[libvirt] [PATCHv5 00/19] Introduce x86 Cache Monitoring Technology (CMT)

This series of patches and the series already been merged introduce the x86 Cache Monitoring Technology (CMT) to libvirt by interacting with kernel resource control (resctrl) interface. CMT is one of the Intel(R) x86 CPU feature which belongs to the Resource Director Technology (RDT). CMT reports the occupancy of the last level cache, which is shared by all CPU cores. In the v1 series, an original and complete feature for CMT was introduced The v2 and v3 patches address the feature for the host capability of CMT. v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command. We have serval discussion about the enabling of CMT, please refer to following links for the RFCs. RFCv3 https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html RFCv2 https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html RFCv1 https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html And the merged commits are list as below, for host capability of CMT. 6af8417415508c31f8ce71234b573b4999f35980 8f6887998bf63594ae26e3db18d4d5896c5f2cb4 58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8 12093f1feaf8f5023dcd9d65dff111022842183d a5d293c18831dcf69ec6195798387fbb70c9f461 1. About reason why CMT is necessary in libvirt? The perf events of 'CMT, MBML, MBMT' have been phased out since Linux kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt the perf based cmt,mbm will not work with the latest linux kernel. These patches add CMT feature to libvirt through kernel resctrlfs interface. 2 Create cache monitoring group (cache monitor). The main interface for creating monitoring group is through XML file. The proposed configuration is like: <cputune> <cachetune vcpus='1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='4-7'> + <monitor level='3' vcpus='4-6'/> </cachetune> </cputune> In above XML, created 2 cache resctrl allocation groups and 2 resctrl monitoring groups. The changes of cache monitor will be effective in next booting of VM. 2 Show CMT result through command 'domstats' Adding the interface in qemu to report this information for resource monitor group through command 'virsh domstats --cpu-total'. Below is a typical output: # virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.0.name=vcpus_1 cpu.cache.0.vcpus=1 cpu.cache.0.bank.count=2 cpu.cache.0.bank.0.id=0 cpu.cache.0.bank.0.bytes=4505600 cpu.cache.0.bank.1.id=1 cpu.cache.0.bank.1.bytes=5586944 cpu.cache.1.name=vcpus_4-6 cpu.cache.1.vcpus=4,5,6 cpu.cache.1.bank.count=2 cpu.cache.1.bank.0.id=0 cpu.cache.1.bank.0.bytes=17571840 cpu.cache.1.bank.1.id=1 cpu.cache.1.bank.1.bytes=29106176 Changes in v5: - qemu: Setting up vcpu and adding pids to resctrl monitor groups during re-connection. - Add the document for domain configuration related to resctrl monitor. Changes in v4: v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command. - Introduced resctrl default allocation - Introduced resctrl monitor and default monitor Changes in v3: - Addressed John Ferlan's review. - Typo fixed. - Removed VIR_ENUM_DECL(virMonitor); Changes in v2: - Introduced MBM capability. - Capability layout changed * Moved <monitor> from cahe <bank> to <cache> * Renamed <Threshold> to <reuseThreshold> - Document for 'reuseThreshold' changed. - Introduced API virResctrlInfoGetMonitorPrefix - Added more tests, covering standalone CMT, fake new feature. - Creating CMT resource control group will be subsequent job. Wang Huaqiang (19): docs: Refactor schemas to support default allocation util: Introduce resctrl monitor for CMT util: Refactor code for adding PID to the resource group util: Add interface for adding PID to monitor util: Refactor code for determining allocation path util: Add monitor interface to determine path util: Refactor code for creating resctrl group util: Add interface for creating monitor group util: Add more interfaces for resctrl monitor util: Introduce default monitor conf: Refactor code for matching existing resctrls conf: Refactor virDomainResctrlAppend conf: Add resctrl monitor configuration Util: Add function for checking if monitor is running qemu: enable resctrl monitor in qemu conf: Add a 'id' to virDomainResctrlDef qemu: refactor qemuDomainGetStatsCpu qemu: Report cache occupancy (CMT) with domstats qemu: Setting up vcpu and adding pids to resctrl monitor groups during reconnection docs/formatdomain.html.in | 30 +- docs/schemas/domaincommon.rng | 14 +- src/conf/domain_conf.c | 327 ++++++++++-- src/conf/domain_conf.h | 12 + src/libvirt-domain.c | 9 + src/libvirt_private.syms | 12 + src/qemu/qemu_driver.c | 271 +++++++++- src/qemu/qemu_process.c | 52 +- src/util/virresctrl.c | 562 ++++++++++++++++++++- src/util/virresctrl.h | 49 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 ++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 14 files changed, 1277 insertions(+), 103 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml -- 2.7.4

The resctrl default allocation is introduced in this patch, which refers to the root directory (/sys/fs/resctrl) and immediately be created after mounting, owns all the tasks and cpus in the system and can make full use of all resources. It does not intentionally allocate any dedicated amount of resource, either cache or memory bandwidth, for default allocation. If a system task has no resource control applied but you want to know task's cache or memroy bandwidth utilization information, the default allocation is meaningful. We create resctrl monitor under the default allocation for such kind of task. Refactoring schemas docs and APIs to create a default cache allocation by allowing the appearance of an <cachetune> with no <cache> element. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/conf/domain_conf.c | 32 +++++++++++++++++++------------- src/util/virresctrl.c | 27 +++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 17 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 8189959..b1651e3 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -943,8 +943,8 @@ <dl> <dt><code>cache</code></dt> <dd> - This element controls the allocation of CPU cache and has the - following attributes: + This optional element controls the allocation of CPU cache and has + the following attributes: <dl> <dt><code>level</code></dt> <dd> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 099a949..5c533d6 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -956,7 +956,7 @@ <attribute name="vcpus"> <ref name='cpuset'/> </attribute> - <oneOrMore> + <zeroOrMore> <element name="cache"> <attribute name="id"> <ref name='unsignedInt'/> @@ -980,7 +980,7 @@ </attribute> </optional> </element> - </oneOrMore> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9911d56..b77680e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19002,22 +19002,27 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } - if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) - goto cleanup; - - if (!alloc) { - alloc = virResctrlAllocNew(); - if (!alloc) + /* If 'n' equals 0, then no <cache> element found in <cachetune>, + * this means it is a default alloction. For default allocation, + * @SetvirDomainResctrlDefPtr->alloc is set to NULL */ + if (n != 0) { + if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) goto cleanup; - } else { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Identical vcpus in cachetunes found")); - goto cleanup; - } - for (i = 0; i < n; i++) { - if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + if (!alloc) { + alloc = virResctrlAllocNew(); + if (!alloc) + goto cleanup; + } else { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Identical vcpus in cachetunes found")); goto cleanup; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } } if (virResctrlAllocIsEmpty(alloc)) { @@ -19027,6 +19032,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) goto cleanup; + vcpus = NULL; alloc = NULL; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index df5b512..697424c 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -234,6 +234,10 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) * in case there is no allocation for that particular cache allocation (level, * cache, ...) or memory allocation for particular node). * + * Resctrl file system root directory, /sys/fs/sysctrl/, is called the default + * allocation, which is created, immediately after mounting, owns all the + * tasks and cpus in the system and can make full use of all resources. + * * =====Cache allocation technology (CAT)===== * * Since one allocation can be made for caches on different levels, the first @@ -1165,6 +1169,9 @@ virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, unsigned int cache, unsigned long long size) { + if (!alloc) + return 0; + if (virResctrlAllocCheckCollision(alloc, level, type, cache)) { virReportError(VIR_ERR_XML_ERROR, _("Colliding cache allocations for cache " @@ -1235,6 +1242,9 @@ virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr alloc, { virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw; + if (!alloc) + return 0; + if (memory_bandwidth > 100) { virReportError(VIR_ERR_XML_ERROR, "%s", _("Memory Bandwidth value exceeding 100 is invalid.")); @@ -1304,6 +1314,11 @@ int virResctrlAllocSetID(virResctrlAllocPtr alloc, const char *id) { + /* If passed a default allocation in, @alloc will be NULL. This is + * a valid case, return normally. */ + if (!alloc) + return 0; + if (!id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl allocation 'id' cannot be NULL")); @@ -1317,6 +1332,9 @@ virResctrlAllocSetID(virResctrlAllocPtr alloc, const char * virResctrlAllocGetID(virResctrlAllocPtr alloc) { + if (!alloc) + return NULL; + return alloc->id; } @@ -2209,6 +2227,9 @@ int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) { + if (!alloc) + return 0; + if (!alloc->id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl Allocation ID must be set before creation")); @@ -2302,6 +2323,9 @@ virResctrlAllocAddPID(virResctrlAllocPtr alloc, char *pidstr = NULL; int ret = 0; + if (!alloc) + return 0; + if (!alloc->path) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Cannot add pid to non-existing resctrl allocation")); @@ -2334,6 +2358,9 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0; + if (!alloc) + return 0; + if (!alloc->path) return 0; -- 2.7.4

s/docs/conf,util/ It's more than just docs... On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The resctrl default allocation is introduced in this patch, which refers to the root directory (/sys/fs/resctrl) and immediately be created after mounting, owns all the tasks and cpus in the system and can make full use of all resources.
It does not intentionally allocate any dedicated amount of resource, either cache or memory bandwidth, for default allocation.
If a system task has no resource control applied but you want to know task's cache or memroy bandwidth utilization information, the default allocation is meaningful. We create resctrl monitor under the default allocation for such kind of task.
Refactoring schemas docs and APIs to create a default cache allocation by allowing the appearance of an <cachetune> with no <cache> element.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/conf/domain_conf.c | 32 +++++++++++++++++++------------- src/util/virresctrl.c | 27 +++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 17 deletions(-)
How would this look in XML output - supply a <cachetune> w/o the <cache> element? Probably also need the <monitor> there as well in at least one entry just prove it works. It seems <memorytune> entries have their own unique "back door" of sorts calling virResctrlAllocNew when there are no <cachetune> entries. Similar to what happens if there were entries cachetune for vcpus of "0-1" and "2-5", but nothing for "6-7". The assumption always being <memorytune> read after <cachetune> and as long as there's no overlap, just create the <memorytune> entry, by supplying a <cachetune> entry without <cache> entries. It's a little awkward to read, but now makes me think about all the existing/strange linkages. In a way I suppose having no <cachetune> entries is proven to work by tests/genericxml2xmlindata/memorytune.xml. The reality is they get created, but without visibility.
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 8189959..b1651e3 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -943,8 +943,8 @@ <dl> <dt><code>cache</code></dt> <dd> - This element controls the allocation of CPU cache and has the - following attributes: + This optional element controls the allocation of CPU cache and has + the following attributes: <dl> <dt><code>level</code></dt> <dd> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 099a949..5c533d6 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -956,7 +956,7 @@ <attribute name="vcpus"> <ref name='cpuset'/> </attribute> - <oneOrMore> + <zeroOrMore> <element name="cache"> <attribute name="id"> <ref name='unsignedInt'/> @@ -980,7 +980,7 @@ </attribute> </optional> </element> - </oneOrMore> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9911d56..b77680e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19002,22 +19002,27 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; }
- if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) - goto cleanup; - - if (!alloc) { - alloc = virResctrlAllocNew(); - if (!alloc) + /* If 'n' equals 0, then no <cache> element found in <cachetune>, + * this means it is a default alloction. For default allocation,
s/alloction/allocation
+ * @SetvirDomainResctrlDefPtr->alloc is set to NULL */
Not sure what ^^ this was...
+ if (n != 0) { + if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) goto cleanup; - } else { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Identical vcpus in cachetunes found")); - goto cleanup; - }
diff is perhaps easier to read if you: if (n == 0) { ret = 0; goto cleanup; } then none of the indent is needed for n != 0
- for (i = 0; i < n; i++) { - if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + if (!alloc) { + alloc = virResctrlAllocNew(); + if (!alloc) + goto cleanup; + } else { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Identical vcpus in cachetunes found")); goto cleanup; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } }
if (virResctrlAllocIsEmpty(alloc)) { @@ -19027,6 +19032,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) goto cleanup; +
Superfluous
vcpus = NULL; alloc = NULL;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index df5b512..697424c 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -234,6 +234,10 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) * in case there is no allocation for that particular cache allocation (level, * cache, ...) or memory allocation for particular node). * + * Resctrl file system root directory, /sys/fs/sysctrl/, is called the default + * allocation, which is created, immediately after mounting, owns all the + * tasks and cpus in the system and can make full use of all resources. + * * =====Cache allocation technology (CAT)===== * * Since one allocation can be made for caches on different levels, the first
I assume you searched on: virResctrlAllocGetType (w/ callers:) virResctrlAllocUpdateMask virResctrlAllocUpdateSize virResctrlAllocCopyMasks It's kind of "painful" to back trace all the callers and determine if any/each of them does the if (!alloc) check "originally" somewhere. I took a quick look and they seem OK
@@ -1165,6 +1169,9 @@ virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, unsigned int cache, unsigned long long size) { + if (!alloc) + return 0; + if (virResctrlAllocCheckCollision(alloc, level, type, cache)) { virReportError(VIR_ERR_XML_ERROR, _("Colliding cache allocations for cache " @@ -1235,6 +1242,9 @@ virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr alloc, { virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw;
^^ This wouldn't have been too happy would it if alloc == NULL; however,
+ if (!alloc) + return 0; +
I don't think it'll matter since the only caller is virDomainMemorytuneDefParse which will allocate an @alloc if one didn't exist *and* pass that through to here, so this check shouldn't be necessary. In researching this I realized that although we have a memorytune-colliding-allocs.xml and memorytune.xml, there is no <memorytune> example that includes <cachetune> entries as well.
if (memory_bandwidth > 100) { virReportError(VIR_ERR_XML_ERROR, "%s", _("Memory Bandwidth value exceeding 100 is invalid.")); @@ -1304,6 +1314,11 @@ int virResctrlAllocSetID(virResctrlAllocPtr alloc, const char *id) { + /* If passed a default allocation in, @alloc will be NULL. This is + * a valid case, return normally. */
This is the only one to get that type of comment... Probably something that should instead be more clearly indicated perhaps in the CAT and MBA comments at the top of the module.
+ if (!alloc) + return 0; + if (!id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl allocation 'id' cannot be NULL")); @@ -1317,6 +1332,9 @@ virResctrlAllocSetID(virResctrlAllocPtr alloc, const char * virResctrlAllocGetID(virResctrlAllocPtr alloc) { + if (!alloc) + return NULL; +
Probably need to consider current callers... I see that both virDomainCachetuneDefFormat and virDomainMemorytuneDefFormat would return -1 for some unknown reason. Although perhaps the latter would work fine since it'd create it's own @alloc if resctrl->alloc == NULL. Hence why I asked for an XML example above.
return alloc->id; }
@@ -2209,6 +2227,9 @@ int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) { + if (!alloc) + return 0; + if (!alloc->id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl Allocation ID must be set before creation")); @@ -2302,6 +2323,9 @@ virResctrlAllocAddPID(virResctrlAllocPtr alloc, char *pidstr = NULL; int ret = 0;
+ if (!alloc) + return 0; + if (!alloc->path) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Cannot add pid to non-existing resctrl allocation")); @@ -2334,6 +2358,9 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0;
+ if (!alloc) + return 0; + if (!alloc->path) return 0;
These two could be combined John

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 4:36 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 01/19] docs: Refactor schemas to support default allocation
s/docs/conf,util/
It's more than just docs...
Yes, the title will be changed accordingly.
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The resctrl default allocation is introduced in this patch, which refers to the root directory (/sys/fs/resctrl) and immediately be created after mounting, owns all the tasks and cpus in the system and can make full use of all resources.
It does not intentionally allocate any dedicated amount of resource, either cache or memory bandwidth, for default allocation.
If a system task has no resource control applied but you want to know task's cache or memroy bandwidth utilization information, the default allocation is meaningful. We create resctrl monitor under the default allocation for such kind of task.
Refactoring schemas docs and APIs to create a default cache allocation by allowing the appearance of an <cachetune> with no <cache> element.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/conf/domain_conf.c | 32 +++++++++++++++++++------------- src/util/virresctrl.c | 27 +++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 17 deletions(-)
How would this look in XML output - supply a <cachetune> w/o the <cache> element? Probably also need the <monitor> there as well in at least one entry just prove it works.
If no <monitor> and no <cache> parsed from XML, the <cachetune> element will not be shown. The <cachetune> element only could be seen if there is at least one <cache> or <monitor> element. Following layouts could not be seen in XML: <cputune> ... <cachetune vcpus='0'/> </cputune> 'Resctrl monitor' has not yet been introduced until this patch, for a better understanding of purpose of this patch, let me take an example to illustrate what will happen after applying whole series patches. Supposing we are going to create a monitor over vcpu 0 for getting cache utilization, and we haven't created any cache allocation for vcpu 0. This could happen if user want to know the cache usage of specific vcpu but don't want to allocate any dedicated amount of cache for it. The XML layout would be: <cputune> <cachetune vcpus='0'> <monitor level='3' vcpus='0'/> </cachetune> </cputune> To support above XML layout in future patches, we need to append a virDomainResctrlDef element to @(virDomainDefPtr*)def->resctrls even the virDomainResctrlDef.alloc is empty. This patch changes code implement this and also will not create any allocation for cache if @def->resctrls[i]->alloc is set to NULL. Also this monitor specified in above configuration will be created under '/sys/fs/resctrl/mon_groups'. This piece of code has been refactored for several times in this patch and subsequent patches, each patch works and does not break original functionality already implemented. But the functionality of resctrl monitor only works after whole series have been applied.
It seems <memorytune> entries have their own unique "back door" of sorts calling virResctrlAllocNew when there are no <cachetune> entries.
Yes. memorytune creates separate entry in <cputune>.
Similar to what happens if there were entries cachetune for vcpus of "0-1" and "2-5", but nothing for "6-7". The assumption always being <memorytune> read after <cachetune> and as long as there's no overlap, just create the <memorytune> entry, by supplying a <cachetune> entry without <cache> entries.
Not understand above paragraph too much. Supposing your 'cachetune' entry means an corresponding element in @def->resctrls array. Up to this patch, it is not allowed to append the virDomainResctrlDef element to @def->resctrls if there is no <cache> element. Later, a virDomainResctrlDef element could only be appended if there exists at least one <cache> element or one <monitor>.
supplying a <cachetune> entry without <cache> entries.
A <cachetune> entry without <cache>entries, and at same time, without <monitor> entries is not permitted here.
It's a little awkward to read, but now makes me think about all the existing/strange linkages. In a way I suppose having no <cachetune> entries is proven to work by tests/genericxml2xmlindata/memorytune.xml. The reality is they get created, but without visibility.
What is created and no visibility? Not understand. If no <cachtune> entries, there will no virDomainResctrlDef element appended to @def->resctrls. Up to this patch, there will be no creation for either <cachtune> entry in later invocation of virDomainCachetuneDefFormat or an appending of element in @def->resctrls during XML parsing, even with following XML configuration: <cputune> ... <cachetune vcpus='0-1'/> </cputune> Even after an integration of the whole patch series, upper XML configuration will not create any @def->resctrls elements in configuration file parsing or any <cachetune> XML element in later call of virDomainCachetuneDefFormat.
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 8189959..b1651e3 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -943,8 +943,8 @@ <dl> <dt><code>cache</code></dt> <dd> - This element controls the allocation of CPU cache and has the - following attributes: + This optional element controls the allocation of CPU cache and has + the following attributes: <dl> <dt><code>level</code></dt> <dd> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 099a949..5c533d6 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -956,7 +956,7 @@ <attribute name="vcpus"> <ref name='cpuset'/> </attribute> - <oneOrMore> + <zeroOrMore> <element name="cache"> <attribute name="id"> <ref name='unsignedInt'/> @@ -980,7 +980,7 @@ </attribute> </optional> </element> - </oneOrMore> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9911d56..b77680e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19002,22 +19002,27 @@
virDomainCachetuneDefParse(virDomainDefPtr def,
goto cleanup; }
- if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) - goto cleanup; - - if (!alloc) { - alloc = virResctrlAllocNew(); - if (!alloc) + /* If 'n' equals 0, then no <cache> element found in <cachetune>, + * this means it is a default alloction. For default allocation,
s/alloction/allocation
My mistake. will be fixed.
+ * @SetvirDomainResctrlDefPtr->alloc is set to NULL */
Not sure what ^^ this was...
Should be @virDomainResctrlDefPtr->alloc.
+ if (n != 0) { + if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) goto cleanup; - } else { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Identical vcpus in cachetunes found")); - goto cleanup; - }
diff is perhaps easier to read if you:
if (n == 0) { ret = 0; goto cleanup; }
then none of the indent is needed for n != 0
Your advising changes works here, but will conflict with later logic I will introduce in patch 13. This part of code will be modified in later patch (patch 13 of 19), adding some code parsing configuration for monitor. At that time, if n==0 then letting this function return without error is not a reasonable logic, and need to check if <monitor> entries exists under <cachetune> entry. Paste the code here for your reference: 19180 if ((n = virXPathNodeSet("./cache", ctxt, &nodes)) < 0) { 19181 virReportError(VIR_ERR_INTERNAL_ERROR, "%s", 19182 _("Cannot extract cache nodes under cachetune")); 19183 goto cleanup; 19184 } 19185 19186 /* If 'n' equals 0, then no <cache> element found in <cachetune>, 19187 * this means it is a default alloction. For default allocation, 19188 * @virDomainResctrlDefPtr->alloc is set to NULL */ 19189 if (n != 0) { 19190 if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) 19191 goto cleanup; 19192 19193 if (!alloc) { 19194 alloc = virResctrlAllocNew(); 19195 if (!alloc) 19196 goto cleanup; 19197 } else { 19198 virReportError(VIR_ERR_XML_ERROR, "%s", 19199 _("Identical vcpus in cachetunes found")); 19200 goto cleanup; 19201 } 19202 19203 for (i = 0; i < n; i++) { 19204 if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) 19205 goto cleanup; 19206 } 19207 } 19208 19209 resctrl = virDomainResctrlNew(alloc, vcpus); 19210 if (!resctrl) 19211 goto cleanup; 19212 19213 if (virDomainResctrlMonDefParse(def, ctxt, node, 19214 VIR_RESCTRL_MONITOR_TYPE_CACHE, 19215 resctrl) < 0) 19216 goto cleanup; 19217 19218 if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) { --> If there is no 'cache' entry and no 'monitor' entry in current <cachetune> entry, code will go to this place, and function will return normally without error, and virDomainResctrlAppend will not be executed. 19219 ret = 0; 19220 goto cleanup; 19221 } 19222 19223 if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) 19224 goto cleanup; 19225 19226 resctrl = NULL; 19227 ret = 0; 19228 cleanup: 19229 ctxt->node = oldnode; 19230 virDomainResctrlDefFree(resctrl); 19231 virObjectUnref(alloc); 19232 virBitmapFree(vcpus); 19233 VIR_FREE(nodes); 19234 return ret;
- for (i = 0; i < n; i++) { - if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + if (!alloc) { + alloc = virResctrlAllocNew(); + if (!alloc) + goto cleanup; + } else { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Identical vcpus in cachetunes found")); goto cleanup; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } }
if (virResctrlAllocIsEmpty(alloc)) { @@ -19027,6 +19032,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) goto cleanup; +
Superfluous
This blank line is involved by mistake, will be removed :)
vcpus = NULL; alloc = NULL;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index df5b512..697424c 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -234,6 +234,10 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) * in case there is no allocation for that particular cache allocation (level, * cache, ...) or memory allocation for particular node). * + * Resctrl file system root directory, /sys/fs/sysctrl/, is called + the default + * allocation, which is created, immediately after mounting, owns all + the + * tasks and cpus in the system and can make full use of all resources. + * * =====Cache allocation technology (CAT)===== * * Since one allocation can be made for caches on different levels, the first
I assume you searched on:
virResctrlAllocGetType (w/ callers:) virResctrlAllocUpdateMask virResctrlAllocUpdateSize virResctrlAllocCopyMasks
It's kind of "painful" to back trace all the callers and determine if any/each of them does the if (!alloc) check "originally" somewhere. I took a quick look and they seem OK
Yes. and I also double checked for these functions. I am focus on cache monitor entries in these patches, will be further checked when introducing memoryBW monitor later.
@@ -1165,6 +1169,9 @@ virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, unsigned int cache, unsigned long long size) { + if (!alloc) + return 0; + if (virResctrlAllocCheckCollision(alloc, level, type, cache)) { virReportError(VIR_ERR_XML_ERROR, _("Colliding cache allocations for cache " @@ -1235,6 +1242,9 @@ virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr alloc, { virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw;
^^ This wouldn't have been too happy would it if alloc == NULL; however,
+ if (!alloc) + return 0; +
I don't think it'll matter since the only caller is virDomainMemorytuneDefParse which will allocate an @alloc if one didn't exist *and* pass that through to here, so this check shouldn't be necessary.
I don't realize the @alloc has already been used! Will remove the pointer checking for @alloc.
In researching this I realized that although we have a memorytune-colliding- allocs.xml and memorytune.xml, there is no <memorytune> example that includes <cachetune> entries as well.
Let me add a test for this case in next update.
if (memory_bandwidth > 100) { virReportError(VIR_ERR_XML_ERROR, "%s", _("Memory Bandwidth value exceeding 100 is invalid.")); @@ -1304,6 +1314,11 @@ int virResctrlAllocSetID(virResctrlAllocPtr alloc, const char *id) { + /* If passed a default allocation in, @alloc will be NULL. This is + * a valid case, return normally. */
Will remove the comment. I'll try to add this comment to the CAT and MBA comments.
This is the only one to get that type of comment... Probably something that should instead be more clearly indicated perhaps in the CAT and MBA comments at the top of the module.
+ if (!alloc) + return 0; + if (!id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl allocation 'id' cannot be NULL")); @@ -1317,6 +1332,9 @@ virResctrlAllocSetID(virResctrlAllocPtr alloc, const char * virResctrlAllocGetID(virResctrlAllocPtr alloc) { + if (!alloc) + return NULL; +
Probably need to consider current callers... I see that both virDomainCachetuneDefFormat and virDomainMemorytuneDefFormat would return -1 for some unknown reason. Although perhaps the latter would work fine since it'd create it's own @alloc if resctrl->alloc == NULL.
Hence why I asked for an XML example above.
There is a subsequent patch, patch 16, handling this. Up to now, no monitor introduced, there will not there is no way to pass in an empty @alloc, so the code will not introduce any trouble. In patch 16, a 'virDomainResctrl.id' is introduced, and later code will use 'virDomainResctrlDef.id' to track the id of cachetune. I did this, because I have extended the scope of virDomainResctrlDef in later patches by adding monitors, and one virDomainResctrlDef is the abstraction of one <cachetune> entry, so logically 'id' of <cachetune> should be kept in structure virDomainResctrlDef.
return alloc->id; }
@@ -2209,6 +2227,9 @@ int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) { + if (!alloc) + return 0; + if (!alloc->id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl Allocation ID must be set before creation")); @@ -2302,6 +2323,9 @@ virResctrlAllocAddPID(virResctrlAllocPtr
alloc,
char *pidstr = NULL; int ret = 0;
+ if (!alloc) + return 0; + if (!alloc->path) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Cannot add pid to non-existing resctrl allocation")); @@ -2334,6 +2358,9 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0;
+ if (!alloc) + return 0; + if (!alloc->path) return 0;
These two could be combined
Will be combined.
John
Thanks for review. Huaqiang

On 10/10/18 9:44 AM, Wang, Huaqiang wrote:
-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 4:36 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 01/19] docs: Refactor schemas to support default allocation
s/docs/conf,util/
It's more than just docs...
Yes, the title will be changed accordingly.
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The resctrl default allocation is introduced in this patch, which refers to the root directory (/sys/fs/resctrl) and immediately be created after mounting, owns all the tasks and cpus in the system and can make full use of all resources.
It does not intentionally allocate any dedicated amount of resource, either cache or memory bandwidth, for default allocation.
If a system task has no resource control applied but you want to know task's cache or memroy bandwidth utilization information, the default allocation is meaningful. We create resctrl monitor under the default allocation for such kind of task.
Refactoring schemas docs and APIs to create a default cache allocation by allowing the appearance of an <cachetune> with no <cache> element.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/conf/domain_conf.c | 32 +++++++++++++++++++------------- src/util/virresctrl.c | 27 +++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 17 deletions(-)
How would this look in XML output - supply a <cachetune> w/o the <cache> element? Probably also need the <monitor> there as well in at least one entry just prove it works.
If no <monitor> and no <cache> parsed from XML, the <cachetune> element will not be shown. The <cachetune> element only could be seen if there is at least one <cache> or <monitor> element.
In a way this has become obvious as I've gone through things, but after really thinking through 13 patches, I'm not sure it matters if a <cachetune> entry exists without <cache> or <monitor>. It does nothing and can be documented that way. Far too much work and effort goes into concerning ourselves with concepts that in the end don't seem to matter and perhaps would never be done. But if they are done (e.g. <cachetune> without <cache> and <monitor>), so what it does nothing and can be documented that way.
Following layouts could not be seen in XML:
<cputune> ... <cachetune vcpus='0'/> </cputune>
'Resctrl monitor' has not yet been introduced until this patch, for a better understanding of purpose of this patch, let me take an example to illustrate what will happen after applying whole series patches.
Supposing we are going to create a monitor over vcpu 0 for getting cache utilization, and we haven't created any cache allocation for vcpu 0. This could happen if user want to know the cache usage of specific vcpu but don't want to allocate any dedicated amount of cache for it.
The XML layout would be:
<cputune> <cachetune vcpus='0'> <monitor level='3' vcpus='0'/> </cachetune> </cputune>
To support above XML layout in future patches, we need to append a virDomainResctrlDef element to @(virDomainDefPtr*)def->resctrls even the virDomainResctrlDef.alloc is empty. This patch changes code implement this and also will not create any allocation for cache if @def->resctrls[i]->alloc is set to NULL.
If someone has a <cachetune> element, then they get an resctrl->alloc. If it doesn't have <cache> elements (all that matters at this point), who cares.
Also this monitor specified in above configuration will be created under '/sys/fs/resctrl/mon_groups'.
This piece of code has been refactored for several times in this patch and subsequent patches, each patch works and does not break original functionality already implemented. But the functionality of resctrl monitor only works after whole series have been applied.
It seems <memorytune> entries have their own unique "back door" of sorts calling virResctrlAllocNew when there are no <cachetune> entries.
Yes. memorytune creates separate entry in <cputune>.
Similar to what happens if there were entries cachetune for vcpus of "0-1" and "2-5", but nothing for "6-7". The assumption always being <memorytune> read after <cachetune> and as long as there's no overlap, just create the <memorytune> entry, by supplying a <cachetune> entry without <cache> entries.
Not understand above paragraph too much. Supposing your 'cachetune' entry means an corresponding element in @def->resctrls array.
This probably crossed boundaries, but the point was if <memorytune> didn't find a <cachetune> for the 'vcpus' it had, then it creates one. But this patch goes through a few handstands to not create ->alloc when as I've come to realize later it really doesn't seem to need to do.
Up to this patch, it is not allowed to append the virDomainResctrlDef element to @def->resctrls if there is no <cache> element.
Later, a virDomainResctrlDef element could only be appended if there exists at least one <cache> element or one <monitor>.
supplying a <cachetune> entry without <cache> entries.
A <cachetune> entry without <cache>entries, and at same time, without <monitor> entries is not permitted here.
It's a little awkward to read, but now makes me think about all the existing/strange linkages. In a way I suppose having no <cachetune> entries is proven to work by tests/genericxml2xmlindata/memorytune.xml. The reality is they get created, but without visibility.
What is created and no visibility? Not understand.
It's that memorytune path I noted above. Nothing is ever saved with them, but they do exist 'internally' (virDomainMemorytuneDefParse calls virResctrlAllocNew and will eventually virDomainResctrlAppend because virResctrlAllocIsEmpty is false since memorytune would have a @bandwidth (and virResctrlAllocSetMemoryBandwidth fills in alloc->mem_bw).
If no <cachtune> entries, there will no virDomainResctrlDef element appended to @def->resctrls.
Up to this patch, there will be no creation for either <cachtune> entry in later invocation of virDomainCachetuneDefFormat or an appending of element in @def->resctrls during XML parsing, even with following XML configuration:
<cputune> ... <cachetune vcpus='0-1'/> </cputune>
Even after an integration of the whole patch series, upper XML configuration will not create any @def->resctrls elements in configuration file parsing or any <cachetune> XML element in later call of virDomainCachetuneDefFormat.
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 8189959..b1651e3 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -943,8 +943,8 @@ <dl> <dt><code>cache</code></dt> <dd> - This element controls the allocation of CPU cache and has the - following attributes: + This optional element controls the allocation of CPU cache and has + the following attributes: <dl> <dt><code>level</code></dt> <dd> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 099a949..5c533d6 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -956,7 +956,7 @@ <attribute name="vcpus"> <ref name='cpuset'/> </attribute> - <oneOrMore> + <zeroOrMore> <element name="cache"> <attribute name="id"> <ref name='unsignedInt'/> @@ -980,7 +980,7 @@ </attribute> </optional> </element> - </oneOrMore> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9911d56..b77680e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19002,22 +19002,27 @@
virDomainCachetuneDefParse(virDomainDefPtr def,
goto cleanup; }
- if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) - goto cleanup; - - if (!alloc) { - alloc = virResctrlAllocNew(); - if (!alloc) + /* If 'n' equals 0, then no <cache> element found in <cachetune>, + * this means it is a default alloction. For default allocation,
s/alloction/allocation
My mistake. will be fixed.
+ * @SetvirDomainResctrlDefPtr->alloc is set to NULL */
Not sure what ^^ this was...
Should be @virDomainResctrlDefPtr->alloc.
+ if (n != 0) { + if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) goto cleanup; - } else { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Identical vcpus in cachetunes found")); - goto cleanup; - }
diff is perhaps easier to read if you:
if (n == 0) { ret = 0; goto cleanup; }
then none of the indent is needed for n != 0
Your advising changes works here, but will conflict with later logic I will introduce in patch 13.
Yeah about where I stopped and started thinking when a <cachetune> is see we create the @alloc
This part of code will be modified in later patch (patch 13 of 19), adding some code parsing configuration for monitor. At that time, if n==0 then letting this function return without error is not a reasonable logic, and need to check if <monitor> entries exists under <cachetune> entry.
Paste the code here for your reference:
19180 if ((n = virXPathNodeSet("./cache", ctxt, &nodes)) < 0) { 19181 virReportError(VIR_ERR_INTERNAL_ERROR, "%s", 19182 _("Cannot extract cache nodes under cachetune")); 19183 goto cleanup; 19184 } 19185 19186 /* If 'n' equals 0, then no <cache> element found in <cachetune>, 19187 * this means it is a default alloction. For default allocation, 19188 * @virDomainResctrlDefPtr->alloc is set to NULL */ 19189 if (n != 0) {
But from here...
19190 if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) 19191 goto cleanup; 19192 19193 if (!alloc) { 19194 alloc = virResctrlAllocNew(); 19195 if (!alloc) 19196 goto cleanup; 19197 } else { 19198 virReportError(VIR_ERR_XML_ERROR, "%s", 19199 _("Identical vcpus in cachetunes found")); 19200 goto cleanup; 19201 } 19202
...to here has nothing to do with whether <cache> elements exist, so why would we restrict creation of @alloc based on whether <cache> entries existed. So what if no <cache> entries exist. This is perhaps less "default allocation" and more don't require <cache> elements. In that case, ... <whatever it means>... Later we're going to add <monitor> elements and they don't require <cache> elements, so having ->alloc predicated on whether <cache> entries exists complicates a lot of code. Simplify things.
19203 for (i = 0; i < n; i++) { 19204 if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) 19205 goto cleanup; 19206 } 19207 } 19208 19209 resctrl = virDomainResctrlNew(alloc, vcpus); 19210 if (!resctrl) 19211 goto cleanup; 19212 19213 if (virDomainResctrlMonDefParse(def, ctxt, node, 19214 VIR_RESCTRL_MONITOR_TYPE_CACHE, 19215 resctrl) < 0) 19216 goto cleanup; 19217 19218 if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) {
--> If there is no 'cache' entry and no 'monitor' entry in current <cachetune> entry, code will go to this place, and function will return normally without error, and virDomainResctrlAppend will not be executed.
But a <memorytune> can allocate and insert too. As I pointed out later I think the ResctrlNew and ResctrlAppend don't need to be separate either.
19219 ret = 0; 19220 goto cleanup; 19221 } 19222 19223 if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) 19224 goto cleanup; 19225 19226 resctrl = NULL; 19227 ret = 0; 19228 cleanup: 19229 ctxt->node = oldnode; 19230 virDomainResctrlDefFree(resctrl); 19231 virObjectUnref(alloc); 19232 virBitmapFree(vcpus); 19233 VIR_FREE(nodes); 19234 return ret;
- for (i = 0; i < n; i++) { - if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + if (!alloc) { + alloc = virResctrlAllocNew(); + if (!alloc) + goto cleanup; + } else { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Identical vcpus in cachetunes found")); goto cleanup; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } }
if (virResctrlAllocIsEmpty(alloc)) { @@ -19027,6 +19032,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) goto cleanup; +
Superfluous
This blank line is involved by mistake, will be removed :)
vcpus = NULL; alloc = NULL;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index df5b512..697424c 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -234,6 +234,10 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) * in case there is no allocation for that particular cache allocation (level, * cache, ...) or memory allocation for particular node). * + * Resctrl file system root directory, /sys/fs/sysctrl/, is called + the default + * allocation, which is created, immediately after mounting, owns all + the + * tasks and cpus in the system and can make full use of all resources. + * * =====Cache allocation technology (CAT)===== * * Since one allocation can be made for caches on different levels, the first
I assume you searched on:
virResctrlAllocGetType (w/ callers:) virResctrlAllocUpdateMask virResctrlAllocUpdateSize virResctrlAllocCopyMasks
It's kind of "painful" to back trace all the callers and determine if any/each of them does the if (!alloc) check "originally" somewhere. I took a quick look and they seem OK
Yes. and I also double checked for these functions.
I am focus on cache monitor entries in these patches, will be further checked when introducing memoryBW monitor later.
@@ -1165,6 +1169,9 @@ virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, unsigned int cache, unsigned long long size) { + if (!alloc) + return 0; + if (virResctrlAllocCheckCollision(alloc, level, type, cache)) { virReportError(VIR_ERR_XML_ERROR, _("Colliding cache allocations for cache " @@ -1235,6 +1242,9 @@ virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr alloc, { virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw;
^^ This wouldn't have been too happy would it if alloc == NULL; however,
+ if (!alloc) + return 0; +
I don't think it'll matter since the only caller is virDomainMemorytuneDefParse which will allocate an @alloc if one didn't exist *and* pass that through to here, so this check shouldn't be necessary.
I don't realize the @alloc has already been used! Will remove the pointer checking for @alloc.
And yes, this is where/why I think you shouldn't have a concept of default allocation... It should just be an allocation (aka cachetune) without specific cache entries. Later that allows monitor entries, but it may allow something else now, who knows. Again, as I pointed out you can have a domain with <memorytune> only entries and have the same "thing", so there's absolutely no reason to not just allow <cachetune> without <cache>.
In researching this I realized that although we have a memorytune-colliding- allocs.xml and memorytune.xml, there is no <memorytune> example that includes <cachetune> entries as well.
Let me add a test for this case in next update.
if (memory_bandwidth > 100) { virReportError(VIR_ERR_XML_ERROR, "%s", _("Memory Bandwidth value exceeding 100 is invalid.")); @@ -1304,6 +1314,11 @@ int virResctrlAllocSetID(virResctrlAllocPtr alloc, const char *id) { + /* If passed a default allocation in, @alloc will be NULL. This is + * a valid case, return normally. */
Will remove the comment. I'll try to add this comment to the CAT and MBA comments.
This is the only one to get that type of comment... Probably something that should instead be more clearly indicated perhaps in the CAT and MBA comments at the top of the module.
+ if (!alloc) + return 0; + if (!id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl allocation 'id' cannot be NULL")); @@ -1317,6 +1332,9 @@ virResctrlAllocSetID(virResctrlAllocPtr alloc, const char * virResctrlAllocGetID(virResctrlAllocPtr alloc) { + if (!alloc) + return NULL; +
Probably need to consider current callers... I see that both virDomainCachetuneDefFormat and virDomainMemorytuneDefFormat would return -1 for some unknown reason. Although perhaps the latter would work fine since it'd create it's own @alloc if resctrl->alloc == NULL.
Hence why I asked for an XML example above.
There is a subsequent patch, patch 16, handling this.
I never got there, heap overflowed stack. I'm currently of the opinion that all of the "if (!alloc)" checks won't be necessary if you create the resctrl->alloc once a <cachetune> entry is seen. Similarly, if a <memorytune> references 'vcpus' that don't already have a <cachetune> entry, then one is created - whether it's formatted is a different story (currently it's not, which is fine). I think that'll simplify things. John
Up to now, no monitor introduced, there will not there is no way to pass in an empty @alloc, so the code will not introduce any trouble.
In patch 16, a 'virDomainResctrl.id' is introduced, and later code will use 'virDomainResctrlDef.id' to track the id of cachetune. I did this, because I have extended the scope of virDomainResctrlDef in later patches by adding monitors, and one virDomainResctrlDef is the abstraction of one <cachetune> entry, so logically 'id' of <cachetune> should be kept in structure virDomainResctrlDef.
return alloc->id; }
@@ -2209,6 +2227,9 @@ int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) { + if (!alloc) + return 0; + if (!alloc->id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl Allocation ID must be set before creation")); @@ -2302,6 +2323,9 @@ virResctrlAllocAddPID(virResctrlAllocPtr
alloc,
char *pidstr = NULL; int ret = 0;
+ if (!alloc) + return 0; + if (!alloc->path) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Cannot add pid to non-existing resctrl allocation")); @@ -2334,6 +2358,9 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0;
+ if (!alloc) + return 0; + if (!alloc->path) return 0;
These two could be combined
Will be combined.
John
Thanks for review. Huaqiang

I think I have forget replying this review. On 10/11/2018 5:28 AM, John Ferlan wrote:
On 10/10/18 9:44 AM, Wang, Huaqiang wrote:
-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 4:36 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 01/19] docs: Refactor schemas to support default allocation
s/docs/conf,util/
It's more than just docs... Yes, the title will be changed accordingly.
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The resctrl default allocation is introduced in this patch, which refers to the root directory (/sys/fs/resctrl) and immediately be created after mounting, owns all the tasks and cpus in the system and can make full use of all resources.
It does not intentionally allocate any dedicated amount of resource, either cache or memory bandwidth, for default allocation.
If a system task has no resource control applied but you want to know task's cache or memroy bandwidth utilization information, the default allocation is meaningful. We create resctrl monitor under the default allocation for such kind of task.
Refactoring schemas docs and APIs to create a default cache allocation by allowing the appearance of an <cachetune> with no <cache> element.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/conf/domain_conf.c | 32 +++++++++++++++++++------------- src/util/virresctrl.c | 27 +++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 17 deletions(-) How would this look in XML output - supply a <cachetune> w/o the <cache> element? Probably also need the <monitor> there as well in at least one entry just prove it works. If no <monitor> and no <cache> parsed from XML, the <cachetune> element will not be shown. The <cachetune> element only could be seen if there is at least one <cache> or <monitor> element.
In a way this has become obvious as I've gone through things, but after really thinking through 13 patches, I'm not sure it matters if a <cachetune> entry exists without <cache> or <monitor>. It does nothing and can be documented that way. Far too much work and effort goes into concerning ourselves with concepts that in the end don't seem to matter and perhaps would never be done. But if they are done (e.g. <cachetune> without <cache> and <monitor>), so what it does nothing and can be documented that way.
Following layouts could not be seen in XML:
<cputune> ... <cachetune vcpus='0'/> </cputune>
'Resctrl monitor' has not yet been introduced until this patch, for a better understanding of purpose of this patch, let me take an example to illustrate what will happen after applying whole series patches.
Supposing we are going to create a monitor over vcpu 0 for getting cache utilization, and we haven't created any cache allocation for vcpu 0. This could happen if user want to know the cache usage of specific vcpu but don't want to allocate any dedicated amount of cache for it.
The XML layout would be:
<cputune> <cachetune vcpus='0'> <monitor level='3' vcpus='0'/> </cachetune> </cputune>
To support above XML layout in future patches, we need to append a virDomainResctrlDef element to @(virDomainDefPtr*)def->resctrls even the virDomainResctrlDef.alloc is empty. This patch changes code implement this and also will not create any allocation for cache if @def->resctrls[i]->alloc is set to NULL. If someone has a <cachetune> element, then they get an resctrl->alloc. If it doesn't have <cache> elements (all that matters at this point), who cares.
So your suggestions is create @resctrl->alloc whenever seeing a <cachetune>, while my solution is don't create it, and leaving it as NULL, if an empty <cachetune> element found. Your suggestion should work if we do not let it to create any actual directory in '/sys/fs/resctrl'.
Also this monitor specified in above configuration will be created under '/sys/fs/resctrl/mon_groups'.
This piece of code has been refactored for several times in this patch and subsequent patches, each patch works and does not break original functionality already implemented. But the functionality of resctrl monitor only works after whole series have been applied.
It seems <memorytune> entries have their own unique "back door" of sorts calling virResctrlAllocNew when there are no <cachetune> entries. Yes. memorytune creates separate entry in <cputune>.
Similar to what happens if there were entries cachetune for vcpus of "0-1" and "2-5", but nothing for "6-7". The assumption always being <memorytune> read after <cachetune> and as long as there's no overlap, just create the <memorytune> entry, by supplying a <cachetune> entry without <cache> entries.
Not understand above paragraph too much. Supposing your 'cachetune' entry means an corresponding element in @def->resctrls array. This probably crossed boundaries, but the point was if <memorytune> didn't find a <cachetune> for the 'vcpus' it had, then it creates one. But this patch goes through a few handstands to not create ->alloc when as I've come to realize later it really doesn't seem to need to do.
Boundary check between vcpus of <cachetune> and <memorytune> should be considered. As stated, will create resctrl->alloc at all <cachetune> element.
Up to this patch, it is not allowed to append the virDomainResctrlDef element to @def->resctrls if there is no <cache> element.
Later, a virDomainResctrlDef element could only be appended if there exists at least one <cache> element or one <monitor>.
supplying a <cachetune> entry without <cache> entries. A <cachetune> entry without <cache>entries, and at same time, without <monitor> entries is not permitted here.
It's a little awkward to read, but now makes me think about all the existing/strange linkages. In a way I suppose having no <cachetune> entries is proven to work by tests/genericxml2xmlindata/memorytune.xml. The reality is they get created, but without visibility. What is created and no visibility? Not understand. It's that memorytune path I noted above. Nothing is ever saved with them, but they do exist 'internally' (virDomainMemorytuneDefParse calls virResctrlAllocNew and will eventually virDomainResctrlAppend because virResctrlAllocIsEmpty is false since memorytune would have a @bandwidth (and virResctrlAllocSetMemoryBandwidth fills in alloc->mem_bw).
This because memorytune and cachetune shares same data structure 'virResctrlAlloc' they are called 'allocation' but refer to different sub-field. cachetune and memorytune works independently from the interface.
If no <cachtune> entries, there will no virDomainResctrlDef element appended to @def->resctrls.
Up to this patch, there will be no creation for either <cachtune> entry in later invocation of virDomainCachetuneDefFormat or an appending of element in @def->resctrls during XML parsing, even with following XML configuration:
<cputune> ... <cachetune vcpus='0-1'/> </cputune>
Even after an integration of the whole patch series, upper XML configuration will not create any @def->resctrls elements in configuration file parsing or any <cachetune> XML element in later call of virDomainCachetuneDefFormat.
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 8189959..b1651e3 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -943,8 +943,8 @@ <dl> <dt><code>cache</code></dt> <dd> - This element controls the allocation of CPU cache and has the - following attributes: + This optional element controls the allocation of CPU cache and has + the following attributes: <dl> <dt><code>level</code></dt> <dd> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 099a949..5c533d6 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -956,7 +956,7 @@ <attribute name="vcpus"> <ref name='cpuset'/> </attribute> - <oneOrMore> + <zeroOrMore> <element name="cache"> <attribute name="id"> <ref name='unsignedInt'/> @@ -980,7 +980,7 @@ </attribute> </optional> </element> - </oneOrMore> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9911d56..b77680e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19002,22 +19002,27 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; }
- if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) - goto cleanup; - - if (!alloc) { - alloc = virResctrlAllocNew(); - if (!alloc) + /* If 'n' equals 0, then no <cache> element found in <cachetune>, + * this means it is a default alloction. For default allocation, s/alloction/allocation My mistake. will be fixed.
+ * @SetvirDomainResctrlDefPtr->alloc is set to NULL */ Not sure what ^^ this was... Should be @virDomainResctrlDefPtr->alloc.
+ if (n != 0) { + if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) goto cleanup; - } else { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Identical vcpus in cachetunes found")); - goto cleanup; - } diff is perhaps easier to read if you:
if (n == 0) { ret = 0; goto cleanup; }
then none of the indent is needed for n != 0 Your advising changes works here, but will conflict with later logic I will introduce in patch 13. Yeah about where I stopped and started thinking when a <cachetune> is see we create the @alloc
This part does not need to change since we decided to create @alloc when code reach this place.
This part of code will be modified in later patch (patch 13 of 19), adding some code parsing configuration for monitor. At that time, if n==0 then letting this function return without error is not a reasonable logic, and need to check if <monitor> entries exists under <cachetune> entry.
Paste the code here for your reference:
19180 if ((n = virXPathNodeSet("./cache", ctxt, &nodes)) < 0) { 19181 virReportError(VIR_ERR_INTERNAL_ERROR, "%s", 19182 _("Cannot extract cache nodes under cachetune")); 19183 goto cleanup; 19184 } 19185 19186 /* If 'n' equals 0, then no <cache> element found in <cachetune>, 19187 * this means it is a default alloction. For default allocation, 19188 * @virDomainResctrlDefPtr->alloc is set to NULL */ 19189 if (n != 0) { But from here...
19190 if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) 19191 goto cleanup; 19192 19193 if (!alloc) { 19194 alloc = virResctrlAllocNew(); 19195 if (!alloc) 19196 goto cleanup; 19197 } else { 19198 virReportError(VIR_ERR_XML_ERROR, "%s", 19199 _("Identical vcpus in cachetunes found")); 19200 goto cleanup; 19201 } 19202 ...to here has nothing to do with whether <cache> elements exist, so why would we restrict creation of @alloc based on whether <cache> entries existed. So what if no <cache> entries exist.
This is perhaps less "default allocation" and more don't require <cache> elements. In that case, ... <whatever it means>... Later we're going to add <monitor> elements and they don't require <cache> elements, so having ->alloc predicated on whether <cache> entries exists complicates a lot of code. Simplify things.
Understand. Will remove 'default allocation' related things and create @->alloc
19203 for (i = 0; i < n; i++) { 19204 if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) 19205 goto cleanup; 19206 } 19207 } 19208 19209 resctrl = virDomainResctrlNew(alloc, vcpus); 19210 if (!resctrl) 19211 goto cleanup; 19212 19213 if (virDomainResctrlMonDefParse(def, ctxt, node, 19214 VIR_RESCTRL_MONITOR_TYPE_CACHE, 19215 resctrl) < 0) 19216 goto cleanup; 19217 19218 if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) {
--> If there is no 'cache' entry and no 'monitor' entry in current <cachetune> entry, code will go to this place, and function will return normally without error, and virDomainResctrlAppend will not be executed.
But a <memorytune> can allocate and insert too.
As I pointed out later I think the ResctrlNew and ResctrlAppend don't need to be separate either.
I can do that but with a long parameter list for ResctrlAppend, I have to pass all element of resctrl into this function, because this is the place that all information will be submitted to @def->resctrls.
19219 ret = 0; 19220 goto cleanup; 19221 } 19222 19223 if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) 19224 goto cleanup; 19225 19226 resctrl = NULL; 19227 ret = 0; 19228 cleanup: 19229 ctxt->node = oldnode; 19230 virDomainResctrlDefFree(resctrl); 19231 virObjectUnref(alloc); 19232 virBitmapFree(vcpus); 19233 VIR_FREE(nodes); 19234 return ret;
- for (i = 0; i < n; i++) { - if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + if (!alloc) { + alloc = virResctrlAllocNew(); + if (!alloc) + goto cleanup; + } else { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Identical vcpus in cachetunes found")); goto cleanup; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } }
if (virResctrlAllocIsEmpty(alloc)) { @@ -19027,6 +19032,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) goto cleanup; + Superfluous This blank line is involved by mistake, will be removed :)
vcpus = NULL; alloc = NULL;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index df5b512..697424c 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -234,6 +234,10 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) * in case there is no allocation for that particular cache allocation (level, * cache, ...) or memory allocation for particular node). * + * Resctrl file system root directory, /sys/fs/sysctrl/, is called + the default + * allocation, which is created, immediately after mounting, owns all + the + * tasks and cpus in the system and can make full use of all resources. + * * =====Cache allocation technology (CAT)===== * * Since one allocation can be made for caches on different levels, the first
I assume you searched on:
virResctrlAllocGetType (w/ callers:) virResctrlAllocUpdateMask virResctrlAllocUpdateSize virResctrlAllocCopyMasks
It's kind of "painful" to back trace all the callers and determine if any/each of them does the if (!alloc) check "originally" somewhere. I took a quick look and they seem OK Yes. and I also double checked for these functions.
I am focus on cache monitor entries in these patches, will be further checked when introducing memoryBW monitor later.
@@ -1165,6 +1169,9 @@ virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, unsigned int cache, unsigned long long size) { + if (!alloc) + return 0; + if (virResctrlAllocCheckCollision(alloc, level, type, cache)) { virReportError(VIR_ERR_XML_ERROR, _("Colliding cache allocations for cache " @@ -1235,6 +1242,9 @@ virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr alloc, { virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw;
^^ This wouldn't have been too happy would it if alloc == NULL; however,
+ if (!alloc) + return 0; + I don't think it'll matter since the only caller is virDomainMemorytuneDefParse which will allocate an @alloc if one didn't exist *and* pass that through to here, so this check shouldn't be necessary. I don't realize the @alloc has already been used! Will remove the pointer checking for @alloc.
And yes, this is where/why I think you shouldn't have a concept of default allocation... It should just be an allocation (aka cachetune) without specific cache entries. Later that allows monitor entries, but it may allow something else now, who knows.
Will remove 'default allocation'.
Again, as I pointed out you can have a domain with <memorytune> only entries and have the same "thing", so there's absolutely no reason to not just allow <cachetune> without <cache>.
In researching this I realized that although we have a memorytune-colliding- allocs.xml and memorytune.xml, there is no <memorytune> example that includes <cachetune> entries as well. Let me add a test for this case in next update.
if (memory_bandwidth > 100) { virReportError(VIR_ERR_XML_ERROR, "%s", _("Memory Bandwidth value exceeding 100 is invalid.")); @@ -1304,6 +1314,11 @@ int virResctrlAllocSetID(virResctrlAllocPtr alloc, const char *id) { + /* If passed a default allocation in, @alloc will be NULL. This is + * a valid case, return normally. */
Will remove the comment. I'll try to add this comment to the CAT and MBA comments.
This is the only one to get that type of comment... Probably something that should instead be more clearly indicated perhaps in the CAT and MBA comments at the top of the module.
+ if (!alloc) + return 0; + if (!id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl allocation 'id' cannot be NULL")); @@ -1317,6 +1332,9 @@ virResctrlAllocSetID(virResctrlAllocPtr alloc, const char * virResctrlAllocGetID(virResctrlAllocPtr alloc) { + if (!alloc) + return NULL; + Probably need to consider current callers... I see that both virDomainCachetuneDefFormat and virDomainMemorytuneDefFormat would return -1 for some unknown reason. Although perhaps the latter would work fine since it'd create it's own @alloc if resctrl->alloc == NULL.
Hence why I asked for an XML example above. There is a subsequent patch, patch 16, handling this. I never got there, heap overflowed stack.
I'm currently of the opinion that all of the "if (!alloc)" checks won't be necessary if you create the resctrl->alloc once a <cachetune> entry is seen. Similarly, if a <memorytune> references 'vcpus' that don't already have a <cachetune> entry, then one is created - whether it's formatted is a different story (currently it's not, which is fine). I think that'll simplify things.
Yes. In this case 'if (!alloc)' is no meaning, will be removed.
John
Thanks for review and suggestions! Huaqiang
Up to now, no monitor introduced, there will not there is no way to pass in an empty @alloc, so the code will not introduce any trouble.
In patch 16, a 'virDomainResctrl.id' is introduced, and later code will use 'virDomainResctrlDef.id' to track the id of cachetune. I did this, because I have extended the scope of virDomainResctrlDef in later patches by adding monitors, and one virDomainResctrlDef is the abstraction of one <cachetune> entry, so logically 'id' of <cachetune> should be kept in structure virDomainResctrlDef.
return alloc->id; }
@@ -2209,6 +2227,9 @@ int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) { + if (!alloc) + return 0; + if (!alloc->id) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Resctrl Allocation ID must be set before creation")); @@ -2302,6 +2323,9 @@ virResctrlAllocAddPID(virResctrlAllocPtr
alloc,
char *pidstr = NULL; int ret = 0;
+ if (!alloc) + return 0; + if (!alloc->path) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Cannot add pid to non-existing resctrl allocation")); @@ -2334,6 +2358,9 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0;
+ if (!alloc) + return 0; + if (!alloc->path) return 0;
These two could be combined Will be combined.
John
Thanks for review. Huaqiang

Cache Monitoring Technology (aka CMT) provides the capability to report cache utilization information of system task. This patch introduces the concept of resctrl monitor through data structure virResctrlMonitor. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 7 ++++++ 3 files changed, 64 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 335210c..d2573c5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache; virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; +virResctrlMonitorNew; # util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 697424c..18ee560 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -105,6 +105,7 @@ typedef virResctrlAllocMemBW *virResctrlAllocMemBWPtr; /* Class definitions and initializations */ static virClassPtr virResctrlInfoClass; static virClassPtr virResctrlAllocClass; +static virClassPtr virResctrlMonitorClass; /* virResctrlInfo */ @@ -319,6 +320,35 @@ struct _virResctrlAlloc { char *path; }; +/* virResctrlMonitor */ + +/* + * virResctrlMonitor is the data structure for resctrl monitor. Resctrl + * monitor represents a resctrl monitoring group, which can be used to + * monitor the resource utilization information for either cache or + * memory bandwidth. + * + * From hardware perspective, cache monitoring technology (CMT), memory + * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal + * features. The monitor will be created under the scope of default allocation + * if no CAT or MBA supported in the system. + */ +struct _virResctrlMonitor { + virObject parent; + + /* In resctrl, each monitor is associated with one specific allocation, + * either the allocation under /sys/fs/resctrl or the default allocation. + * If this pointer is NULL, then the monitor will be associated with + * default allocation, otherwise, this pointer points to the allocation + * this monitor associated with. */ + virResctrlAllocPtr alloc; + /* The monitor identifier */ + char *id; + /* libvirt-generated path in /sys/fs/resctrl for this particular + * monitor */ + char *path; +}; + static void virResctrlAllocDispose(void *obj) @@ -368,6 +398,17 @@ virResctrlAllocDispose(void *obj) } +static void +virResctrlMonitorDispose(void *obj) +{ + virResctrlMonitorPtr monitor = obj; + + virObjectUnref(monitor->alloc); + VIR_FREE(monitor->id); + VIR_FREE(monitor->path); +} + + /* Global initialization for classes */ static int virResctrlOnceInit(void) @@ -378,6 +419,9 @@ virResctrlOnceInit(void) if (!VIR_CLASS_NEW(virResctrlAlloc, virClassForObject())) return -1; + if (!VIR_CLASS_NEW(virResctrlMonitor, virClassForObject())) + return -1; + return 0; } @@ -2372,3 +2416,15 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) return ret; } + + +/* virResctrlMonitor-related definitions */ + +virResctrlMonitorPtr +virResctrlMonitorNew(void) +{ + if (virResctrlInitialize() < 0) + return NULL; + + return virObjectNew(virResctrlMonitorClass); +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 10505e9..f59a9aa 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -185,4 +185,11 @@ int virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl, const char *prefix, virResctrlInfoMonPtr *monitor); + +/* Monitor-related things */ +typedef struct _virResctrlMonitor virResctrlMonitor; +typedef virResctrlMonitor *virResctrlMonitorPtr; + +virResctrlMonitorPtr +virResctrlMonitorNew(void); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Cache Monitoring Technology (aka CMT) provides the capability to report cache utilization information of system task.
This patch introduces the concept of resctrl monitor through data structure virResctrlMonitor.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 7 ++++++ 3 files changed, 64 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 335210c..d2573c5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache; virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; +virResctrlMonitorNew;
# util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 697424c..18ee560 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -105,6 +105,7 @@ typedef virResctrlAllocMemBW *virResctrlAllocMemBWPtr; /* Class definitions and initializations */ static virClassPtr virResctrlInfoClass; static virClassPtr virResctrlAllocClass; +static virClassPtr virResctrlMonitorClass;
/* virResctrlInfo */ @@ -319,6 +320,35 @@ struct _virResctrlAlloc { char *path; };
+/* virResctrlMonitor */ + +/* + * virResctrlMonitor is the data structure for resctrl monitor. Resctrl + * monitor represents a resctrl monitoring group, which can be used to + * monitor the resource utilization information for either cache or + * memory bandwidth. + * + * From hardware perspective, cache monitoring technology (CMT), memory + * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal + * features. The monitor will be created under the scope of default allocation + * if no CAT or MBA supported in the system.
"if no specific CAT or MBA entries are provided for the guest" The rest seems reasonable at least for now, so Reviewed-by: John Ferlan <jferlan@redhat.com> John [...]

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:08 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 02/19] util: Introduce resctrl monitor for CMT
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Cache Monitoring Technology (aka CMT) provides the capability to report cache utilization information of system task.
This patch introduces the concept of resctrl monitor through data structure virResctrlMonitor.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 7 ++++++ 3 files changed, 64 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 335210c..d2573c5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache; virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; +virResctrlMonitorNew;
# util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 697424c..18ee560 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -105,6 +105,7 @@ typedef virResctrlAllocMemBW *virResctrlAllocMemBWPtr; /* Class definitions and initializations */ static virClassPtr virResctrlInfoClass; static virClassPtr virResctrlAllocClass; +static virClassPtr virResctrlMonitorClass;
/* virResctrlInfo */ @@ -319,6 +320,35 @@ struct _virResctrlAlloc { char *path; };
+/* virResctrlMonitor */ + +/* + * virResctrlMonitor is the data structure for resctrl monitor. +Resctrl + * monitor represents a resctrl monitoring group, which can be used +to + * monitor the resource utilization information for either cache or + * memory bandwidth. + * + * From hardware perspective, cache monitoring technology (CMT), +memory + * bandwidth technology (MBM), as well as the CAT and MBA, are all +orthogonal + * features. The monitor will be created under the scope of default +allocation + * if no CAT or MBA supported in the system.
"if no specific CAT or MBA entries are provided for the guest"
OK.
The rest seems reasonable at least for now, so
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
[...]
Thanks for review. Huaqiang

The code of adding PID to the allocation could be reused, refactor it for later reusing. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 18ee560..41c7e51 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2359,24 +2359,21 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, } -int -virResctrlAllocAddPID(virResctrlAllocPtr alloc, - pid_t pid) +static int +virResctrlAddPID(const char *path, + pid_t pid) { char *tasks = NULL; char *pidstr = NULL; int ret = 0; - if (!alloc) - return 0; - - if (!alloc->path) { + if (!path) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Cannot add pid to non-existing resctrl allocation")); + _("Cannot add pid to non-existing resctrl group")); return -1; } - if (virAsprintf(&tasks, "%s/tasks", alloc->path) < 0) + if (virAsprintf(&tasks, "%s/tasks", path) < 0) return -1; if (virAsprintf(&pidstr, "%lld", (long long int) pid) < 0) @@ -2398,6 +2395,17 @@ virResctrlAllocAddPID(virResctrlAllocPtr alloc, int +virResctrlAllocAddPID(virResctrlAllocPtr alloc, + pid_t pid) +{ + if (!alloc) + return 0; + + return virResctrlAddPID(alloc->path, pid); +} + + +int virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0; -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The code of adding PID to the allocation could be reused, refactor it for later reusing.
reuse.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:09 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 03/19] util: Refactor code for adding PID to the resource group
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The code of adding PID to the allocation could be reused, refactor it for later reusing.
reuse.
Thanks.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for review. Huaqiang

Add interface for adding task PID to monitor. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 8 ++++++++ src/util/virresctrl.h | 4 ++++ 3 files changed, 13 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d2573c5..4a52a86 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache; virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; +virResctrlMonitorAddPID; virResctrlMonitorNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 41c7e51..c9a79f7 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2436,3 +2436,11 @@ virResctrlMonitorNew(void) return virObjectNew(virResctrlMonitorClass); } + + +int +virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, + pid_t pid) +{ + return virResctrlAddPID(monitor->path, pid); +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index f59a9aa..cb9bfae 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -192,4 +192,8 @@ typedef virResctrlMonitor *virResctrlMonitorPtr; virResctrlMonitorPtr virResctrlMonitorNew(void); + +int +virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, + pid_t pid); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interface for adding task PID to monitor.
to the monitor
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 8 ++++++++ src/util/virresctrl.h | 4 ++++ 3 files changed, 13 insertions(+)
Kind of an odd order considering @path gets introduced later, but it's not that big a deal. Reviewed-by: John Ferlan <jferlan@redhat.com> John

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:09 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 04/19] util: Add interface for adding PID to monitor
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interface for adding task PID to monitor.
to the monitor
Will be changed.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 8 ++++++++ src/util/virresctrl.h | 4 ++++ 3 files changed, 13 insertions(+)
Kind of an odd order considering @path gets introduced later, but it's not that big a deal.
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for review. Huaqiang

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:09 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 05/19] util: Refactor code for determining allocation path
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The code for determining resctrl allocation path could be reused for monitor. Refactor it for reusing.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index c9a79f7..03001cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2267,6 +2267,26 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, }
+static char * +virResctrlDeterminePath(const char *pathparent,
s/pathparent/parentpath
OK.
+ const char *prefix, + const char *id) { + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl resource ID must be set before + creation"));
s/Resctrl resource ID/'%s' resctrl ID/
where %s is @parentpath
Working in the @parentpath, then we'd know which it was Alloc or Monitor
How about changes like these? if (!id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl resource ID must be set before creation")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl ID must be set before determining resctrl " + "path under '%s'"), + parentpath); return NULL; }
+ return NULL; + } + + if (virAsprintf(&path, "%s/%s-%s", pathparent, prefix, id) < 0)
parentpath
Will be renamed.
+ return NULL; + + return path; +} + + int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) @@ -2274,15 +2294,16 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, if (!alloc) return 0;
- if (!alloc->id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl Allocation ID must be set before creation")); + if (alloc->path) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Resctrl group path is expected to be + NULL"));
Resctrl alloc group path is already set '%s'
I think this is an internal error not an invalid arg, too
Will be changed. if (alloc->path) { - virReportError(VIR_ERR_INVALID_ARG, "%s", - _("Resctrl group path is expected to be NULL")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl allocation path is already set to '%s'"), + alloc->path); return -1; }
John
Thanks for review. Huaqiang
return -1; }
- if (!alloc->path && - virAsprintf(&alloc->path, "%s/%s-%s", - SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH, + machinename, + alloc->id); + if (!alloc->path) return -1;
return 0;

Hi John, Last reply is for " [PATCHv5 05/19] util: Refactor code for determining allocation path". Please ignore this. Sorry for inconvenience. BR Huaqiang
-----Original Message----- From: Wang, Huaqiang Sent: Wednesday, October 10, 2018 9:48 PM To: 'John Ferlan' <jferlan@redhat.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: RE: [libvirt] [PATCHv5 04/19] util: Add interface for adding PID to monitor
-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:09 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 05/19] util: Refactor code for determining allocation path
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The code for determining resctrl allocation path could be reused for monitor. Refactor it for reusing.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index c9a79f7..03001cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2267,6 +2267,26 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, }
+static char * +virResctrlDeterminePath(const char *pathparent,
s/pathparent/parentpath
OK.
+ const char *prefix, + const char *id) { + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl resource ID must be set before + creation"));
s/Resctrl resource ID/'%s' resctrl ID/
where %s is @parentpath
Working in the @parentpath, then we'd know which it was Alloc or Monitor
How about changes like these?
if (!id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl resource ID must be set before creation")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl ID must be set before determining resctrl " + "path under '%s'"), + parentpath); return NULL; }
+ return NULL; + } + + if (virAsprintf(&path, "%s/%s-%s", pathparent, prefix, id) < 0)
parentpath
Will be renamed.
+ return NULL; + + return path; +} + + int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) @@ -2274,15 +2294,16 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, if (!alloc) return 0;
- if (!alloc->id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl Allocation ID must be set before creation")); + if (alloc->path) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Resctrl group path is expected to be + NULL"));
Resctrl alloc group path is already set '%s'
I think this is an internal error not an invalid arg, too
Will be changed.
if (alloc->path) { - virReportError(VIR_ERR_INVALID_ARG, "%s", - _("Resctrl group path is expected to be NULL")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl allocation path is already set to '%s'"), + alloc->path); return -1; }
John
Thanks for review. Huaqiang
return -1; }
- if (!alloc->path && - virAsprintf(&alloc->path, "%s/%s-%s", - SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH, + machinename, + alloc->id); + if (!alloc->path) return -1;
return 0;

The code for determining resctrl allocation path could be reused for monitor. Refactor it for reusing. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index c9a79f7..03001cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2267,6 +2267,26 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, } +static char * +virResctrlDeterminePath(const char *pathparent, + const char *prefix, + const char *id) +{ + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl resource ID must be set before creation")); + return NULL; + } + + if (virAsprintf(&path, "%s/%s-%s", pathparent, prefix, id) < 0) + return NULL; + + return path; +} + + int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) @@ -2274,15 +2294,16 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, if (!alloc) return 0; - if (!alloc->id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl Allocation ID must be set before creation")); + if (alloc->path) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Resctrl group path is expected to be NULL")); return -1; } - if (!alloc->path && - virAsprintf(&alloc->path, "%s/%s-%s", - SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH, + machinename, + alloc->id); + if (!alloc->path) return -1; return 0; -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The code for determining resctrl allocation path could be reused for monitor. Refactor it for reusing.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index c9a79f7..03001cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2267,6 +2267,26 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, }
+static char * +virResctrlDeterminePath(const char *pathparent,
s/pathparent/parentpath
+ const char *prefix, + const char *id) +{ + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl resource ID must be set before creation"));
s/Resctrl resource ID/'%s' resctrl ID/ where %s is @parentpath Working in the @parentpath, then we'd know which it was Alloc or Monitor
+ return NULL; + } + + if (virAsprintf(&path, "%s/%s-%s", pathparent, prefix, id) < 0)
parentpath
+ return NULL; + + return path; +} + + int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) @@ -2274,15 +2294,16 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, if (!alloc) return 0;
- if (!alloc->id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl Allocation ID must be set before creation")); + if (alloc->path) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Resctrl group path is expected to be NULL"));
Resctrl alloc group path is already set '%s' I think this is an internal error not an invalid arg, too John
return -1; }
- if (!alloc->path && - virAsprintf(&alloc->path, "%s/%s-%s", - SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH, + machinename, + alloc->id); + if (!alloc->path) return -1;
return 0;

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:09 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 05/19] util: Refactor code for determining allocation path
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The code for determining resctrl allocation path could be reused for monitor. Refactor it for reusing.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index c9a79f7..03001cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2267,6 +2267,26 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, }
+static char * +virResctrlDeterminePath(const char *pathparent,
s/pathparent/parentpath
OK.
+ const char *prefix, + const char *id) { + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl resource ID must be set before + creation"));
s/Resctrl resource ID/'%s' resctrl ID/
where %s is @parentpath
Working in the @parentpath, then we'd know which it was Alloc or Monitor
How about changes like these? if (!id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl resource ID must be set before creation")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl ID must be set before determining resctrl " + "path under '%s'"), + parentpath); return NULL; }
+ return NULL; + } + + if (virAsprintf(&path, "%s/%s-%s", pathparent, prefix, id) < 0)
parentpath
Will be renamed.
+ return NULL; + + return path; +} + + int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) @@ -2274,15 +2294,16 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, if (!alloc) return 0;
- if (!alloc->id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl Allocation ID must be set before creation")); + if (alloc->path) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Resctrl group path is expected to be + NULL"));
Resctrl alloc group path is already set '%s'
I think this is an internal error not an invalid arg, too
Will be changed. if (alloc->path) { - virReportError(VIR_ERR_INVALID_ARG, "%s", - _("Resctrl group path is expected to be NULL")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl allocation path is already set to '%s'"), + alloc->path); return -1; }
John
Thanks for review. Huaqiang
return -1; }
- if (!alloc->path && - virAsprintf(&alloc->path, "%s/%s-%s", - SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH, + machinename, + alloc->id); + if (!alloc->path) return -1;
return 0;

On 10/10/18 9:55 AM, Wang, Huaqiang wrote:
-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:09 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 05/19] util: Refactor code for determining allocation path
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The code for determining resctrl allocation path could be reused for monitor. Refactor it for reusing.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index c9a79f7..03001cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2267,6 +2267,26 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, }
+static char * +virResctrlDeterminePath(const char *pathparent,
s/pathparent/parentpath
OK.
+ const char *prefix, + const char *id) { + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl resource ID must be set before + creation"));
s/Resctrl resource ID/'%s' resctrl ID/
where %s is @parentpath
Working in the @parentpath, then we'd know which it was Alloc or Monitor
How about changes like these?
if (!id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl resource ID must be set before creation")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl ID must be set before determining resctrl " + "path under '%s'"), + parentpath); return NULL; }
Seems reasonable... although instead of "path under", just go with "parentpath='%s'" John
+ return NULL; + } + + if (virAsprintf(&path, "%s/%s-%s", pathparent, prefix, id) < 0)
parentpath
Will be renamed.
+ return NULL; + + return path; +} + + int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) @@ -2274,15 +2294,16 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, if (!alloc) return 0;
- if (!alloc->id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl Allocation ID must be set before creation")); + if (alloc->path) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Resctrl group path is expected to be + NULL"));
Resctrl alloc group path is already set '%s'
I think this is an internal error not an invalid arg, too
Will be changed.
if (alloc->path) { - virReportError(VIR_ERR_INVALID_ARG, "%s", - _("Resctrl group path is expected to be NULL")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl allocation path is already set to '%s'"), + alloc->path); return -1; }
John
Thanks for review. Huaqiang
return -1; }
- if (!alloc->path && - virAsprintf(&alloc->path, "%s/%s-%s", - SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH, + machinename, + alloc->id); + if (!alloc->path) return -1;
return 0;

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Thursday, October 11, 2018 5:41 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 05/19] util: Refactor code for determining allocation path
On 10/10/18 9:55 AM, Wang, Huaqiang wrote:
-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:09 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 05/19] util: Refactor code for determining allocation path
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The code for determining resctrl allocation path could be reused for monitor. Refactor it for reusing.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index c9a79f7..03001cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2267,6 +2267,26 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, }
+static char * +virResctrlDeterminePath(const char *pathparent,
s/pathparent/parentpath
OK.
+ const char *prefix, + const char *id) { + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl resource ID must be set before + creation"));
s/Resctrl resource ID/'%s' resctrl ID/
where %s is @parentpath
Working in the @parentpath, then we'd know which it was Alloc or Monitor
How about changes like these?
if (!id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl resource ID must be set before creation")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl ID must be set before determining resctrl " + "path under '%s'"), + parentpath); return NULL; }
Seems reasonable... although instead of "path under", just go with "parentpath='%s'"
Thanks for advice, will be fixed.
John
+ return NULL; + } + + if (virAsprintf(&path, "%s/%s-%s", pathparent, prefix, id) < 0)
parentpath
Will be renamed.
+ return NULL; + + return path; +} + + int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) @@ -2274,15 +2294,16 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, if (!alloc) return 0;
- if (!alloc->id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl Allocation ID must be set before creation")); + if (alloc->path) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Resctrl group path is expected to be + NULL"));
Resctrl alloc group path is already set '%s'
I think this is an internal error not an invalid arg, too
Will be changed.
if (alloc->path) { - virReportError(VIR_ERR_INVALID_ARG, "%s", - _("Resctrl group path is expected to be NULL")); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl allocation path is already set to '%s'"), + alloc->path); return -1; }
John
Thanks for review. Huaqiang
return -1; }
- if (!alloc->path && - virAsprintf(&alloc->path, "%s/%s-%s", - SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH, + machinename, + alloc->id); + if (!alloc->path) return -1;
return 0;

Add interface for resctrl monitor to determine the path. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 32 ++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 3 +++ 3 files changed, 36 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4a52a86..e175c8b 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; virResctrlMonitorAddPID; +virResctrlMonitorDeterminePath; virResctrlMonitorNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 03001cc..1a5578e 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2465,3 +2465,35 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, { return virResctrlAddPID(monitor->path, pid); } + +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename) +{ + char *alloc_path = NULL; + char *parentpath = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (monitor->alloc) + alloc_path = monitor->alloc->path; + else + alloc_path = (char *)SYSFS_RESCTRL_PATH; + + if (virAsprintf(&parentpath, "%s/mon_groups", alloc_path) < 0) + return -1; + + monitor->path = virResctrlDeterminePath(parentpath, machinename, + monitor->id); + + VIR_FREE(parentpath); + + if (!monitor->path) + return -1; + + return 0; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index cb9bfae..69b6b1d 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -196,4 +196,7 @@ virResctrlMonitorNew(void); int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid); +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interface for resctrl monitor to determine the path.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 32 ++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 3 +++ 3 files changed, 36 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4a52a86..e175c8b 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; virResctrlMonitorAddPID; +virResctrlMonitorDeterminePath; virResctrlMonitorNew;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 03001cc..1a5578e 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2465,3 +2465,35 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, { return virResctrlAddPID(monitor->path, pid); } + +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename) +{ + char *alloc_path = NULL;
const char
+ char *parentpath = NULL;
VIR_AUTOFREE(char *) parentpath = NULL; (thus the VIR_FREE later isn't necessary)
+ + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + }
Shouldn't there be a monitor->path check here like there is in virResctrlAllocDeterminePath?
+ + if (monitor->alloc) + alloc_path = monitor->alloc->path; + else + alloc_path = (char *)SYSFS_RESCTRL_PATH;
s/(char *)//
+ + if (virAsprintf(&parentpath, "%s/mon_groups", alloc_path) < 0) + return -1; + + monitor->path = virResctrlDeterminePath(parentpath, machinename, + monitor->id); + + VIR_FREE(parentpath);
see above... John
+ + if (!monitor->path) + return -1; + + return 0; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index cb9bfae..69b6b1d 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -196,4 +196,7 @@ virResctrlMonitorNew(void); int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid); +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:16 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 06/19] util: Add monitor interface to determine path
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interface for resctrl monitor to determine the path.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 32 ++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 3 +++ 3 files changed, 36 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4a52a86..e175c8b 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; virResctrlMonitorAddPID; +virResctrlMonitorDeterminePath; virResctrlMonitorNew;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 03001cc..1a5578e 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2465,3 +2465,35 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, { return virResctrlAddPID(monitor->path, pid); } + +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename) { + char *alloc_path = NULL;
const char
OK, a pointer to const char will be better.
+ char *parentpath = NULL;
VIR_AUTOFREE(char *) parentpath = NULL;
(thus the VIR_FREE later isn't necessary)
I haven't realized the existence of such kind of variable 'decorator'. VIR_AUTOFREE will be used in next update. Thanks.
+ + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + }
Shouldn't there be a monitor->path check here like there is in virResctrlAllocDeterminePath?
Since virResctrlAllocDeterminePath has such kind of safety check. Let's add the similar check here. will be added.
+ + if (monitor->alloc) + alloc_path = monitor->alloc->path; + else + alloc_path = (char *)SYSFS_RESCTRL_PATH;
s/(char *)//
Will be removed.
+ + if (virAsprintf(&parentpath, "%s/mon_groups", alloc_path) < 0) + return -1; + + monitor->path = virResctrlDeterminePath(parentpath, machinename, + monitor->id); + + VIR_FREE(parentpath);
see above...
Line will be removed.
John
Thanks for review. Huaqiang
+ + if (!monitor->path) + return -1; + + return 0; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index cb9bfae..69b6b1d 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -196,4 +196,7 @@ virResctrlMonitorNew(void); int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid); +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */

On 10/10/18 9:56 AM, Wang, Huaqiang wrote:
-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:16 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 06/19] util: Add monitor interface to determine path
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interface for resctrl monitor to determine the path.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 32 ++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 3 +++ 3 files changed, 36 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4a52a86..e175c8b 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; virResctrlMonitorAddPID; +virResctrlMonitorDeterminePath; virResctrlMonitorNew;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 03001cc..1a5578e 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2465,3 +2465,35 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, { return virResctrlAddPID(monitor->path, pid); } + +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename) { + char *alloc_path = NULL;
const char
OK, a pointer to const char will be better.
+ char *parentpath = NULL;
VIR_AUTOFREE(char *) parentpath = NULL;
(thus the VIR_FREE later isn't necessary)
I haven't realized the existence of such kind of variable 'decorator'. VIR_AUTOFREE will be used in next update. Thanks.
Yeah it's "newer" stuff, but it hasn't always gotten into my review cadence so sometimes I remember, sometimes I don't. We had a Google summer of code student working through those changes, but there's still many more places to change. John
+ + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + }
Shouldn't there be a monitor->path check here like there is in virResctrlAllocDeterminePath?
Since virResctrlAllocDeterminePath has such kind of safety check. Let's add the similar check here. will be added.
+ + if (monitor->alloc) + alloc_path = monitor->alloc->path; + else + alloc_path = (char *)SYSFS_RESCTRL_PATH;
s/(char *)//
Will be removed.
+ + if (virAsprintf(&parentpath, "%s/mon_groups", alloc_path) < 0) + return -1; + + monitor->path = virResctrlDeterminePath(parentpath, machinename, + monitor->id); + + VIR_FREE(parentpath);
see above...
Line will be removed.
John
Thanks for review. Huaqiang
+ + if (!monitor->path) + return -1; + + return 0; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index cb9bfae..69b6b1d 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -196,4 +196,7 @@ virResctrlMonitorNew(void); int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid); +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Thursday, October 11, 2018 5:43 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 06/19] util: Add monitor interface to determine path
On 10/10/18 9:56 AM, Wang, Huaqiang wrote:
-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 7:16 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 06/19] util: Add monitor interface to determine path
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interface for resctrl monitor to determine the path.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 32 ++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 3 +++ 3 files changed, 36 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4a52a86..e175c8b 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; virResctrlMonitorAddPID; +virResctrlMonitorDeterminePath; virResctrlMonitorNew;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 03001cc..1a5578e 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2465,3 +2465,35 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, { return virResctrlAddPID(monitor->path, pid); } + +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename) { + char *alloc_path = NULL;
const char
OK, a pointer to const char will be better.
+ char *parentpath = NULL;
VIR_AUTOFREE(char *) parentpath = NULL;
(thus the VIR_FREE later isn't necessary)
I haven't realized the existence of such kind of variable 'decorator'. VIR_AUTOFREE will be used in next update. Thanks.
Yeah it's "newer" stuff, but it hasn't always gotten into my review cadence so sometimes I remember, sometimes I don't. We had a Google summer of code student working through those changes, but there's still many more places to change.
John
Thanks! Huaqiang
+ + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + }
Shouldn't there be a monitor->path check here like there is in virResctrlAllocDeterminePath?
Since virResctrlAllocDeterminePath has such kind of safety check. Let's add the similar check here. will be added.
+ + if (monitor->alloc) + alloc_path = monitor->alloc->path; + else + alloc_path = (char *)SYSFS_RESCTRL_PATH;
s/(char *)//
Will be removed.
+ + if (virAsprintf(&parentpath, "%s/mon_groups", alloc_path) < 0) + return -1; + + monitor->path = virResctrlDeterminePath(parentpath, machinename, + monitor->id); + + VIR_FREE(parentpath);
see above...
Line will be removed.
John
Thanks for review. Huaqiang
+ + if (!monitor->path) + return -1; + + return 0; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index cb9bfae..69b6b1d 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -196,4 +196,7 @@ virResctrlMonitorNew(void); int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid); +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */

The code for creating resctrl allocation group could be reused for monitoring group, refactor it for reusing in the later patch. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 1a5578e..8b617a6 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2310,6 +2310,26 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, } +/* This function creates a resctrl directory in resource control file system, + * and the directory path is specified by @path. */ +static int +virResctrlCreateGroupPath(const char *path) +{ + /* Directory exists, return */ + if (virFileExists(path)) + return 0; + + if (virFileMakePath(path) < 0) { + virReportSystemError(errno, + _("Cannot create resctrl directory '%s'"), + path); + return -1; + } + + return 0; +} + + /* This checks if the directory for the alloc exists. If not it tries to create * it and apply appropriate alloc settings. */ int @@ -2334,13 +2354,6 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, if (virResctrlAllocDeterminePath(alloc, machinename) < 0) return -1; - if (virFileExists(alloc->path)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Path '%s' for resctrl allocation exists"), - alloc->path); - goto cleanup; - } - lockfd = virResctrlLockWrite(); if (lockfd < 0) goto cleanup; @@ -2348,6 +2361,9 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, if (virResctrlAllocAssign(resctrl, alloc) < 0) goto cleanup; + if (virResctrlCreateGroupPath(alloc->path) < 0) + goto cleanup; + alloc_str = virResctrlAllocFormat(alloc); if (!alloc_str) goto cleanup; @@ -2355,13 +2371,6 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, if (virAsprintf(&schemata_path, "%s/schemata", alloc->path) < 0) goto cleanup; - if (virFileMakePath(alloc->path) < 0) { - virReportSystemError(errno, - _("Cannot create resctrl directory '%s'"), - alloc->path); - goto cleanup; - } - VIR_DEBUG("Writing resctrl schemata '%s' into '%s'", alloc_str, schemata_path); if (virFileWriteStr(schemata_path, alloc_str, 0) < 0) { rmdir(alloc->path); -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
The code for creating resctrl allocation group could be reused for monitoring group, refactor it for reusing in the later patch.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John (NB: Done for the day too).

Add interface for creating the resource monitoring group according to '@virResctrlMonitor->path'. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 28 ++++++++++++++++++++++++++++ src/util/virresctrl.h | 6 ++++++ 3 files changed, 35 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e175c8b..a878083 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; virResctrlMonitorAddPID; +virResctrlMonitorCreate; virResctrlMonitorDeterminePath; virResctrlMonitorNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 8b617a6..b3d20cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2475,6 +2475,7 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, return virResctrlAddPID(monitor->path, pid); } + int virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename) @@ -2506,3 +2507,30 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, return 0; } + + +int +virResctrlMonitorCreate(virResctrlAllocPtr alloc, + virResctrlMonitorPtr monitor, + const char *machinename) +{ + int lockfd = -1; + int ret = -1; + + if (!monitor) + return 0; + + monitor->alloc = virObjectRef(alloc); + + if (virResctrlMonitorDeterminePath(monitor, machinename) < 0) + return -1; + + lockfd = virResctrlLockWrite(); + if (lockfd < 0) + return -1; + + ret = virResctrlCreateGroupPath(monitor->path); + + virResctrlUnlock(lockfd); + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 69b6b1d..1efe394 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -196,7 +196,13 @@ virResctrlMonitorNew(void); int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid); + int virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename); + +int +virResctrlMonitorCreate(virResctrlAllocPtr alloc, + virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interface for creating the resource monitoring group according to '@virResctrlMonitor->path'.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 28 ++++++++++++++++++++++++++++ src/util/virresctrl.h | 6 ++++++ 3 files changed, 35 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e175c8b..a878083 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; virResctrlMonitorAddPID; +virResctrlMonitorCreate; virResctrlMonitorDeterminePath; virResctrlMonitorNew;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 8b617a6..b3d20cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2475,6 +2475,7 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, return virResctrlAddPID(monitor->path, pid); }
+
This should have been squashed into patch6
int virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename) @@ -2506,3 +2507,30 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
return 0; } + + +int +virResctrlMonitorCreate(virResctrlAllocPtr alloc, + virResctrlMonitorPtr monitor, + const char *machinename) +{ + int lockfd = -1; + int ret = -1; + + if (!monitor) + return 0; + + monitor->alloc = virObjectRef(alloc);
Can @alloc be NULL here? I see that the eventual caller from qemuProcessResctrlCreate would pass vm->def->resctrls[i]->alloc after a virResctrlAllocCreate, but that API can return 0 immediately "if (!alloc)"? Furthermore, if we Ref it here, but return -1 in the subsequent steps are we sure that the Unref gets done. The one "thing" about the order of patches here is that it forces me to look forward to ensure decisions made in previous patches will be handled in the future.
+ + if (virResctrlMonitorDeterminePath(monitor, machinename) < 0) + return -1;
At least for now virResctrlMonitorDeterminePath can handle a NULL monitor->alloc... John
+ + lockfd = virResctrlLockWrite(); + if (lockfd < 0) + return -1; + + ret = virResctrlCreateGroupPath(monitor->path); + + virResctrlUnlock(lockfd); + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 69b6b1d..1efe394 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -196,7 +196,13 @@ virResctrlMonitorNew(void); int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid); + int virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename); + +int +virResctrlMonitorCreate(virResctrlAllocPtr alloc, + virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */

On 10/11/2018 3:13 AM, John Ferlan wrote:
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interface for creating the resource monitoring group according to '@virResctrlMonitor->path'.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 28 ++++++++++++++++++++++++++++ src/util/virresctrl.h | 6 ++++++ 3 files changed, 35 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e175c8b..a878083 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; virResctrlMonitorAddPID; +virResctrlMonitorCreate; virResctrlMonitorDeterminePath; virResctrlMonitorNew;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 8b617a6..b3d20cc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2475,6 +2475,7 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, return virResctrlAddPID(monitor->path, pid); }
+ This should have been squashed into patch6
OK. Two lines between each function.
int virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename) @@ -2506,3 +2507,30 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
return 0; } + + +int +virResctrlMonitorCreate(virResctrlAllocPtr alloc, + virResctrlMonitorPtr monitor, + const char *machinename) +{ + int lockfd = -1; + int ret = -1; + + if (!monitor) + return 0; + + monitor->alloc = virObjectRef(alloc); Can @alloc be NULL here? I see that the eventual caller from qemuProcessResctrlCreate would pass vm->def->resctrls[i]->alloc after a virResctrlAllocCreate, but that API can return 0 immediately "if (!alloc)"?
@alloc could be NULL. In virResctrlAllocCreate, if the passed in @alloc is NULL, the virResctrlAllocCreate function will return 0, this is not considered as an error. Following the code invoking virResctrlMonitorCreate, if @vm->def->resctrls[i]->alloc is NULL, then the virResctrlAllocCreate function will return 0, this is not an error, code going on, then virResctrlMonitorCreate will be invoked if @nmonitors is not 0, in this case, the first argument,the @alloc, of virResctrlMonitorCreate is NULL. 2613 for (i = 0; i < vm->def->nresctrls; i++) { 2614 size_t j = 0; 2615 if (virResctrlAllocCreate(caps->host.resctrl, 2616 vm->def->resctrls[i]->alloc, 2617 priv->machineName) < 0) 2618 goto cleanup; 2619 2620 for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { 2621 virDomainResctrlMonDefPtr mon = NULL; 2622 2623 mon = vm->def->resctrls[i]->monitors[j]; 2624 if (virResctrlMonitorCreate(vm->def->resctrls[i]->alloc, 2625 mon->instance, 2626 priv->machineName) < 0) 2627 goto cleanup; 2628 2629 } 2630 }
Furthermore, if we Ref it here, but return -1 in the subsequent steps are we sure that the Unref gets done.
If in later steps error happens and returns a -1, then it will go through the resource releasing functions by calling virResctrlMonitorDispose and virResctrlAllocDispose. The unref of @monitor->alloc is done in virResctrlMonitorDispose (function introduced in patch2): 401 static void 402 virResctrlMonitorDispose(void *obj) 403 { 404 virResctrlMonitorPtr monitor = obj; 405 406 virObjectUnref(monitor->alloc); 407 VIR_FREE(monitor->id); 408 VIR_FREE(monitor->path); 409 }
The one "thing" about the order of patches here is that it forces me to look forward to ensure decisions made in previous patches will be handled in the future.
Maybe combine virResctrlMonitorDispose and virResctrlMonitorCreate in one patch? Will that make you look better for understanding how @monitor->alloc is used.
+ + if (virResctrlMonitorDeterminePath(monitor, machinename) < 0) + return -1; At least for now virResctrlMonitorDeterminePath can handle a NULL monitor->alloc...
John
Thanks for review. Huaqiang
+ + lockfd = virResctrlLockWrite(); + if (lockfd < 0) + return -1; + + ret = virResctrlCreateGroupPath(monitor->path); + + virResctrlUnlock(lockfd); + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 69b6b1d..1efe394 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -196,7 +196,13 @@ virResctrlMonitorNew(void); int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid); + int virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename); + +int +virResctrlMonitorCreate(virResctrlAllocPtr alloc, + virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */

[...]
402 virResctrlMonitorDispose(void *obj) 403 { 404 virResctrlMonitorPtr monitor = obj; 405 406 virObjectUnref(monitor->alloc); 407 VIR_FREE(monitor->id); 408 VIR_FREE(monitor->path); 409 }
The one "thing" about the order of patches here is that it forces me to look forward to ensure decisions made in previous patches will be handled in the future.
Maybe combine virResctrlMonitorDispose and virResctrlMonitorCreate in one patch? Will that make you look better for understanding how @monitor->alloc is used.
That usually is best. I think the order has me off a bit too, but there is no perfect solution for order. I find myself looking forward a lot and trying to keep track. John

On 10/12/2018 10:27 PM, John Ferlan wrote:
[...]
402 virResctrlMonitorDispose(void *obj) 403 { 404 virResctrlMonitorPtr monitor = obj; 405 406 virObjectUnref(monitor->alloc); 407 VIR_FREE(monitor->id); 408 VIR_FREE(monitor->path); 409 }
The one "thing" about the order of patches here is that it forces me to look forward to ensure decisions made in previous patches will be handled in the future. Maybe combine virResctrlMonitorDispose and virResctrlMonitorCreate in one patch? Will that make you look better for understanding how @monitor->alloc is used.
That usually is best. I think the order has me off a bit too, but there is no perfect solution for order. I find myself looking forward a lot and trying to keep track.
I'll change the order of the series to fix this.
John
Thanks for review. Huaqiang

Add interfaces monitor group to support operations such as add PID, set ID, remove group ... etc. The interface for getting cache occupancy information from the monitor is also added. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 6 ++ src/util/virresctrl.c | 209 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 23 ++++++ 3 files changed, 236 insertions(+), 2 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a878083..c93d19f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2683,7 +2683,13 @@ virResctrlInfoNew; virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; +virResctrlMonitorGetCacheLevel; +virResctrlMonitorGetCacheOccupancy; +virResctrlMonitorGetID; virResctrlMonitorNew; +virResctrlMonitorRemove; +virResctrlMonitorSetCacheLevel; +virResctrlMonitorSetID; # util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index b3d20cc..fca1f6f 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -225,11 +225,19 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) } + +/* + * virResctrlAlloc and virResctrlMonitor are representing a resource control + * group (in XML under cputune/cachetune and consequently a directory under + * /sys/fs/resctrl). virResctrlAlloc is the data structure for resource + * allocation, while the virResctrlMonitor represents the resource monitoring + * part. + */ + /* virResctrlAlloc */ /* - * virResctrlAlloc represents one allocation (in XML under cputune/cachetune and - * consequently a directory under /sys/fs/resctrl). Since it can have multiple + * virResctrlAlloc represents one allocation. Since it can have multiple * parts of multiple caches allocated it is represented as bunch of nested * sparse arrays (by sparse I mean array of pointers so that each might be NULL * in case there is no allocation for that particular cache allocation (level, @@ -347,6 +355,8 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* The cache 'level', special for cache monitor */ + unsigned int cache_level; }; @@ -2510,6 +2520,27 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, int +virResctrlMonitorSetID(virResctrlMonitorPtr monitor, + const char *id) +{ + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl monitor 'id' cannot be NULL")); + return -1; + } + + return VIR_STRDUP(monitor->id, id); +} + + +const char * +virResctrlMonitorGetID(virResctrlMonitorPtr monitor) +{ + return monitor->id; +} + + +int virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlMonitorPtr monitor, const char *machinename) @@ -2534,3 +2565,177 @@ virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlUnlock(lockfd); return ret; } + + +int +virResctrlMonitorRemove(virResctrlMonitorPtr monitor) +{ + int ret = 0; + + if (!monitor->path) + return 0; + + VIR_DEBUG("Removing resctrl monitor%s", monitor->path); + if (rmdir(monitor->path) != 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove %s (%d)"), + monitor->path, errno); + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), monitor->path, errno); + } + + return ret; +} + + +int +virResctrlMonitorSetCacheLevel(virResctrlMonitorPtr monitor, + unsigned int level) +{ + /* Only supports cache level 3 CMT */ + if (level != 3) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor cache level")); + return -1; + } + + monitor->cache_level = level; + + return 0; +} + +unsigned int +virResctrlMonitorGetCacheLevel(virResctrlMonitorPtr monitor) +{ + return monitor->cache_level; +} + + +/* + * virResctrlMonitorGetStatistic + * + * @monitor: The monitor that the statistic data will be retrieved from. + * @resource: The name for resource name. 'llc_occpancy' for cache resource. + * "mbm_totol_bytes" and "mbm_local_bytes" for memory bandwidth resource. + * @len: The array length for @ids, and @vals + * @ids: The id array for resource statistic information, ids[0] + * stores the first node id value, ids[1] stores the second node id value, + * ... and so on. + * @vals: The resource resource utilization information array. vals[0] + * stores the cache or memory bandwidth utilization value for first node, + * vals[1] stores the second value ... and so on. + * + * Get cache or memory bandwidth utilization information from monitor that + * specified by @id. + * + * Returns 0 for success, -1 for error. + */ +static int +virResctrlMonitorGetStatistic(virResctrlMonitorPtr monitor, + const char *resource, + size_t *len, + unsigned int **ids, + unsigned int **vals) +{ + int rv = -1; + int ret = -1; + size_t nids = 0; + size_t nvals = 0; + DIR *dirp = NULL; + char *datapath = NULL; + struct dirent *ent = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (virAsprintf(&datapath, "%s/mon_data", monitor->path) < 0) + return -1; + + if (virDirOpen(&dirp, datapath) < 0) + goto cleanup; + + *len = 0; + while (virDirRead(dirp, &ent, datapath) > 0) { + char *str_id = NULL; + unsigned int id = 0; + unsigned int val = 0; + size_t i = 0; + size_t cur_id_pos = 0; + unsigned int tmp_id = 0; + unsigned int tmp_val = 0; + + /* Looking for directory that contains resource utilization + * information file. The directory name is arranged in format + * "mon_<node_name>_<node_id>". For example, "mon_L3_00" and + * "mon_l3_01" are two target directories for a two nodes system + * with resource utilization data file for each node respectively. + */ + if (ent->d_type != DT_DIR) + continue; + + if (STRNEQLEN(ent->d_name, "mon_L", 5)) + continue; + + str_id = strchr(ent->d_name, '_'); + if (!str_id) + continue; + + str_id = strchr(++str_id, '_'); + if (!str_id) + continue; + + if (virStrToLong_uip(++str_id, NULL, 0, &id) < 0) + goto cleanup; + + rv = virFileReadValueUint(&val, "%s/%s/%s", datapath, + ent->d_name, resource); + if (rv == -2) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%s/%s/%s' does not exist."), + datapath, ent->d_name, resource); + } + if (rv < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*ids, nids, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*vals, nvals, val) < 0) + goto cleanup; + + /* Sort @ids and @vals arrays in the ascending order of id */ + cur_id_pos = nids - 1; + for (i = 0; i < cur_id_pos; i++) { + if ((*ids)[cur_id_pos] < (*ids)[i]) { + tmp_id = (*ids)[cur_id_pos]; + tmp_val = (*vals)[cur_id_pos]; + (*ids)[cur_id_pos] = (*ids)[i]; + (*vals)[cur_id_pos] = (*vals)[i]; + (*ids)[i] = tmp_id; + (*vals)[i] = tmp_val; + } + } + } + + *len = nids; + ret = 0; + cleanup: + VIR_FREE(datapath); + VIR_DIR_CLOSE(dirp); + return ret; +} + + +/* Get cache occupancy data from @monitor */ +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + size_t *nbank, + unsigned int **bankids, + unsigned int **bankcaches) +{ + return virResctrlMonitorGetStatistic(monitor, "llc_occupancy", + nbank, bankids, bankcaches); +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 1efe394..6137fee 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -202,7 +202,30 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename); int +virResctrlMonitorSetID(virResctrlMonitorPtr monitor, + const char *id); + +const char * +virResctrlMonitorGetID(virResctrlMonitorPtr monitor); + +int virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlMonitorPtr monitor, const char *machinename); + +int +virResctrlMonitorRemove(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorSetCacheLevel(virResctrlMonitorPtr monitor, + unsigned int level); + +unsigned int +virResctrlMonitorGetCacheLevel(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + size_t *nbank, + unsigned int **bankids, + unsigned int **bankcaches); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interfaces monitor group to support operations such as add PID, set ID, remove group ... etc.
The interface for getting cache occupancy information from the monitor is also added.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 6 ++ src/util/virresctrl.c | 209 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 23 ++++++ 3 files changed, 236 insertions(+), 2 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a878083..c93d19f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2683,7 +2683,13 @@ virResctrlInfoNew; virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; +virResctrlMonitorGetCacheLevel; +virResctrlMonitorGetCacheOccupancy; +virResctrlMonitorGetID; virResctrlMonitorNew; +virResctrlMonitorRemove; +virResctrlMonitorSetCacheLevel; +virResctrlMonitorSetID;
# util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index b3d20cc..fca1f6f 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -225,11 +225,19 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) }
+ +/* + * virResctrlAlloc and virResctrlMonitor are representing a resource control + * group (in XML under cputune/cachetune and consequently a directory under + * /sys/fs/resctrl). virResctrlAlloc is the data structure for resource + * allocation, while the virResctrlMonitor represents the resource monitoring + * part. + */ + /* virResctrlAlloc */
/* - * virResctrlAlloc represents one allocation (in XML under cputune/cachetune and - * consequently a directory under /sys/fs/resctrl). Since it can have multiple + * virResctrlAlloc represents one allocation. Since it can have multiple
I would think that perhaps the comments changing here would go earlier when virResctrlMonitor was introduced. The comment with the single virResctrlAlloc could be changed to "virResctrlAlloc and virResctrlMonitor", then merge in what you've typed above.
* parts of multiple caches allocated it is represented as bunch of nested * sparse arrays (by sparse I mean array of pointers so that each might be NULL * in case there is no allocation for that particular cache allocation (level, @@ -347,6 +355,8 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* The cache 'level', special for cache monitor */ + unsigned int cache_level; };
@@ -2510,6 +2520,27 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
int +virResctrlMonitorSetID(virResctrlMonitorPtr monitor, + const char *id) +{ + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl monitor 'id' cannot be NULL")); + return -1; + } + + return VIR_STRDUP(monitor->id, id); +} + + +const char * +virResctrlMonitorGetID(virResctrlMonitorPtr monitor) +{ + return monitor->id; +} + + +int virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlMonitorPtr monitor, const char *machinename) @@ -2534,3 +2565,177 @@ virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlUnlock(lockfd); return ret; } + + +int
The eventual only caller never checks status...
+virResctrlMonitorRemove(virResctrlMonitorPtr monitor) +{ + int ret = 0; +
So unlike Alloc, @monitor cannot be NULL...
+ if (!monitor->path) + return 0; + + VIR_DEBUG("Removing resctrl monitor%s", monitor->path); + if (rmdir(monitor->path) != 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove %s (%d)"), + monitor->path, errno); + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), monitor->path, errno);
Either virReportSystemError if you're going to handle the returned status or VIR_ERROR if you're not (and are just ignoring), but not both.
+ } + + return ret; +} + + +int +virResctrlMonitorSetCacheLevel(virResctrlMonitorPtr monitor, + unsigned int level) +{ + /* Only supports cache level 3 CMT */ + if (level != 3) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor cache level")); + return -1; + } + + monitor->cache_level = level; + + return 0; +} + +unsigned int +virResctrlMonitorGetCacheLevel(virResctrlMonitorPtr monitor) +{ + return monitor->cache_level; +}
Based on usage, maybe we should just give up on API's like this. Create a VIR_RESCTRL_MONITOR_CACHE_LEVEL (or something like it)... Then use it at least for now when reading/supplying the level. Thus we leave it to future developer to create the API's and handle the new levels... If when we Parse we don't find the constant, then error.
+ + +/* + * virResctrlMonitorGetStatistic
Usually just GetStats is fine or GetOneStat
+ * + * @monitor: The monitor that the statistic data will be retrieved from. + * @resource: The name for resource name. 'llc_occpancy' for cache resource.
occupancy
+ * "mbm_totol_bytes" and "mbm_local_bytes" for memory bandwidth resource.
mem_total_bytes Although the actual names could go below when describing [1]
+ * @len: The array length for @ids, and @vals + * @ids: The id array for resource statistic information, ids[0] + * stores the first node id value, ids[1] stores the second node id value, + * ... and so on. + * @vals: The resource resource utilization information array. vals[0] + * stores the cache or memory bandwidth utilization value for first node, + * vals[1] stores the second value ... and so on. + *
[1] e.g. here - what you'd expect @resource to be...
+ * Get cache or memory bandwidth utilization information from monitor that + * specified by @id. + * + * Returns 0 for success, -1 for error. + */ +static int +virResctrlMonitorGetStatistic(virResctrlMonitorPtr monitor, + const char *resource, + size_t *len, + unsigned int **ids, + unsigned int **vals) +{ + int rv = -1; + int ret = -1; + size_t nids = 0; + size_t nvals = 0; + DIR *dirp = NULL; + char *datapath = NULL; + struct dirent *ent = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (virAsprintf(&datapath, "%s/mon_data", monitor->path) < 0) + return -1; + + if (virDirOpen(&dirp, datapath) < 0) + goto cleanup; + + *len = 0; + while (virDirRead(dirp, &ent, datapath) > 0) { + char *str_id = NULL; + unsigned int id = 0; + unsigned int val = 0; + size_t i = 0; + size_t cur_id_pos = 0; + unsigned int tmp_id = 0; + unsigned int tmp_val = 0; + + /* Looking for directory that contains resource utilization + * information file. The directory name is arranged in format + * "mon_<node_name>_<node_id>". For example, "mon_L3_00" and + * "mon_l3_01" are two target directories for a two nodes system + * with resource utilization data file for each node respectively. + */ + if (ent->d_type != DT_DIR) + continue; + + if (STRNEQLEN(ent->d_name, "mon_L", 5)) + continue; + + str_id = strchr(ent->d_name, '_'); + if (!str_id) + continue; + + str_id = strchr(++str_id, '_'); + if (!str_id) + continue;
I think all of the above could be replaced by: if (!(str_id = STRSKIP(ent->d_name, "mon_L"))) continue; I personally am not a fan of pre-auto-incr. I get particularly uncomfortable when it involves pointer arithmetic... But I think it's unnecessary if you use STRSKIP
+ + if (virStrToLong_uip(++str_id, NULL, 0, &id) < 0) + goto cleanup; + + rv = virFileReadValueUint(&val, "%s/%s/%s", datapath, + ent->d_name, resource); + if (rv == -2) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%s/%s/%s' does not exist."), + datapath, ent->d_name, resource); + } + if (rv < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*ids, nids, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*vals, nvals, val) < 0) + goto cleanup;
If you had a single structure for id and val, then...
+ + /* Sort @ids and @vals arrays in the ascending order of id */ + cur_id_pos = nids - 1; + for (i = 0; i < cur_id_pos; i++) { + if ((*ids)[cur_id_pos] < (*ids)[i]) { + tmp_id = (*ids)[cur_id_pos]; + tmp_val = (*vals)[cur_id_pos]; + (*ids)[cur_id_pos] = (*ids)[i]; + (*vals)[cur_id_pos] = (*vals)[i]; + (*ids)[i] = tmp_id; + (*vals)[i] = tmp_val; + }
...qsort would be a much better option than the above open code...
+ } + } + + *len = nids; + ret = 0; + cleanup: + VIR_FREE(datapath); + VIR_DIR_CLOSE(dirp); + return ret; +} + + +/* Get cache occupancy data from @monitor */ +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + size_t *nbank, + unsigned int **bankids, + unsigned int **bankcaches) +{ + return virResctrlMonitorGetStatistic(monitor, "llc_occupancy", + nbank, bankids, bankcaches); +}
I think the above two may be a case for waiting until they're needed, but I haven't got that far yet. IOW: Some extraction and reordering. John
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 1efe394..6137fee 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -202,7 +202,30 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename);
int +virResctrlMonitorSetID(virResctrlMonitorPtr monitor, + const char *id); + +const char * +virResctrlMonitorGetID(virResctrlMonitorPtr monitor); + +int virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlMonitorPtr monitor, const char *machinename); + +int +virResctrlMonitorRemove(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorSetCacheLevel(virResctrlMonitorPtr monitor, + unsigned int level); + +unsigned int +virResctrlMonitorGetCacheLevel(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + size_t *nbank, + unsigned int **bankids, + unsigned int **bankcaches); #endif /* __VIR_RESCTRL_H__ */

On 10/11/2018 3:13 AM, John Ferlan wrote:
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add interfaces monitor group to support operations such as add PID, set ID, remove group ... etc.
The interface for getting cache occupancy information from the monitor is also added.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 6 ++ src/util/virresctrl.c | 209 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 23 ++++++ 3 files changed, 236 insertions(+), 2 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a878083..c93d19f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2683,7 +2683,13 @@ virResctrlInfoNew; virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; +virResctrlMonitorGetCacheLevel; +virResctrlMonitorGetCacheOccupancy; +virResctrlMonitorGetID; virResctrlMonitorNew; +virResctrlMonitorRemove; +virResctrlMonitorSetCacheLevel; +virResctrlMonitorSetID;
# util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index b3d20cc..fca1f6f 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -225,11 +225,19 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) }
+ +/* + * virResctrlAlloc and virResctrlMonitor are representing a resource control + * group (in XML under cputune/cachetune and consequently a directory under + * /sys/fs/resctrl). virResctrlAlloc is the data structure for resource + * allocation, while the virResctrlMonitor represents the resource monitoring + * part. + */ + /* virResctrlAlloc */
/* - * virResctrlAlloc represents one allocation (in XML under cputune/cachetune and - * consequently a directory under /sys/fs/resctrl). Since it can have multiple + * virResctrlAlloc represents one allocation. Since it can have multiple
I would think that perhaps the comments changing here would go earlier when virResctrlMonitor was introduced. The comment with the single virResctrlAlloc could be changed to "virResctrlAlloc and virResctrlMonitor", then merge in what you've typed above.
* parts of multiple caches allocated it is represented as bunch of nested * sparse arrays (by sparse I mean array of pointers so that each might be NULL * in case there is no allocation for that particular cache allocation (level, @@ -347,6 +355,8 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* The cache 'level', special for cache monitor */ + unsigned int cache_level; };
@@ -2510,6 +2520,27 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
int +virResctrlMonitorSetID(virResctrlMonitorPtr monitor, + const char *id) +{ + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl monitor 'id' cannot be NULL")); + return -1; + } + + return VIR_STRDUP(monitor->id, id); +} + + +const char * +virResctrlMonitorGetID(virResctrlMonitorPtr monitor) +{ + return monitor->id; +} + + +int virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlMonitorPtr monitor, const char *machinename) @@ -2534,3 +2565,177 @@ virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlUnlock(lockfd); return ret; } + + +int
The eventual only caller never checks status...
That's true, and I noticed that. The back-end logic is copied from virResctrlAllocRemove. Maybe with a change of removing the following virReportSystemError lines.
+virResctrlMonitorRemove(virResctrlMonitorPtr monitor) +{ + int ret = 0; +
So unlike Alloc, @monitor cannot be NULL...
@monitor is not allowed to be NULL. This is guaranteed by the caller.
+ if (!monitor->path) + return 0; + + VIR_DEBUG("Removing resctrl monitor%s", monitor->path); + if (rmdir(monitor->path) != 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove %s (%d)"), + monitor->path, errno); + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), monitor->path, errno);
Either virReportSystemError if you're going to handle the returned status or VIR_ERROR if you're not (and are just ignoring), but not both.
I would like to remove the virReportSystemError lines and keep the VIR_ERROR line.
+ } + + return ret; +} + + +int +virResctrlMonitorSetCacheLevel(virResctrlMonitorPtr monitor, + unsigned int level) +{ + /* Only supports cache level 3 CMT */ + if (level != 3) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor cache level")); + return -1; + } + + monitor->cache_level = level; + + return 0; +} + +unsigned int +virResctrlMonitorGetCacheLevel(virResctrlMonitorPtr monitor) +{ + return monitor->cache_level; +}
Based on usage, maybe we should just give up on API's like this. Create a VIR_RESCTRL_MONITOR_CACHE_LEVEL (or something like it)... Then use it at least for now when reading/supplying the level. Thus we leave it to future developer to create the API's and handle the new levels...
If when we Parse we don't find the constant, then error.
Do you mean removing the 'virResctrlMonitorSetCacheLevel' and 'virResctrlMonitorGetCacheLevel' two functions from here, and processing the monitor cache level information directory in domain_conf.c during XML parsing and format process.
+ + +/* + * virResctrlMonitorGetStatistic
Usually just GetStats is fine or GetOneStat
I prefer GetStats -> virResctrlMonitorGetStats
+ * + * @monitor: The monitor that the statistic data will be retrieved from. + * @resource: The name for resource name. 'llc_occpancy' for cache resource.
occupancy
OK
+ * "mbm_totol_bytes" and "mbm_local_bytes" for memory bandwidth resource.
mem_total_bytes
Yes, I spell a wrong word. will be corrected.
Although the actual names could go below when describing [1]
The describe of @resource would be trimmed: @resource: The name for resource name.
+ * @len: The array length for @ids, and @vals + * @ids: The id array for resource statistic information, ids[0] + * stores the first node id value, ids[1] stores the second node id value, + * ... and so on. + * @vals: The resource resource utilization information array. vals[0] + * stores the cache or memory bandwidth utilization value for first node, + * vals[1] stores the second value ... and so on. + *
[1] e.g. here - what you'd expect @resource to be...
* @vals: The data array of the current @resource utilization information. * The first element of @vals stores the first node's data, the second * array element stores the second node data if there is multi-node system. * Similarly the third array element stores the third ... and so on. * If @resource is 'llc_occupancy', the array stores the cache occupancy * information for all nodes. If @resource is 'mbm_total_bytes' or * 'mbm_local_bytes', then the array stores memory bandwidth related * information for all nodes. These comments might be changed, as we want to combine @ids with @vals.
+ * Get cache or memory bandwidth utilization information from monitor that + * specified by @id. + * + * Returns 0 for success, -1 for error. + */ +static int +virResctrlMonitorGetStatistic(virResctrlMonitorPtr monitor, + const char *resource, + size_t *len, + unsigned int **ids, + unsigned int **vals) +{ + int rv = -1; + int ret = -1; + size_t nids = 0; + size_t nvals = 0; + DIR *dirp = NULL; + char *datapath = NULL; + struct dirent *ent = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (virAsprintf(&datapath, "%s/mon_data", monitor->path) < 0) + return -1; + + if (virDirOpen(&dirp, datapath) < 0) + goto cleanup; + + *len = 0; + while (virDirRead(dirp, &ent, datapath) > 0) { + char *str_id = NULL; + unsigned int id = 0; + unsigned int val = 0; + size_t i = 0; + size_t cur_id_pos = 0; + unsigned int tmp_id = 0; + unsigned int tmp_val = 0; + + /* Looking for directory that contains resource utilization + * information file. The directory name is arranged in format + * "mon_<node_name>_<node_id>". For example, "mon_L3_00" and + * "mon_l3_01" are two target directories for a two nodes system + * with resource utilization data file for each node respectively. + */ + if (ent->d_type != DT_DIR) + continue; + + if (STRNEQLEN(ent->d_name, "mon_L", 5)) + continue; + + str_id = strchr(ent->d_name, '_'); + if (!str_id) + continue; + + str_id = strchr(++str_id, '_'); + if (!str_id) + continue;
I think all of the above could be replaced by:
if (!(str_id = STRSKIP(ent->d_name, "mon_L"))) continue;
I personally am not a fan of pre-auto-incr. I get particularly uncomfortable when it involves pointer arithmetic... But I think it's unnecessary if you use STRSKIP
The interesting target directory name is like 'MON_L3_00' or 'MON_L3CODE_00' if CDP is enabled. The last two numbers after second '_' character is the ID number, now we need to locate the second '_' character. One STRSKIP is not enough, still need a call of strchr to locate the second '_' in following way: if (!(str_id = STRSKIP(ent->d_name, "mon_L"))) continue; str_id = strchr(str_id , '_'); if (!str_id) continue;
+ + if (virStrToLong_uip(++str_id, NULL, 0, &id) < 0) + goto cleanup; + + rv = virFileReadValueUint(&val, "%s/%s/%s", datapath, + ent->d_name, resource); + if (rv == -2) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%s/%s/%s' does not exist."), + datapath, ent->d_name, resource); + } + if (rv < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*ids, nids, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*vals, nvals, val) < 0) + goto cleanup;
If you had a single structure for id and val, then...
How about: struct _virResctrlMonitorStats { unsigned int id; unsigned int val; };
+ + /* Sort @ids and @vals arrays in the ascending order of id */ + cur_id_pos = nids - 1; + for (i = 0; i < cur_id_pos; i++) { + if ((*ids)[cur_id_pos] < (*ids)[i]) { + tmp_id = (*ids)[cur_id_pos]; + tmp_val = (*vals)[cur_id_pos]; + (*ids)[cur_id_pos] = (*ids)[i]; + (*vals)[cur_id_pos] = (*vals)[i]; + (*ids)[i] = tmp_id; + (*vals)[i] = tmp_val; + }
...qsort would be a much better option than the above open code...
If @ids and @vals are combined in one structure, then is very convenient to use 'qsort' to sort a structure. The sorting action will be moved out of the loop, before a successful return. Will be done.
+ } + } + + *len = nids; + ret = 0; + cleanup: + VIR_FREE(datapath); + VIR_DIR_CLOSE(dirp); + return ret; +} + + +/* Get cache occupancy data from @monitor */ +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + size_t *nbank, + unsigned int **bankids, + unsigned int **bankcaches) +{ + return virResctrlMonitorGetStatistic(monitor, "llc_occupancy", + nbank, bankids, bankcaches); +}
I think the above two may be a case for waiting until they're needed, but I haven't got that far yet. IOW: Some extraction and reordering.
This function is used in patch 18, maybe move these two functions into patch18. Will be done.
John
Thanks for review. Huaqiang
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 1efe394..6137fee 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -202,7 +202,30 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename);
int +virResctrlMonitorSetID(virResctrlMonitorPtr monitor, + const char *id); + +const char * +virResctrlMonitorGetID(virResctrlMonitorPtr monitor); + +int virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlMonitorPtr monitor, const char *machinename); + +int +virResctrlMonitorRemove(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorSetCacheLevel(virResctrlMonitorPtr monitor, + unsigned int level); + +unsigned int +virResctrlMonitorGetCacheLevel(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + size_t *nbank, + unsigned int **bankids, + unsigned int **bankcaches); #endif /* __VIR_RESCTRL_H__ */

[...]
virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlMonitorPtr monitor, const char *machinename) @@ -2534,3 +2565,177 @@ virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlUnlock(lockfd); return ret; } + + +int
The eventual only caller never checks status...
That's true, and I noticed that. The back-end logic is copied from virResctrlAllocRemove.
Understood, but that doesn't check status either. Maybe it needs to change to a void since it uses VIR_ERROR.
Maybe with a change of removing the following virReportSystemError lines.
+virResctrlMonitorRemove(virResctrlMonitorPtr monitor) +{ + int ret = 0; +
So unlike Alloc, @monitor cannot be NULL...
@monitor is not allowed to be NULL. This is guaranteed by the caller.
+ if (!monitor->path) + return 0; + + VIR_DEBUG("Removing resctrl monitor%s", monitor->path); + if (rmdir(monitor->path) != 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove %s (%d)"), + monitor->path, errno); + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), monitor->path, errno);
Either virReportSystemError if you're going to handle the returned status or VIR_ERROR if you're not (and are just ignoring), but not both.
I would like to remove the virReportSystemError lines and keep the VIR_ERROR line.
I can agree to that along with it being void since it's a best effort.
+ } + + return ret; +} + + +int +virResctrlMonitorSetCacheLevel(virResctrlMonitorPtr monitor, + unsigned int level) +{ + /* Only supports cache level 3 CMT */ + if (level != 3) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor cache level")); + return -1; + } + + monitor->cache_level = level; + + return 0; +} + +unsigned int +virResctrlMonitorGetCacheLevel(virResctrlMonitorPtr monitor) +{ + return monitor->cache_level; +}
Based on usage, maybe we should just give up on API's like this. Create a VIR_RESCTRL_MONITOR_CACHE_LEVEL (or something like it)... Then use it at least for now when reading/supplying the level. Thus we leave it to future developer to create the API's and handle the new levels...
If when we Parse we don't find the constant, then error.
Do you mean removing the 'virResctrlMonitorSetCacheLevel' and 'virResctrlMonitorGetCacheLevel' two functions from here, and processing the monitor cache level information directory in domain_conf.c during XML parsing and format process.
Yeah, I think I've come to that conclusion. In the Parse code you'd still parse the value, but then compare against the constant. In the Format code, you can just format what you have. Whomever creates a different level in the future can have the fun of managing range and of course managing if whatever is being fetched is in the particular cache level.
+ + +/* + * virResctrlMonitorGetStatistic
[...]
+ str_id = strchr(++str_id, '_'); + if (!str_id) + continue;
I think all of the above could be replaced by:
if (!(str_id = STRSKIP(ent->d_name, "mon_L"))) continue;
I personally am not a fan of pre-auto-incr. I get particularly uncomfortable when it involves pointer arithmetic... But I think it's unnecessary if you use STRSKIP
The interesting target directory name is like 'MON_L3_00' or 'MON_L3CODE_00' if CDP is enabled. The last two numbers after second '_' character is the ID number, now we need to locate the second '_' character. One STRSKIP is not enough, still need a call of strchr to locate the second '_' in following way:
if (!(str_id = STRSKIP(ent->d_name, "mon_L"))) continue;
str_id = strchr(str_id , '_'); if (!str_id) continue;
So MON_L3DATA_00 would do the same as would MON_L3FUTURE_00? Then let's STRSKIP(str_id, "_") - IOW: Skip to the next The first STRSKIP is to get to the "level"#, right? The second to the "id"#. Maybe the variables should be named that way to make it clear along with some comments.
+ + if (virStrToLong_uip(++str_id, NULL, 0, &id) < 0) + goto cleanup; + + rv = virFileReadValueUint(&val, "%s/%s/%s", datapath, + ent->d_name, resource); + if (rv == -2) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%s/%s/%s' does not exist."), + datapath, ent->d_name, resource); + } + if (rv < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*ids, nids, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*vals, nvals, val) < 0) + goto cleanup;
If you had a single structure for id and val, then...
How about:
struct _virResctrlMonitorStats { unsigned int id; unsigned int val; };
Your call how you do it, but that self created sort (below) is more what is objectionable. I remember peeking quickly - there are plenty of qsort examples to choose from in libvirt sources. John
+ + /* Sort @ids and @vals arrays in the ascending order of id */ + cur_id_pos = nids - 1; + for (i = 0; i < cur_id_pos; i++) { + if ((*ids)[cur_id_pos] < (*ids)[i]) { + tmp_id = (*ids)[cur_id_pos]; + tmp_val = (*vals)[cur_id_pos]; + (*ids)[cur_id_pos] = (*ids)[i]; + (*vals)[cur_id_pos] = (*vals)[i]; + (*ids)[i] = tmp_id; + (*vals)[i] = tmp_val; + }
[...]

On 10/12/2018 10:40 PM, John Ferlan wrote:
[...]
virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlMonitorPtr monitor, const char *machinename) @@ -2534,3 +2565,177 @@ virResctrlMonitorCreate(virResctrlAllocPtr alloc, virResctrlUnlock(lockfd); return ret; } + + +int The eventual only caller never checks status... That's true, and I noticed that. The back-end logic is copied from virResctrlAllocRemove. Understood, but that doesn't check status either. Maybe it needs to change to a void since it uses VIR_ERROR.
Maybe with a change of removing the following virReportSystemError lines.
+virResctrlMonitorRemove(virResctrlMonitorPtr monitor) +{ + int ret = 0; + So unlike Alloc, @monitor cannot be NULL... @monitor is not allowed to be NULL. This is guaranteed by the caller.
+ if (!monitor->path) + return 0; + + VIR_DEBUG("Removing resctrl monitor%s", monitor->path); + if (rmdir(monitor->path) != 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove %s (%d)"), + monitor->path, errno); + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), monitor->path, errno); Either virReportSystemError if you're going to handle the returned status or VIR_ERROR if you're not (and are just ignoring), but not both. I would like to remove the virReportSystemError lines and keep the VIR_ERROR line.
I can agree to that along with it being void since it's a best effort.
+ } + + return ret; +} + + +int +virResctrlMonitorSetCacheLevel(virResctrlMonitorPtr monitor, + unsigned int level) +{ + /* Only supports cache level 3 CMT */ + if (level != 3) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor cache level")); + return -1; + } + + monitor->cache_level = level; + + return 0; +} + +unsigned int +virResctrlMonitorGetCacheLevel(virResctrlMonitorPtr monitor) +{ + return monitor->cache_level; +} Based on usage, maybe we should just give up on API's like this. Create a VIR_RESCTRL_MONITOR_CACHE_LEVEL (or something like it)... Then use it at least for now when reading/supplying the level. Thus we leave it to future developer to create the API's and handle the new levels...
If when we Parse we don't find the constant, then error.
Do you mean removing the 'virResctrlMonitorSetCacheLevel' and 'virResctrlMonitorGetCacheLevel' two functions from here, and processing the monitor cache level information directory in domain_conf.c during XML parsing and format process.
Yeah, I think I've come to that conclusion. In the Parse code you'd still parse the value, but then compare against the constant. In the Format code, you can just format what you have. Whomever creates a different level in the future can have the fun of managing range and of course managing if whatever is being fetched is in the particular cache level.
+ + +/* + * virResctrlMonitorGetStatistic [...]
+ str_id = strchr(++str_id, '_'); + if (!str_id) + continue; I think all of the above could be replaced by:
if (!(str_id = STRSKIP(ent->d_name, "mon_L"))) continue;
I personally am not a fan of pre-auto-incr. I get particularly uncomfortable when it involves pointer arithmetic... But I think it's unnecessary if you use STRSKIP The interesting target directory name is like 'MON_L3_00' or 'MON_L3CODE_00' if CDP is enabled. The last two numbers after second '_' character is the ID number, now we need to locate the second '_' character. One STRSKIP is not enough, still need a call of strchr to locate the second '_' in following way:
if (!(str_id = STRSKIP(ent->d_name, "mon_L"))) continue;
str_id = strchr(str_id , '_'); if (!str_id) continue; So MON_L3DATA_00 would do the same as would MON_L3FUTURE_00? Yes. Here I am only interested in the last two digital numbers after second character '_', which is the node id.
Then let's STRSKIP(str_id, "_") - IOW: Skip to the next
Do you mean there steps in total? if (!(str_id = STRSKIP(ent->d_name, "mon_L"))) continue; str_id = strchr(str_id , '_'); if (!str_id) continue; if (!(str_id = STRSKIP(str_id, "_")))/* instead of STRSKIP(ent->d_name, '_') */ continue; if (virStrToLong_uip(str_id, NULL, 0, &id) < 0) goto cleanup;
The first STRSKIP is to get to the "level"#, right?
Yes. But I skipped the verify for cache level, we only support L3 cache monitoring.
The second to the "id"#. Maybe the variables should be named that way to make it clear along with some comments.
Good suggestion. I'll directly use 'node_id' for the name, and code would look like these:          /* Looking for directory that contains the resource utilization           * information file. The directory name is arranged in format           * "mon_<node_name>_<node_id>". For example, "mon_L3_00" and           * "mon_L3_01" are two target directories for a two nodes system           * with resource utilization data file for each node respectively.           */          if (ent->d_type != DT_DIR) continue;          /* Looking for directory has a prefix 'mon_L' */          if (!(node_id = STRSKIP(end->d_name, "mon_L")))              continue;          /* Looking for directory has another '_' */          node_id = strchr(node_id, '_');          if (!node_id)              continue;          /* Skip the character '_' */          if (!(node_id = STRSKIP(node_id, "_")))              continue;          /* The node ID number should be here, parsing it. */          if (virStrToLong_uip(node_id, NULL, 0, &id) < 0)              goto cleanup;
+ + if (virStrToLong_uip(++str_id, NULL, 0, &id) < 0) + goto cleanup; + + rv = virFileReadValueUint(&val, "%s/%s/%s", datapath, + ent->d_name, resource); + if (rv == -2) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%s/%s/%s' does not exist."), + datapath, ent->d_name, resource); + } + if (rv < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*ids, nids, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*vals, nvals, val) < 0) + goto cleanup; If you had a single structure for id and val, then...
How about:
struct _virResctrlMonitorStats { unsigned int id; unsigned int val; };
Your call how you do it, but that self created sort (below) is more what is objectionable. I remember peeking quickly - there are plenty of qsort examples to choose from in libvirt sources.
I'll using the _virResctrlMonitorStats structure, and sorting the output with qsort as you suggested.
John
Thanks for review. Huaqiang
+ + /* Sort @ids and @vals arrays in the ascending order of id */ + cur_id_pos = nids - 1; + for (i = 0; i < cur_id_pos; i++) { + if ((*ids)[cur_id_pos] < (*ids)[i]) { + tmp_id = (*ids)[cur_id_pos]; + tmp_val = (*vals)[cur_id_pos]; + (*ids)[cur_id_pos] = (*ids)[i]; + (*vals)[cur_id_pos] = (*vals)[i]; + (*ids)[i] = tmp_id; + (*vals)[i] = tmp_val; + }
[...]

In resctrl file system, more than one monitoring groups could be created within one allocation group, along with the creation of allocation group, a monitoring group is created at the same, which monitors the resource utilization information of whole allocation group. This patch is introducing the concept of default monitor, which represents the particular monitoring group that created along with the creation of allocation group. Default monitor shares the common 'vcpu' list with the allocation. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 23 +++++++++++++++++++++++ src/util/virresctrl.h | 2 ++ 3 files changed, 26 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c93d19f..4b22ed4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2689,6 +2689,7 @@ virResctrlMonitorGetID; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetCacheLevel; +virResctrlMonitorSetDefault; virResctrlMonitorSetID; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index fca1f6f..41e8d48 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -340,6 +340,13 @@ struct _virResctrlAlloc { * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal * features. The monitor will be created under the scope of default allocation * if no CAT or MBA supported in the system. + * + * In resctrl file sytem, more than one monitoring groups could be created + * within one allocation group, along with the creation of allocation group, + * a monitoring group is created at the same, which monitors the resource + * utilization information of whole allocation group. + * A virResctrlMonitor with @default_monitor marked as 'true' is representing + * the monitoring group created along with the creation of allocation group. */ struct _virResctrlMonitor { virObject parent; @@ -355,6 +362,8 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* Boolean flag for default monitor */ + bool default_monitor; /* The cache 'level', special for cache monitor */ unsigned int cache_level; }; @@ -2499,6 +2508,13 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, return -1; } + if (monitor->default_monitor) { + if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0) + return -1; + + return 0; + } + if (monitor->alloc) alloc_path = monitor->alloc->path; else @@ -2739,3 +2755,10 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, return virResctrlMonitorGetStatistic(monitor, "llc_occupancy", nbank, bankids, bankcaches); } + + +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor) +{ + monitor->default_monitor = true; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 6137fee..371df8a 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -228,4 +228,6 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, size_t *nbank, unsigned int **bankids, unsigned int **bankcaches); +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
In resctrl file system, more than one monitoring groups could be created within one allocation group, along with the creation of allocation group, a monitoring group is created at the same, which monitors the resource utilization information of whole allocation group.
This patch is introducing the concept of default monitor, which represents the particular monitoring group that created along with the creation of allocation group.
Default monitor shares the common 'vcpu' list with the allocation.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 23 +++++++++++++++++++++++ src/util/virresctrl.h | 2 ++ 3 files changed, 26 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c93d19f..4b22ed4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2689,6 +2689,7 @@ virResctrlMonitorGetID; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetCacheLevel; +virResctrlMonitorSetDefault; virResctrlMonitorSetID;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index fca1f6f..41e8d48 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -340,6 +340,13 @@ struct _virResctrlAlloc { * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal * features. The monitor will be created under the scope of default allocation * if no CAT or MBA supported in the system. + * + * In resctrl file sytem, more than one monitoring groups could be created'
In the resctrl file system, more than one monitoring group could...
+ * within one allocation group, along with the creation of allocation group,
s/group, along/group. Along/
+ * a monitoring group is created at the same, which monitors the resource + * utilization information of whole allocation group.
Reword - it's a bit redundant
+ * A virResctrlMonitor with @default_monitor marked as 'true' is representing + * the monitoring group created along with the creation of allocation group.
Well I'm a bit lost, but let's see what happens. I'm not sure what you're trying to delineate here. There is the creation of an resctrl->alloc when a <monitor> is found by no <cachetune> is found. Thus, we create an empty <cachetune> (one with no <cache> elements). To me that's a default environment. I assume a similar paradigm could exist if there was or wasn't a <memorytune> element...
*/ struct _virResctrlMonitor { virObject parent; @@ -355,6 +362,8 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* Boolean flag for default monitor */ + bool default_monitor; /* The cache 'level', special for cache monitor */ unsigned int cache_level; }; @@ -2499,6 +2508,13 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, return -1; }
+ if (monitor->default_monitor) { + if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0)
See check below ... at this point monitor->alloc could be NULL and won't make this STRDUP very happy.
+ return -1; + + return 0; + } + if (monitor->alloc) alloc_path = monitor->alloc->path; else @@ -2739,3 +2755,10 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, return virResctrlMonitorGetStatistic(monitor, "llc_occupancy", nbank, bankids, bankcaches); } + + +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor) +{ + monitor->default_monitor = true; +}
I really don't see what the value of this is. Looking later on it seems there's some sort of check that the vcpus desired for the monitor match that of some cachetune entry and you then set default. To me that could happen multiple times, e.g.: <cachetune vcpus='0-1' ... /> <cachetune vcpus='2-3' ... /> and <monitor vcpus='0-1' .../> <monitor vcpus='2-3' .../> so, then it would seem there would be two defaults. Is all this being done to save a few steps in virResctrlMonitorDeterminePath? If so, then I see no value. It only adds confusion. John
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 6137fee..371df8a 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -228,4 +228,6 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, size_t *nbank, unsigned int **bankids, unsigned int **bankcaches); +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor); #endif /* __VIR_RESCTRL_H__ */

On 10/11/2018 3:14 AM, John Ferlan wrote:
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
In resctrl file system, more than one monitoring groups could be created within one allocation group, along with the creation of allocation group, a monitoring group is created at the same, which monitors the resource utilization information of whole allocation group.
This patch is introducing the concept of default monitor, which represents the particular monitoring group that created along with the creation of allocation group.
Default monitor shares the common 'vcpu' list with the allocation.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 23 +++++++++++++++++++++++ src/util/virresctrl.h | 2 ++ 3 files changed, 26 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c93d19f..4b22ed4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2689,6 +2689,7 @@ virResctrlMonitorGetID; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetCacheLevel; +virResctrlMonitorSetDefault; virResctrlMonitorSetID;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index fca1f6f..41e8d48 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -340,6 +340,13 @@ struct _virResctrlAlloc { * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal * features. The monitor will be created under the scope of default allocation * if no CAT or MBA supported in the system. + * + * In resctrl file sytem, more than one monitoring groups could be created'
In the resctrl file system, more than one monitoring group could...
Got.
+ * within one allocation group, along with the creation of allocation group,
s/group, along/group. Along/
Got
+ * a monitoring group is created at the same, which monitors the resource + * utilization information of whole allocation group.
Reword - it's a bit redundant
Rewording like these: * There are two type of monitors, the default monitor and the non-default * monitor. Within one allocation, up to one default monitor and more than * one non-default monitors could be created. * * The flag 'default_monitor' defined in structure virResctrlMonitor denotes * if a monitor is the default one or not.
+ * A virResctrlMonitor with @default_monitor marked as 'true' is representing + * the monitoring group created along with the creation of allocation group.
Well I'm a bit lost, but let's see what happens. I'm not sure what you're trying to delineate here. There is the creation of an resctrl->alloc when a <monitor> is found by no <cachetune> is found. Thus, we create an empty <cachetune> (one with no <cache> elements). To me that's a default environment.
I assume a similar paradigm could exist if there was or wasn't a <memorytune> element...
Make you confused, my bad. I was trying to introducing the situation of '/sys/fs/resctrl' filesystem and what I was done here at the same time. Regarding the XML layout for default monitor, following XML lines represents a default monitor, the key rule is the default monitor's <monitor> entry has the same 'vcpus' setting as <cachetune>. <cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> </cachetune> and following example illustrates a <cachetune> with two monitors, one of them is a default monitor <cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> <monitor level='3' vcpus='0'/> </cachetune> Particularly, following XML layout doesn't define any monitor or allocation. Following XML lines does not have any effect, and does not indicate an error. <cachetune vcpus='0-1'/> or <cachetune vcpus='0-1'> </cachetune>
*/ struct _virResctrlMonitor { virObject parent; @@ -355,6 +362,8 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* Boolean flag for default monitor */ + bool default_monitor; /* The cache 'level', special for cache monitor */ unsigned int cache_level; }; @@ -2499,6 +2508,13 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, return -1; }
+ if (monitor->default_monitor) { + if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0)
See check below ... at this point monitor->alloc could be NULL and won't make this STRDUP very happy.
Thanks for catching my bug. I did not run the code on my machine with default monitor, because I have trouble in run libvirt with CAT, creating non-default resctrl allocation. I am working on it, and will more tests for monitor. Using following code to fix it: (also changed next lines....) if (monitor->alloc) alloc_path = monitor->alloc->path; else alloc_path = SYSFS_RESCTRL_PATH; if (monitor->default_monitor) { if (VIR_STRDUP(monitor->path, alloc_path) < 0) return -1; return 0; }
+ return -1; + + return 0; + } + if (monitor->alloc) alloc_path = monitor->alloc->path; else @@ -2739,3 +2755,10 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, return virResctrlMonitorGetStatistic(monitor, "llc_occupancy", nbank, bankids, bankcaches); } + + +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor) +{ + monitor->default_monitor = true; +}
I really don't see what the value of this is. Looking later on it seems there's some sort of check that the vcpus desired for the monitor match that of some cachetune entry and you then set default.
To me that could happen multiple times, e.g.:
<cachetune vcpus='0-1' ... /> <cachetune vcpus='2-3' ... />
and
<monitor vcpus='0-1' .../> <monitor vcpus='2-3' .../>
so, then it would seem there would be two defaults.
Yes. Two defaults. But I think the <monitor .. > entry should be placed under <cachetune> entry. I'd like to rewrite your configuration in following way: <cachetune vcpus='0-1'> <monitor levels='3' vcpus='0-1'/> <cachetune/> <cachetune vcpus='2-3'> <monitor levels='3' vcpus='2-3'/> <cachetune/> upper configuration creates two allocations and one default monitor for each allocation. By the way <cachetune vcpus='0-1' /> has no effect for resctrl but not be considered as an error. Unlike default allocation, default allocation could be created for multiple times in host, but with a limitation that you can only create one default monitor for one allocation at most. There is only one default allocation in whole system. Every monitor is linked with some specific allocation, either default allocation or non-default allocation (or just 'allocation'). The upper manner of monitor is determined by kernel's 'resctrl' file system. If I created an allocation by creating a directory 'p0' under '/sys/fs/resctrl/', and after adding two vcpus' PID to file '/sys/fs/resctrl/p0/tasks' and making some changes to 'sys/fs/resctrl/p0/schemata' file, then the allocation is functional for resource limitation. The libvirt CAT code is doing similar operations for VM in an automatic way. Let me show the files under '/sys/fs/resctrl/p0': . ├── cpus ├── cpus_list ├── mon_data │ ├── mon_L3_00 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── mon_L3_01 │ ├── llc_occupancy │ ├── mbm_local_bytes │ └── mbm_total_bytes ├── mon_groups ├── schemata └── tasks We can find some files for the monitoring role, the 'llc_occupancy' file, the 'mbm_local_bytes' file and the 'mbm_total_bytes' file. The truth is 'resctrl' fs create a monitoring group for each allocation group, and this is I called default monitor referring to. Since this monitoring group is created whenever creating a libvirt allocation, in the design of libvirt resctrl monitor, I choose to make it shown not shown according to the XML configuration. To create other monitoring groups just making sub-directories under the 'mon_group' directory, and adding corresponding vcpu PID to that sub-directory's 'tasks' file. The virResctrlMonitorCreate function creates this kind of sub-directory under this 'mon_group' directory for non-default monitor. For an allocation with one default monitor and a non-default monitor (or just 'monitor' in wording), the file layout is like these: . ├── cpus ├── cpus_list ├── mon_data <--- default monitor interface │ ├── mon_L3_00 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── mon_L3_01 │ ├── llc_occupancy │ ├── mbm_local_bytes │ └── mbm_total_bytes ├── mon_groups │ └── mon1 │ ├── cpus │ ├── cpus_list │ ├── mon_data <--- non-default monitor interface │ │ ├── mon_L3_00 │ │ │ ├── llc_occupancy │ │ │ ├── mbm_local_bytes │ │ │ └── mbm_total_bytes │ │ └── mon_L3_01 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── tasks ├── schemata └── tasks This patch is trying to let monitor has the capability to mark a monitor as a default monitor, and the default monitor is 'physically' existed in kernel 'resctrl' file system, and which has a different manner with other monitors.
Is all this being done to save a few steps in virResctrlMonitorDeterminePath? If so, then I see no value. It only adds confusion.
default monitor has a different role with other monitors, hope I have documented it clearly.
John
Thanks for review. Huaqiang
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 6137fee..371df8a 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -228,4 +228,6 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, size_t *nbank, unsigned int **bankids, unsigned int **bankcaches); +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor); #endif /* __VIR_RESCTRL_H__ */

Answers refined. On 10/11/2018 3:14 AM, John Ferlan wrote:
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
In resctrl file system, more than one monitoring groups could be created within one allocation group, along with the creation of allocation group, a monitoring group is created at the same, which monitors the resource utilization information of whole allocation group.
This patch is introducing the concept of default monitor, which represents the particular monitoring group that created along with the creation of allocation group.
Default monitor shares the common 'vcpu' list with the allocation.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 23 +++++++++++++++++++++++ src/util/virresctrl.h | 2 ++ 3 files changed, 26 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c93d19f..4b22ed4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2689,6 +2689,7 @@ virResctrlMonitorGetID; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetCacheLevel; +virResctrlMonitorSetDefault; virResctrlMonitorSetID;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index fca1f6f..41e8d48 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -340,6 +340,13 @@ struct _virResctrlAlloc { * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal * features. The monitor will be created under the scope of default allocation * if no CAT or MBA supported in the system. + * + * In resctrl file sytem, more than one monitoring groups could be created' In the resctrl file system, more than one monitoring group could...
Got.
+ * within one allocation group, along with the creation of allocation group, s/group, along/group. Along/
Got
+ * a monitoring group is created at the same, which monitors the resource + * utilization information of whole allocation group. Reword - it's a bit redundant
Rewording like these: * There are two type of monitors, the default monitor and the non-default * monitor. Within one allocation, up to one default monitor and more than * one non-default monitors could be created. * * The flag 'default_monitor' defined in structure virResctrlMonitor denotes * if a monitor is the default one or not.
+ * A virResctrlMonitor with @default_monitor marked as 'true' is representing + * the monitoring group created along with the creation of allocation group. Well I'm a bit lost, but let's see what happens. I'm not sure what you're trying to delineate here. There is the creation of an resctrl->alloc when a <monitor> is found by no <cachetune> is found. Thus, we create an empty <cachetune> (one with no <cache> elements). To me that's a default environment.
I assume a similar paradigm could exist if there was or wasn't a <memorytune> element...
Make you confused, my bad. I was trying to introducing the situation of '/sys/fs/resctrl' filesystem and what I was done here at the same time. Regarding the XML layout for default monitor, following XML lines represents a default monitor, the key rule is the default monitor's <monitor> entry has the same 'vcpus' setting as <cachetune>. <cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> </cachetune> and following example illustrates a <cachetune> with two monitors, one of them is a default monitor <cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> <monitor level='3' vcpus='0'/> </cachetune> Particularly, following XML layout doesn't define any monitor or allocation. Following XML lines does not have any effect, and does not indicate an error. <cachetune vcpus='0-1'/> or <cachetune vcpus='0-1'> </cachetune>
*/ struct _virResctrlMonitor { virObject parent; @@ -355,6 +362,8 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* Boolean flag for default monitor */ + bool default_monitor; /* The cache 'level', special for cache monitor */ unsigned int cache_level; }; @@ -2499,6 +2508,13 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, return -1; }
+ if (monitor->default_monitor) { + if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0) See check below ... at this point monitor->alloc could be NULL and won't make this STRDUP very happy.
Thanks for catching my bug. I did not run the code on my machine with default monitor, because I have trouble in run libvirt with CAT, creating non-default resctrl allocation. I am working on it, and will more tests for monitor. Using following code to fix it: (also changed next lines....) if (monitor->alloc) alloc_path = monitor->alloc->path; else alloc_path = SYSFS_RESCTRL_PATH; if (monitor->default_monitor) { if (VIR_STRDUP(monitor->path, alloc_path) < 0) return -1; return 0; }
+ return -1; + + return 0; + } + if (monitor->alloc) alloc_path = monitor->alloc->path; else @@ -2739,3 +2755,10 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, return virResctrlMonitorGetStatistic(monitor, "llc_occupancy", nbank, bankids, bankcaches); } + + +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor) +{ + monitor->default_monitor = true; +}
I really don't see what the value of this is. Looking later on it seems there's some sort of check that the vcpus desired for the monitor match that of some cachetune entry and you then set default.
To me that could happen multiple times, e.g.:
<cachetune vcpus='0-1' ... /> <cachetune vcpus='2-3' ... />
and
<monitor vcpus='0-1' .../> <monitor vcpus='2-3' .../>
so, then it would seem there would be two defaults.
Yes. Two defaults. But I think the <monitor .. > entry should be placed under <cachetune> entry. I'd like to rewrite your configuration in following way: <cachetune vcpus='0-1'> <monitor levels='3' vcpus='0-1'/> <cachetune/> <cachetune vcpus='2-3'> <monitor levels='3' vcpus='2-3'/> <cachetune/> upper configuration creates two allocations and one default monitor for each allocation. By the way <cachetune vcpus='0-1' /> has no effect for resctrl but not be considered as an error. Unlike default allocation, default monitor could be created for multiple times in scope of system, but with a limitation that you can only create one default monitor for one allocation at most. There is only one default allocation in whole system. Every monitor is linked with some specific allocation, either default allocation or non-default allocation (or just 'allocation'). The upper manner of monitor is determined by kernel's 'resctrl' file system. If I created an allocation by creating a directory 'p0' under '/sys/fs/resctrl/', and after adding two vcpus' PID to file '/sys/fs/resctrl/p0/tasks' and making some changes to 'sys/fs/resctrl/p0/schemata' file, then the allocation is functional for resource limitation. The libvirt CAT code is doing similar operations for VM in an automatic way. Let me show the files under '/sys/fs/resctrl/p0': . ├── cpus ├── cpus_list ├── mon_data │ ├── mon_L3_00 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── mon_L3_01 │ ├── llc_occupancy │ ├── mbm_local_bytes │ └── mbm_total_bytes ├── mon_groups ├── schemata └── tasks We can find some files for the monitoring role, the 'llc_occupancy' file, the 'mbm_local_bytes' file and the 'mbm_total_bytes' file. The truth is 'resctrl' fs create a monitoring group for each allocation group along with its creation, and this monitoring group is what I introduced default monitor. The default monitor shares the same 'tasks' file with allocation, so it monitors the resource utilization for all pids existing in 'tasks' file. Since this monitoring group is created whenever creating a libvirt allocation, in the design of libvirt resctrl monitor, I choose to make it shown not shown according to the XML configuration. To create other monitoring groups just making sub-directories under the 'mon_group' directory, and adding corresponding vcpu PID to that sub-directory's 'tasks' file. The virResctrlMonitorCreate function creates this kind of sub-directory under this 'mon_group' directory for non-default monitor. For non-default monitor, you can specify a subset of pids of that in allocation 'tasks' file, and no pid overlap allowed between non-default monitors. For an allocation with one default monitor and a non-default monitor (or just 'monitor' in wording), the files layout are like these: . ├── cpus ├── cpus_list ├── mon_data <--- default monitor interface │ ├── mon_L3_00 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── mon_L3_01 │ ├── llc_occupancy │ ├── mbm_local_bytes │ └── mbm_total_bytes ├── mon_groups │ └── mon1 │ ├── cpus │ ├── cpus_list │ ├── mon_data <--- non-default monitor interface │ │ ├── mon_L3_00 │ │ │ ├── llc_occupancy │ │ │ ├── mbm_local_bytes │ │ │ └── mbm_total_bytes │ │ └── mon_L3_01 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── tasks ├── schemata └── tasks This patch is trying to let monitor has the capability to mark a monitor as a default monitor, and the default monitor is 'physically' existed in kernel 'resctrl' file system, and which has a different manner with other monitors.
Is all this being done to save a few steps in virResctrlMonitorDeterminePath? If so, then I see no value. It only adds confusion.
default monitor has a different role with other monitors, hope I have documented it clearly. Without identifying the default monitor, all monitors will be create under allocation's 'mon_group' directory, the following configuration will not be supported due to overlap between monitors. <cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> <monitor level='3' vcpus='0'/> </cachetune> So default monitor is valuable if you get to know the backend mechanisim of kernel resctrl file system.
John
Thanks for review. Huaqiang
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 6137fee..371df8a 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -228,4 +228,6 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, size_t *nbank, unsigned int **bankids, unsigned int **bankcaches); +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor); #endif /* __VIR_RESCTRL_H__ */

On 10/11/18 8:02 AM, Wang, Huaqiang wrote:
Answers refined.
On 10/11/2018 3:14 AM, John Ferlan wrote:
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
In resctrl file system, more than one monitoring groups could be created within one allocation group, along with the creation of allocation group, a monitoring group is created at the same, which monitors the resource utilization information of whole allocation group.
This patch is introducing the concept of default monitor, which represents the particular monitoring group that created along with the creation of allocation group.
Default monitor shares the common 'vcpu' list with the allocation.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 23 +++++++++++++++++++++++ src/util/virresctrl.h | 2 ++ 3 files changed, 26 insertions(+)
I assume the two responses to my review are very similar, but I'll use this second one since it's timewise later...
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c93d19f..4b22ed4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2689,6 +2689,7 @@ virResctrlMonitorGetID; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetCacheLevel; +virResctrlMonitorSetDefault; virResctrlMonitorSetID;
[...]
+ * a monitoring group is created at the same, which monitors the resource + * utilization information of whole allocation group. Reword - it's a bit redundant
Rewording like these:
* There are two type of monitors, the default monitor and the non-default * monitor. Within one allocation, up to one default monitor and more than * one non-default monitors could be created. * * The flag 'default_monitor' defined in structure virResctrlMonitor denotes * if a monitor is the default one or not.
Starting with "Within one allocation,..." - doesn't make much sense to me as a reader, sorry.
+ * A virResctrlMonitor with @default_monitor marked as 'true' is representing + * the monitoring group created along with the creation of allocation group. Well I'm a bit lost, but let's see what happens. I'm not sure what you're trying to delineate here. There is the creation of an resctrl->alloc when a <monitor> is found by no <cachetune> is found. Thus, we create an empty <cachetune> (one with no <cache> elements). To me that's a default environment.
I assume a similar paradigm could exist if there was or wasn't a <memorytune> element...
Make you confused, my bad. I was trying to introducing the situation of '/sys/fs/resctrl' filesystem and what I was done here at the same time.
Regarding the XML layout for default monitor, following XML lines represents a default monitor, the key rule is the default monitor's <monitor> entry has the same 'vcpus' setting as <cachetune>.
<cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> </cachetune>
and following example illustrates a <cachetune> with two monitors, one of them is a default monitor <cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/>
This gets tagged as a default monitor just because the vcpus match?
<monitor level='3' vcpus='0'/>
and this is not a default monitor because the vcpus don't match
</cachetune>
Particularly, following XML layout doesn't define any monitor or allocation. Following XML lines does not have any effect, and does not indicate an error.
<cachetune vcpus='0-1'/>
or
<cachetune vcpus='0-1'> </cachetune>
I understand that, but that wasn't my point. If someone modified their XML and added just that, then add the resctrl->alloc as soon as you see it. Then when you see a <cache>, that gets added to some cache list. When you see a <monitor> that gets added to some monitor list. The whole concept of calling some a default something just isn't working for me. It's a cache or monitor for either all or some subset of the vcpus for the cachetune (or resctrl->alloc). Nothing more, nothing less.
*/ struct _virResctrlMonitor { virObject parent; @@ -355,6 +362,8 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* Boolean flag for default monitor */ + bool default_monitor; /* The cache 'level', special for cache monitor */ unsigned int cache_level; }; @@ -2499,6 +2508,13 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, return -1; }
+ if (monitor->default_monitor) { + if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0) See check below ... at this point monitor->alloc could be NULL and won't make this STRDUP very happy.
Thanks for catching my bug. I did not run the code on my machine with default monitor, because I have trouble in run libvirt with CAT, creating non-default resctrl allocation. I am working on it, and will more tests for monitor.
Using following code to fix it: (also changed next lines....)
if (monitor->alloc) alloc_path = monitor->alloc->path; else alloc_path = SYSFS_RESCTRL_PATH;
if (monitor->default_monitor) { if (VIR_STRDUP(monitor->path, alloc_path) < 0) return -1;
return 0; }
+ return -1; + + return 0; + } + if (monitor->alloc) alloc_path = monitor->alloc->path; else @@ -2739,3 +2755,10 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, return virResctrlMonitorGetStatistic(monitor, "llc_occupancy", nbank, bankids, bankcaches); } + + +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor) +{ + monitor->default_monitor = true; +}
I really don't see what the value of this is. Looking later on it seems there's some sort of check that the vcpus desired for the monitor match that of some cachetune entry and you then set default.
To me that could happen multiple times, e.g.:
<cachetune vcpus='0-1' ... /> <cachetune vcpus='2-3' ... />
and
<monitor vcpus='0-1' .../> <monitor vcpus='2-3' .../>
so, then it would seem there would be two defaults.
Yes. Two defaults. But I think the <monitor .. > entry should be placed under <cachetune> entry.
I'd like to rewrite your configuration in following way:
<cachetune vcpus='0-1'> <monitor levels='3' vcpus='0-1'/> <cachetune/>
<cachetune vcpus='2-3'> <monitor levels='3' vcpus='2-3'/> <cachetune/>
upper configuration creates two allocations and one default monitor for each allocation.
and a default monitor vs. a non-default monitor has no specific meaning afaict.
By the way
<cachetune vcpus='0-1' />
has no effect for resctrl but not be considered as an error.
Unlike default allocation, default monitor could be created for multiple times in scope of system, but with a limitation that you can only create one default monitor for one allocation at most. There is only one default allocation in whole system.
Every monitor is linked with some specific allocation, either default allocation or non-default allocation (or just 'allocation').
Simplify the code - default allocation means no <cache> lines true? Is monitor the related to the cache or the cachetune? If you had: <cachetune 0-4> <cache 0> <cache 1-2> <cache 3> <cache 4> ... Then is : <monitor 0-4> <monitor 2> acceptible? If so, then I'm not seeing the need for default monitor and/or default allocation.
The upper manner of monitor is determined by kernel's 'resctrl' file system.
If I created an allocation by creating a directory 'p0' under '/sys/fs/resctrl/', and after adding two vcpus' PID to file '/sys/fs/resctrl/p0/tasks' and making some changes to 'sys/fs/resctrl/p0/schemata' file, then the allocation is functional for resource limitation. The libvirt CAT code is doing similar operations for VM in an automatic way.
The next hunk I'll need to look at later - too many other things cycling through my head right now. Nice picture, but correlating this to code is not clicking. My quick look though - I see the same files in both the default and non-default pictures - the path to get there is different, but that path afaik is generated as part of the processing and shouldn't know whether it's default or non-default. It's just a path to data based on some other "base" path. John
Let me show the files under '/sys/fs/resctrl/p0':
. ├── cpus ├── cpus_list ├── mon_data │ ├── mon_L3_00 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── mon_L3_01 │ ├── llc_occupancy │ ├── mbm_local_bytes │ └── mbm_total_bytes ├── mon_groups ├── schemata └── tasks
We can find some files for the monitoring role, the 'llc_occupancy' file, the 'mbm_local_bytes' file and the 'mbm_total_bytes' file. The truth is 'resctrl' fs create a monitoring group for each allocation group along with its creation, and this monitoring group is what I introduced default monitor. The default monitor shares the same 'tasks' file with allocation, so it monitors the resource utilization for all pids existing in 'tasks' file.
Since this monitoring group is created whenever creating a libvirt allocation, in the design of libvirt resctrl monitor, I choose to make it shown not shown according to the XML configuration.
To create other monitoring groups just making sub-directories under the 'mon_group' directory, and adding corresponding vcpu PID to that sub-directory's 'tasks' file. The virResctrlMonitorCreate function creates this kind of sub-directory under this 'mon_group' directory for non-default monitor. For non-default monitor, you can specify a subset of pids of that in allocation 'tasks' file, and no pid overlap allowed between non-default monitors.
For an allocation with one default monitor and a non-default monitor (or just 'monitor' in wording), the files layout are like these: . ├── cpus ├── cpus_list ├── mon_data <--- default monitor interface │ ├── mon_L3_00 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── mon_L3_01 │ ├── llc_occupancy │ ├── mbm_local_bytes │ └── mbm_total_bytes ├── mon_groups │ └── mon1 │ ├── cpus │ ├── cpus_list │ ├── mon_data <--- non-default monitor interface │ │ ├── mon_L3_00 │ │ │ ├── llc_occupancy │ │ │ ├── mbm_local_bytes │ │ │ └── mbm_total_bytes │ │ └── mon_L3_01 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── tasks ├── schemata └── tasks
This patch is trying to let monitor has the capability to mark a monitor as a default monitor, and the default monitor is 'physically' existed in kernel 'resctrl' file system, and which has a different manner with other monitors.
Is all this being done to save a few steps in virResctrlMonitorDeterminePath? If so, then I see no value. It only adds confusion.
default monitor has a different role with other monitors, hope I have documented it clearly.
Without identifying the default monitor, all monitors will be create under allocation's 'mon_group' directory, the following configuration will not be supported due to overlap between monitors.
<cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> <monitor level='3' vcpus='0'/> </cachetune>
So default monitor is valuable if you get to know the backend mechanisim of kernel resctrl file system.
[...]

On 10/12/2018 11:18 PM, John Ferlan wrote:
Answers refined.
On 10/11/2018 3:14 AM, John Ferlan wrote:
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
In resctrl file system, more than one monitoring groups could be created within one allocation group, along with the creation of allocation group, a monitoring group is created at the same, which monitors the resource utilization information of whole allocation group.
This patch is introducing the concept of default monitor, which represents the particular monitoring group that created along with the creation of allocation group.
Default monitor shares the common 'vcpu' list with the allocation.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 23 +++++++++++++++++++++++ src/util/virresctrl.h | 2 ++ 3 files changed, 26 insertions(+)
I assume the two responses to my review are very similar, but I'll use
On 10/11/18 8:02 AM, Wang, Huaqiang wrote: this second one since it's timewise later...
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c93d19f..4b22ed4 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2689,6 +2689,7 @@ virResctrlMonitorGetID; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetCacheLevel; +virResctrlMonitorSetDefault; virResctrlMonitorSetID;
[...]
+ * a monitoring group is created at the same, which monitors the resource + * utilization information of whole allocation group. Reword - it's a bit redundant Rewording like these:
* There are two type of monitors, the default monitor and the non-default * monitor. Within one allocation, up to one default monitor and more than * one non-default monitors could be created. * * The flag 'default_monitor' defined in structure virResctrlMonitor denotes * if a monitor is the default one or not.
Starting with "Within one allocation,..." - doesn't make much sense to me as a reader, sorry.
+ * A virResctrlMonitor with @default_monitor marked as 'true' is representing + * the monitoring group created along with the creation of allocation group. Well I'm a bit lost, but let's see what happens. I'm not sure what you're trying to delineate here. There is the creation of an resctrl->alloc when a <monitor> is found by no <cachetune> is found. Thus, we create an empty <cachetune> (one with no <cache> elements). To me that's a default environment.
Re-read above paragraph, you have pointed out very clearly about your point, and now I finally catch up with you ... and the default monitor could be simplified: Before there is a pointer point to 'allocation', which is @resctrl->alloc in dom_conf.c or @monitor->alloc in virresctrl.c, the 'default' case could be determined by checking if @alloc is empty pointer or not. Then there is no need for 'virResctrlMonitorSetDefault' and '@monitor->default_monitor' variable. Will remove the concept of 'default monitor' and related data field in virResctrlMonitor as well as the API(s). (mainly virResctrlMonitorSetDefault). 'Default allocation' concept related wording will be removed too. And there are not too much data field and API(s) related to default allocation, and I will check my code to remove all stuff for it. Do we have any gap about this?
I assume a similar paradigm could exist if there was or wasn't a <memorytune> element...
Make you confused, my bad. I was trying to introducing the situation of '/sys/fs/resctrl' filesystem and what I was done here at the same time.
Regarding the XML layout for default monitor, following XML lines represents a default monitor, the key rule is the default monitor's <monitor> entry has the same 'vcpus' setting as <cachetune>.
<cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> </cachetune>
and following example illustrates a <cachetune> with two monitors, one of them is a default monitor <cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> This gets tagged as a default monitor just because the vcpus match?
Yes.
<monitor level='3' vcpus='0'/> and this is not a default monitor because the vcpus don't match
Yes.
</cachetune>
Particularly, following XML layout doesn't define any monitor or allocation. Following XML lines does not have any effect, and does not indicate an error.
<cachetune vcpus='0-1'/>
or
<cachetune vcpus='0-1'> </cachetune>
I understand that, but that wasn't my point. If someone modified their XML and added just that, then add the resctrl->alloc as soon as you see it. Then when you see a <cache>, that gets added to some cache list. When you see a <monitor> that gets added to some monitor list.
The whole concept of calling some a default something just isn't working for me. It's a cache or monitor for either all or some subset of the vcpus for the cachetune (or resctrl->alloc). Nothing more, nothing less.
Again, I'll remove the default concepts that I tried introduced before. No concept for 'default allocation' and 'default monitor' will be removed. The difference between my called 'default monitor' and 'non-default monitor' is tiny, and could be handled with a simple judgement. (For a monitor whether is a default one or not could be found out by checking associated @alloc pointer.)
*/ struct _virResctrlMonitor { virObject parent; @@ -355,6 +362,8 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* Boolean flag for default monitor */ + bool default_monitor; /* The cache 'level', special for cache monitor */ unsigned int cache_level; }; @@ -2499,6 +2508,13 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, return -1; }
+ if (monitor->default_monitor) { + if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0) See check below ... at this point monitor->alloc could be NULL and won't make this STRDUP very happy. Thanks for catching my bug. I did not run the code on my machine with default monitor, because I have trouble in run libvirt with CAT, creating non-default resctrl allocation. I am working on it, and will more tests for monitor.
Using following code to fix it: (also changed next lines....)
if (monitor->alloc) alloc_path = monitor->alloc->path; else alloc_path = SYSFS_RESCTRL_PATH;
if (monitor->default_monitor) { if (VIR_STRDUP(monitor->path, alloc_path) < 0) return -1;
return 0; }
+ return -1; + + return 0; + } + if (monitor->alloc) alloc_path = monitor->alloc->path; else @@ -2739,3 +2755,10 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, return virResctrlMonitorGetStatistic(monitor, "llc_occupancy", nbank, bankids, bankcaches); } + + +void +virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor) +{ + monitor->default_monitor = true; +} I really don't see what the value of this is. Looking later on it seems there's some sort of check that the vcpus desired for the monitor match that of some cachetune entry and you then set default.
To me that could happen multiple times, e.g.:
<cachetune vcpus='0-1' ... /> <cachetune vcpus='2-3' ... />
and
<monitor vcpus='0-1' .../> <monitor vcpus='2-3' .../>
so, then it would seem there would be two defaults. Yes. Two defaults. But I think the <monitor .. > entry should be placed under <cachetune> entry.
I'd like to rewrite your configuration in following way:
<cachetune vcpus='0-1'> <monitor levels='3' vcpus='0-1'/> <cachetune/>
<cachetune vcpus='2-3'> <monitor levels='3' vcpus='2-3'/> <cachetune/>
upper configuration creates two allocations and one default monitor for each allocation. and a default monitor vs. a non-default monitor has no specific meaning afaict.
Agree. Will remove this 'default' concept and related things.
By the way
<cachetune vcpus='0-1' />
has no effect for resctrl but not be considered as an error.
Unlike default allocation, default monitor could be created for multiple times in scope of system, but with a limitation that you can only create one default monitor for one allocation at most. There is only one default allocation in whole system.
Every monitor is linked with some specific allocation, either default allocation or non-default allocation (or just 'allocation').
Simplify the code - default allocation means no <cache> lines true?
Maybe no, or I misinterpreted your words. 'default allocation' groups is specifying the the resctrl group '/sys/fs/resctrl', which defines the cache or memory bandwidth policy in file '/sys/fs/resctrl/schemata'. It is possible to change the resource allocating policy through changing content of '/sys/fs/resctrl/schemata'. But In 'libvirt' virresctrl model, I am not sure it is possible to change the resource allocation policy through adding some <cache> entry in some place. I need to investigate this for a full test of CMT. But, any way, the concept of 'default allocation' should be possible to be simplified. I'll do that in my V6 code.
Is monitor the related to the cache or the cachetune? If you had:
<cachetune 0-4> <cache 0> <cache 1-2> <cache 3> <cache 4> ...
Do the numbers in above <cache> element refer to the 'vcpu(s)'? If answer is yes, then it might be impossible to split vcpus in one <cachetune>. This is based on the consumption that: one <cachetune> is representing a directory under '/sys/fs/resctrl' (or itself), and you could create only one allocation and make one kind of cache/memoryBW allocation policy for a bunch of vcpus. It is impossible to create multiple cache allocating policies in one <cachetune>, and it is not allowed to specify any 'vcpus' in the <cache> entry. but you can assign different cache policy for cache bank in different host node in one <cachetune>.   <cachetune>      <cache id='0' level='3' type='both' size='2816' unit='KiB'/>  <-- can not specify 'vcpus'      <cache id='1' level='3' type='both' size='2816' unit='KiB'/>  <-- the first number is 'id', not vcpu      <monitor level='3' vcpus='0-2'/>      <monitor level='3' vcpus='0'/>      <monitor level='3' vcpus='1'/>      <monitor level='3' vcpus='2'/>    </cachetune>
Then is :
<monitor 0-4> <monitor 2>
acceptible? If so, then I'm not seeing the need for default monitor and/or default allocation.
Again. I realized this part could be simplified. Will be done in next code update for your review.
The upper manner of monitor is determined by kernel's 'resctrl' file system.
If I created an allocation by creating a directory 'p0' under '/sys/fs/resctrl/', and after adding two vcpus' PID to file '/sys/fs/resctrl/p0/tasks' and making some changes to 'sys/fs/resctrl/p0/schemata' file, then the allocation is functional for resource limitation. The libvirt CAT code is doing similar operations for VM in an automatic way.
The next hunk I'll need to look at later - too many other things cycling through my head right now. Nice picture, but correlating this to code is not clicking. My quick look though - I see the same files in both the default and non-default pictures - the path to get there is different, but that path afaik is generated as part of the processing and shouldn't know whether it's default or non-default. It's just a path to data based on some other "base" path.
John
Thanks for review. Huaqiang
Let me show the files under '/sys/fs/resctrl/p0':
. ├── cpus ├── cpus_list ├── mon_data │ ├── mon_L3_00 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── mon_L3_01 │ ├── llc_occupancy │ ├── mbm_local_bytes │ └── mbm_total_bytes ├── mon_groups ├── schemata └── tasks
We can find some files for the monitoring role, the 'llc_occupancy' file, the 'mbm_local_bytes' file and the 'mbm_total_bytes' file. The truth is 'resctrl' fs create a monitoring group for each allocation group along with its creation, and this monitoring group is what I introduced default monitor. The default monitor shares the same 'tasks' file with allocation, so it monitors the resource utilization for all pids existing in 'tasks' file.
Since this monitoring group is created whenever creating a libvirt allocation, in the design of libvirt resctrl monitor, I choose to make it shown not shown according to the XML configuration.
To create other monitoring groups just making sub-directories under the 'mon_group' directory, and adding corresponding vcpu PID to that sub-directory's 'tasks' file. The virResctrlMonitorCreate function creates this kind of sub-directory under this 'mon_group' directory for non-default monitor. For non-default monitor, you can specify a subset of pids of that in allocation 'tasks' file, and no pid overlap allowed between non-default monitors.
For an allocation with one default monitor and a non-default monitor (or just 'monitor' in wording), the files layout are like these: . ├── cpus ├── cpus_list ├── mon_data <--- default monitor interface │ ├── mon_L3_00 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── mon_L3_01 │ ├── llc_occupancy │ ├── mbm_local_bytes │ └── mbm_total_bytes ├── mon_groups │ └── mon1 │ ├── cpus │ ├── cpus_list │ ├── mon_data <--- non-default monitor interface │ │ ├── mon_L3_00 │ │ │ ├── llc_occupancy │ │ │ ├── mbm_local_bytes │ │ │ └── mbm_total_bytes │ │ └── mon_L3_01 │ │ ├── llc_occupancy │ │ ├── mbm_local_bytes │ │ └── mbm_total_bytes │ └── tasks ├── schemata └── tasks
This patch is trying to let monitor has the capability to mark a monitor as a default monitor, and the default monitor is 'physically' existed in kernel 'resctrl' file system, and which has a different manner with other monitors.
Is all this being done to save a few steps in virResctrlMonitorDeterminePath? If so, then I see no value. It only adds confusion. default monitor has a different role with other monitors, hope I have documented it clearly.
Without identifying the default monitor, all monitors will be create under allocation's 'mon_group' directory, the following configuration will not be supported due to overlap between monitors.
<cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> <monitor level='3' vcpus='0'/> </cachetune>
So default monitor is valuable if you get to know the backend mechanisim of kernel resctrl file system.
[...]

Refactoring the code of matching the new resctrl with existing resctrl groups. Add the virObjectRef action into function virDomainResctrlVcpuMatch. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index b77680e..e2b4701 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18833,7 +18833,7 @@ virDomainResctrlVcpuMatch(virDomainDefPtr def, * Just updating memory allocation information of that group */ if (virBitmapEqual(def->resctrls[i]->vcpus, vcpus)) { - *alloc = def->resctrls[i]->alloc; + *alloc = virObjectRef(def->resctrls[i]->alloc); break; } if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) { @@ -19225,8 +19225,6 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, if (!alloc) goto cleanup; new_alloc = true; - } else { - alloc = virObjectRef(alloc); } for (i = 0; i < n; i++) { -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Refactoring the code of matching the new resctrl with existing resctrl groups. Add the virObjectRef action into function virDomainResctrlVcpuMatch.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 10/11/2018 3:15 AM, John Ferlan wrote:
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Refactoring the code of matching the new resctrl with existing resctrl groups. Add the virObjectRef action into function virDomainResctrlVcpuMatch.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for review. Huaqiang

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Refactoring the code of matching the new resctrl with existing resctrl groups. Add the virObjectRef action into function virDomainResctrlVcpuMatch.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
Extra question here... Another caller to virDomainResctrlVcpuMatch is virDomainCachetuneDefParse... Prior to this change, if @alloc was returned we'd go to Unref(alloc) which I think is a bug, right? All things considered. At least with this change, the Unref wouldn't be for the only Ref ever done on @alloc. I can push this separately, but the answer perhaps changes the commit message a bit... John
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index b77680e..e2b4701 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18833,7 +18833,7 @@ virDomainResctrlVcpuMatch(virDomainDefPtr def, * Just updating memory allocation information of that group */ if (virBitmapEqual(def->resctrls[i]->vcpus, vcpus)) { - *alloc = def->resctrls[i]->alloc; + *alloc = virObjectRef(def->resctrls[i]->alloc); break; } if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) { @@ -19225,8 +19225,6 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, if (!alloc) goto cleanup; new_alloc = true; - } else { - alloc = virObjectRef(alloc); }
for (i = 0; i < n; i++) {

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Thursday, October 11, 2018 5:58 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 11/19] conf: Refactor code for matching existing resctrls
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Refactoring the code of matching the new resctrl with existing resctrl groups. Add the virObjectRef action into function virDomainResctrlVcpuMatch.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
Extra question here... Another caller to virDomainResctrlVcpuMatch is virDomainCachetuneDefParse...
Prior to this change, if @alloc was returned we'd go to Unref(alloc) which I think is a bug, right? All things considered.
Yes.
At least with this change, the Unref wouldn't be for the only Ref ever done on @alloc.
I can push this separately, but the answer perhaps changes the commit message a bit...
I will make this patch as a standalone patch for pushing and change commit message accordingly.
John
Thanks for review. Huaqiang
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index b77680e..e2b4701 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18833,7 +18833,7 @@ virDomainResctrlVcpuMatch(virDomainDefPtr def, * Just updating memory allocation information of that group */ if (virBitmapEqual(def->resctrls[i]->vcpus, vcpus)) { - *alloc = def->resctrls[i]->alloc; + *alloc = virObjectRef(def->resctrls[i]->alloc); break; } if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) { @@ -19225,8 +19225,6 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, if (!alloc) goto cleanup; new_alloc = true; - } else { - alloc = virObjectRef(alloc); }
for (i = 0; i < n; i++) {

Refactor virDomainResctrlAppend to facilitate virDomainResctrlDef with the capability to hold more element. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 64 +++++++++++++++++++++++++++++++++++--------------- 1 file changed, 45 insertions(+), 19 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e2b4701..9a514a6 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18920,24 +18920,43 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, } +static virDomainResctrlDefPtr +virDomainResctrlNew(virResctrlAllocPtr alloc, + virBitmapPtr vcpus) +{ + virDomainResctrlDefPtr resctrl = NULL; + + if (VIR_ALLOC(resctrl) < 0) + return NULL; + + if ((resctrl->vcpus = virBitmapNewCopy(vcpus)) == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to copy 'vcpus'")); + goto error; + } + + resctrl->alloc = virObjectRef(alloc); + + return resctrl; + error: + virDomainResctrlDefFree(resctrl); + return NULL; +} + + static int virDomainResctrlAppend(virDomainDefPtr def, xmlNodePtr node, - virResctrlAllocPtr alloc, - virBitmapPtr vcpus, + virDomainResctrlDefPtr resctrl, unsigned int flags) { char *vcpus_str = NULL; char *alloc_id = NULL; - virDomainResctrlDefPtr tmp_resctrl = NULL; int ret = -1; - if (VIR_ALLOC(tmp_resctrl) < 0) - goto cleanup; - /* We need to format it back because we need to be consistent in the naming * even when users specify some "sub-optimal" string there. */ - vcpus_str = virBitmapFormat(vcpus); + vcpus_str = virBitmapFormat(resctrl->vcpus); if (!vcpus_str) goto cleanup; @@ -18954,18 +18973,14 @@ virDomainResctrlAppend(virDomainDefPtr def, goto cleanup; } - if (virResctrlAllocSetID(alloc, alloc_id) < 0) + if (virResctrlAllocSetID(resctrl->alloc, alloc_id) < 0) goto cleanup; - tmp_resctrl->vcpus = vcpus; - tmp_resctrl->alloc = alloc; - - if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) + if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, resctrl) < 0) goto cleanup; ret = 0; cleanup: - virDomainResctrlDefFree(tmp_resctrl); VIR_FREE(alloc_id); VIR_FREE(vcpus_str); return ret; @@ -18982,6 +18997,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; + virDomainResctrlDefPtr resctrl = NULL; ssize_t i = 0; int n; int ret = -1; @@ -19030,15 +19046,18 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } - if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) + resctrl = virDomainResctrlNew(alloc, vcpus); + if (!resctrl) goto cleanup; - vcpus = NULL; - alloc = NULL; + if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) + goto cleanup; + resctrl = NULL; ret = 0; cleanup: ctxt->node = oldnode; + virDomainResctrlDefFree(resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(nodes); @@ -19196,6 +19215,8 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; + virDomainResctrlDefPtr resctrl = NULL; + ssize_t i = 0; int n; int ret = -1; @@ -19240,15 +19261,20 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, * just update the existing alloc information, which is done in above * virDomainMemorytuneDefParseMemory */ if (new_alloc) { - if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) + resctrl = virDomainResctrlNew(alloc, vcpus); + if (!resctrl) goto cleanup; - vcpus = NULL; - alloc = NULL; + + if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) + goto cleanup; + + resctrl = NULL; } ret = 0; cleanup: ctxt->node = oldnode; + virDomainResctrlDefFree(resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(nodes); -- 2.7.4

This is more "Introduce virDomainResctrlNew" On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Refactor virDomainResctrlAppend to facilitate virDomainResctrlDef with the capability to hold more element.
and then this says something like: "Rather than rely on virDomainResctrlAppend to perform the allocation, move the onus to the caller and make use of virBitmapNewCopy for @vcpus and virObjectRef for @alloc, thus removing the need to set each to NULL after the call."
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 64 +++++++++++++++++++++++++++++++++++--------------- 1 file changed, 45 insertions(+), 19 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e2b4701..9a514a6 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18920,24 +18920,43 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, }
+static virDomainResctrlDefPtr +virDomainResctrlNew(virResctrlAllocPtr alloc, + virBitmapPtr vcpus) +{ + virDomainResctrlDefPtr resctrl = NULL; + + if (VIR_ALLOC(resctrl) < 0) + return NULL; + + if ((resctrl->vcpus = virBitmapNewCopy(vcpus)) == NULL) {
I'd prefer: if (!(resctrl->vcpus = virBitmapNewCopy(vcpus))) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to copy 'vcpus'")); + goto error; + } + + resctrl->alloc = virObjectRef(alloc); + + return resctrl; + error: + virDomainResctrlDefFree(resctrl); + return NULL; +} + + static int virDomainResctrlAppend(virDomainDefPtr def, xmlNodePtr node, - virResctrlAllocPtr alloc, - virBitmapPtr vcpus, + virDomainResctrlDefPtr resctrl, unsigned int flags) { char *vcpus_str = NULL; char *alloc_id = NULL; - virDomainResctrlDefPtr tmp_resctrl = NULL; int ret = -1;
- if (VIR_ALLOC(tmp_resctrl) < 0) - goto cleanup; -
Based on below, I think this is where you call the virDomainResctrlNew w/ @cpus and @alloc
/* We need to format it back because we need to be consistent in the naming * even when users specify some "sub-optimal" string there. */ - vcpus_str = virBitmapFormat(vcpus); + vcpus_str = virBitmapFormat(resctrl->vcpus); if (!vcpus_str) goto cleanup;
@@ -18954,18 +18973,14 @@ virDomainResctrlAppend(virDomainDefPtr def, goto cleanup; }
- if (virResctrlAllocSetID(alloc, alloc_id) < 0) + if (virResctrlAllocSetID(resctrl->alloc, alloc_id) < 0) goto cleanup;
- tmp_resctrl->vcpus = vcpus; - tmp_resctrl->alloc = alloc; - - if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) + if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, resctrl) < 0) goto cleanup;
ret = 0; cleanup: - virDomainResctrlDefFree(tmp_resctrl);
You'd keep the above by use @resctrl for a parameter. On success, VIR_APPEND_ELEMENT will set resctrl = NULL so the *DefFree won't do anything. Without that, then the @resctrl would be leaked if the APPEND failed for any reason.
VIR_FREE(alloc_id); VIR_FREE(vcpus_str); return ret; @@ -18982,6 +18997,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; + virDomainResctrlDefPtr resctrl = NULL; ssize_t i = 0; int n; int ret = -1; @@ -19030,15 +19046,18 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; }
- if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) + resctrl = virDomainResctrlNew(alloc, vcpus); + if (!resctrl) goto cleanup;
- vcpus = NULL; - alloc = NULL; + if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) + goto cleanup;
+ resctrl = NULL;
Of course this is where it gets tricky - although you could pass &resctrl and then use (*resctrl)-> in the function - that way upon successful return resctrl is either added or free'd already. Alternatively, since both areas are changing to first alloc and then append, is there any specific reason the virDomainResctrlNew has to be outside of virDomainResctrlAppend? I do see the future does some other virDomainResctrlMonDefParse and virResctrlAllocIsEmpty calls before virDomainResctrlAppend - may have to rethink all that or just go with the &resctrl logic. Maybe I'll have a different thought later - let's see what happens when I get there. John
ret = 0; cleanup: ctxt->node = oldnode; + virDomainResctrlDefFree(resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(nodes); @@ -19196,6 +19215,8 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; + virDomainResctrlDefPtr resctrl = NULL; + ssize_t i = 0; int n; int ret = -1; @@ -19240,15 +19261,20 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, * just update the existing alloc information, which is done in above * virDomainMemorytuneDefParseMemory */ if (new_alloc) { - if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) + resctrl = virDomainResctrlNew(alloc, vcpus); + if (!resctrl) goto cleanup; - vcpus = NULL; - alloc = NULL; + + if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) + goto cleanup; + + resctrl = NULL; }
ret = 0; cleanup: ctxt->node = oldnode; + virDomainResctrlDefFree(resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(nodes);

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Thursday, October 11, 2018 3:54 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 12/19] conf: Refactor virDomainResctrlAppend
This is more "Introduce virDomainResctrlNew"
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Refactor virDomainResctrlAppend to facilitate virDomainResctrlDef with the capability to hold more element.
and then this says something like:
"Rather than rely on virDomainResctrlAppend to perform the allocation, move the onus to the caller and make use of virBitmapNewCopy for @vcpus and virObjectRef for @alloc, thus removing the need to set each to NULL after the call."
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 64 +++++++++++++++++++++++++++++++++++--------------- 1 file changed, 45 insertions(+), 19 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e2b4701..9a514a6 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18920,24 +18920,43 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, }
+static virDomainResctrlDefPtr +virDomainResctrlNew(virResctrlAllocPtr alloc, + virBitmapPtr vcpus) { + virDomainResctrlDefPtr resctrl = NULL; + + if (VIR_ALLOC(resctrl) < 0) + return NULL; + + if ((resctrl->vcpus = virBitmapNewCopy(vcpus)) == NULL) {
I'd prefer:
if (!(resctrl->vcpus = virBitmapNewCopy(vcpus))) {
OK. Seems more consistent with the rest of the code.
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to copy 'vcpus'")); + goto error; + } + + resctrl->alloc = virObjectRef(alloc); + + return resctrl; + error: + virDomainResctrlDefFree(resctrl); + return NULL; +} + + static int virDomainResctrlAppend(virDomainDefPtr def, xmlNodePtr node, - virResctrlAllocPtr alloc, - virBitmapPtr vcpus, + virDomainResctrlDefPtr resctrl, unsigned int flags) { char *vcpus_str = NULL; char *alloc_id = NULL; - virDomainResctrlDefPtr tmp_resctrl = NULL; int ret = -1;
- if (VIR_ALLOC(tmp_resctrl) < 0) - goto cleanup; -
Based on below, I think this is where you call the virDomainResctrlNew w/ @cpus and @alloc
/* We need to format it back because we need to be consistent in the
naming
* even when users specify some "sub-optimal" string there. */ - vcpus_str = virBitmapFormat(vcpus); + vcpus_str = virBitmapFormat(resctrl->vcpus); if (!vcpus_str) goto cleanup;
@@ -18954,18 +18973,14 @@ virDomainResctrlAppend(virDomainDefPtr def, goto cleanup; }
- if (virResctrlAllocSetID(alloc, alloc_id) < 0) + if (virResctrlAllocSetID(resctrl->alloc, alloc_id) < 0) goto cleanup;
- tmp_resctrl->vcpus = vcpus; - tmp_resctrl->alloc = alloc; - - if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) + if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, resctrl) < + 0) goto cleanup;
ret = 0; cleanup: - virDomainResctrlDefFree(tmp_resctrl);
You'd keep the above by use @resctrl for a parameter. On success, VIR_APPEND_ELEMENT will set resctrl = NULL so the *DefFree won't do anything. Without that, then the @resctrl would be leaked if the APPEND failed for any reason.
After code refactoring, the this function's argument @resctrl is passed in from its caller, so the caller allocates object memory for @resctrl, and it is better let caller to do the object release job when this function returns an error. @resctrl object is allocated and freed for error in function virDomainCachetuneDefParse and virDomainMemorytuneDefParse. I think this code will not make the memory leak.
VIR_FREE(alloc_id); VIR_FREE(vcpus_str); return ret; @@ -18982,6 +18997,7 @@ virDomainCachetuneDefParse(virDomainDefPtr
def,
xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; + virDomainResctrlDefPtr resctrl = NULL; ssize_t i = 0; int n; int ret = -1; @@ -19030,15 +19046,18 @@
virDomainCachetuneDefParse(virDomainDefPtr def,
goto cleanup; }
- if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) + resctrl = virDomainResctrlNew(alloc, vcpus); + if (!resctrl) goto cleanup;
- vcpus = NULL; - alloc = NULL; + if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) + goto cleanup;
+ resctrl = NULL;
Of course this is where it gets tricky - although you could pass &resctrl and then use (*resctrl)-> in the function - that way upon successful return resctrl is either added or free'd already.
If passing a pointer of @resctrl to virDomainResctrlAppend, it will work as you said. I don't think the current implementation of this refactoring will cause the memory leak as you pointed out in above lines, since you prefer the way by passing in a &resctrl to virDomainResctrlAppend, I can change code accordingly.
Alternatively, since both areas are changing to first alloc and then append, is there any specific reason the virDomainResctrlNew has to be outside of virDomainResctrlAppend?
The @resctrl structure, which is virDomainResctrlDefPtr type, will be extended later in a new form of: struct _virDomainResctrlDef { char *id; virBitmapPtr vcpus; virResctrlAllocPtr alloc; virDomainResctrlMonDefPtr *monitors; size_t nmonitors; }; If without virDomainResctrlNew, the virDomainResctrlAppend function will have a long argument list finally, like this: static int virDomainResctrlAppend(virDomainDefPtr def, xmlNodePtr node, virResctrlAllocPtr alloc, virBitmapPtr vcpus, virDomainResctrlMonDefPtr monitors, size_t nmonitors, unsigned int flags) To make the function argument list be more concise, so we decide to create the virDomainResctrlNew function and create the @resctrl out of virDomainResctrlAppend in v1.
I do see the future does some other virDomainResctrlMonDefParse and virResctrlAllocIsEmpty calls before virDomainResctrlAppend - may have to rethink all that or just go with the &resctrl logic. Maybe I'll have a different thought later - let's see what happens when I get there.
John
Thanks for review. Huaqiang
ret = 0; cleanup: ctxt->node = oldnode; + virDomainResctrlDefFree(resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(nodes); @@ -19196,6 +19215,8 @@
virDomainMemorytuneDefParse(virDomainDefPtr def,
xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; + virDomainResctrlDefPtr resctrl = NULL; + ssize_t i = 0; int n; int ret = -1; @@ -19240,15 +19261,20 @@
virDomainMemorytuneDefParse(virDomainDefPtr def,
* just update the existing alloc information, which is done in above * virDomainMemorytuneDefParseMemory */ if (new_alloc) { - if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) + resctrl = virDomainResctrlNew(alloc, vcpus); + if (!resctrl) goto cleanup; - vcpus = NULL; - alloc = NULL; + + if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) + goto cleanup; + + resctrl = NULL; }
ret = 0; cleanup: ctxt->node = oldnode; + virDomainResctrlDefFree(resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(nodes);

Introducing <monitor> element under <cachetune> to represent a cache monitor. Supports two kind of monitors, which are, monitor under default allocation or monitor under particular allocation. Monitor supervises the cache or memory bandwidth usage for interested vcpu thread set, if the vcpu thread set is belong to some resctrl allocation, then the monitor will be created under this allocation, that is, creating a resctrl monitoring group directory under the directory of '@alloc->path/mon_group'. Otherwise, the monitor will be created under default allocation. For default allocation monitor, it will have such kind of XML layout: <cachetune vcpus='1'> <monitor level=3 vcpus='1'/> </cachetune> For other type monitor, the XML layout will be something like: <cachetune vcpus='2-4'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> <monitor level=3 vcpus='2'/> </cachetune> Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 26 +++ docs/schemas/domaincommon.rng | 10 + src/conf/domain_conf.c | 217 ++++++++++++++++++++- src/conf/domain_conf.h | 11 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 8 files changed, 301 insertions(+), 5 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b1651e3..2fd665c 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,12 @@ <cachetune vcpus='0-3'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-3'/> + </cachetune> + <cachetune vcpus='4-5'> + <monitor level='3' vcpus='4'/> + <monitor level='3' vcpus='5'/> </cachetune> <memorytune vcpus='0-3'> <node id='0' bandwidth='60'/> @@ -978,6 +984,26 @@ </dd> </dl> </dd> + <dt><code>monitor</code></dt> + <dd> + The optional element <code>monitor</code> creates the cache + monitor(s) for current cache allocation and has the following + required attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level the monitor belongs to. + </dd> + <dt><code>vcpus</code></dt> + <dd> + vCPU list the monitor applies to. A monitor's vCPU list + can only be the member(s) of the vCPU list of associating + allocation. The default monitor has the same vCPU list as the + associating allocation. For non-default monitors, there + are no vCPU overlap permitted. + </dd> + </dl> + </dd> </dl> </dd> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 5c533d6..7ce49d3 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -981,6 +981,16 @@ </optional> </element> </zeroOrMore> + <zeroOrMore> + <element name="monitor"> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + </element> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9a514a6..4f4604f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2955,13 +2955,30 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) static void +virDomainResctrlMonDefFree(virDomainResctrlMonDefPtr domresmon) +{ + if (!domresmon) + return; + + virBitmapFree(domresmon->vcpus); + virObjectUnref(domresmon->instance); +} + + +static void virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) { + size_t i = 0; + if (!resctrl) return; + for (i = 0; i < resctrl->nmonitors; i++) + virDomainResctrlMonDefFree(resctrl->monitors[i]); + virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); + VIR_FREE(resctrl->monitors); VIR_FREE(resctrl); } @@ -18919,6 +18936,154 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, return ret; } +/* Checking if the monitor's vcpus is conflicted with existing allocation + * and monitors. + * + * Returns 1 if @vcpus equals to @resctrl->vcpus, means it is a default + * monitor. Returns - 1 if a conflict found. Returns 0 if no conflict and + * @vcpus is not equal to @resctrl->vcpus. + * */ +static int +virDomainResctrlMonValidateVcpu(virDomainResctrlDefPtr resctrl, + virBitmapPtr vcpus) +{ + size_t i = 0; + int vcpu = -1; + + if (virBitmapIsAllClear(vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("vcpus is empty")); + return -1; + } + + while ((vcpu = virBitmapNextSetBit(vcpus, vcpu)) >= 0) { + if (!virBitmapIsBitSet(resctrl->vcpus, vcpu)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with allocation")); + return -1; + } + } + + if (resctrl->alloc && virBitmapEqual(vcpus, resctrl->vcpus)) + return 1; + + for (i = 0; i < resctrl->nmonitors; i++) { + if (virBitmapEqual(resctrl->vcpus, resctrl->monitors[i]->vcpus)) + continue; + + if (virBitmapOverlaps(vcpus, resctrl->monitors[i]->vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with monitors")); + + return -1; + } + } + + return 0; +} + + +static int +virDomainResctrlMonDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlMonitorType tag, + virDomainResctrlDefPtr resctrl) +{ + virDomainResctrlMonDefPtr domresmon = NULL; + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + unsigned int level = 0; + char * tmp = NULL; + char * id = NULL; + size_t i = 0; + int n = 0; + int rv = -1; + int ret = -1; + + ctxt->node = node; + + if ((n = virXPathNodeSet("./monitor", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract monitor nodes")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + + if (VIR_ALLOC(domresmon) < 0) + goto cleanup; + + domresmon->tag = tag; + + domresmon->instance = virResctrlMonitorNew(); + if (!domresmon->instance) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not create monitor")); + goto cleanup; + } + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + tmp = virXMLPropString(nodes[i], "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing monitor attribute 'level'")); + goto cleanup; + } + + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid monitor attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + + if (virResctrlMonitorSetCacheLevel(domresmon->instance, level) < 0) + goto cleanup; + + VIR_FREE(tmp); + } + + if (virDomainResctrlParseVcpus(def, nodes[i], &domresmon->vcpus) < 0) + goto cleanup; + + rv = virDomainResctrlMonValidateVcpu(resctrl, domresmon->vcpus); + + /* If monitor's vcpu list is identical to allocation's vcpu list, + * set as default monitor */ + if (rv == 1 && resctrl->alloc) + virResctrlMonitorSetDefault(domresmon->instance); + else if (rv < 0) + goto cleanup; + + if (!(tmp = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + if (virAsprintf(&id, "vcpus_%s", tmp) < 0) + goto cleanup; + + if (virResctrlMonitorSetID(domresmon->instance, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(resctrl->monitors, + resctrl->nmonitors, + domresmon) < 0) + goto cleanup; + + VIR_FREE(id); + VIR_FREE(tmp); + domresmon = NULL; + } + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(id); + VIR_FREE(tmp); + virDomainResctrlMonDefFree(domresmon); + return ret; +} + static virDomainResctrlDefPtr virDomainResctrlNew(virResctrlAllocPtr alloc, @@ -19041,15 +19206,20 @@ virDomainCachetuneDefParse(virDomainDefPtr def, } } - if (virResctrlAllocIsEmpty(alloc)) { - ret = 0; - goto cleanup; - } - resctrl = virDomainResctrlNew(alloc, vcpus); if (!resctrl) goto cleanup; + if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_CACHE, + resctrl) < 0) + goto cleanup; + + if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) { + ret = 0; + goto cleanup; + } + if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) goto cleanup; @@ -27085,12 +27255,42 @@ virDomainCachetuneDefFormatHelper(unsigned int level, static int +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr domresmon, + virResctrlMonitorType tag, + virBufferPtr buf) +{ + char *vcpus = NULL; + unsigned int level = 0; + + if (domresmon->tag != tag) + return 0; + + virBufferAddLit(buf, "<monitor "); + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + level = virResctrlMonitorGetCacheLevel(domresmon->instance); + virBufferAsprintf(buf, "level='%u' ", level); + } + + vcpus = virBitmapFormat(domresmon->vcpus); + if (!vcpus) + return -1; + + virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus); + + VIR_FREE(vcpus); + return 0; +} + + +static int virDomainCachetuneDefFormat(virBufferPtr buf, virDomainResctrlDefPtr resctrl, unsigned int flags) { virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; char *vcpus = NULL; + size_t i = 0; int ret = -1; virBufferSetChildIndent(&childrenBuf, buf); @@ -27099,6 +27299,13 @@ virDomainCachetuneDefFormat(virBufferPtr buf, &childrenBuf) < 0) goto cleanup; + for (i = 0; i < resctrl->nmonitors; i ++) { + if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i], + VIR_RESCTRL_MONITOR_TYPE_CACHE, + &childrenBuf) < 0) + goto cleanup; + } + if (virBufferCheckError(&childrenBuf) < 0) goto cleanup; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e30a4b2..60f6464 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2236,12 +2236,23 @@ struct _virDomainCputune { }; +typedef struct _virDomainResctrlMonDef virDomainResctrlMonDef; +typedef virDomainResctrlMonDef *virDomainResctrlMonDefPtr; +struct _virDomainResctrlMonDef { + virBitmapPtr vcpus; + virResctrlMonitorType tag; + virResctrlMonitorPtr instance; +}; + typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr; struct _virDomainResctrlDef { virBitmapPtr vcpus; virResctrlAllocPtr alloc; + + virDomainResctrlMonDefPtr *monitors; + size_t nmonitors; }; diff --git a/tests/genericxml2xmlindata/cachetune-cdp.xml b/tests/genericxml2xmlindata/cachetune-cdp.xml index 9718f06..9f4c139 100644 --- a/tests/genericxml2xmlindata/cachetune-cdp.xml +++ b/tests/genericxml2xmlindata/cachetune-cdp.xml @@ -8,9 +8,12 @@ <cachetune vcpus='0-1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='2'> <cache id='1' level='3' type='code' size='6' unit='MiB'/> + <monitor level='3' vcpus='2'/> </cachetune> <cachetune vcpus='3'> <cache id='1' level='3' type='data' size='6912' unit='KiB'/> diff --git a/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml new file mode 100644 index 0000000..d481fb5 --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml @@ -0,0 +1,30 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='2'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-small.xml b/tests/genericxml2xmlindata/cachetune-small.xml index ab2d9cf..748be08 100644 --- a/tests/genericxml2xmlindata/cachetune-small.xml +++ b/tests/genericxml2xmlindata/cachetune-small.xml @@ -7,6 +7,13 @@ <cputune> <cachetune vcpus='0-1'> <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-1'/> + </cachetune> + <cachetune vcpus='2-3'> + <monitor level='3' vcpus='2'/> + <monitor level='3' vcpus='3'/> </cachetune> </cputune> <os> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index fa941f0..4393d44 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -137,6 +137,8 @@ mymain(void) TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST_FULL("cachetune-colliding-types", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST_FULL("cachetune-colliding-monitor", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST("memorytune"); DO_TEST_FULL("memorytune-colliding-allocs", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Introducing <monitor> element under <cachetune> to represent a cache monitor.
Supports two kind of monitors, which are, monitor under default allocation or monitor under particular allocation.
I still don't see what the big difference is with singling out default. Does it really matter - what advantage is there. Having a monitor for 'n' vcpus is a choice.
Monitor supervises the cache or memory bandwidth usage for
At this point memory bandwidth is not in play...
interested vcpu thread set, if the vcpu thread set is belong to some resctrl allocation, then the monitor will be created under this allocation, that is, creating a resctrl monitoring group directory under the directory of '@alloc->path/mon_group'. Otherwise, the monitor will be created under default allocation.
This is becoming increasing difficult to describe/decipher - makes me wonder who would really use it.
For default allocation monitor, it will have such kind of XML layout:
<cachetune vcpus='1'> <monitor level=3 vcpus='1'/> </cachetune>
For other type monitor, the XML layout will be something like:
<cachetune vcpus='2-4'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> <monitor level=3 vcpus='2'/> </cachetune>
Since all we get is the "cache_occupancy" that would seem to me to be most important when there is a <cache> bank, right? So what would the first (or your default style) collect/display. Or does cache not really matter. IOW: What is cache_occupancy measuring? Each cache? The entire thing? If there's no cache elements, then what? I honestly think this just needs to be simplified as much as possible. When you monitor specific vcpus within a cachetune, then you get what? If the cachetune has no specific cache entries, you get what? If you monitor multiple vcpus within a cachetune then you get what? (?An aggregation of all?). This whole default and specific description doesn't make sense.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 26 +++ docs/schemas/domaincommon.rng | 10 + src/conf/domain_conf.c | 217 ++++++++++++++++++++- src/conf/domain_conf.h | 11 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 8 files changed, 301 insertions(+), 5 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b1651e3..2fd665c 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,12 @@ <cachetune vcpus='0-3'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-3'/> + </cachetune> + <cachetune vcpus='4-5'> + <monitor level='3' vcpus='4'/> + <monitor level='3' vcpus='5'/> </cachetune> <memorytune vcpus='0-3'> <node id='0' bandwidth='60'/> @@ -978,6 +984,26 @@ </dd> </dl> </dd> + <dt><code>monitor</code></dt> + <dd> + The optional element <code>monitor</code> creates the cache + monitor(s) for current cache allocation and has the following + required attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level the monitor belongs to. + </dd> + <dt><code>vcpus</code></dt> + <dd> + vCPU list the monitor applies to. A monitor's vCPU list + can only be the member(s) of the vCPU list of associating + allocation. The default monitor has the same vCPU list as the + associating allocation. For non-default monitors, there + are no vCPU overlap permitted. + </dd> + </dl> + </dd> </dl> </dd>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 5c533d6..7ce49d3 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -981,6 +981,16 @@ </optional> </element> </zeroOrMore> + <zeroOrMore> + <element name="monitor"> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + </element> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9a514a6..4f4604f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2955,13 +2955,30 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader)
static void +virDomainResctrlMonDefFree(virDomainResctrlMonDefPtr domresmon) +{ + if (!domresmon) + return; + + virBitmapFree(domresmon->vcpus); + virObjectUnref(domresmon->instance);
VIR_FREE(domresmon);
+} + + +static void virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) { + size_t i = 0; + if (!resctrl) return;
+ for (i = 0; i < resctrl->nmonitors; i++) + virDomainResctrlMonDefFree(resctrl->monitors[i]); + virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); + VIR_FREE(resctrl->monitors); VIR_FREE(resctrl); }
@@ -18919,6 +18936,154 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, return ret; }
Two blank lines
+/* Checking if the monitor's vcpus is conflicted with existing allocation + * and monitors. + * + * Returns 1 if @vcpus equals to @resctrl->vcpus, means it is a default + * monitor. Returns - 1 if a conflict found. Returns 0 if no conflict and + * @vcpus is not equal to @resctrl->vcpus. + * */ +static int +virDomainResctrlMonValidateVcpu(virDomainResctrlDefPtr resctrl, + virBitmapPtr vcpus)
This should be *ValidateVcpus.
+{ + size_t i = 0; + int vcpu = -1; + + if (virBitmapIsAllClear(vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("vcpus is empty")); + return -1; + } + + while ((vcpu = virBitmapNextSetBit(vcpus, vcpu)) >= 0) { + if (!virBitmapIsBitSet(resctrl->vcpus, vcpu)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with allocation")); + return -1; + } + } + + if (resctrl->alloc && virBitmapEqual(vcpus, resctrl->vcpus))
The ->alloc check is confusing in light of having a monitor as a child of cachetune.
+ return 1; + + for (i = 0; i < resctrl->nmonitors; i++) { + if (virBitmapEqual(resctrl->vcpus, resctrl->monitors[i]->vcpus)) + continue; + + if (virBitmapOverlaps(vcpus, resctrl->monitors[i]->vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with monitors")); + + return -1; + } + } + + return 0; +} + + +static int +virDomainResctrlMonDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlMonitorType tag, + virDomainResctrlDefPtr resctrl) +{ + virDomainResctrlMonDefPtr domresmon = NULL; + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + unsigned int level = 0; + char * tmp = NULL; + char * id = NULL; + size_t i = 0; + int n = 0; + int rv = -1; + int ret = -1; + + ctxt->node = node; + + if ((n = virXPathNodeSet("./monitor", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract monitor nodes")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + + if (VIR_ALLOC(domresmon) < 0) + goto cleanup; + + domresmon->tag = tag; + + domresmon->instance = virResctrlMonitorNew(); + if (!domresmon->instance) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not create monitor")); + goto cleanup; + } + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + tmp = virXMLPropString(nodes[i], "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing monitor attribute 'level'")); + goto cleanup; + } + + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid monitor attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + + if (virResctrlMonitorSetCacheLevel(domresmon->instance, level) < 0) + goto cleanup; + + VIR_FREE(tmp); + } + + if (virDomainResctrlParseVcpus(def, nodes[i], &domresmon->vcpus) < 0) + goto cleanup; + + rv = virDomainResctrlMonValidateVcpu(resctrl, domresmon->vcpus); + + /* If monitor's vcpu list is identical to allocation's vcpu list, + * set as default monitor */ + if (rv == 1 && resctrl->alloc)
I'm still not seeing the need for default... FWIW: The resctrl->alloc check is unnecessary since the only way rv == 1 is if it's there.
+ virResctrlMonitorSetDefault(domresmon->instance); + else if (rv < 0) + goto cleanup; + + if (!(tmp = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + if (virAsprintf(&id, "vcpus_%s", tmp) < 0) + goto cleanup; + + if (virResctrlMonitorSetID(domresmon->instance, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(resctrl->monitors, + resctrl->nmonitors, + domresmon) < 0) + goto cleanup; + + VIR_FREE(id); + VIR_FREE(tmp); + domresmon = NULL; + } + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(id); + VIR_FREE(tmp); + virDomainResctrlMonDefFree(domresmon);
VIR_FREE(nodes);
+ return ret; +} +
static virDomainResctrlDefPtr virDomainResctrlNew(virResctrlAllocPtr alloc, @@ -19041,15 +19206,20 @@ virDomainCachetuneDefParse(virDomainDefPtr def, } }
- if (virResctrlAllocIsEmpty(alloc)) { - ret = 0; - goto cleanup; - } - resctrl = virDomainResctrlNew(alloc, vcpus);
So if @alloc == NULL (which is one of the reasons AllocIsEmpty is true) means that we'll be virObjectRef(NULL) in ResctrlNew(). In this case @alloc could be NULL if there were no <cache> entries, IIUC. I'm starting to see a real downside to a resctrl->alloc == NULL. I really don't want to continue to see churn on the internal hierarchy though. However, if I look at how it could be filled in within the context of this function, the virDomainResctrlVcpuMatch call and subsequent possible allocation of @alloc would seemingly be possible outside the context of whether specific <cache> entries existed. Probably could just do away with the term default allocation - all it seems to be is an allocation without <cache> elements, but it can have <monitor> elements. If someone places a <cachetune> without <cache> and without <monitor>, so what - who cares. Probably doesn't do much other than limit other <cachetune> (and perhaps <memorytune>) elements.
if (!resctrl) goto cleanup;
+ if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_CACHE, + resctrl) < 0) + goto cleanup; + + if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) { + ret = 0; + goto cleanup; + } +
Moving the AllocIsEmpty check should be separate. I'm losing steam, but the next couple of patches had Coverity issues, so I figured I'll note that before going back to read all the comments you've posted today while I was reviewing without trying to go back. John
if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) goto cleanup;
@@ -27085,12 +27255,42 @@ virDomainCachetuneDefFormatHelper(unsigned int level,
static int +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr domresmon, + virResctrlMonitorType tag, + virBufferPtr buf) +{ + char *vcpus = NULL; + unsigned int level = 0; + + if (domresmon->tag != tag) + return 0; + + virBufferAddLit(buf, "<monitor "); + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + level = virResctrlMonitorGetCacheLevel(domresmon->instance); + virBufferAsprintf(buf, "level='%u' ", level); + } + + vcpus = virBitmapFormat(domresmon->vcpus); + if (!vcpus) + return -1; + + virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus); + + VIR_FREE(vcpus); + return 0; +} + + +static int virDomainCachetuneDefFormat(virBufferPtr buf, virDomainResctrlDefPtr resctrl, unsigned int flags) { virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; char *vcpus = NULL; + size_t i = 0; int ret = -1;
virBufferSetChildIndent(&childrenBuf, buf); @@ -27099,6 +27299,13 @@ virDomainCachetuneDefFormat(virBufferPtr buf, &childrenBuf) < 0) goto cleanup;
+ for (i = 0; i < resctrl->nmonitors; i ++) { + if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i], + VIR_RESCTRL_MONITOR_TYPE_CACHE, + &childrenBuf) < 0) + goto cleanup; + } + if (virBufferCheckError(&childrenBuf) < 0) goto cleanup;
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e30a4b2..60f6464 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2236,12 +2236,23 @@ struct _virDomainCputune { };
+typedef struct _virDomainResctrlMonDef virDomainResctrlMonDef; +typedef virDomainResctrlMonDef *virDomainResctrlMonDefPtr; +struct _virDomainResctrlMonDef { + virBitmapPtr vcpus; + virResctrlMonitorType tag; + virResctrlMonitorPtr instance; +}; + typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr;
struct _virDomainResctrlDef { virBitmapPtr vcpus; virResctrlAllocPtr alloc; + + virDomainResctrlMonDefPtr *monitors; + size_t nmonitors; };
diff --git a/tests/genericxml2xmlindata/cachetune-cdp.xml b/tests/genericxml2xmlindata/cachetune-cdp.xml index 9718f06..9f4c139 100644 --- a/tests/genericxml2xmlindata/cachetune-cdp.xml +++ b/tests/genericxml2xmlindata/cachetune-cdp.xml @@ -8,9 +8,12 @@ <cachetune vcpus='0-1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='2'> <cache id='1' level='3' type='code' size='6' unit='MiB'/> + <monitor level='3' vcpus='2'/> </cachetune> <cachetune vcpus='3'> <cache id='1' level='3' type='data' size='6912' unit='KiB'/> diff --git a/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml new file mode 100644 index 0000000..d481fb5 --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml @@ -0,0 +1,30 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='2'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-small.xml b/tests/genericxml2xmlindata/cachetune-small.xml index ab2d9cf..748be08 100644 --- a/tests/genericxml2xmlindata/cachetune-small.xml +++ b/tests/genericxml2xmlindata/cachetune-small.xml @@ -7,6 +7,13 @@ <cputune> <cachetune vcpus='0-1'> <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-1'/> + </cachetune> + <cachetune vcpus='2-3'> + <monitor level='3' vcpus='2'/> + <monitor level='3' vcpus='3'/> </cachetune> </cputune> <os> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index fa941f0..4393d44 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -137,6 +137,8 @@ mymain(void) TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST_FULL("cachetune-colliding-types", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST_FULL("cachetune-colliding-monitor", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST("memorytune"); DO_TEST_FULL("memorytune-colliding-allocs", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Thursday, October 11, 2018 4:58 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 13/19] conf: Add resctrl monitor configuration
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Introducing <monitor> element under <cachetune> to represent a cache monitor.
Supports two kind of monitors, which are, monitor under default allocation or monitor under particular allocation.
I still don't see what the big difference is with singling out default. Does it really matter - what advantage is there. Having a monitor for 'n' vcpus is a choice.
Monitor supervises the cache or memory bandwidth usage for
At this point memory bandwidth is not in play...
Yes. Will remove the memory BW related words.
interested vcpu thread set, if the vcpu thread set is belong to some resctrl allocation, then the monitor will be created under this allocation, that is, creating a resctrl monitoring group directory under the directory of '@alloc->path/mon_group'. Otherwise, the monitor will be created under default allocation.
This is becoming increasing difficult to describe/decipher - makes me wonder who would really use it.
Monitor in default allocation could be commonly used when we want to observe the resource usage for VM vcpu threads that are not specified with any dedicated resource limitation through CAT or MBA technology. And another common case is it will be used if CAT/MBA is not supported but CMT/MBM is enabled in libvirt. The reason I have introduced the monitor in previous patch and further introducing the monitor in default allocation in this patch is the monitor n two different allocations has different behavior. Let's focus on CAT and CMT technology in my below lines, since MBA and MBM are very similar cases to them. As we know, before the CAT technology was introduced, any process in Linux is sharing the L3 cache along with all other active processes. After CAT is enabled in libvirt, it has the capability to apply cache isolation and assign dedicated amount of cache to some key VM vcpu threads. If we want to observe the real L3 cache usage information, then we need the help of monitor. == Monitor usage case 1: monitor in default allocation == If you want to get the cache utilization data before applying any cache isolation to the VM vcpu threads, you need to create a monitor in the default allocation, because you haven't create any cache allocation. == Monitor usage case 2: monitor in non-default allocation == If you have created a cache allocation for specific VM vcpu treads and not sure how many cache-lines these VM vcpu threads are really used, you need to create a monitor under the this allocation to get real cache usage information. If you find your VM vcpu threads only used a small part of cache that have allocated, you might think about to reduce its allocation. == Usage for default monitor and non-default monitor == Since we have introduced the 'default monitor' and 'non-default monitor' concepts in previous patches, and now, you can monitor the cache usage for all VM vcpu threads that added to this allocation, and also you has the capability to monitor a subset of vcpu list of this allocation. Without 'default monitor', it is impossible to get the cache usage for all vcpu threads in the allocation and at the same time get the cache usage for some highly interested vcpu threads of allocation. == Monitor usage case 3: allocating dedicated cache and monitoring its usage of one VM, and getting its influence over another VM== Think about the scenario that there are two VMs, it is a known information that one VM is very cache sensitive and don't want to share cache with other workloads, and for another VM, we have no knowledge about cache requirement, but it is required to monitor the cache usage for the two VMs. With the concepts introduced until now. We need to create an allocation and for this VM, then create a default-monitor in this allocation for monitoring. For another VM, it is required to create a non-default monitor under default allocation. With introduced concepts, the allocation, the default allocation, the monitor and the default monitor, it is possible to fulfill requirement all scenarios. The creation of monitor does not have too much flexibility, as I stated in my reply of v5patch0's review comments, the monitors need to follow below rules: 1. In each <cachetune> entry more than one monitors could be specified. 2. In each <cachetune> entry up to one allocation could be specified. 3. The allocation is using the vcpu list specified in <cachetune> attribute 'vcpus'. 4. A monitor has the same vcpu list as allocation is allowed, and this monitor is allocation's default monitor. 5. A monitor has a subset vcpu list of allocation is allowed. 6. For non-default monitors, any vcpu list overlap is not permitted. And also, the number of monitors could be generated is subject to the hardware RMID numbers. About the behavior of underlying '/sys/fs/resctrl' file system, you can get more detail from this link: https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt
For default allocation monitor, it will have such kind of XML layout:
<cachetune vcpus='1'> <monitor level=3 vcpus='1'/> </cachetune>
For other type monitor, the XML layout will be something like:
<cachetune vcpus='2-4'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> <monitor level=3 vcpus='2'/> </cachetune>
Since all we get is the "cache_occupancy" that would seem to me to be most important when there is a <cache> bank, right? So what would the first (or your default style) collect/display. Or does cache not really matter.
<cache> does not matter for monitor. Even in case of no CAT in system, the <cache> entry will never be shown under <cachetune>. If valid <cache> entries exist, means allocating some part of cache-lines to VM vcpu threads specified in attribute 'vcpus'. If there is no <cache> entry existed in <cachetune>, it does not mean these vcpu threads does not use any cache resource, it means it will use the cache resource specified in default allocation. Normally the default allocation's cache resource is shared by a lot of cache insensitive workloads.
IOW: What is cache_occupancy measuring? Each cache? The entire thing? If there's no cache elements, then what?
cache_occupancy is measuring based on cache bank. For Intel 2 socket xeon CPU, it is considered as two cache banks, one cache bank per socket. The typical output for each monitor of this case is: cpu.cache.0.name=vcpus_1 cpu.cache.0.vcpus=1 cpu.cache.0.bank.count=2 <--- 2 cache banks cpu.cache.0.bank.0.id=0 <--- bank.id.0 cache_occypancy cpu.cache.0.bank.0.bytes=9371648 _| cpu.cache.0.bank.1.id=1 <--- bank.id.1 cache_occypancy cpu.cache.0.bank.1.bytes=1081344 _| If you want to know the total cache occupancy for VM vcpu threads of this monitor, you need to add them up.
I honestly think this just needs to be simplified as much as possible.
When you monitor specific vcpus within a cachetune, then you get what?
In this case, the monitor you created only monitors the specific vcpus you added for monitor. Following two configurations satisfy your scenario, and the only monitor will detect the cache usage of thread of vcpu 2. <cachetune vcpus='2-4'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> <monitor level=3 vcpus='2'/> </cachetune> <cachetune vcpus='2-4'> <monitor level=3 vcpus='2'/> </cachetune>
If the cachetune has no specific cache entries, you get what?
If no cache entry in cachetune, it will also get vcpu threads' cache utilization information based on cache bank. No cache entry specified for the cachetune, means it will use the cache allocating policy of default cache allocation, which is file /sys/fs/resctrl/schemata. If valid cache entries are provided in cachetune, then an allocation will be created for the threads of vcpu listed in <cachetune> 'vcpus' attribute. Supposing the allocation is the directory /sys/fs/resctrl/p0, then the cache resource limitation was applied on these threads. For monitor, it does not care if vcpu threads are allowed or not alloowed to access a limit amount of cache-lines. Monitor only reports the amount of cache has been accesses.
If you monitor multiple vcpus within a cachetune then you get what? (?An aggregation of all?).
Yes. supposing you have this vcpus setting for <cachetune> <cachetune vcpus='0-4,8' ..../> and you choose to monitor the cache usage for vcpu 0,3,8, then you create following monitor entry inside the cachetune entry, with the output of monitor, you will get an aggregative cache occupancy information for threads of vcpu 0,3,8. <cachetune vcpus='0-4,8'/> <monitor level='3' vcpus='0,3,8'/> </cachetune>
This whole default and specific description doesn't make sense.
Sorry for make you confused, I'll try to refine the descriptions.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 26 +++ docs/schemas/domaincommon.rng | 10 + src/conf/domain_conf.c | 217 ++++++++++++++++++++- src/conf/domain_conf.h | 11 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 8 files changed, 301 insertions(+), 5 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b1651e3..2fd665c 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,12 @@ <cachetune vcpus='0-3'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-3'/> + </cachetune> + <cachetune vcpus='4-5'> + <monitor level='3' vcpus='4'/> + <monitor level='3' vcpus='5'/> </cachetune> <memorytune vcpus='0-3'> <node id='0' bandwidth='60'/> @@ -978,6 +984,26 @@ </dd> </dl> </dd> + <dt><code>monitor</code></dt> + <dd> + The optional element <code>monitor</code> creates the cache + monitor(s) for current cache allocation and has the following + required attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level the monitor belongs to. + </dd> + <dt><code>vcpus</code></dt> + <dd> + vCPU list the monitor applies to. A monitor's vCPU list + can only be the member(s) of the vCPU list of associating + allocation. The default monitor has the same vCPU list as the + associating allocation. For non-default monitors, there + are no vCPU overlap permitted. + </dd> + </dl> + </dd> </dl> </dd>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 5c533d6..7ce49d3 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -981,6 +981,16 @@ </optional> </element> </zeroOrMore> + <zeroOrMore> + <element name="monitor"> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + </element> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9a514a6..4f4604f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2955,13 +2955,30 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader)
static void +virDomainResctrlMonDefFree(virDomainResctrlMonDefPtr domresmon) { + if (!domresmon) + return; + + virBitmapFree(domresmon->vcpus); + virObjectUnref(domresmon->instance);
VIR_FREE(domresmon);
I forget to free monitor itself. Will be fixed.
+} + + +static void virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) { + size_t i = 0; + if (!resctrl) return;
+ for (i = 0; i < resctrl->nmonitors; i++) + virDomainResctrlMonDefFree(resctrl->monitors[i]); + virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); + VIR_FREE(resctrl->monitors); VIR_FREE(resctrl); }
@@ -18919,6 +18936,154 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, return ret; }
Two blank lines
OK.
+/* Checking if the monitor's vcpus is conflicted with existing +allocation + * and monitors. + * + * Returns 1 if @vcpus equals to @resctrl->vcpus, means it is a +default + * monitor. Returns - 1 if a conflict found. Returns 0 if no conflict +and + * @vcpus is not equal to @resctrl->vcpus. + * */ +static int +virDomainResctrlMonValidateVcpu(virDomainResctrlDefPtr resctrl, + virBitmapPtr vcpus)
This should be *ValidateVcpus.
OK.
+{ + size_t i = 0; + int vcpu = -1; + + if (virBitmapIsAllClear(vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("vcpus is empty")); + return -1; + } + + while ((vcpu = virBitmapNextSetBit(vcpus, vcpu)) >= 0) { + if (!virBitmapIsBitSet(resctrl->vcpus, vcpu)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with allocation")); + return -1; + } + } + + if (resctrl->alloc && virBitmapEqual(vcpus, resctrl->vcpus))
The ->alloc check is confusing in light of having a monitor as a child of cachetune.
I am looking cachetune as an entity with the capability to do the performance tuning tasks by a better cache resource arrangement. In a general performance tuning taks, first we need to get to know the resource utilization information and then applies some resource arrangement, such as cache isolation or allocation. @resctrl->alloc represents the role for resource arranging, @resctrl->monitors are for the performance detecting/monitoring part. Performance monitoring belongs to the scope of performance tuning, just like the part doing the resource limitation. Based on this understanding, I combined @alloc and @monitors, and let them as a child of @resctrl. If you still think it is strange to put @monitors under @resctrl, then I have to change it, creating a data structure at the level of @def->resctrls, then the new _virDomainDef structure would be: struct _virDomainDef { ... virDomainResctrlDefPtr *resctrls; size_t nresctrls; virDomainResctrlMonDefPtr *monitors; size_t nmonitors; ... } It seems the "virDomainResctrlDefPtr" should be refactored by reducing its scope that the name implies. Further, even have refactored like this, I have to add a lot of code to maintain the relationship between @resctrls->vcpus and each one of @monitors->vcpus.
+ return 1; + + for (i = 0; i < resctrl->nmonitors; i++) { + if (virBitmapEqual(resctrl->vcpus, resctrl->monitors[i]->vcpus)) + continue; + + if (virBitmapOverlaps(vcpus, resctrl->monitors[i]->vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with + monitors")); + + return -1; + } + } + + return 0; +} + + +static int +virDomainResctrlMonDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlMonitorType tag, + virDomainResctrlDefPtr resctrl) { + virDomainResctrlMonDefPtr domresmon = NULL; + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + unsigned int level = 0; + char * tmp = NULL; + char * id = NULL; + size_t i = 0; + int n = 0; + int rv = -1; + int ret = -1; + + ctxt->node = node; + + if ((n = virXPathNodeSet("./monitor", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract monitor nodes")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + + if (VIR_ALLOC(domresmon) < 0) + goto cleanup; + + domresmon->tag = tag; + + domresmon->instance = virResctrlMonitorNew(); + if (!domresmon->instance) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not create monitor")); + goto cleanup; + } + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + tmp = virXMLPropString(nodes[i], "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing monitor attribute 'level'")); + goto cleanup; + } + + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid monitor attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + + if (virResctrlMonitorSetCacheLevel(domresmon->instance, level) < 0) + goto cleanup; + + VIR_FREE(tmp); + } + + if (virDomainResctrlParseVcpus(def, nodes[i], &domresmon->vcpus) < 0) + goto cleanup; + + rv = virDomainResctrlMonValidateVcpu(resctrl, + domresmon->vcpus); + + /* If monitor's vcpu list is identical to allocation's vcpu list, + * set as default monitor */ + if (rv == 1 && resctrl->alloc)
I'm still not seeing the need for default...
As I stated, default monitor, monitor, and monitor in default allocation has different character and behavior in resctrl, they are necessary. If we don't support default monitor, we could not monitor cache usage of whole allocation and also monitor some the cache usage of a subset of allocation vcpus at the same time. If we don't support default allocation, we will have to create many duplicated allocations that has same cache settings, that is sharing same cache resources. In that case, we'll lose the flexibility that kernel resctrl fs provided.
FWIW: The resctrl->alloc check is unnecessary since the only way rv == 1 is if it's there.
You are right, it is double checked, changes to if (rv == 1) virResctrlMonitorSetDefault(domresmon->instance);
+ virResctrlMonitorSetDefault(domresmon->instance); + else if (rv < 0) + goto cleanup; + + if (!(tmp = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + if (virAsprintf(&id, "vcpus_%s", tmp) < 0) + goto cleanup; + + if (virResctrlMonitorSetID(domresmon->instance, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(resctrl->monitors, + resctrl->nmonitors, + domresmon) < 0) + goto cleanup; + + VIR_FREE(id); + VIR_FREE(tmp); + domresmon = NULL; + } + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(id); + VIR_FREE(tmp); + virDomainResctrlMonDefFree(domresmon);
VIR_FREE(nodes);
I forget to free it. Will be added.
+ return ret; +} +
static virDomainResctrlDefPtr virDomainResctrlNew(virResctrlAllocPtr alloc, @@ -19041,15 +19206,20 @@ virDomainCachetuneDefParse(virDomainDefPtr def, } }
- if (virResctrlAllocIsEmpty(alloc)) { - ret = 0; - goto cleanup; - } - resctrl = virDomainResctrlNew(alloc, vcpus);
So if @alloc == NULL (which is one of the reasons AllocIsEmpty is true) means that we'll be virObjectRef(NULL) in ResctrlNew(). In this case @alloc could be NULL if there were no <cache> entries, IIUC.
Your understanding is right.
I'm starting to see a real downside to a resctrl->alloc == NULL. I really don't want to continue to see churn on the internal hierarchy though.
The downside, what you mean is the 'churn on the internal hierarchy though'?
However, if I look at how it could be filled in within the context of this function, the virDomainResctrlVcpuMatch call and subsequent possible allocation of @alloc would seemingly be possible outside the context of whether specific <cache> entries existed.
If no <cache> entry, then 'n = virXPathNodeSet("./cache", ctxt, &nodes)' statement will assign n to 0. If n is 0, then virDomainResctrlVcpuMatch will not be called and its subsequent steps for creating @alloc will not be executed.
Probably could just do away with the term default allocation - all it seems to be is an allocation without <cache> elements, but it can have <monitor> elements.
Correct, this is how code works.
If someone places a <cachetune> without <cache> and without <monitor>, so what - who cares. Probably doesn't do much other than limit other <cachetune> (and perhaps <memorytune>) elements.
Yes.
if (!resctrl) goto cleanup;
+ if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_CACHE, + resctrl) < 0) + goto cleanup; + + if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) { + ret = 0; + goto cleanup; + } +
Moving the AllocIsEmpty check should be separate.
Got. Will be done.
I'm losing steam, but the next couple of patches had Coverity issues, so I figured I'll note that before going back to read all the comments you've posted today while I was reviewing without trying to go back.
John
I use a lot of paragraph in introducing the usage of 'default monitor', 'default allocation' and 'monitor in non-default allocation', and states that these 'default' are necessary for purpose of saving RMIDs, removing allocation duplications and keeping the flexibility of monitor that kernel interface provided. Hope I tell them clearly. Thanks for review. Huaqiang
if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) goto cleanup;
@@ -27085,12 +27255,42 @@
virDomainCachetuneDefFormatHelper(unsigned
int level,
static int +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr domresmon, + virResctrlMonitorType tag, + virBufferPtr buf) { +IIUC. char *vcpus = NULL; + unsigned int level = 0; + + if (domresmon->tag != tag) + return 0; + + virBufferAddLit(buf, "<monitor "); + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + level = virResctrlMonitorGetCacheLevel(domresmon->instance); + virBufferAsprintf(buf, "level='%u' ", level); + } + + vcpus = virBitmapFormat(domresmon->vcpus); + if (!vcpus) + return -1; + + virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus); + + VIR_FREE(vcpus); + return 0; +} + + +static int virDomainCachetuneDefFormat(virBufferPtr buf, virDomainResctrlDefPtr resctrl, unsigned int flags) { virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; char *vcpus = NULL; + size_t i = 0; int ret = -1;
virBufferSetChildIndent(&childrenBuf, buf); @@ -27099,6 +27299,13 @@ virDomainCachetuneDefFormat(virBufferPtr buf, &childrenBuf) < 0) goto cleanup;
+ for (i = 0; i < resctrl->nmonitors; i ++) { + if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i], + VIR_RESCTRL_MONITOR_TYPE_CACHE, + &childrenBuf) < 0) + goto cleanup; + } + if (virBufferCheckError(&childrenBuf) < 0) goto cleanup;
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e30a4b2..60f6464 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2236,12 +2236,23 @@ struct _virDomainCputune { };
+typedef struct _virDomainResctrlMonDef virDomainResctrlMonDef; +typedef virDomainResctrlMonDef *virDomainResctrlMonDefPtr; struct +_virDomainResctrlMonDef { + virBitmapPtr vcpus; + virResctrlMonitorType tag; + virResctrlMonitorPtr instance; +}; + typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr;
struct _virDomainResctrlDef { virBitmapPtr vcpus; virResctrlAllocPtr alloc; + + virDomainResctrlMonDefPtr *monitors; + size_t nmonitors; };
diff --git a/tests/genericxml2xmlindata/cachetune-cdp.xml b/tests/genericxml2xmlindata/cachetune-cdp.xml index 9718f06..9f4c139 100644 --- a/tests/genericxml2xmlindata/cachetune-cdp.xml +++ b/tests/genericxml2xmlindata/cachetune-cdp.xml @@ -8,9 +8,12 @@ <cachetune vcpus='0-1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='2'> <cache id='1' level='3' type='code' size='6' unit='MiB'/> + <monitor level='3' vcpus='2'/> </cachetune> <cachetune vcpus='3'> <cache id='1' level='3' type='data' size='6912' unit='KiB'/> diff --git a/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml new file mode 100644 index 0000000..d481fb5 --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml @@ -0,0 +1,30 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='2'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-small.xml b/tests/genericxml2xmlindata/cachetune-small.xml index ab2d9cf..748be08 100644 --- a/tests/genericxml2xmlindata/cachetune-small.xml +++ b/tests/genericxml2xmlindata/cachetune-small.xml @@ -7,6 +7,13 @@ <cputune> <cachetune vcpus='0-1'> <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-1'/> + </cachetune> + <cachetune vcpus='2-3'> + <monitor level='3' vcpus='2'/> + <monitor level='3' vcpus='3'/> </cachetune> </cputune> <os> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index fa941f0..4393d44 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -137,6 +137,8 @@ mymain(void) TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST_FULL("cachetune-colliding-types", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST_FULL("cachetune-colliding-monitor", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST("memorytune"); DO_TEST_FULL("memorytune-colliding-allocs", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);

On 10/12/18 3:10 AM, Wang, Huaqiang wrote:
-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Thursday, October 11, 2018 4:58 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 13/19] conf: Add resctrl monitor configuration
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Introducing <monitor> element under <cachetune> to represent a cache monitor.
Supports two kind of monitors, which are, monitor under default allocation or monitor under particular allocation.
I still don't see what the big difference is with singling out default. Does it really matter - what advantage is there. Having a monitor for 'n' vcpus is a choice.
Monitor supervises the cache or memory bandwidth usage for
At this point memory bandwidth is not in play...
Yes. Will remove the memory BW related words.
interested vcpu thread set, if the vcpu thread set is belong to some resctrl allocation, then the monitor will be created under this allocation, that is, creating a resctrl monitoring group directory under the directory of '@alloc->path/mon_group'. Otherwise, the monitor will be created under default allocation.
This is becoming increasing difficult to describe/decipher - makes me wonder who would really use it.
Monitor in default allocation could be commonly used when we want to observe the resource usage for VM vcpu threads that are not specified with any dedicated resource limitation through CAT or MBA technology.
And another common case is it will be used if CAT/MBA is not supported but CMT/MBM is enabled in libvirt.
The reason I have introduced the monitor in previous patch and further introducing the monitor in default allocation in this patch is the monitor n two different allocations has different behavior.
Let's focus on CAT and CMT technology in my below lines, since MBA and MBM are very similar cases to them.
As we know, before the CAT technology was introduced, any process in Linux is sharing the L3 cache along with all other active processes. After CAT is enabled in libvirt, it has the capability to apply cache isolation and assign dedicated amount of cache to some key VM vcpu threads.
If we want to observe the real L3 cache usage information, then we need the help of monitor.
== Monitor usage case 1: monitor in default allocation ==
If you want to get the cache utilization data before applying any cache isolation to the VM vcpu threads, you need to create a monitor in the default allocation, because you haven't create any cache allocation.
== Monitor usage case 2: monitor in non-default allocation ==
If you have created a cache allocation for specific VM vcpu treads and not sure how many cache-lines these VM vcpu threads are really used, you need to create a monitor under the this allocation to get real cache usage information.
If you find your VM vcpu threads only used a small part of cache that have allocated, you might think about to reduce its allocation.
== Usage for default monitor and non-default monitor ==
Since we have introduced the 'default monitor' and 'non-default monitor' concepts in previous patches, and now, you can monitor the cache usage for all VM vcpu threads that added to this allocation, and also you has the capability to monitor a subset of vcpu list of this allocation.
Without 'default monitor', it is impossible to get the cache usage for all vcpu threads in the allocation and at the same time get the cache usage for some highly interested vcpu threads of allocation.
"Default monitor" is in name only. It's just "a" monitor for all the threads in cachetune (or memorytune eventually); whereas, a "non default monitor" is "a" monitor for specific vcpus within the range in cachetune. Thus your description is that a monitor can be a monitor all vcpus or a monitor for some subset of of all vcpus. A <monitor> describes which vcpus of a cachetune (or memorytune) can be monitored - there are 3 options - "all", "m-to-n", or "one-to-one". What they relate to in the filesystem is the magic of the code of paths to the data. What data structures are called or how this is described in docs just doesn't seem to need to make a delineation.
== Monitor usage case 3: allocating dedicated cache and monitoring its usage of one VM, and getting its influence over another VM==
Think about the scenario that there are two VMs, it is a known information that one VM is very cache sensitive and don't want to share cache with other workloads, and for another VM, we have no knowledge about cache requirement, but it is required to monitor the cache usage for the two VMs.
With the concepts introduced until now. We need to create an allocation and for this VM, then create a default-monitor in this allocation for monitoring. For another VM, it is required to create a non-default monitor under default allocation.
With introduced concepts, the allocation, the default allocation, the monitor and the default monitor, it is possible to fulfill requirement all scenarios.
I think it's far easier to describe 2 things rather than 4.
The creation of monitor does not have too much flexibility, as I stated in my reply of v5patch0's review comments, the monitors need to follow below rules:
1. In each <cachetune> entry more than one monitors could be specified. 2. In each <cachetune> entry up to one allocation could be specified. 3. The allocation is using the vcpu list specified in <cachetune> attribute 'vcpus'.
An allocation is either from the list of vcpus for a cachetune or a memorytune if cachetune for the listed vcpus in a memorytune doesn't exist (and yes, doesn't conflict with any vcpus in any found cachetune). If the domain was created with 4 vcpus, then is "theoretically" the default allocation for 4 vcpus. Allocations beyond that allow one to slice and dice *as long as* there is no overlap with the definitions. I don't think "default" or "hidden" really needs to be described or called out, it just is. Maybe I'm oversimplifying.
4. A monitor has the same vcpu list as allocation is allowed, and this monitor is allocation's default monitor. 5. A monitor has a subset vcpu list of allocation is allowed. 6. For non-default monitors, any vcpu list overlap is not permitted. And also, the number of monitors could be generated is subject to the hardware RMID numbers.
About the behavior of underlying '/sys/fs/resctrl' file system, you can get more detail from this link: https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt
For default allocation monitor, it will have such kind of XML layout:
<cachetune vcpus='1'> <monitor level=3 vcpus='1'/> </cachetune>
For other type monitor, the XML layout will be something like:
<cachetune vcpus='2-4'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> <monitor level=3 vcpus='2'/> </cachetune>
Since all we get is the "cache_occupancy" that would seem to me to be most important when there is a <cache> bank, right? So what would the first (or your default style) collect/display. Or does cache not really matter.
<cache> does not matter for monitor. Even in case of no CAT in system, the <cache> entry will never be shown under <cachetune>.
If valid <cache> entries exist, means allocating some part of cache-lines to VM vcpu threads specified in attribute 'vcpus'. If there is no <cache> entry existed in <cachetune>, it does not mean these vcpu threads does not use any cache resource, it means it will use the cache resource specified in default allocation. Normally the default allocation's cache resource is shared by a lot of cache insensitive workloads.
Right it's not overriding anything. It's just using the values for all vcpus present, but that's an implementation detail of the underlying technology which it feels like could be "papered over".
IOW: What is cache_occupancy measuring? Each cache? The entire thing? If there's no cache elements, then what?
cache_occupancy is measuring based on cache bank. For Intel 2 socket xeon CPU, it is considered as two cache banks, one cache bank per socket. The typical output for each monitor of this case is:
cpu.cache.0.name=vcpus_1 cpu.cache.0.vcpus=1 cpu.cache.0.bank.count=2 <--- 2 cache banks cpu.cache.0.bank.0.id=0 <--- bank.id.0 cache_occypancy cpu.cache.0.bank.0.bytes=9371648 _| cpu.cache.0.bank.1.id=1 <--- bank.id.1 cache_occypancy cpu.cache.0.bank.1.bytes=1081344 _|
If you want to know the total cache occupancy for VM vcpu threads of this monitor, you need to add them up.
So if you have: <monitor... vcpus=0-1> what do you get in output for cache_occupancy? 0 + 1?
I honestly think this just needs to be simplified as much as possible.
When you monitor specific vcpus within a cachetune, then you get what?
In this case, the monitor you created only monitors the specific vcpus you added for monitor.
Following two configurations satisfy your scenario, and the only monitor will detect the cache usage of thread of vcpu 2.
<cachetune vcpus='2-4'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> <monitor level=3 vcpus='2'/> </cachetune>
<cachetune vcpus='2-4'> <monitor level=3 vcpus='2'/> </cachetune>
Perhaps my question was mistyped or misinterpreted. In the above top example, if we have <monitor ... vcpus='2-4'>, then do the values in <cache> have any impact on the calculation as opposed to if they weren't there?
If the cachetune has no specific cache entries, you get what?
If no cache entry in cachetune, it will also get vcpu threads' cache utilization information based on cache bank. No cache entry specified for the cachetune, means it will use the cache allocating policy of default cache allocation, which is file /sys/fs/resctrl/schemata.
If valid cache entries are provided in cachetune, then an allocation will be created for the threads of vcpu listed in <cachetune> 'vcpus' attribute. Supposing the allocation is the directory /sys/fs/resctrl/p0, then the cache resource limitation was applied on these threads.
For monitor, it does not care if vcpu threads are allowed or not alloowed to access a limit amount of cache-lines. Monitor only reports the amount of cache has been accesses.
If you monitor multiple vcpus within a cachetune then you get what? (?An aggregation of all?).
Yes. supposing you have this vcpus setting for <cachetune> <cachetune vcpus='0-4,8' ..../>
and you choose to monitor the cache usage for vcpu 0,3,8, then you create following monitor entry inside the cachetune entry, with the output of monitor, you will get an aggregative cache occupancy information for threads of vcpu 0,3,8.
<cachetune vcpus='0-4,8'/> <monitor level='3' vcpus='0,3,8'/> </cachetune>
This whole default and specific description doesn't make sense.
Sorry for make you confused, I'll try to refine the descriptions.
In this last case if you also had <monitor level='3', vcpus='4'/> <monitor level='3', vcpus='0-4,8'/> then I'd expect that the values output in "0-4,8" to match those that I could add by myself with "4" and "0-3,8". True? Is it apparent yet why I'm saying mentioning default just confuses things? If so, I'm not sure what else I can do to explain.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 26 +++ docs/schemas/domaincommon.rng | 10 + src/conf/domain_conf.c | 217 ++++++++++++++++++++- src/conf/domain_conf.h | 11 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 8 files changed, 301 insertions(+), 5 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b1651e3..2fd665c 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,12 @@ <cachetune vcpus='0-3'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-3'/> + </cachetune> + <cachetune vcpus='4-5'> + <monitor level='3' vcpus='4'/> + <monitor level='3' vcpus='5'/> </cachetune> <memorytune vcpus='0-3'> <node id='0' bandwidth='60'/> @@ -978,6 +984,26 @@ </dd> </dl> </dd> + <dt><code>monitor</code></dt> + <dd> + The optional element <code>monitor</code> creates the cache + monitor(s) for current cache allocation and has the following + required attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level the monitor belongs to. + </dd> + <dt><code>vcpus</code></dt> + <dd> + vCPU list the monitor applies to. A monitor's vCPU list + can only be the member(s) of the vCPU list of associating + allocation. The default monitor has the same vCPU list as the + associating allocation. For non-default monitors, there + are no vCPU overlap permitted. + </dd> + </dl> + </dd> </dl> </dd>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 5c533d6..7ce49d3 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -981,6 +981,16 @@ </optional> </element> </zeroOrMore> + <zeroOrMore> + <element name="monitor"> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + </element> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9a514a6..4f4604f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2955,13 +2955,30 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader)
static void +virDomainResctrlMonDefFree(virDomainResctrlMonDefPtr domresmon) { + if (!domresmon) + return; + + virBitmapFree(domresmon->vcpus); + virObjectUnref(domresmon->instance);
VIR_FREE(domresmon);
I forget to free monitor itself. Will be fixed.
Coverity found this!
+} + + +static void virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) { + size_t i = 0; + if (!resctrl) return;
+ for (i = 0; i < resctrl->nmonitors; i++) + virDomainResctrlMonDefFree(resctrl->monitors[i]); + virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); + VIR_FREE(resctrl->monitors); VIR_FREE(resctrl); }
@@ -18919,6 +18936,154 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, return ret; }
Two blank lines
OK.
+/* Checking if the monitor's vcpus is conflicted with existing +allocation + * and monitors. + * + * Returns 1 if @vcpus equals to @resctrl->vcpus, means it is a +default + * monitor. Returns - 1 if a conflict found. Returns 0 if no conflict +and + * @vcpus is not equal to @resctrl->vcpus. + * */ +static int +virDomainResctrlMonValidateVcpu(virDomainResctrlDefPtr resctrl, + virBitmapPtr vcpus)
This should be *ValidateVcpus.
OK.
+{ + size_t i = 0; + int vcpu = -1; + + if (virBitmapIsAllClear(vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("vcpus is empty")); + return -1; + } + + while ((vcpu = virBitmapNextSetBit(vcpus, vcpu)) >= 0) { + if (!virBitmapIsBitSet(resctrl->vcpus, vcpu)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with allocation")); + return -1; + } + } + + if (resctrl->alloc && virBitmapEqual(vcpus, resctrl->vcpus))
The ->alloc check is confusing in light of having a monitor as a child of cachetune.
I am looking cachetune as an entity with the capability to do the performance tuning tasks by a better cache resource arrangement.
In a general performance tuning taks, first we need to get to know the resource utilization information and then applies some resource arrangement, such as cache isolation or allocation. @resctrl->alloc represents the role for resource arranging, @resctrl->monitors are for the performance detecting/monitoring part.
Performance monitoring belongs to the scope of performance tuning, just like the part doing the resource limitation. Based on this understanding, I combined @alloc and @monitors, and let them as a child of @resctrl.
If you still think it is strange to put @monitors under @resctrl, then I have to change it, creating a data structure at the level of @def->resctrls, then the new _virDomainDef structure would be:
I don't believe this is what I was saying.
struct _virDomainDef { ... virDomainResctrlDefPtr *resctrls; size_t nresctrls;
virDomainResctrlMonDefPtr *monitors; size_t nmonitors; ... }
It seems the "virDomainResctrlDefPtr" should be refactored by reducing its scope that the name implies.
Further, even have refactored like this, I have to add a lot of code to maintain the relationship between @resctrls->vcpus and each one of @monitors->vcpus.
+ return 1; + + for (i = 0; i < resctrl->nmonitors; i++) { + if (virBitmapEqual(resctrl->vcpus, resctrl->monitors[i]->vcpus)) + continue; + + if (virBitmapOverlaps(vcpus, resctrl->monitors[i]->vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with + monitors")); + + return -1; + } + } + + return 0; +} + + +static int +virDomainResctrlMonDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlMonitorType tag, + virDomainResctrlDefPtr resctrl) { + virDomainResctrlMonDefPtr domresmon = NULL; + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + unsigned int level = 0; + char * tmp = NULL; + char * id = NULL; + size_t i = 0; + int n = 0; + int rv = -1; + int ret = -1; + + ctxt->node = node; + + if ((n = virXPathNodeSet("./monitor", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract monitor nodes")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + + if (VIR_ALLOC(domresmon) < 0) + goto cleanup; + + domresmon->tag = tag; + + domresmon->instance = virResctrlMonitorNew(); + if (!domresmon->instance) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not create monitor")); + goto cleanup; + } + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + tmp = virXMLPropString(nodes[i], "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing monitor attribute 'level'")); + goto cleanup; + } + + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid monitor attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + + if (virResctrlMonitorSetCacheLevel(domresmon->instance, level) < 0) + goto cleanup; + + VIR_FREE(tmp); + } + + if (virDomainResctrlParseVcpus(def, nodes[i], &domresmon->vcpus) < 0) + goto cleanup; + + rv = virDomainResctrlMonValidateVcpu(resctrl, + domresmon->vcpus); + + /* If monitor's vcpu list is identical to allocation's vcpu list, + * set as default monitor */ + if (rv == 1 && resctrl->alloc)
I'm still not seeing the need for default...
As I stated, default monitor, monitor, and monitor in default allocation has different character and behavior in resctrl, they are necessary. If we don't support default monitor, we could not monitor cache usage of whole allocation and also monitor some the cache usage of a subset of allocation vcpus at the same time. If we don't support default allocation, we will have to create many duplicated allocations that has same cache settings, that is sharing same cache resources. In that case, we'll lose the flexibility that kernel resctrl fs provided.
FWIW: The resctrl->alloc check is unnecessary since the only way rv == 1 is if it's there.
You are right, it is double checked, changes to
if (rv == 1) virResctrlMonitorSetDefault(domresmon->instance);
+ virResctrlMonitorSetDefault(domresmon->instance); + else if (rv < 0) + goto cleanup; + + if (!(tmp = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + if (virAsprintf(&id, "vcpus_%s", tmp) < 0) + goto cleanup; + + if (virResctrlMonitorSetID(domresmon->instance, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(resctrl->monitors, + resctrl->nmonitors, + domresmon) < 0) + goto cleanup; + + VIR_FREE(id); + VIR_FREE(tmp); + domresmon = NULL; + } + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(id); + VIR_FREE(tmp); + virDomainResctrlMonDefFree(domresmon);
VIR_FREE(nodes);
I forget to free it. Will be added.
Again, Coverity
+ return ret; +} +
static virDomainResctrlDefPtr virDomainResctrlNew(virResctrlAllocPtr alloc, @@ -19041,15 +19206,20 @@ virDomainCachetuneDefParse(virDomainDefPtr def, } }
- if (virResctrlAllocIsEmpty(alloc)) { - ret = 0; - goto cleanup; - } - resctrl = virDomainResctrlNew(alloc, vcpus);
So if @alloc == NULL (which is one of the reasons AllocIsEmpty is true) means that we'll be virObjectRef(NULL) in ResctrlNew(). In this case @alloc could be NULL if there were no <cache> entries, IIUC.
Your understanding is right.
I'm starting to see a real downside to a resctrl->alloc == NULL. I really don't want to continue to see churn on the internal hierarchy though.
The downside, what you mean is the 'churn on the internal hierarchy though'?
The downside is it makes no sense to have resctrl->alloc == NULL if a <cachetune> or <memorytune> element exists. If there's a <cachetune> or a <memorytune>, then a resctrl->alloc is created for the listed vcpus. Everything builds off of that. The <monitor> elements are children of cachetune (and I assume eventually) memorytune. The fact that the filesystem has this other "default" thing means what when a cachetune or memorytune element is found - it's a child of the default thing
However, if I look at how it could be filled in within the context of this function, the virDomainResctrlVcpuMatch call and subsequent possible allocation of @alloc would seemingly be possible outside the context of whether specific <cache> entries existed.
If no <cache> entry, then 'n = virXPathNodeSet("./cache", ctxt, &nodes)' statement will assign n to 0. If n is 0, then virDomainResctrlVcpuMatch will not be called and its subsequent steps for creating @alloc will not be executed.
Probably could just do away with the term default allocation - all it seems to be is an allocation without <cache> elements, but it can have <monitor> elements.
Correct, this is how code works.
If someone places a <cachetune> without <cache> and without <monitor>, so what - who cares. Probably doesn't do much other than limit other <cachetune> (and perhaps <memorytune>) elements.
Yes.
if (!resctrl) goto cleanup;
+ if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_CACHE, + resctrl) < 0) + goto cleanup; + + if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) { + ret = 0; + goto cleanup; + } +
Moving the AllocIsEmpty check should be separate.
Got. Will be done.
I'm losing steam, but the next couple of patches had Coverity issues, so I figured I'll note that before going back to read all the comments you've posted today while I was reviewing without trying to go back.
John
I use a lot of paragraph in introducing the usage of 'default monitor', 'default allocation' and 'monitor in non-default allocation', and states that these 'default' are necessary for purpose of saving RMIDs, removing allocation duplications and keeping the flexibility of monitor that kernel interface provided. Hope I tell them clearly.
Yes, lots of text, perhaps way too much. Still I'm no closer to figuring out what the need for this default wording/mechnism really is. John
Thanks for review. Huaqiang
if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) goto cleanup;
@@ -27085,12 +27255,42 @@
virDomainCachetuneDefFormatHelper(unsigned
int level,
static int +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr domresmon, + virResctrlMonitorType tag, + virBufferPtr buf) { +IIUC. char *vcpus = NULL; + unsigned int level = 0; + + if (domresmon->tag != tag) + return 0; + + virBufferAddLit(buf, "<monitor "); + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + level = virResctrlMonitorGetCacheLevel(domresmon->instance); + virBufferAsprintf(buf, "level='%u' ", level); + } + + vcpus = virBitmapFormat(domresmon->vcpus); + if (!vcpus) + return -1; + + virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus); + + VIR_FREE(vcpus); + return 0; +} + + +static int virDomainCachetuneDefFormat(virBufferPtr buf, virDomainResctrlDefPtr resctrl, unsigned int flags) { virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; char *vcpus = NULL; + size_t i = 0; int ret = -1;
virBufferSetChildIndent(&childrenBuf, buf); @@ -27099,6 +27299,13 @@ virDomainCachetuneDefFormat(virBufferPtr buf, &childrenBuf) < 0) goto cleanup;
+ for (i = 0; i < resctrl->nmonitors; i ++) { + if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i], + VIR_RESCTRL_MONITOR_TYPE_CACHE, + &childrenBuf) < 0) + goto cleanup; + } + if (virBufferCheckError(&childrenBuf) < 0) goto cleanup;
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e30a4b2..60f6464 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2236,12 +2236,23 @@ struct _virDomainCputune { };
+typedef struct _virDomainResctrlMonDef virDomainResctrlMonDef; +typedef virDomainResctrlMonDef *virDomainResctrlMonDefPtr; struct +_virDomainResctrlMonDef { + virBitmapPtr vcpus; + virResctrlMonitorType tag; + virResctrlMonitorPtr instance; +}; + typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr;
struct _virDomainResctrlDef { virBitmapPtr vcpus; virResctrlAllocPtr alloc; + + virDomainResctrlMonDefPtr *monitors; + size_t nmonitors; };
diff --git a/tests/genericxml2xmlindata/cachetune-cdp.xml b/tests/genericxml2xmlindata/cachetune-cdp.xml index 9718f06..9f4c139 100644 --- a/tests/genericxml2xmlindata/cachetune-cdp.xml +++ b/tests/genericxml2xmlindata/cachetune-cdp.xml @@ -8,9 +8,12 @@ <cachetune vcpus='0-1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='2'> <cache id='1' level='3' type='code' size='6' unit='MiB'/> + <monitor level='3' vcpus='2'/> </cachetune> <cachetune vcpus='3'> <cache id='1' level='3' type='data' size='6912' unit='KiB'/> diff --git a/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml new file mode 100644 index 0000000..d481fb5 --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml @@ -0,0 +1,30 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='2'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-small.xml b/tests/genericxml2xmlindata/cachetune-small.xml index ab2d9cf..748be08 100644 --- a/tests/genericxml2xmlindata/cachetune-small.xml +++ b/tests/genericxml2xmlindata/cachetune-small.xml @@ -7,6 +7,13 @@ <cputune> <cachetune vcpus='0-1'> <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-1'/> + </cachetune> + <cachetune vcpus='2-3'> + <monitor level='3' vcpus='2'/> + <monitor level='3' vcpus='3'/> </cachetune> </cputune> <os> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index fa941f0..4393d44 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -137,6 +137,8 @@ mymain(void) TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST_FULL("cachetune-colliding-types", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST_FULL("cachetune-colliding-monitor", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST("memorytune"); DO_TEST_FULL("memorytune-colliding-allocs", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);

On 10/13/2018 6:29 AM, John Ferlan wrote:
-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Thursday, October 11, 2018 4:58 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 13/19] conf: Add resctrl monitor configuration
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Introducing <monitor> element under <cachetune> to represent a cache monitor.
Supports two kind of monitors, which are, monitor under default allocation or monitor under particular allocation. I still don't see what the big difference is with singling out default. Does it really matter - what advantage is there. Having a monitor for 'n' vcpus is a choice.
Monitor supervises the cache or memory bandwidth usage for At this point memory bandwidth is not in play... Yes. Will remove the memory BW related words.
interested vcpu thread set, if the vcpu thread set is belong to some resctrl allocation, then the monitor will be created under this allocation, that is, creating a resctrl monitoring group directory under the directory of '@alloc->path/mon_group'. Otherwise, the monitor will be created under default allocation. This is becoming increasing difficult to describe/decipher - makes me wonder who would really use it. Monitor in default allocation could be commonly used when we want to observe the resource usage for VM vcpu threads that are not specified with any dedicated resource limitation through CAT or MBA technology.
And another common case is it will be used if CAT/MBA is not supported but CMT/MBM is enabled in libvirt.
The reason I have introduced the monitor in previous patch and further introducing the monitor in default allocation in this patch is the monitor n two different allocations has different behavior.
Let's focus on CAT and CMT technology in my below lines, since MBA and MBM are very similar cases to them.
As we know, before the CAT technology was introduced, any process in Linux is sharing the L3 cache along with all other active processes. After CAT is enabled in libvirt, it has the capability to apply cache isolation and assign dedicated amount of cache to some key VM vcpu threads.
If we want to observe the real L3 cache usage information, then we need the help of monitor.
== Monitor usage case 1: monitor in default allocation ==
If you want to get the cache utilization data before applying any cache isolation to the VM vcpu threads, you need to create a monitor in the default allocation, because you haven't create any cache allocation.
== Monitor usage case 2: monitor in non-default allocation ==
If you have created a cache allocation for specific VM vcpu treads and not sure how many cache-lines these VM vcpu threads are really used, you need to create a monitor under the this allocation to get real cache usage information.
If you find your VM vcpu threads only used a small part of cache that have allocated, you might think about to reduce its allocation.
== Usage for default monitor and non-default monitor ==
Since we have introduced the 'default monitor' and 'non-default monitor' concepts in previous patches, and now, you can monitor the cache usage for all VM vcpu threads that added to this allocation, and also you has the capability to monitor a subset of vcpu list of this allocation.
Without 'default monitor', it is impossible to get the cache usage for all vcpu threads in the allocation and at the same time get the cache usage for some highly interested vcpu threads of allocation. "Default monitor" is in name only. It's just "a" monitor for all the
On 10/12/18 3:10 AM, Wang, Huaqiang wrote: threads in cachetune (or memorytune eventually); whereas, a "non default monitor" is "a" monitor for specific vcpus within the range in cachetune.
Agree. As stated in reply of patch10, 'default monitor' and will be removed, and related code will be cleaned.
Thus your description is that a monitor can be a monitor all vcpus or a monitor for some subset of of all vcpus. A <monitor> describes which vcpus of a cachetune (or memorytune) can be monitored - there are 3 options - "all", "m-to-n", or "one-to-one".
Thanks for suggestion. Then no things will be mentioned as 'default', code and document will be cleaned accordingly.
What they relate to in the filesystem is the magic of the code of paths to the data. What data structures are called or how this is described in docs just doesn't seem to need to make a delineation.
Agree.
== Monitor usage case 3: allocating dedicated cache and monitoring its usage of one VM, and getting its influence over another VM==
Think about the scenario that there are two VMs, it is a known information that one VM is very cache sensitive and don't want to share cache with other workloads, and for another VM, we have no knowledge about cache requirement, but it is required to monitor the cache usage for the two VMs.
With the concepts introduced until now. We need to create an allocation and for this VM, then create a default-monitor in this allocation for monitoring. For another VM, it is required to create a non-default monitor under default allocation.
With introduced concepts, the allocation, the default allocation, the monitor and the default monitor, it is possible to fulfill requirement all scenarios. I think it's far easier to describe 2 things rather than 4.
Agree. Will be done. Only 'allocation' and 'monitor'.
The creation of monitor does not have too much flexibility, as I stated in my reply of v5patch0's review comments, the monitors need to follow below rules:
1. In each <cachetune> entry more than one monitors could be specified. 2. In each <cachetune> entry up to one allocation could be specified. 3. The allocation is using the vcpu list specified in <cachetune> attribute 'vcpus'. An allocation is either from the list of vcpus for a cachetune or a memorytune if cachetune for the listed vcpus in a memorytune doesn't exist (and yes, doesn't conflict with any vcpus in any found cachetune).
If the domain was created with 4 vcpus, then is "theoretically" the default allocation for 4 vcpus. Allocations beyond that allow one to slice and dice *as long as* there is no overlap with the definitions.
I don't think "default" or "hidden" really needs to be described or called out, it just is. Maybe I'm oversimplifying.
After going through the code, I think the 'default allocation' and 'default monitor' could be removed, as I stated, without any lose of functionality currently implemented.
4. A monitor has the same vcpu list as allocation is allowed, and this monitor is allocation's default monitor. 5. A monitor has a subset vcpu list of allocation is allowed. 6. For non-default monitors, any vcpu list overlap is not permitted. And also, the number of monitors could be generated is subject to the hardware RMID numbers.
About the behavior of underlying '/sys/fs/resctrl' file system, you can get more detail from this link: https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt
For default allocation monitor, it will have such kind of XML layout:
<cachetune vcpus='1'> <monitor level=3 vcpus='1'/> </cachetune>
For other type monitor, the XML layout will be something like:
<cachetune vcpus='2-4'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> <monitor level=3 vcpus='2'/> </cachetune>
Since all we get is the "cache_occupancy" that would seem to me to be most important when there is a <cache> bank, right? So what would the first (or your default style) collect/display. Or does cache not really matter. <cache> does not matter for monitor. Even in case of no CAT in system, the <cache> entry will never be shown under <cachetune>.
If valid <cache> entries exist, means allocating some part of cache-lines to VM vcpu threads specified in attribute 'vcpus'. If there is no <cache> entry existed in <cachetune>, it does not mean these vcpu threads does not use any cache resource, it means it will use the cache resource specified in default allocation. Normally the default allocation's cache resource is shared by a lot of cache insensitive workloads.
Right it's not overriding anything. It's just using the values for all vcpus present, but that's an implementation detail of the underlying technology which it feels like could be "papered over".
IOW: What is cache_occupancy measuring? Each cache? The entire thing? If there's no cache elements, then what? cache_occupancy is measuring based on cache bank. For Intel 2 socket xeon CPU, it is considered as two cache banks, one cache bank per socket. The typical output for each monitor of this case is:
cpu.cache.0.name=vcpus_1 cpu.cache.0.vcpus=1 cpu.cache.0.bank.count=2 <--- 2 cache banks cpu.cache.0.bank.0.id=0 <--- bank.id.0 cache_occypancy cpu.cache.0.bank.0.bytes=9371648 _| cpu.cache.0.bank.1.id=1 <--- bank.id.1 cache_occypancy cpu.cache.0.bank.1.bytes=1081344 _|
If you want to know the total cache occupancy for VM vcpu threads of this monitor, you need to add them up.
So if you have:
<monitor... vcpus=0-1>
what do you get in output for cache_occupancy? 0 + 1?
Yes. Output is sum of two vcpus. for cache bank 0 Â Â Â vcpus_0-1.bank.0.bytes = Â vcpus_0.bank.0.bytes + vcpus_1.bank.0.bytes for cache bank 1 Â Â Â vcpus_0-1.bank.1.bytes = Â vcpus_0.bank.1.bytes + vcpus_1.bank.1.bytes
I honestly think this just needs to be simplified as much as possible.
"I honestly think this just needs to be simplified as much as possible." I reconsidered your comment ( in above line), do you mean the XML configuration for 'monitor' need to be simplified also? What I think is, even after the removal of 'default monitor' and 'default allocation' concepts, the XML configuration for monitors (with type 'all', 'm-to-n', 'one to one') still need such kind of arrangement. Take an example, a VM has 4 vcpus, vcpu 0 and 1 run cache sensitive workload, and wants to hold private L3 caches, and there is no specific requirement for left vcpus but still need a monitoring on the cache usage. Then we could create an cache allocation for vcpu 0 and 1 as well as a monitor on getting the actual cache that these two vcpus used. For vcpu 2 and 3, create a monitor for it. The XML configurations are: (no change in general rules comparing to my previous examples) <cachetune vcpus='0-1'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> <monitor level=3 vcpus='0-1'/> </cachetune> <cachetune vcpus='2-3'> <monitor level=3 vcpus='2-3'/> </cachetune> Any suggestion from you is welcome.
When you monitor specific vcpus within a cachetune, then you get what? In this case, the monitor you created only monitors the specific vcpus you added for monitor.
Following two configurations satisfy your scenario, and the only monitor will detect the cache usage of thread of vcpu 2.
<cachetune vcpus='2-4'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> <monitor level=3 vcpus='2'/> </cachetune>
<cachetune vcpus='2-4'> <monitor level=3 vcpus='2'/> </cachetune>
Perhaps my question was mistyped or misinterpreted. In the above top example, if we have <monitor ... vcpus='2-4'>, then do the values in <cache> have any impact on the calculation as opposed to if they weren't there?
I perhaps still not understand you well ... There will have significant influence for the output of monitor if <cache> entry exist and if vcpu2-4 demands much more caches that allocation can offer; If the cache that the allocation offers is much bigger than vcpu2-4 actually used, the influence will be tiny. But in another case, that, if there is no 'cache' entries, just showing in the second example, it still influenced by the cache that the 'allocation' offers. Its difference with the first example is: the top example is using the cache resources allocated by the allocation of itself, while the second example uses the allocation of resources defined in /sys/fs/resctrl/schemata, and this cache is shared by multiple system tasks.
If the cachetune has no specific cache entries, you get what? If no cache entry in cachetune, it will also get vcpu threads' cache utilization information based on cache bank. No cache entry specified for the cachetune, means it will use the cache allocating policy of default cache allocation, which is file /sys/fs/resctrl/schemata.
If valid cache entries are provided in cachetune, then an allocation will be created for the threads of vcpu listed in <cachetune> 'vcpus' attribute. Supposing the allocation is the directory /sys/fs/resctrl/p0, then the cache resource limitation was applied on these threads.
For monitor, it does not care if vcpu threads are allowed or not alloowed to access a limit amount of cache-lines. Monitor only reports the amount of cache has been accesses.
If you monitor multiple vcpus within a cachetune then you get what? (?An aggregation of all?). Yes. supposing you have this vcpus setting for <cachetune> <cachetune vcpus='0-4,8' ..../>
and you choose to monitor the cache usage for vcpu 0,3,8, then you create following monitor entry inside the cachetune entry, with the output of monitor, you will get an aggregative cache occupancy information for threads of vcpu 0,3,8.
<cachetune vcpus='0-4,8'/> <monitor level='3' vcpus='0,3,8'/> </cachetune>
This whole default and specific description doesn't make sense. Sorry for make you confused, I'll try to refine the descriptions.
In this last case if you also had
<monitor level='3', vcpus='4'/> <monitor level='3', vcpus='0-4,8'/>
then I'd expect that the values output in "0-4,8" to match those that I could add by myself with "4" and "0-3,8". True?
Yes.
Is it apparent yet why I'm saying mentioning default just confuses things? If so, I'm not sure what else I can do to explain.
Agree with the conclusion that 'default xxx' is a confusing things. But hope you understand that, a monitor has same vcpu list with the allocation is created along with the creation of allocation, no matter you defined a <monitor> in <cachetune> and has a same 'vcpus' setting with allocation in the XML configuration or not. This is the behavior of kernel resctrl fs. To get the cache utilization information for whole allocation, enable this system created monitor is most economic way in terms of saving RMID.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 26 +++ docs/schemas/domaincommon.rng | 10 + src/conf/domain_conf.c | 217 ++++++++++++++++++++- src/conf/domain_conf.h | 11 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 8 files changed, 301 insertions(+), 5 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b1651e3..2fd665c 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,12 @@ <cachetune vcpus='0-3'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-3'/> + </cachetune> + <cachetune vcpus='4-5'> + <monitor level='3' vcpus='4'/> + <monitor level='3' vcpus='5'/> </cachetune> <memorytune vcpus='0-3'> <node id='0' bandwidth='60'/> @@ -978,6 +984,26 @@ </dd> </dl> </dd> + <dt><code>monitor</code></dt> + <dd> + The optional element <code>monitor</code> creates the cache + monitor(s) for current cache allocation and has the following + required attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level the monitor belongs to. + </dd> + <dt><code>vcpus</code></dt> + <dd> + vCPU list the monitor applies to. A monitor's vCPU list + can only be the member(s) of the vCPU list of associating + allocation. The default monitor has the same vCPU list as the + associating allocation. For non-default monitors, there + are no vCPU overlap permitted. + </dd> + </dl> + </dd> </dl> </dd>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 5c533d6..7ce49d3 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -981,6 +981,16 @@ </optional> </element> </zeroOrMore> + <zeroOrMore> + <element name="monitor"> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + </element> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9a514a6..4f4604f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2955,13 +2955,30 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader)
static void +virDomainResctrlMonDefFree(virDomainResctrlMonDefPtr domresmon) { + if (!domresmon) + return; + + virBitmapFree(domresmon->vcpus); + virObjectUnref(domresmon->instance); VIR_FREE(domresmon); I forget to free monitor itself. Will be fixed.
Coverity found this!
Still thank you.
+} + + +static void virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) { + size_t i = 0; + if (!resctrl) return;
+ for (i = 0; i < resctrl->nmonitors; i++) + virDomainResctrlMonDefFree(resctrl->monitors[i]); + virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); + VIR_FREE(resctrl->monitors); VIR_FREE(resctrl); }
@@ -18919,6 +18936,154 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, return ret; }
Two blank lines OK.
+/* Checking if the monitor's vcpus is conflicted with existing +allocation + * and monitors. + * + * Returns 1 if @vcpus equals to @resctrl->vcpus, means it is a +default + * monitor. Returns - 1 if a conflict found. Returns 0 if no conflict +and + * @vcpus is not equal to @resctrl->vcpus. + * */ +static int +virDomainResctrlMonValidateVcpu(virDomainResctrlDefPtr resctrl, + virBitmapPtr vcpus) This should be *ValidateVcpus. OK.
+{ + size_t i = 0; + int vcpu = -1; + + if (virBitmapIsAllClear(vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("vcpus is empty")); + return -1; + } + + while ((vcpu = virBitmapNextSetBit(vcpus, vcpu)) >= 0) { + if (!virBitmapIsBitSet(resctrl->vcpus, vcpu)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with allocation")); + return -1; + } + } + + if (resctrl->alloc && virBitmapEqual(vcpus, resctrl->vcpus)) The ->alloc check is confusing in light of having a monitor as a child of cachetune. I am looking cachetune as an entity with the capability to do the performance tuning tasks by a better cache resource arrangement.
In a general performance tuning taks, first we need to get to know the resource utilization information and then applies some resource arrangement, such as cache isolation or allocation. @resctrl->alloc represents the role for resource arranging, @resctrl->monitors are for the performance detecting/monitoring part.
Performance monitoring belongs to the scope of performance tuning, just like the part doing the resource limitation. Based on this understanding, I combined @alloc and @monitors, and let them as a child of @resctrl.
If you still think it is strange to put @monitors under @resctrl, then I have to change it, creating a data structure at the level of @def->resctrls, then the new _virDomainDef structure would be: I don't believe this is what I was saying.
struct _virDomainDef { ... virDomainResctrlDefPtr *resctrls; size_t nresctrls;
virDomainResctrlMonDefPtr *monitors; size_t nmonitors; ... }
It seems the "virDomainResctrlDefPtr" should be refactored by reducing its scope that the name implies.
Further, even have refactored like this, I have to add a lot of code to maintain the relationship between @resctrls->vcpus and each one of @monitors->vcpus.
+ return 1; + + for (i = 0; i < resctrl->nmonitors; i++) { + if (virBitmapEqual(resctrl->vcpus, resctrl->monitors[i]->vcpus)) + continue; + + if (virBitmapOverlaps(vcpus, resctrl->monitors[i]->vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with + monitors")); + + return -1; + } + } + + return 0; +} + + +static int +virDomainResctrlMonDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlMonitorType tag, + virDomainResctrlDefPtr resctrl) { + virDomainResctrlMonDefPtr domresmon = NULL; + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + unsigned int level = 0; + char * tmp = NULL; + char * id = NULL; + size_t i = 0; + int n = 0; + int rv = -1; + int ret = -1; + + ctxt->node = node; + + if ((n = virXPathNodeSet("./monitor", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract monitor nodes")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + + if (VIR_ALLOC(domresmon) < 0) + goto cleanup; + + domresmon->tag = tag; + + domresmon->instance = virResctrlMonitorNew(); + if (!domresmon->instance) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not create monitor")); + goto cleanup; + } + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + tmp = virXMLPropString(nodes[i], "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing monitor attribute 'level'")); + goto cleanup; + } + + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid monitor attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + + if (virResctrlMonitorSetCacheLevel(domresmon->instance, level) < 0) + goto cleanup; + + VIR_FREE(tmp); + } + + if (virDomainResctrlParseVcpus(def, nodes[i], &domresmon->vcpus) < 0) + goto cleanup; + + rv = virDomainResctrlMonValidateVcpu(resctrl, + domresmon->vcpus); + + /* If monitor's vcpu list is identical to allocation's vcpu list, + * set as default monitor */ + if (rv == 1 && resctrl->alloc) I'm still not seeing the need for default... As I stated, default monitor, monitor, and monitor in default allocation has different character and behavior in resctrl, they are necessary. If we don't support default monitor, we could not monitor cache usage of whole allocation and also monitor some the cache usage of a subset of allocation vcpus at the same time. If we don't support default allocation, we will have to create many duplicated allocations that has same cache settings, that is sharing same cache resources. In that case, we'll lose the flexibility that kernel resctrl fs provided.
FWIW: The resctrl->alloc check is unnecessary since the only way rv == 1 is if it's there. You are right, it is double checked, changes to
if (rv == 1) virResctrlMonitorSetDefault(domresmon->instance);
+ virResctrlMonitorSetDefault(domresmon->instance); + else if (rv < 0) + goto cleanup; + + if (!(tmp = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + if (virAsprintf(&id, "vcpus_%s", tmp) < 0) + goto cleanup; + + if (virResctrlMonitorSetID(domresmon->instance, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(resctrl->monitors, + resctrl->nmonitors, + domresmon) < 0) + goto cleanup; + + VIR_FREE(id); + VIR_FREE(tmp); + domresmon = NULL; + } + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(id); + VIR_FREE(tmp); + virDomainResctrlMonDefFree(domresmon); VIR_FREE(nodes); I forget to free it. Will be added.
Again, Coverity
Thank you again. Hope someday I can hold the power of Coverity ...
+ return ret; +} +
static virDomainResctrlDefPtr virDomainResctrlNew(virResctrlAllocPtr alloc, @@ -19041,15 +19206,20 @@ virDomainCachetuneDefParse(virDomainDefPtr def, } }
- if (virResctrlAllocIsEmpty(alloc)) { - ret = 0; - goto cleanup; - } - resctrl = virDomainResctrlNew(alloc, vcpus); So if @alloc == NULL (which is one of the reasons AllocIsEmpty is true) means that we'll be virObjectRef(NULL) in ResctrlNew(). In this case @alloc could be NULL if there were no <cache> entries, IIUC. Your understanding is right.
I'm starting to see a real downside to a resctrl->alloc == NULL. I really don't want to continue to see churn on the internal hierarchy though. The downside, what you mean is the 'churn on the internal hierarchy though'?
The downside is it makes no sense to have resctrl->alloc == NULL if a <cachetune> or <memorytune> element exists.
Yes. virresctrl.c could determine a monitor is a default one by checking @resctrl->alloc.
If there's a <cachetune> or a <memorytune>, then a resctrl->alloc is created for the listed vcpus. Everything builds off of that.
The <monitor> elements are children of cachetune (and I assume eventually) memorytune. The fact that the filesystem has this other "default" thing means what when a cachetune or memorytune element is found - it's a child of the default thing
However, if I look at how it could be filled in within the context of this function, the virDomainResctrlVcpuMatch call and subsequent possible allocation of @alloc would seemingly be possible outside the context of whether specific <cache> entries existed. If no <cache> entry, then 'n = virXPathNodeSet("./cache", ctxt, &nodes)' statement will assign n to 0. If n is 0, then virDomainResctrlVcpuMatch will not be called and its subsequent steps for creating @alloc will not be executed.
Probably could just do away with the term default allocation - all it seems to be is an allocation without <cache> elements, but it can have <monitor> elements. Correct, this is how code works.
If someone places a <cachetune> without <cache> and without <monitor>, so what - who cares. Probably doesn't do much other than limit other <cachetune> (and perhaps <memorytune>) elements. Yes.
if (!resctrl) goto cleanup;
+ if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_CACHE, + resctrl) < 0) + goto cleanup; + + if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) { + ret = 0; + goto cleanup; + } +
Moving the AllocIsEmpty check should be separate. Got. Will be done.
I'm losing steam, but the next couple of patches had Coverity issues, so I figured I'll note that before going back to read all the comments you've posted today while I was reviewing without trying to go back.
John I use a lot of paragraph in introducing the usage of 'default monitor', 'default allocation' and 'monitor in non-default allocation', and states that these 'default' are necessary for purpose of saving RMIDs, removing allocation duplications and keeping the flexibility of monitor that kernel interface provided. Hope I tell them clearly. Yes, lots of text, perhaps way too much. Still I'm no closer to figuring out what the need for this default wording/mechnism really is.
John
As stated in prior paragraph. Will remove 'default monitor' and 'default allocation' and make cleaning for code and comments. Do I miss anything? BTW, I find the 'virsh domstats --cpu-total' output for monitors, introduced in patch18, is not good enough. current is " Domain: 'ubuntu16.04-base'  cpu.cache.monitor.count=2  cpu.cache.0.name=vcpus_0  cpu.cache.0.vcpus=0  cpu.cache.0.bank.count=2  cpu.cache.0.bank.0.id=0  cpu.cache.0.bank.0.bytes=9371648  cpu.cache.0.bank.1.id=1  cpu.cache.0.bank.1.bytes=1081344  cpu.cache.1.name=vcpus_3  cpu.cache.1.vcpus=3  cpu.cache.1.bank.count=2  cpu.cache.1.bank.0.id=0  cpu.cache.1.bank.0.bytes=630784  cpu.cache.1.bank.1.id=1  cpu.cache.1.bank.1.bytes=10452992 " I may change the output to following by adding 'monitor' for each line: Domain: 'ubuntu16.04-base'  cpu.cache.monitor.count=2  cpu.cache.monitor.0.name=vcpus_0  cpu.cache.monitor.0.vcpus=0  cpu.cache.monitor.0.bank.count=2  cpu.cache.monitor.0.bank.0.id=0  cpu.cache.monitor.0.bank.0.bytes=9371648  cpu.cache.monitor.0.bank.1.id=1  cpu.cache.monitor.0.bank.1.bytes=1081344  cpu.cache.monitor.1.name=vcpus_3  cpu.cache.monitor.1.vcpus=3  cpu.cache.monitor.1.bank.count=2  cpu.cache.monitor.1.bank.0.id=0  cpu.cache.monitor.1.bank.0.bytes=630784  cpu.cache.monitor.1.bank.1.id=1  cpu.cache.monitor.1.bank.1.bytes=10452992 Please take this change in consideration when you make review for patch 18. Thanks for review. Huaqiang
Thanks for review. Huaqiang
if (virDomainResctrlAppend(def, node, resctrl, flags) < 0) goto cleanup;
@@ -27085,12 +27255,42 @@
virDomainCachetuneDefFormatHelper(unsigned
int level,
static int +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr domresmon, + virResctrlMonitorType tag, + virBufferPtr buf) { +IIUC. char *vcpus = NULL; + unsigned int level = 0; + + if (domresmon->tag != tag) + return 0; + + virBufferAddLit(buf, "<monitor "); + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + level = virResctrlMonitorGetCacheLevel(domresmon->instance); + virBufferAsprintf(buf, "level='%u' ", level); + } + + vcpus = virBitmapFormat(domresmon->vcpus); + if (!vcpus) + return -1; + + virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus); + + VIR_FREE(vcpus); + return 0; +} + + +static int virDomainCachetuneDefFormat(virBufferPtr buf, virDomainResctrlDefPtr resctrl, unsigned int flags) { virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; char *vcpus = NULL; + size_t i = 0; int ret = -1;
virBufferSetChildIndent(&childrenBuf, buf); @@ -27099,6 +27299,13 @@ virDomainCachetuneDefFormat(virBufferPtr buf, &childrenBuf) < 0) goto cleanup;
+ for (i = 0; i < resctrl->nmonitors; i ++) { + if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i], + VIR_RESCTRL_MONITOR_TYPE_CACHE, + &childrenBuf) < 0) + goto cleanup; + } + if (virBufferCheckError(&childrenBuf) < 0) goto cleanup;
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e30a4b2..60f6464 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2236,12 +2236,23 @@ struct _virDomainCputune { };
+typedef struct _virDomainResctrlMonDef virDomainResctrlMonDef; +typedef virDomainResctrlMonDef *virDomainResctrlMonDefPtr; struct +_virDomainResctrlMonDef { + virBitmapPtr vcpus; + virResctrlMonitorType tag; + virResctrlMonitorPtr instance; +}; + typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr;
struct _virDomainResctrlDef { virBitmapPtr vcpus; virResctrlAllocPtr alloc; + + virDomainResctrlMonDefPtr *monitors; + size_t nmonitors; };
diff --git a/tests/genericxml2xmlindata/cachetune-cdp.xml b/tests/genericxml2xmlindata/cachetune-cdp.xml index 9718f06..9f4c139 100644 --- a/tests/genericxml2xmlindata/cachetune-cdp.xml +++ b/tests/genericxml2xmlindata/cachetune-cdp.xml @@ -8,9 +8,12 @@ <cachetune vcpus='0-1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='2'> <cache id='1' level='3' type='code' size='6' unit='MiB'/> + <monitor level='3' vcpus='2'/> </cachetune> <cachetune vcpus='3'> <cache id='1' level='3' type='data' size='6912' unit='KiB'/> diff --git a/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml new file mode 100644 index 0000000..d481fb5 --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml @@ -0,0 +1,30 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='2'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-small.xml b/tests/genericxml2xmlindata/cachetune-small.xml index ab2d9cf..748be08 100644 --- a/tests/genericxml2xmlindata/cachetune-small.xml +++ b/tests/genericxml2xmlindata/cachetune-small.xml @@ -7,6 +7,13 @@ <cputune> <cachetune vcpus='0-1'> <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-1'/> + </cachetune> + <cachetune vcpus='2-3'> + <monitor level='3' vcpus='2'/> + <monitor level='3' vcpus='3'/> </cachetune> </cputune> <os> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index fa941f0..4393d44 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -137,6 +137,8 @@ mymain(void) TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST_FULL("cachetune-colliding-types", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST_FULL("cachetune-colliding-monitor", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST("memorytune"); DO_TEST_FULL("memorytune-colliding-allocs", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);

On 10/15/18 11:25 AM, Wang, Huaqiang wrote:
On 10/13/2018 6:29 AM, John Ferlan wrote:
On 10/12/18 3:10 AM, Wang, Huaqiang wrote:
-----Original Message-----
[...]
IOW: What is cache_occupancy measuring? Each cache? The entire thing? If there's no cache elements, then what? cache_occupancy is measuring based on cache bank. For Intel 2 socket xeon CPU, it is considered as two cache banks, one cache bank per socket. The typical output for each monitor of this case is:
     cpu.cache.0.name=vcpus_1      cpu.cache.0.vcpus=1      cpu.cache.0.bank.count=2         <--- 2 cache banks      cpu.cache.0.bank.0.id=0          <--- bank.id.0 cache_occypancy      cpu.cache.0.bank.0.bytes=9371648   _|      cpu.cache.0.bank.1.id=1          <--- bank.id.1 cache_occypancy      cpu.cache.0.bank.1.bytes=1081344   _|
If you want to know the total cache occupancy for VM vcpu threads of this monitor, you need to add them up.
So if you have:
   <monitor... vcpus=0-1>
what do you get in output for cache_occupancy? 0 + 1?
Yes. Output is sum of two vcpus.
for cache bank 0 Â Â Â vcpus_0-1.bank.0.bytes = Â vcpus_0.bank.0.bytes + vcpus_1.bank.0.bytes for cache bank 1 Â Â Â vcpus_0-1.bank.1.bytes = Â vcpus_0.bank.1.bytes + vcpus_1.bank.1.bytes
I honestly think this just needs to be simplified as much as possible.
"I honestly think this just needs to be simplified as much as possible."
I reconsidered your comment ( in above line), do you mean the XML configuration for 'monitor' need to be simplified also?
This is/was a comment regarding default stuff which you are removing.
What I think is, even after the removal of 'default monitor' and 'default allocation' concepts, the XML configuration for monitors (with type 'all', 'm-to-n', 'one to one') still need such kind of arrangement.
Take an example, a VM has 4 vcpus, vcpu 0 and 1 run cache sensitive workload, and wants to hold private L3 caches, and there is no specific requirement for left vcpus but still need a monitoring on the cache usage.
Then we could create an cache allocation for vcpu 0 and 1 as well as a monitor on getting the actual cache that these two vcpus used. For vcpu 2 and 3, create a monitor for it.
The XML configurations are: (no change in general rules comparing to my previous examples)
   <cachetune vcpus='0-1'>      <cache id='0' level='3' type='both' size='3' unit='MiB'/>      <cache id='1' level='3' type='both' size='3' unit='MiB'/>      <monitor level=3 vcpus='0-1'/>    </cachetune>    <cachetune vcpus='2-3'>      <monitor level=3 vcpus='2-3'/>    </cachetune>  Any suggestion from you is welcome.
I'm not sure what the question is and I'm not sure it matters at this point. If you only create an allocation for any <cachetune> or <memorytune> entry, then that's all that'll be reported which is what I was trying to point out. Its not that something else may or may not exist, it's what gets reported and can be queried via the XML.
When you monitor specific vcpus within a cachetune, then you get what? In this case, the monitor you created only monitors the specific vcpus you added for monitor.
Following two configurations satisfy your scenario, and the only monitor will detect the cache usage of thread of vcpu 2.
     <cachetune vcpus='2-4'>        <cache id='0' level='3' type='both' size='3' unit='MiB'/>        <cache id='1' level='3' type='both' size='3' unit='MiB'/>        <monitor level=3 vcpus='2'/>      </cachetune>
     <cachetune vcpus='2-4'>        <monitor level=3 vcpus='2'/>      </cachetune>
Perhaps my question was mistyped or misinterpreted. In the above top example, if we have <monitor ... vcpus='2-4'>, then do the values in <cache> have any impact on the calculation as opposed to if they weren't there?
I perhaps still not understand you well ... There will have significant influence for the output of monitor if <cache> entry exist and if vcpu2-4 demands much more caches that allocation can offer; If the cache that the allocation offers is much bigger than vcpu2-4 actually used, the influence will be tiny.
But in another case, that, if there is no 'cache' entries, just showing in the second example, it still influenced by the cache that the 'allocation' offers. Its difference with the first example is: the top example is using the cache resources allocated by the allocation of itself, while the second example uses the allocation of resources defined in /sys/fs/resctrl/schemata, and this cache is shared by multiple system tasks.
The question was related to how <monitor> is defined and trying to further describe my feeling that default was necessary.
If the cachetune has no specific cache entries, you get what? If no cache entry in cachetune, it will also get vcpu threads' cache utilization information based on cache bank. No cache entry specified for the cachetune, means it will use the cache allocating policy of default cache allocation, which is file /sys/fs/resctrl/schemata.
If valid cache entries are provided in cachetune, then an allocation will be created for the threads of vcpu listed in <cachetune> 'vcpus' attribute. Supposing the allocation is the directory /sys/fs/resctrl/p0, then the cache resource limitation was applied on these threads.
For monitor, it does not care if vcpu threads are allowed or not alloowed to access a limit amount of cache-lines. Monitor only reports the amount of cache has been accesses.
If you monitor multiple vcpus within a cachetune then you get what? (?An aggregation of all?). Yes. supposing you have this vcpus setting for <cachetune> Â Â Â Â <cachetune vcpus='0-4,8' ..../>
and you choose to monitor the cache usage for vcpu 0,3,8, then you create following monitor entry inside the cachetune entry, with the output of monitor, you will get an aggregative cache occupancy information for threads of vcpu 0,3,8.
    <cachetune vcpus='0-4,8'/>       <monitor level='3' vcpus='0,3,8'/>     </cachetune>
This whole default and specific description doesn't make sense. Sorry for make you confused, I'll try to refine the descriptions.
In this last case if you also had
   <monitor level='3', vcpus='4'/>    <monitor level='3', vcpus='0-4,8'/>
then I'd expect that the values output in "0-4,8" to match those that I could add by myself with "4" and "0-3,8". True?
Yes.
and this essentially solidifies the point I was making above.
Is it apparent yet why I'm saying mentioning default just confuses things? If so, I'm not sure what else I can do to explain.
Agree with the conclusion that 'default xxx' is a confusing things.
But hope you understand that, a monitor has same vcpu list with the allocation is created along with the creation of allocation, no matter you defined a <monitor> in <cachetune> and has a same 'vcpus' setting with allocation in the XML configuration or not. This is the behavior of kernel resctrl fs. To get the cache utilization information for whole allocation, enable this system created monitor is most economic way in terms of saving RMID.
Sure, one cannot have too many monitors because there are limitations. [...]
I forget to free it. Will be added.
Again, Coverity
Thank you again. Hope someday I can hold the power of Coverity ...
It's nice to have, but it has it's own issues. Getting to know what's a real issue and some false positive takes a while. I'm sure there's other code analyzers out there. [...]
As stated in prior paragraph. Will remove 'default monitor' and 'default allocation' and make cleaning for code and comments.
Do I miss anything?
I hope not, it's time consuming to read/comprehend everything. I see the need to post more because it doesn't necessarily make sense without understanding the future, but long series mean long reviews and long reviews mean more questions and more questions mean deeper responses in the mail list. In the long run I hope we get something acceptable to be used by/for libvirt to describe/summarize the depths that is CAT. I think we're getting closer that's for sure.
BTW, I find the 'virsh domstats --cpu-total' output for monitors, introduced in patch18, is not good enough. current is " Domain: 'ubuntu16.04-base' Â cpu.cache.monitor.count=2 Â cpu.cache.0.name=vcpus_0 Â cpu.cache.0.vcpus=0 Â cpu.cache.0.bank.count=2 Â cpu.cache.0.bank.0.id=0 Â cpu.cache.0.bank.0.bytes=9371648 Â cpu.cache.0.bank.1.id=1 Â cpu.cache.0.bank.1.bytes=1081344 Â cpu.cache.1.name=vcpus_3 Â cpu.cache.1.vcpus=3 Â cpu.cache.1.bank.count=2 Â cpu.cache.1.bank.0.id=0 Â cpu.cache.1.bank.0.bytes=630784 Â cpu.cache.1.bank.1.id=1 Â cpu.cache.1.bank.1.bytes=10452992 " I may change the output to following by adding 'monitor' for each line:
Domain: 'ubuntu16.04-base' Â cpu.cache.monitor.count=2 Â cpu.cache.monitor.0.name=vcpus_0 Â cpu.cache.monitor.0.vcpus=0 Â cpu.cache.monitor.0.bank.count=2 Â cpu.cache.monitor.0.bank.0.id=0 Â cpu.cache.monitor.0.bank.0.bytes=9371648 Â cpu.cache.monitor.0.bank.1.id=1 Â cpu.cache.monitor.0.bank.1.bytes=1081344 Â cpu.cache.monitor.1.name=vcpus_3 Â cpu.cache.monitor.1.vcpus=3 Â cpu.cache.monitor.1.bank.count=2 Â cpu.cache.monitor.1.bank.0.id=0 Â cpu.cache.monitor.1.bank.0.bytes=630784 Â cpu.cache.monitor.1.bank.1.id=1 Â cpu.cache.monitor.1.bank.1.bytes=10452992
Please take this change in consideration when you make review for patch 18.
Some day we'll get there. John BTW: Next week is KVM Forum - so that usually means less activity on this list and less time for reviews. [...]

Check monitor status by checking the PIDs are in file 'task' or not. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 4 +++ 3 files changed, 88 insertions(+), 1 deletion(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4b22ed4..c90e48a 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2686,6 +2686,7 @@ virResctrlMonitorDeterminePath; virResctrlMonitorGetCacheLevel; virResctrlMonitorGetCacheOccupancy; virResctrlMonitorGetID; +virResctrlMonitorIsRunning; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetCacheLevel; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 41e8d48..67dfbb8 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -364,6 +364,9 @@ struct _virResctrlMonitor { char *path; /* Boolean flag for default monitor */ bool default_monitor; + /* Tracking the tasks' PID associated with this monitor */ + pid_t *pids; + size_t npids; /* The cache 'level', special for cache monitor */ unsigned int cache_level; }; @@ -425,6 +428,7 @@ virResctrlMonitorDispose(void *obj) virObjectUnref(monitor->alloc); VIR_FREE(monitor->id); VIR_FREE(monitor->path); + VIR_FREE(monitor->pids); } @@ -2491,7 +2495,13 @@ int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid) { - return virResctrlAddPID(monitor->path, pid); + if (virResctrlAddPID(monitor->path, pid) < 0) + return -1; + + if (VIR_APPEND_ELEMENT(monitor->pids, monitor->npids, pid) < 0) + return -1; + + return 0; } @@ -2762,3 +2772,75 @@ virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor) { monitor->default_monitor = true; } + + +static int +virResctrlPIDCompare(const void *pida, const void *pidb) +{ + return *(pid_t*)pida - *(pid_t*)pidb; +} + + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor) +{ + char *pidstr = NULL; + char **spids = NULL; + size_t nspids = 0; + pid_t *pids = NULL; + size_t npids = 0; + size_t i = 0; + int rv = -1; + bool ret = false; + + if (!monitor->path) + return false; + + if (monitor->npids == 0) + return false; + + rv = virFileReadValueString(&pidstr, "%s/tasks", monitor->path); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Task file '%s/tasks' does not exist"), + monitor->path); + if (rv < 0) + goto cleanup; + + /* no PID in task file */ + if (!*pidstr) + goto cleanup; + + spids = virStringSplitCount(pidstr, "\n", 0, &nspids); + if (nspids != monitor->npids) + return false; + + for (i = 0; i < nspids; i++) { + unsigned int val = 0; + pid_t pid = 0; + + if (virStrToLong_uip(spids[i], NULL, 0, &val) < 0) + goto cleanup; + + pid = (pid_t)val; + + if (VIR_APPEND_ELEMENT(pids, npids, pid) < 0) + goto cleanup; + } + + qsort(pids, npids, sizeof(pid_t), virResctrlPIDCompare); + qsort(monitor->pids, monitor->npids, sizeof(pid_t), virResctrlPIDCompare); + + for (i = 0; i < monitor->npids; i++) { + if (monitor->pids[i] != pids[i]) + goto cleanup; + } + + ret = true; + cleanup: + virStringListFree(spids); + VIR_FREE(pids); + VIR_FREE(pidstr); + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 371df8a..c5794cb 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -230,4 +230,8 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, unsigned int **bankcaches); void virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor); + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); + #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Check monitor status by checking the PIDs are in file 'task' or not.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 4 +++ 3 files changed, 88 insertions(+), 1 deletion(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4b22ed4..c90e48a 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2686,6 +2686,7 @@ virResctrlMonitorDeterminePath; virResctrlMonitorGetCacheLevel; virResctrlMonitorGetCacheOccupancy; virResctrlMonitorGetID; +virResctrlMonitorIsRunning; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetCacheLevel; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 41e8d48..67dfbb8 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -364,6 +364,9 @@ struct _virResctrlMonitor { char *path; /* Boolean flag for default monitor */ bool default_monitor; + /* Tracking the tasks' PID associated with this monitor */ + pid_t *pids; + size_t npids; /* The cache 'level', special for cache monitor */ unsigned int cache_level; }; @@ -425,6 +428,7 @@ virResctrlMonitorDispose(void *obj) virObjectUnref(monitor->alloc); VIR_FREE(monitor->id); VIR_FREE(monitor->path); + VIR_FREE(monitor->pids); }
@@ -2491,7 +2495,13 @@ int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid) { - return virResctrlAddPID(monitor->path, pid); + if (virResctrlAddPID(monitor->path, pid) < 0) + return -1; + + if (VIR_APPEND_ELEMENT(monitor->pids, monitor->npids, pid) < 0) + return -1; + + return 0; }
@@ -2762,3 +2772,75 @@ virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor) { monitor->default_monitor = true; } + + +static int +virResctrlPIDCompare(const void *pida, const void *pidb) +{ + return *(pid_t*)pida - *(pid_t*)pidb; +} + + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor) +{ + char *pidstr = NULL; + char **spids = NULL; + size_t nspids = 0; + pid_t *pids = NULL; + size_t npids = 0; + size_t i = 0; + int rv = -1; + bool ret = false; + + if (!monitor->path) + return false; + + if (monitor->npids == 0) + return false; + + rv = virFileReadValueString(&pidstr, "%s/tasks", monitor->path); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Task file '%s/tasks' does not exist"), + monitor->path); + if (rv < 0) + goto cleanup; + + /* no PID in task file */ + if (!*pidstr) + goto cleanup; + + spids = virStringSplitCount(pidstr, "\n", 0, &nspids); + if (nspids != monitor->npids) + return false;
This will cause a leak... "goto cleanup;"
+ + for (i = 0; i < nspids; i++) { + unsigned int val = 0; + pid_t pid = 0; + + if (virStrToLong_uip(spids[i], NULL, 0, &val) < 0) + goto cleanup; + + pid = (pid_t)val; + + if (VIR_APPEND_ELEMENT(pids, npids, pid) < 0) + goto cleanup; + } + + qsort(pids, npids, sizeof(pid_t), virResctrlPIDCompare); + qsort(monitor->pids, monitor->npids, sizeof(pid_t), virResctrlPIDCompare); + + for (i = 0; i < monitor->npids; i++) { + if (monitor->pids[i] != pids[i]) + goto cleanup; + } + + ret = true; + cleanup:
NB: Any place where you get here without an error message, but w/ ret = false may cause "issues" for the caller especially since some of the ways to get here w/ @false do set an error message. John
+ virStringListFree(spids); + VIR_FREE(pids); + VIR_FREE(pidstr); + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 371df8a..c5794cb 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -230,4 +230,8 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, unsigned int **bankcaches); void virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor); + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); + #endif /* __VIR_RESCTRL_H__ */

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Thursday, October 11, 2018 4:58 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 14/19] Util: Add function for checking if monitor is running
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Check monitor status by checking the PIDs are in file 'task' or not.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 4 +++ 3 files changed, 88 insertions(+), 1 deletion(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4b22ed4..c90e48a 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2686,6 +2686,7 @@ virResctrlMonitorDeterminePath; virResctrlMonitorGetCacheLevel; virResctrlMonitorGetCacheOccupancy; virResctrlMonitorGetID; +virResctrlMonitorIsRunning; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetCacheLevel; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 41e8d48..67dfbb8 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -364,6 +364,9 @@ struct _virResctrlMonitor { char *path; /* Boolean flag for default monitor */ bool default_monitor; + /* Tracking the tasks' PID associated with this monitor */ + pid_t *pids; + size_t npids; /* The cache 'level', special for cache monitor */ unsigned int cache_level; }; @@ -425,6 +428,7 @@ virResctrlMonitorDispose(void *obj) virObjectUnref(monitor->alloc); VIR_FREE(monitor->id); VIR_FREE(monitor->path); + VIR_FREE(monitor->pids); }
@@ -2491,7 +2495,13 @@ int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid) { - return virResctrlAddPID(monitor->path, pid); + if (virResctrlAddPID(monitor->path, pid) < 0) + return -1; + + if (VIR_APPEND_ELEMENT(monitor->pids, monitor->npids, pid) < 0) + return -1; + + return 0; }
@@ -2762,3 +2772,75 @@ virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor) { monitor->default_monitor = true; } + + +static int +virResctrlPIDCompare(const void *pida, const void *pidb) { + return *(pid_t*)pida - *(pid_t*)pidb; } + + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor) { + char *pidstr = NULL; + char **spids = NULL; + size_t nspids = 0; + pid_t *pids = NULL; + size_t npids = 0; + size_t i = 0; + int rv = -1; + bool ret = false; + + if (!monitor->path) + return false; + + if (monitor->npids == 0) + return false; + + rv = virFileReadValueString(&pidstr, "%s/tasks", monitor->path); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Task file '%s/tasks' does not exist"), + monitor->path); + if (rv < 0) + goto cleanup; + + /* no PID in task file */ + if (!*pidstr) + goto cleanup; + + spids = virStringSplitCount(pidstr, "\n", 0, &nspids); + if (nspids != monitor->npids) + return false;
This will cause a leak... "goto cleanup;"
My bad. Will be corrected.
+ + for (i = 0; i < nspids; i++) { + unsigned int val = 0; + pid_t pid = 0; + + if (virStrToLong_uip(spids[i], NULL, 0, &val) < 0) + goto cleanup; + + pid = (pid_t)val; + + if (VIR_APPEND_ELEMENT(pids, npids, pid) < 0) + goto cleanup; + } + + qsort(pids, npids, sizeof(pid_t), virResctrlPIDCompare); + qsort(monitor->pids, monitor->npids, sizeof(pid_t), + virResctrlPIDCompare); + + for (i = 0; i < monitor->npids; i++) { + if (monitor->pids[i] != pids[i]) + goto cleanup; + } + + ret = true; + cleanup:
NB: Any place where you get here without an error message, but w/ ret = false may cause "issues" for the caller especially since some of the ways to get here w/ @false do set an error message.
Got. Will add some comments for each return and each 'goto cleanup', and will refine the error message.
John
Thanks for review. Huaqiang
+ virStringListFree(spids); + VIR_FREE(pids); + VIR_FREE(pidstr); + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 371df8a..c5794cb 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -230,4 +230,8 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, unsigned int **bankcaches); void virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor); + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); + #endif /* __VIR_RESCTRL_H__ */

Add functions for creating, destroying, reconnecting resctrl monitor in qemu according to the configuration in domain XML. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 46 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e9c7618..a4bbef6 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2611,10 +2611,22 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, return -1; for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; if (virResctrlAllocCreate(caps->host.resctrl, vm->def->resctrls[i]->alloc, priv->machineName) < 0) goto cleanup; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorCreate(vm->def->resctrls[i]->alloc, + mon->instance, + priv->machineName) < 0) + goto cleanup; + + } } ret = 0; @@ -5440,11 +5452,22 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, return -1; for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; virDomainResctrlDefPtr ct = vm->def->resctrls[i]; if (virBitmapIsBitSet(ct->vcpus, vcpuid)) { if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) return -1; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + } break; } } @@ -7207,8 +7230,18 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove resctrl allocation after cgroups are cleaned up which makes it * kind of safer (although removing the allocation should work even with * pids in tasks file */ - for (i = 0; i < vm->def->nresctrls; i++) + for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + virResctrlMonitorRemove(mon->instance); + } + virResctrlAllocRemove(vm->def->resctrls[i]->alloc); + } qemuProcessRemoveDomainStatus(driver, vm); @@ -7222,8 +7255,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, virPortAllocatorRelease(graphics->data.vnc.port); } else if (graphics->data.vnc.portReserved) { virPortAllocatorRelease(graphics->data.vnc.port); - graphics->data.vnc.portReserved = false; - } + graphics->data.vnc.portReserved = false; } if (graphics->data.vnc.websocketGenerated) { virPortAllocatorRelease(graphics->data.vnc.websocket); graphics->data.vnc.websocketGenerated = false; @@ -7939,9 +7971,20 @@ qemuProcessReconnect(void *opaque) goto error; for (i = 0; i < obj->def->nresctrls; i++) { + size_t j = 0; + if (virResctrlAllocDeterminePath(obj->def->resctrls[i]->alloc, priv->machineName) < 0) goto error; + + for (j = 0; j < obj->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = obj->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorDeterminePath(mon->instance, + priv->machineName) < 0) + goto error; + } } /* update domain state XML with possibly updated state in virDomainObj */ -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add functions for creating, destroying, reconnecting resctrl monitor in qemu according to the configuration in domain XML.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 46 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e9c7618..a4bbef6 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2611,10 +2611,22 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, return -1;
for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; if (virResctrlAllocCreate(caps->host.resctrl, vm->def->resctrls[i]->alloc, priv->machineName) < 0) goto cleanup; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorCreate(vm->def->resctrls[i]->alloc, + mon->instance, + priv->machineName) < 0) + goto cleanup; + + } }
ret = 0; @@ -5440,11 +5452,22 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, return -1;
for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; virDomainResctrlDefPtr ct = vm->def->resctrls[i];
if (virBitmapIsBitSet(ct->vcpus, vcpuid)) { if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) return -1; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + } break; } } @@ -7207,8 +7230,18 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove resctrl allocation after cgroups are cleaned up which makes it * kind of safer (although removing the allocation should work even with * pids in tasks file */ - for (i = 0; i < vm->def->nresctrls; i++) + for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + virResctrlMonitorRemove(mon->instance); + } + virResctrlAllocRemove(vm->def->resctrls[i]->alloc); + }
qemuProcessRemoveDomainStatus(driver, vm);
@@ -7222,8 +7255,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, virPortAllocatorRelease(graphics->data.vnc.port); } else if (graphics->data.vnc.portReserved) { virPortAllocatorRelease(graphics->data.vnc.port); - graphics->data.vnc.portReserved = false; - } + graphics->data.vnc.portReserved = false; }
Rogue edit? Caused an issue w/ my Coverity environment because there's a false positive surrounding the data.vnc.websocket value being -1 and thus passing that to virPortAllocatorRelease and using it without checking > 0 in a few lines. Anyway this hunk needs to be removed. John (OK, so I didn't look too hard at these last two, but I'm going back to the start to read what you've written today).
if (graphics->data.vnc.websocketGenerated) { virPortAllocatorRelease(graphics->data.vnc.websocket); graphics->data.vnc.websocketGenerated = false; @@ -7939,9 +7971,20 @@ qemuProcessReconnect(void *opaque) goto error;
for (i = 0; i < obj->def->nresctrls; i++) { + size_t j = 0; + if (virResctrlAllocDeterminePath(obj->def->resctrls[i]->alloc, priv->machineName) < 0) goto error; + + for (j = 0; j < obj->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = obj->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorDeterminePath(mon->instance, + priv->machineName) < 0) + goto error; + } }
/* update domain state XML with possibly updated state in virDomainObj */

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Thursday, October 11, 2018 4:59 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 15/19] qemu: enable resctrl monitor in qemu
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
Add functions for creating, destroying, reconnecting resctrl monitor in qemu according to the configuration in domain XML.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 46 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e9c7618..a4bbef6 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2611,10 +2611,22 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, return -1;
for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; if (virResctrlAllocCreate(caps->host.resctrl, vm->def->resctrls[i]->alloc, priv->machineName) < 0) goto cleanup; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorCreate(vm->def->resctrls[i]->alloc, + mon->instance, + priv->machineName) < 0) + goto cleanup; + + } }
ret = 0; @@ -5440,11 +5452,22 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, return -1;
for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; virDomainResctrlDefPtr ct = vm->def->resctrls[i];
if (virBitmapIsBitSet(ct->vcpus, vcpuid)) { if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) return -1; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + } break; } } @@ -7207,8 +7230,18 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove resctrl allocation after cgroups are cleaned up which makes it * kind of safer (although removing the allocation should work even with * pids in tasks file */ - for (i = 0; i < vm->def->nresctrls; i++) + for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + virResctrlMonitorRemove(mon->instance); + } + virResctrlAllocRemove(vm->def->resctrls[i]->alloc); + }
qemuProcessRemoveDomainStatus(driver, vm);
@@ -7222,8 +7255,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, virPortAllocatorRelease(graphics->data.vnc.port); } else if (graphics->data.vnc.portReserved) { virPortAllocatorRelease(graphics->data.vnc.port); - graphics->data.vnc.portReserved = false; - } + graphics->data.vnc.portReserved = false; }
Rogue edit? Caused an issue w/ my Coverity environment because there's a false positive surrounding the data.vnc.websocket value being -1 and thus passing that to virPortAllocatorRelease and using it without checking > 0 in a few lines.
Anyway this hunk needs to be removed.
I have no idea why I made such a change... I don't indent to make any change here, change will be removed.
John
(OK, so I didn't look too hard at these last two, but I'm going back to the start to read what you've written today).
We need to have some agreement on the usage of 'default allocation' and 'default monitor'. Then let's go on. Thanks for review. Huaqiang
if (graphics->data.vnc.websocketGenerated) { virPortAllocatorRelease(graphics->data.vnc.websocket); graphics->data.vnc.websocketGenerated = false; @@ -7939,9 +7971,20 @@ qemuProcessReconnect(void *opaque) goto error;
for (i = 0; i < obj->def->nresctrls; i++) { + size_t j = 0; + if (virResctrlAllocDeterminePath(obj->def->resctrls[i]->alloc, priv->machineName) < 0) goto error; + + for (j = 0; j < obj->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = obj->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorDeterminePath(mon->instance, + priv->machineName) < 0) + goto error; + } }
/* update domain state XML with possibly updated state in virDomainObj */

Adding element 'id' to virDomainResctrlDef. This 'id' reflects the attribute 'id' of of element 'cachetune in XML. virResctrlAlloc.id is a copy of virDomanResctrlDef.id. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 20 ++++++++------------ src/conf/domain_conf.h | 1 + 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 4f4604f..6da9dd4 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2979,6 +2979,7 @@ virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); VIR_FREE(resctrl->monitors); + VIR_FREE(resctrl->id); VIR_FREE(resctrl); } @@ -19138,6 +19139,9 @@ virDomainResctrlAppend(virDomainDefPtr def, goto cleanup; } + if (VIR_STRDUP(resctrl->id, alloc_id) < 0) + goto cleanup; + if (virResctrlAllocSetID(resctrl->alloc, alloc_id) < 0) goto cleanup; @@ -27320,13 +27324,9 @@ virDomainCachetuneDefFormat(virBufferPtr buf, virBufferAsprintf(buf, "<cachetune vcpus='%s'", vcpus); - if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { - const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); - if (!alloc_id) - goto cleanup; + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) + virBufferAsprintf(buf, " id='%s'", resctrl->id); - virBufferAsprintf(buf, " id='%s'", alloc_id); - } virBufferAddLit(buf, ">\n"); virBufferAddBuffer(buf, &childrenBuf); @@ -27383,13 +27383,9 @@ virDomainMemorytuneDefFormat(virBufferPtr buf, virBufferAsprintf(buf, "<memorytune vcpus='%s'", vcpus); - if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { - const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); - if (!alloc_id) - goto cleanup; + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) + virBufferAsprintf(buf, " id='%s'", resctrl->id); - virBufferAsprintf(buf, " id='%s'", alloc_id); - } virBufferAddLit(buf, ">\n"); virBufferAddBuffer(buf, &childrenBuf); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 60f6464..e190aa2 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2248,6 +2248,7 @@ typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr; struct _virDomainResctrlDef { + char *id; virBitmapPtr vcpus; virResctrlAllocPtr alloc; -- 2.7.4

Refactoring qemuDomainGetStatsCpu, make it possible to add more CPU statistics. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_driver.c | 45 ++++++++++++++++++++++----------------------- 1 file changed, 22 insertions(+), 23 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a52e249..12a5f8f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -19711,30 +19711,29 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, unsigned long long sys_time = 0; int err = 0; - if (!priv->cgroup) - return 0; - - err = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.time", - cpu_time) < 0) - return -1; + if (priv->cgroup) { + err = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.time", + cpu_time) < 0) + return -1; - err = virCgroupGetCpuacctStat(priv->cgroup, &user_time, &sys_time); - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.user", - user_time) < 0) - return -1; - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.system", - sys_time) < 0) - return -1; + err = virCgroupGetCpuacctStat(priv->cgroup, &user_time, &sys_time); + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.user", + user_time) < 0) + return -1; + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.system", + sys_time) < 0) + return -1; + } return 0; } -- 2.7.4

Adding the interface in qemu to report CMT statistic information through command 'virsh domstats --cpu-total'. Below is a typical output: # virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.0.name=vcpus_1 cpu.cache.0.vcpus=1 cpu.cache.0.bank.count=2 cpu.cache.0.bank.0.id=0 cpu.cache.0.bank.0.bytes=4505600 cpu.cache.0.bank.1.id=1 cpu.cache.0.bank.1.bytes=5586944 cpu.cache.1.name=vcpus_4-6 cpu.cache.1.vcpus=4,5,6 cpu.cache.1.bank.count=2 cpu.cache.1.bank.0.id=0 cpu.cache.1.bank.0.bytes=17571840 cpu.cache.1.bank.1.id=1 cpu.cache.1.bank.1.bytes=29106176 Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt-domain.c | 9 ++ src/qemu/qemu_driver.c | 230 +++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 239 insertions(+) diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 7690339..218c93a 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -11345,6 +11345,15 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * "cpu.user" - user cpu time spent in nanoseconds as unsigned long long. * "cpu.system" - system cpu time spent in nanoseconds as unsigned long * long. + * "cpu.cache.monitor.count" - tocal cache monitoring groups + * "cpu.cache.M.name" - name of cache monitoring group 'M' + * "cpu.cache.M.vcpus" - vcpus for cache monitoring group 'M' + * "cpu.cache.M.bank.count" - total bank number of cache monitoring + * group 'M' + * "cpu.cache.M.bank.N.id" - OS assigned cache bank id for cache 'N' in + * cache monitoring group 'M' + * "cpu.cache.M.bank.N.bytes" - monitor's cache occupancy of cache bank + * 'N' in cache monitoring group 'M' * * VIR_DOMAIN_STATS_BALLOON: * Return memory balloon device information. diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 12a5f8f..7510a62 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -102,6 +102,7 @@ #include "virnuma.h" #include "dirname.h" #include "netdev_bandwidth_conf.h" +#include "c-ctype.h" #define VIR_FROM_THIS VIR_FROM_QEMU @@ -19698,6 +19699,231 @@ typedef enum { #define HAVE_JOB(flags) ((flags) & QEMU_DOMAIN_STATS_HAVE_JOB) +/* In terms of the output of virBitmapFormat, both '1-3' and '1,3' are valid + * outputs and represent different vcpu set. + * + * It is not easy to differentiate these two vcpu set formats at first glance. + * This function could be used to clear this ambiguity, it substitutes all '-' + * with ',' while generates semantically correct vcpu set. + * e.g. vcpu set string '1-3' will be replaced by string '1,2,3'. */ +static char * +qemuDomainVcpuFormatHelper(const char *vcpus) +{ + size_t i = 0; + int last = 0; + int start = 0; + char * tmp = NULL; + bool firstnum = true; + const char *cur = vcpus; + virBuffer buf = VIR_BUFFER_INITIALIZER; + char *ret = NULL; + + if (virStringIsEmpty(cur)) + return NULL; + + while (*cur != '\0') { + if (!c_isdigit(*cur)) + goto cleanup; + + if (virStrToLong_i(cur, &tmp, 10, &start) < 0) + goto cleanup; + if (start < 0) + goto cleanup; + + cur = tmp; + + virSkipSpaces(&cur); + + if (*cur == ',' || *cur == 0) { + if (!firstnum) + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%d", start); + firstnum = false; + } else if (*cur == '-') { + cur++; + virSkipSpaces(&cur); + + if (virStrToLong_i(cur, &tmp, 10, &last) < 0) + goto cleanup; + + if (last < start) + goto cleanup; + cur = tmp; + + for (i = start; i <= last; i++) { + if (!firstnum) + + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%ld", i); + firstnum = 0; + } + + virSkipSpaces(&cur); + } + + if (*cur == ',') { + cur++; + virSkipSpaces(&cur); + } else if (*cur == 0) { + break; + } else { + goto cleanup; + } + } + + ret = virBufferContentAndReset(&buf); + cleanup: + virBufferFreeAndReset(&buf); + return ret; +} + + +static int +qemuDomainGetStatsCpuResource(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr dom, + virDomainStatsRecordPtr record, + int *maxparams, + unsigned int privflags ATTRIBUTE_UNUSED, + virResctrlMonitorType restag) +{ + char param_name[VIR_TYPED_PARAM_FIELD_LENGTH]; + virDomainResctrlMonDefPtr domresmon = NULL; + virDomainResctrlDefPtr resctrl = NULL; + unsigned int nmonitors = NULL; + const char *restype = NULL; + unsigned int *vals = NULL; + unsigned int *ids = NULL; + size_t nvals = 0; + char *rawvcpus = NULL; + char *vcpus = NULL; + size_t i = 0; + size_t j = 0; + int ret = -1; + + if (!virDomainObjIsActive(dom)) + return 0; + + if (restag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + restype = "cache"; + } else { + VIR_DEBUG("Invalid CPU resource type"); + return -1; + } + + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + domresmon = resctrl->monitors[j]; + if (virResctrlMonitorIsRunning(domresmon->instance) && + domresmon->tag == restag) + nmonitors++; + } + } + + if (nmonitors) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.count", restype); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nmonitors) < 0) + goto cleanup; + } + + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + size_t l = 0; + virResctrlMonitorPtr monitor = resctrl->monitors[j]->instance; + const char *id = virResctrlMonitorGetID(monitor); + + if (!id) + goto cleanup; + + domresmon = resctrl->monitors[j]; + + if (!virResctrlMonitorIsRunning(domresmon->instance)) + continue; + + if (!(rawvcpus = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + vcpus = qemuDomainVcpuFormatHelper(rawvcpus); + if (!vcpus) + goto cleanup; + + if (virResctrlMonitorGetCacheOccupancy(monitor, &nvals, + &ids, &vals) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.%ld.name", restype, i); + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + id) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.%ld.vcpus", restype, i); + + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + vcpus) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.%ld.bank.count", restype, i); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nvals) < 0) + goto cleanup; + + for (l = 0; l < nvals; l++) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.%ld.bank.%ld.id", restype, i, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + ids[l]) < 0) + goto cleanup; + + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.%ld.bank.%ld.bytes", restype, i, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + vals[l]) < 0) + goto cleanup; + } + + VIR_FREE(ids); + VIR_FREE(vals); + VIR_FREE(vcpus); + nvals = 0; + } + } + + ret = 0; + cleanup: + VIR_FREE(ids); + VIR_FREE(vals); + VIR_FREE(vcpus); + return ret; +} + + static int qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr dom, @@ -19735,6 +19961,10 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, return -1; } + if (qemuDomainGetStatsCpuResource(driver, dom, record, maxparams, privflags, + VIR_RESCTRL_MONITOR_TYPE_CACHE) < 0) + return -1; + return 0; } -- 2.7.4

Invoking qemuProcessSetupVcpus in process of VM reconnection. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 3 +++ src/util/virresctrl.c | 7 +++++++ 2 files changed, 10 insertions(+) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index a4bbef6..f85aef0 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -7987,6 +7987,9 @@ qemuProcessReconnect(void *opaque) } } + if (qemuProcessSetupVcpus(obj) < 0) + goto error; + /* update domain state XML with possibly updated state in virDomainObj */ if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, obj, driver->caps) < 0) goto error; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 67dfbb8..b5717b2 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2495,9 +2495,16 @@ int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid) { + size_t i = 0; + if (virResctrlAddPID(monitor->path, pid) < 0) return -1; + for (i = 0; i < monitor->npids; i++) { + if (pid == monitor->pids[i]) + return 0; + } + if (VIR_APPEND_ELEMENT(monitor->pids, monitor->npids, pid) < 0) return -1; -- 2.7.4

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
This series of patches and the series already been merged introduce the x86 Cache Monitoring Technology (CMT) to libvirt by interacting with kernel resource control (resctrl) interface. CMT is one of the Intel(R) x86 CPU feature which belongs to the Resource Director Technology (RDT). CMT reports the occupancy of the last level cache, which is shared by all CPU cores.
In the v1 series, an original and complete feature for CMT was introduced The v2 and v3 patches address the feature for the host capability of CMT. v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command.
We have serval discussion about the enabling of CMT, please refer to following links for the RFCs. RFCv3 https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html RFCv2 https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html RFCv1 https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html
And the merged commits are list as below, for host capability of CMT. 6af8417415508c31f8ce71234b573b4999f35980 8f6887998bf63594ae26e3db18d4d5896c5f2cb4 58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8 12093f1feaf8f5023dcd9d65dff111022842183d a5d293c18831dcf69ec6195798387fbb70c9f461
1. About reason why CMT is necessary in libvirt? The perf events of 'CMT, MBML, MBMT' have been phased out since Linux kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt the perf based cmt,mbm will not work with the latest linux kernel. These patches add CMT feature to libvirt through kernel resctrlfs interface.
2 Create cache monitoring group (cache monitor).
The main interface for creating monitoring group is through XML file. The proposed configuration is like:
<cputune> <cachetune vcpus='1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='1'/>
Duplication of vcpus is odd for a child entry isn't it? It's not in the <cache> entry...
</cachetune> <cachetune vcpus='4-7'> + <monitor level='3' vcpus='4-6'/>
... but perhaps that means using 4-6 is OK because it's a subset of the parent cachetune 4-7? I'm not sure I can keep track of all the discussions we've had about this, so this could be something we've already covered, but has moved out of my short term memory.
</cachetune> </cputune>
In above XML, created 2 cache resctrl allocation groups and 2 resctrl monitoring groups. The changes of cache monitor will be effective in next booting of VM.
2 Show CMT result through command 'domstats'
Adding the interface in qemu to report this information for resource monitor group through command 'virsh domstats --cpu-total'. Below is a typical output:
# virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.0.name=vcpus_1 cpu.cache.0.vcpus=1 cpu.cache.0.bank.count=2 cpu.cache.0.bank.0.id=0 cpu.cache.0.bank.0.bytes=4505600 cpu.cache.0.bank.1.id=1 cpu.cache.0.bank.1.bytes=5586944 cpu.cache.1.name=vcpus_4-6
So perhaps "this" is more correct 4-6 (I assume this comes from the <cachetune> entryu...
cpu.cache.1.vcpus=4,5,6
Interesting that a name can be 4-6, but these are each called out. Can someone have "5,7,9"? How does that look on the name line and then on the vcpus line.
cpu.cache.1.bank.count=2 cpu.cache.1.bank.0.id=0 cpu.cache.1.bank.0.bytes=17571840 cpu.cache.1.bank.1.id=1 cpu.cache.1.bank.1.bytes=29106176
Obviously a different example than above with only 1 <monitor> entry... and the .bytes values for everything doesn't match up with the kb values above.
Changes in v5: - qemu: Setting up vcpu and adding pids to resctrl monitor groups during re-connection. - Add the document for domain configuration related to resctrl monitor.
Probably should have posted a reply to your v4 series to indicate you were working on a v5 due to whatever reason so that no one started reviewing it... It takes a "long time" to set aside the time to review large series... Also, while it may pass your compiler, the patch18 needed: - unsigned int nmonitors = NULL; + unsigned int nmonitors = 0; Something I thought I had pointed out in much earlier reviews... I'll work through the series over the next day or so with any luck... It is on my short term radar at least. John
Changes in v4: v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command. - Introduced resctrl default allocation - Introduced resctrl monitor and default monitor
Changes in v3: - Addressed John Ferlan's review. - Typo fixed. - Removed VIR_ENUM_DECL(virMonitor);
Changes in v2: - Introduced MBM capability. - Capability layout changed * Moved <monitor> from cahe <bank> to <cache> * Renamed <Threshold> to <reuseThreshold> - Document for 'reuseThreshold' changed. - Introduced API virResctrlInfoGetMonitorPrefix - Added more tests, covering standalone CMT, fake new feature. - Creating CMT resource control group will be subsequent job.
Wang Huaqiang (19): docs: Refactor schemas to support default allocation util: Introduce resctrl monitor for CMT util: Refactor code for adding PID to the resource group util: Add interface for adding PID to monitor util: Refactor code for determining allocation path util: Add monitor interface to determine path util: Refactor code for creating resctrl group util: Add interface for creating monitor group util: Add more interfaces for resctrl monitor util: Introduce default monitor conf: Refactor code for matching existing resctrls conf: Refactor virDomainResctrlAppend conf: Add resctrl monitor configuration Util: Add function for checking if monitor is running qemu: enable resctrl monitor in qemu conf: Add a 'id' to virDomainResctrlDef qemu: refactor qemuDomainGetStatsCpu qemu: Report cache occupancy (CMT) with domstats qemu: Setting up vcpu and adding pids to resctrl monitor groups during reconnection
docs/formatdomain.html.in | 30 +- docs/schemas/domaincommon.rng | 14 +- src/conf/domain_conf.c | 327 ++++++++++-- src/conf/domain_conf.h | 12 + src/libvirt-domain.c | 9 + src/libvirt_private.syms | 12 + src/qemu/qemu_driver.c | 271 +++++++++- src/qemu/qemu_process.c | 52 +- src/util/virresctrl.c | 562 ++++++++++++++++++++- src/util/virresctrl.h | 49 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 ++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 14 files changed, 1277 insertions(+), 103 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, October 10, 2018 12:54 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [libvirt] [PATCHv5 00/19] Introduce x86 Cache Monitoring Technology (CMT)
On 10/9/18 6:30 AM, Wang Huaqiang wrote:
This series of patches and the series already been merged introduce the x86 Cache Monitoring Technology (CMT) to libvirt by interacting with kernel resource control (resctrl) interface. CMT is one of the Intel(R) x86 CPU feature which belongs to the Resource Director Technology (RDT). CMT reports the occupancy of the last level cache, which is shared by all CPU cores.
In the v1 series, an original and complete feature for CMT was introduced The v2 and v3 patches address the feature for the host capability of CMT. v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command.
We have serval discussion about the enabling of CMT, please refer to following links for the RFCs. RFCv3 https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html RFCv2 https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html RFCv1 https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html
And the merged commits are list as below, for host capability of CMT. 6af8417415508c31f8ce71234b573b4999f35980 8f6887998bf63594ae26e3db18d4d5896c5f2cb4 58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8 12093f1feaf8f5023dcd9d65dff111022842183d a5d293c18831dcf69ec6195798387fbb70c9f461
1. About reason why CMT is necessary in libvirt? The perf events of 'CMT, MBML, MBMT' have been phased out since Linux kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt the perf based cmt,mbm will not work with the latest linux kernel. These patches add CMT feature to libvirt through kernel resctrlfs interface.
2 Create cache monitoring group (cache monitor).
The main interface for creating monitoring group is through XML file. The proposed configuration is like:
<cputune> <cachetune vcpus='1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='1'/>
Duplication of vcpus is odd for a child entry isn't it? It's not in the <cache> entry...
Seems odd in this example if you think one 'vcpus' is a copy of another. Actually the two 'vcpus' can be assigned with different vcpu setting. Let's introduce some background.
From the perspective of CPU hardware configuration, it has different hardware resource for allocation group and monitor group. And the number of hardware CLOSID, the hardware class of service ID, determines the number of allocation groups could be created. And the number of hardware RMID, and which is hardware resource monitor ID, determines the monitor groups could be created simultaneously in host.
Normally we have more RMIDs and CLOSIDs, that is, we can create more allocations than monitors. Based on this hardware design, kernel introduces the resctrl file system and which has the capability to create more monitors than allocations, and create more than one monitors for each allocation. Allocations can only be created under root directory of '/sys/fs/resctrl' with one exception that the root directory itself is also an allocation, which is called default allocation. For monitors, could be created under default allocation's 'mon_groups' directory, which is directory '/sys/fs/resctrl/mon_groups', or be created under other allocation's 'mon_groups' directory, for example '/sys/fs/resctrl/p0/mon_groups' for allocation 'p0'. Each directory under the allocation's 'mon_group' occupies one RMID, and you can create several directories under this directory. Each allocation itself occupies one RMID and creates a monitor monitoring the cache or memory bandwidth utilization for all CPUs using this RMID. With the help of kernel scheduler, the hardware RMID is assigned to the CPU servicing current Linux thread at time of CPU context switch. That is the RMID could be tracked through PID of vcpu. Here, since both <cachetune> and <monitor> have the same 'vcpus' attribute but it may pointer to different vcpu list. The rules are: 1. In each <cachetune> entry more than one monitors could be specified. 2. In each <cachetune> entry up to one allocation could be specified. 3. The allocation is using the vcpu list specified in <cachetune> attribute 'vcpus'. 4. A monitor has the same vcpu list as allocation is allowed, and this monitor is allocation's default monitor. 5. A monitor has a subset vcpu list of allocation is allowed. 6. For non-default monitors, any vcpu list overlap is not permitted. Since we treat both memorytune and cachetune as the allocation, These rules are applicable to memoryBW allocation if we replace the <cachetune> with <memorytune>. So following XML are all valid: <cputune> <cachetune vcpus='1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='2-5'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> <monitor level='3' vcpus='2'/> <monitor level='3' vcpus='3,5'/> <monitor level='3' vcpus='4'/> <monitor level='3' vcpus='2-5'/> </cachetune> <cachetune vcpus='6'> <monitor level='3' vcpus='6'/> </cachetune> </cputune> Any of following <cachetune> entry is invalid: <cachetune vcpus='6'> <monitor level='3' vcpus='7'/> </cachetune> <cachetune vcpus='2-5'> <monitor level='3' vcpus='2-4'/> <monitor level='3' vcpus='2'/> </cachetune> <cachetune vcpus='6'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> <monitor level='3' vcpus='7'/> </cachetune>
</cachetune> <cachetune vcpus='4-7'> + <monitor level='3' vcpus='4-6'/>
... but perhaps that means using 4-6 is OK because it's a subset of the parent cachetune 4-7?
I'm not sure I can keep track of all the discussions we've had about this, so this could be something we've already covered, but has moved out of my short term memory.
See above explanation.
</cachetune> </cputune>
In above XML, created 2 cache resctrl allocation groups and 2 resctrl monitoring groups. The changes of cache monitor will be effective in next booting of VM.
2 Show CMT result through command 'domstats'
Adding the interface in qemu to report this information for resource monitor group through command 'virsh domstats --cpu-total'. Below is a typical output:
# virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.0.name=vcpus_1 cpu.cache.0.vcpus=1 cpu.cache.0.bank.count=2 cpu.cache.0.bank.0.id=0 cpu.cache.0.bank.0.bytes=4505600 cpu.cache.0.bank.1.id=1 cpu.cache.0.bank.1.bytes=5586944 cpu.cache.1.name=vcpus_4-6
So perhaps "this" is more correct 4-6 (I assume this comes from the <cachetune> entryu...
Actually 'cpu.cache.1.name' shows the cache monitor ID, content of @virResctrlMonitor.id. Because 'cpu.cache.x.id' is more confusion for describing the cache since the 'cache ID' describes the cache bank index in some places. I also think 'vcpus_4-6' is more reasonable than '4-6' to describe a monitor's name or id. If you insist '4-6' is better, I could change that.
cpu.cache.1.vcpus=4,5,6
Interesting that a name can be 4-6, but these are each called out. Can someone have "5,7,9"? How does that look on the name line and then on the vcpus line.
vcpu list "5,7,9" is valid here and the monitor's output would be: cpu.cache.monitor.count=... cpu.cache.0.name=vcpus_5,7,9 cpu.cache.0.vcpus=5,7,9 cpu.cache.0.bank.count=2 cpu.cache.0.bank.0.id=0 cpu.cache.0.bank.0.bytes=4505600 cpu.cache.0.bank.0.id=0 cpu.cache.0.bank.1.bytes=4505600 cpu.cache.0.bank.1.id=0
cpu.cache.1.bank.count=2 cpu.cache.1.bank.0.id=0 cpu.cache.1.bank.0.bytes=17571840 cpu.cache.1.bank.1.id=1 cpu.cache.1.bank.1.bytes=29106176
Obviously a different example than above with only 1 <monitor> entry... and the .bytes values for everything doesn't match up with the kb values above.
These numbers in my example are not the real number that I get from a system, they may not match up. .bytes reports the cache utilization information of current monitor, it might be less than the cache allocated from current allocation. It is reasonable that we allocate 10MB cache to vcpus 0 and vcpus 0 only used part of that cache resource. I have some trouble in making CAT work in my test machine, I'll try to catch some real numbers to illustrate these numbers when I fixed the CAT issue.
Changes in v5: - qemu: Setting up vcpu and adding pids to resctrl monitor groups during re-connection. - Add the document for domain configuration related to resctrl monitor.
Probably should have posted a reply to your v4 series to indicate you were working on a v5 due to whatever reason so that no one started reviewing it...
It takes a "long time" to set aside the time to review large series...
I understand, and noticed you also submit your patch series to community and has a lot of heavy work on task of review. In V5, I added some missing part.
Also, while it may pass your compiler, the patch18 needed:
- unsigned int nmonitors = NULL; + unsigned int nmonitors = 0;
Something I thought I had pointed out in much earlier reviews...
Yes you told me about this, my bad. Will be fixed.
I'll work through the series over the next day or so with any luck... It is on my short term radar at least.
John
Thank you very much for your review. Huaqiang
Changes in v4: v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command. - Introduced resctrl default allocation - Introduced resctrl monitor and default monitor
Changes in v3: - Addressed John Ferlan's review. - Typo fixed. - Removed VIR_ENUM_DECL(virMonitor);
Changes in v2: - Introduced MBM capability. - Capability layout changed * Moved <monitor> from cahe <bank> to <cache> * Renamed <Threshold> to <reuseThreshold> - Document for 'reuseThreshold' changed. - Introduced API virResctrlInfoGetMonitorPrefix - Added more tests, covering standalone CMT, fake new feature. - Creating CMT resource control group will be subsequent job.
Wang Huaqiang (19): docs: Refactor schemas to support default allocation util: Introduce resctrl monitor for CMT util: Refactor code for adding PID to the resource group util: Add interface for adding PID to monitor util: Refactor code for determining allocation path util: Add monitor interface to determine path util: Refactor code for creating resctrl group util: Add interface for creating monitor group util: Add more interfaces for resctrl monitor util: Introduce default monitor conf: Refactor code for matching existing resctrls conf: Refactor virDomainResctrlAppend conf: Add resctrl monitor configuration Util: Add function for checking if monitor is running qemu: enable resctrl monitor in qemu conf: Add a 'id' to virDomainResctrlDef qemu: refactor qemuDomainGetStatsCpu qemu: Report cache occupancy (CMT) with domstats qemu: Setting up vcpu and adding pids to resctrl monitor groups during reconnection
docs/formatdomain.html.in | 30 +- docs/schemas/domaincommon.rng | 14 +- src/conf/domain_conf.c | 327 ++++++++++-- src/conf/domain_conf.h | 12 + src/libvirt-domain.c | 9 + src/libvirt_private.syms | 12 + src/qemu/qemu_driver.c | 271 +++++++++- src/qemu/qemu_process.c | 52 +- src/util/virresctrl.c | 562 ++++++++++++++++++++- src/util/virresctrl.h | 49 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 ++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 14 files changed, 1277 insertions(+), 103 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
participants (3)
-
John Ferlan
-
Wang Huaqiang
-
Wang, Huaqiang