[libvirt] [PATCHv7 00/18] Introduce x86 Cache Monitoring Technology (CMT)

This series of patches and the series already been merged introduce the x86 Cache Monitoring Technology (CMT) to libvirt by interacting with kernel resource control (resctrl) interface. CMT is one of the Intel(R) x86 CPU feature which belongs to the Resource Director Technology (RDT). CMT reports the occupancy of the last level cache, which is shared by all CPU cores. In the v1 series, an original and complete feature for CMT was introduced The v2 and v3 patches address the feature for the host capability of CMT. v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command. We have serval discussion about the enabling of CMT, please refer to following links for the RFCs. RFCv3 https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html RFCv2 https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html RFCv1 https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html And the merged commits are list as below, for host capability of CMT. 6af8417415508c31f8ce71234b573b4999f35980 8f6887998bf63594ae26e3db18d4d5896c5f2cb4 58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8 12093f1feaf8f5023dcd9d65dff111022842183d a5d293c18831dcf69ec6195798387fbb70c9f461 1. About reason why CMT is necessary in libvirt? The perf events of 'CMT, MBML, MBMT' have been phased out since Linux kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt the perf based cmt,mbm will not work with the latest linux kernel. These patches add CMT feature to libvirt through kernel resctrlfs interface. 2 Create cache monitoring group (cache monitor). The main interface for creating monitoring group is through XML file. The proposed configuration is like: <cputune> <cachetune vcpus='1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='4-7'> + <monitor level='3' vcpus='4-6'/> </cachetune> </cputune> In above XML, created 2 cache resctrl allocation groups and 2 resctrl monitoring groups. The changes of cache monitor will be effective in next booting of VM. 2 Show CMT result through command 'domstats' Adding the interface in qemu to report this information for resource monitor group through command 'virsh domstats --cpu-total'. Below is a typical output: # virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.monitor.0.name=vcpus_1 cpu.cache.monitor.0.vcpus=1 cpu.cache.monitor.0.bank.count=2 cpu.cache.monitor.0.bank.0.id=0 cpu.cache.monitor.0.bank.0.bytes=4505600 cpu.cache.monitor.0.bank.1.id=1 cpu.cache.monitor.0.bank.1.bytes=5586944 cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6 cpu.cache.monitor.1.bank.count=2 cpu.cache.monitor.1.bank.0.id=0 cpu.cache.monitor.1.bank.0.bytes=17571840 cpu.cache.monitor.1.bank.1.id=1 cpu.cache.monitor.1.bank.1.bytes=29106176 Changes in v7: - Add several lines removed by mistake. Changes in v6: - Addressing John's review comments for v5. - Removed and cleaned the concepts of 'default allocation' and 'default monitor'. - qemu: virsh domstats --cpu-total output for CMT, add identifier 'monitor' for each itm. Changes in v5: - qemu: Setting up vcpu and adding pids to resctrl monitor groups during re-connection. - Add the document for domain configuration related to resctrl monitor. Changes in v4: v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command. - Introduced resctrl default allocation - Introduced resctrl monitor and default monitor Changes in v3: - Addressed John Ferlan's review. - Typo fixed. - Removed VIR_ENUM_DECL(virMonitor); Changes in v2: - Introduced MBM capability. - Capability layout changed * Moved <monitor> from cahe <bank> to <cache> * Renamed <Threshold> to <reuseThreshold> - Document for 'reuseThreshold' changed. - Introduced API virResctrlInfoGetMonitorPrefix - Added more tests, covering standalone CMT, fake new feature. - Creating CMT resource control group will be subsequent job. Wang Huaqiang (18): docs,util: Refactor schemas and virresctrl to support optional cache util: Introduce resctrl monitor for CMT util: Refactor code for determining allocation path util: Add interface to determine monitor path util: Refactor code for adding PID to the resource group util: Add interface for adding PID to the monitor util: Refactor code for creating resctrl group util: Add interface for creating monitor group util: Add more interfaces for resctrl monitor conf: Remove virDomainResctrlAppend and introduce virDomainResctrlNew conf: move virResctrlAllocIsEmpty to a place a litter lower conf: Introduce cache monitor element in cachetune qemu: enable resctrl monitor in qemu util: Add function for checking if monitor is running conf: Add 'id' to virDomainResctrlDef qemu: refactor qemuDomainGetStatsCpu qemu: Report cache occupancy (CMT) with domstats qemu: Setting up vcpu and adding pids to resctrl monitor groups during reconnection docs/formatdomain.html.in | 30 +- docs/schemas/domaincommon.rng | 14 +- src/conf/domain_conf.c | 310 ++++++++++-- src/conf/domain_conf.h | 12 + src/libvirt-domain.c | 9 + src/libvirt_private.syms | 11 + src/qemu/qemu_driver.c | 270 +++++++++- src/qemu/qemu_process.c | 69 ++- src/util/virresctrl.c | 560 +++++++++++++++++++-- src/util/virresctrl.h | 49 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 ++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 14 files changed, 1283 insertions(+), 93 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml -- 2.7.4

Refactor schemas and virresctrl to support optional <cache> element in <cachetune>. Later, the monitor entry will be introduced and to be placed under <cachetune>. Either cache entry or monitor entry is an optional element of <cachetune>. An cachetune has no <cache> element is taking the default resource allocating policy defined in '/sys/fs/resctrl/schemata'. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/util/virresctrl.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 32 insertions(+), 4 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 8189959..b1651e3 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -943,8 +943,8 @@ <dl> <dt><code>cache</code></dt> <dd> - This element controls the allocation of CPU cache and has the - following attributes: + This optional element controls the allocation of CPU cache and has + the following attributes: <dl> <dt><code>level</code></dt> <dd> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 099a949..5c533d6 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -956,7 +956,7 @@ <attribute name="vcpus"> <ref name='cpuset'/> </attribute> - <oneOrMore> + <zeroOrMore> <element name="cache"> <attribute name="id"> <ref name='unsignedInt'/> @@ -980,7 +980,7 @@ </attribute> </optional> </element> - </oneOrMore> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 5d811a2..91bd341 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -234,6 +234,11 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) * in case there is no allocation for that particular cache allocation (level, * cache, ...) or memory allocation for particular node). * + * Allocation corresponding to root directory, /sys/fs/sysctrl/, defines the + * default resource allocating policy, which is created immediately after + * mounting, and owns all the tasks and cpus in the system. Cache or memory + * bandwidth resource will be shared for tasks in this allocation. + * * =====Cache allocation technology (CAT)===== * * Since one allocation can be made for caches on different levels, the first @@ -2215,6 +2220,15 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, return -1; } + /* If the allocation is empty, then the path will be SYSFS_RESCTRL_PATH */ + if (virResctrlAllocIsEmpty(alloc)) { + if (!alloc->path && + VIR_STRDUP(alloc->path, SYSFS_RESCTRL_PATH) < 0) + return -1; + + return 0; + } + if (!alloc->path && virAsprintf(&alloc->path, "%s/%s-%s", SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) @@ -2248,6 +2262,11 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, if (virResctrlAllocDeterminePath(alloc, machinename) < 0) return -1; + /* If the allocation is empty, then do not create directory in underlying + * resctrl file system */ + if (virResctrlAllocIsEmpty(alloc)) + return 0; + if (virFileExists(alloc->path)) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Path '%s' for resctrl allocation exists"), @@ -2302,6 +2321,11 @@ virResctrlAllocAddPID(virResctrlAllocPtr alloc, char *pidstr = NULL; int ret = 0; + /* If the allocation is empty, then it is impossible to add a PID to + * allocation due to lacking of its 'tasks' file. Just return */ + if (virResctrlAllocIsEmpty(alloc)) + return 0; + if (!alloc->path) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Cannot add pid to non-existing resctrl allocation")); @@ -2334,6 +2358,10 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0; + /* No directory have ever been created. Just return */ + if (virResctrlAllocIsEmpty(alloc)) + return 0; + if (!alloc->path) return 0; -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Refactor schemas and virresctrl to support optional <cache> element in <cachetune>.
Later, the monitor entry will be introduced and to be placed under <cachetune>. Either cache entry or monitor entry is an optional element of <cachetune>.
An cachetune has no <cache> element is taking the default resource allocating policy defined in '/sys/fs/resctrl/schemata'.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/util/virresctrl.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 32 insertions(+), 4 deletions(-)
[...]
+ /* If the allocation is empty, then the path will be SYSFS_RESCTRL_PATH */ + if (virResctrlAllocIsEmpty(alloc)) { + if (!alloc->path && + VIR_STRDUP(alloc->path, SYSFS_RESCTRL_PATH) < 0) + return -1; + + return 0; + } +
Because of ...
if (!alloc->path && virAsprintf(&alloc->path, "%s/%s-%s", SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0)
[...]
@@ -2334,6 +2358,10 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0;
+ /* No directory have ever been created. Just return */ + if (virResctrlAllocIsEmpty(alloc)) + return 0;
... the change to virResctrlAllocDeterminePath to fill in alloc->path when virResctrlAllocIsEmpty to be a default path, this should be: if (STREQ_NULLABLE(alloc->path, SYSFS_RESCTRL_PATH)) return 0; or moved after the next check and the _NULLABLE removed. Whether the AllocIsEmpty is true or not shouldn't be the bearing on whether the directory created because of that
+ if (!alloc->path) return 0;
I can adjust for you, let me know; otherwise, things are fine for the Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 2018年11月05日 23:01, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Refactor schemas and virresctrl to support optional <cache> element in <cachetune>.
Later, the monitor entry will be introduced and to be placed under <cachetune>. Either cache entry or monitor entry is an optional element of <cachetune>.
An cachetune has no <cache> element is taking the default resource allocating policy defined in '/sys/fs/resctrl/schemata'.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/util/virresctrl.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 32 insertions(+), 4 deletions(-)
[...]
+ /* If the allocation is empty, then the path will be SYSFS_RESCTRL_PATH */ + if (virResctrlAllocIsEmpty(alloc)) { + if (!alloc->path && + VIR_STRDUP(alloc->path, SYSFS_RESCTRL_PATH) < 0) + return -1; + + return 0; + } + Because of ...
if (!alloc->path && virAsprintf(&alloc->path, "%s/%s-%s", SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0)
[...]
@@ -2334,6 +2358,10 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0;
+ /* No directory have ever been created. Just return */ + if (virResctrlAllocIsEmpty(alloc)) + return 0; ... the change to virResctrlAllocDeterminePath to fill in alloc->path when virResctrlAllocIsEmpty to be a default path, this should be:
if (STREQ_NULLABLE(alloc->path, SYSFS_RESCTRL_PATH)) return 0;
or moved after the next check and the _NULLABLE removed.
Whether the AllocIsEmpty is true or not shouldn't be the bearing on whether the directory created because of that
Agree with the changes. "No need to create a directory that has already been created by system." (SYSFS_RESCTRL_PATH is the created directory).
+ if (!alloc->path) return 0;
I can adjust for you, let me know; otherwise, things are fine for the
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang

On 2018年11月06日 16:18, Huaqiang,Wang wrote:
On 2018年11月05日 23:01, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Refactor schemas and virresctrl to support optional <cache> element in <cachetune>.
Later, the monitor entry will be introduced and to be placed under <cachetune>. Either cache entry or monitor entry is an optional element of <cachetune>.
An cachetune has no <cache> element is taking the default resource allocating policy defined in '/sys/fs/resctrl/schemata'.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/util/virresctrl.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 32 insertions(+), 4 deletions(-)
[...]
+ /* If the allocation is empty, then the path will be SYSFS_RESCTRL_PATH */ + if (virResctrlAllocIsEmpty(alloc)) { + if (!alloc->path && + VIR_STRDUP(alloc->path, SYSFS_RESCTRL_PATH) < 0) + return -1; + + return 0; + } + Because of ...
if (!alloc->path && virAsprintf(&alloc->path, "%s/%s-%s", SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) [...]
@@ -2334,6 +2358,10 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0; + /* No directory have ever been created. Just return */ + if (virResctrlAllocIsEmpty(alloc)) + return 0; ... the change to virResctrlAllocDeterminePath to fill in alloc->path when virResctrlAllocIsEmpty to be a default path, this should be:
if (STREQ_NULLABLE(alloc->path, SYSFS_RESCTRL_PATH)) return 0;
or moved after the next check and the _NULLABLE removed.
Whether the AllocIsEmpty is true or not shouldn't be the bearing on whether the directory created because of that
Agree with the changes. "No need to create a directory that has already been created by system." (SYSFS_RESCTRL_PATH is the created directory).
Should be "no need to destroy a system created directory, the SYSFS_RESCTRL_PATH. Directory SYSFS_RESCTRL_PATH is governed by system." It might also be reasonable to make similar changes for virResctrlAllocCreate, because no need to create an already created directory, Right? And *alloc->path must be properly assigned, by virResctrlAllocDeterminePath, before a call of virResctrlAllocCreate.
+ if (!alloc->path) return 0;
I can adjust for you, let me know; otherwise, things are fine for the
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang

On 11/6/18 3:30 AM, Huaqiang,Wang wrote:
On 2018年11月06日 16:18, Huaqiang,Wang wrote:
On 2018年11月05日 23:01, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Refactor schemas and virresctrl to support optional <cache> element in <cachetune>.
Later, the monitor entry will be introduced and to be placed under <cachetune>. Either cache entry or monitor entry is an optional element of <cachetune>.
An cachetune has no <cache> element is taking the default resource allocating policy defined in '/sys/fs/resctrl/schemata'.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 4 ++-- docs/schemas/domaincommon.rng | 4 ++-- src/util/virresctrl.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 32 insertions(+), 4 deletions(-)
[...]
+ /* If the allocation is empty, then the path will be SYSFS_RESCTRL_PATH */ + if (virResctrlAllocIsEmpty(alloc)) { + if (!alloc->path && + VIR_STRDUP(alloc->path, SYSFS_RESCTRL_PATH) < 0) + return -1; + + return 0; + } + Because of ...
if (!alloc->path && virAsprintf(&alloc->path, "%s/%s-%s", SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) [...]
@@ -2334,6 +2358,10 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0; + /* No directory have ever been created. Just return */ + if (virResctrlAllocIsEmpty(alloc)) + return 0; ... the change to virResctrlAllocDeterminePath to fill in alloc->path when virResctrlAllocIsEmpty to be a default path, this should be:
if (STREQ_NULLABLE(alloc->path, SYSFS_RESCTRL_PATH)) return 0;
or moved after the next check and the _NULLABLE removed.
Whether the AllocIsEmpty is true or not shouldn't be the bearing on whether the directory created because of that
Agree with the changes. "No need to create a directory that has already been created by system." (SYSFS_RESCTRL_PATH is the created directory).
Should be "no need to destroy a system created directory, the SYSFS_RESCTRL_PATH. Directory SYSFS_RESCTRL_PATH is governed by system."
I'll add the comment: + /* Do not destroy if path is the system/default path for the allocation */
It might also be reasonable to make similar changes for virResctrlAllocCreate, because no need to create an already created directory, Right? And *alloc->path must be properly assigned, by virResctrlAllocDeterminePath, before a call of virResctrlAllocCreate.
True - I'll adjust the check there too to be: + /* If using the system/default path for the allocation, then we're done */ + if (STREQ(alloc->path, SYSFS_RESCTRL_PATH)) + return 0; + Tks - John
+ if (!alloc->path) return 0;
I can adjust for you, let me know; otherwise, things are fine for the
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang

Cache Monitoring Technology (aka CMT) provides the capability to report cache utilization information of system task. This patch introduces the concept of resctrl monitor through data structure virResctrlMonitor. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 79 ++++++++++++++++++++++++++++++++++++++++++++---- src/util/virresctrl.h | 9 ++++++ 3 files changed, 83 insertions(+), 6 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 335210c..d2573c5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache; virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; +virResctrlMonitorNew; # util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 91bd341..6530801 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -36,9 +36,9 @@ VIR_LOG_INIT("util.virresctrl") /* Resctrl is short for Resource Control. It might be implemented for various - * resources. Currently this supports cache allocation technology (aka CAT) and - * memory bandwidth allocation (aka MBA). More resources technologies may be - * added in the future. + * resources. Currently this supports cache allocation technology (aka CAT), + * memory bandwidth allocation (aka MBA) and cache monitoring technology (aka + * CMT). More resources technologies may be added in the future. */ @@ -105,6 +105,7 @@ typedef virResctrlAllocMemBW *virResctrlAllocMemBWPtr; /* Class definitions and initializations */ static virClassPtr virResctrlInfoClass; static virClassPtr virResctrlAllocClass; +static virClassPtr virResctrlMonitorClass; /* virResctrlInfo */ @@ -224,11 +225,16 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) } -/* virResctrlAlloc */ +/* virResctrlAlloc and virResctrlMonitor*/ /* - * virResctrlAlloc represents one allocation (in XML under cputune/cachetune and - * consequently a directory under /sys/fs/resctrl). Since it can have multiple + * virResctrlAlloc and virResctrlMonitor are representing a resource control + * group (in XML under cputune/cachetune and consequently a directory under + * /sys/fs/resctrl). virResctrlAlloc is the data structure for resource + * allocation, while the virResctrlMonitor represents the resource monitoring + * part. + * + * virResctrlAlloc represents one allocation. Since it can have multiple * parts of multiple caches allocated it is represented as bunch of nested * sparse arrays (by sparse I mean array of pointers so that each might be NULL * in case there is no allocation for that particular cache allocation (level, @@ -275,6 +281,18 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) * a sparse array to represent whether a memory bandwidth allocation happens * on corresponding node. The available memory controller number is collected * in 'virResctrlInfo'. + * + * =====Cache monitoring technology (CMT)===== + * + * Cache monitoring technology is used to perceive how many cache the process + * is using actually. virResctrlMonitor represents the resource control + * monitoring group, it is supported to monitor resource utilization + * information on granularity of vcpu. + * + * From hardware perspective, cache monitoring technology (CMT), memory + * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal + * features. The monitor will be created under the scope of default resctl + * group if no specific CAT or MBA entries are provided for the guest." */ struct _virResctrlAllocPerType { /* There could be bool saying whether this is set or not, but since everything @@ -320,6 +338,29 @@ struct _virResctrlAlloc { char *path; }; +/* + * virResctrlMonitor is the data structure for resctrl monitor. Resctrl + * monitor represents a resctrl monitoring group, which can be used to + * monitor the resource utilization information for either cache or + * memory bandwidth. + */ +struct _virResctrlMonitor { + virObject parent; + + /* In resctrl, each monitor is associated with one specific allocation, + * either the allocation under / sys / fs / resctrl or allocation of the + * root directory itself. This pointer points to the allocation + * this monitor associated with. */ + virResctrlAllocPtr alloc; + /* The monitor identifier. For a monitor has the same @path name as its + * @alloc, the @id will be set to the same value as it is in @alloc->id. + */ + char *id; + /* libvirt-generated path in /sys/fs/resctrl for this particular + * monitor */ + char *path; +}; + static void virResctrlAllocDispose(void *obj) @@ -369,6 +410,17 @@ virResctrlAllocDispose(void *obj) } +static void +virResctrlMonitorDispose(void *obj) +{ + virResctrlMonitorPtr monitor = obj; + + virObjectUnref(monitor->alloc); + VIR_FREE(monitor->id); + VIR_FREE(monitor->path); +} + + /* Global initialization for classes */ static int virResctrlOnceInit(void) @@ -379,6 +431,9 @@ virResctrlOnceInit(void) if (!VIR_CLASS_NEW(virResctrlAlloc, virClassForObject())) return -1; + if (!VIR_CLASS_NEW(virResctrlMonitor, virClassForObject())) + return -1; + return 0; } @@ -2373,3 +2428,15 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc) return ret; } + + +/* virResctrlMonitor-related definitions */ + +virResctrlMonitorPtr +virResctrlMonitorNew(void) +{ + if (virResctrlInitialize() < 0) + return NULL; + + return virObjectNew(virResctrlMonitorClass); +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 10505e9..eaa077d 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -185,4 +185,13 @@ int virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl, const char *prefix, virResctrlInfoMonPtr *monitor); + +/* Monitor-related things */ + +typedef struct _virResctrlMonitor virResctrlMonitor; +typedef virResctrlMonitor *virResctrlMonitorPtr; + + +virResctrlMonitorPtr +virResctrlMonitorNew(void); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Cache Monitoring Technology (aka CMT) provides the capability to report cache utilization information of system task.
This patch introduces the concept of resctrl monitor through data structure virResctrlMonitor.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 79 ++++++++++++++++++++++++++++++++++++++++++++---- src/util/virresctrl.h | 9 ++++++ 3 files changed, 83 insertions(+), 6 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 335210c..d2573c5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms
[...]
@@ -275,6 +281,18 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) * a sparse array to represent whether a memory bandwidth allocation happens * on corresponding node. The available memory controller number is collected * in 'virResctrlInfo'. + * + * =====Cache monitoring technology (CMT)===== + * + * Cache monitoring technology is used to perceive how many cache the process + * is using actually. virResctrlMonitor represents the resource control + * monitoring group, it is supported to monitor resource utilization + * information on granularity of vcpu. + * + * From hardware perspective, cache monitoring technology (CMT), memory
From a
+ * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal + * features. The monitor will be created under the scope of default resctl
*resctrl
+ * group if no specific CAT or MBA entries are provided for the guest." */ struct _virResctrlAllocPerType { /* There could be bool saying whether this is set or not, but since everything @@ -320,6 +338,29 @@ struct _virResctrlAlloc { char *path; };
+/* + * virResctrlMonitor is the data structure for resctrl monitor. Resctrl + * monitor represents a resctrl monitoring group, which can be used to + * monitor the resource utilization information for either cache or + * memory bandwidth. + */ +struct _virResctrlMonitor { + virObject parent; + + /* In resctrl, each monitor is associated with one specific allocation,
Each ResctrlMonitor is associated...
+ * either the allocation under / sys / fs / resctrl or allocation of the
either the root directory allocation /sys/fs/resctrl or a specific allocation defined under the root directory. With these simple changes that I can make for you, Reviewed-by: John Ferlan <jferlan@redhat.com> John [...]

On 2018年11月05日 23:02, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Cache Monitoring Technology (aka CMT) provides the capability to report cache utilization information of system task.
This patch introduces the concept of resctrl monitor through data structure virResctrlMonitor.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 79 ++++++++++++++++++++++++++++++++++++++++++++---- src/util/virresctrl.h | 9 ++++++ 3 files changed, 83 insertions(+), 6 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 335210c..d2573c5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms [...]
@@ -275,6 +281,18 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon) * a sparse array to represent whether a memory bandwidth allocation happens * on corresponding node. The available memory controller number is collected * in 'virResctrlInfo'. + * + * =====Cache monitoring technology (CMT)===== + * + * Cache monitoring technology is used to perceive how many cache the process + * is using actually. virResctrlMonitor represents the resource control + * monitoring group, it is supported to monitor resource utilization + * information on granularity of vcpu. + * + * From hardware perspective, cache monitoring technology (CMT), memory From a
+ * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal + * features. The monitor will be created under the scope of default resctl *resctrl
+ * group if no specific CAT or MBA entries are provided for the guest." */ struct _virResctrlAllocPerType { /* There could be bool saying whether this is set or not, but since everything @@ -320,6 +338,29 @@ struct _virResctrlAlloc { char *path; };
+/* + * virResctrlMonitor is the data structure for resctrl monitor. Resctrl + * monitor represents a resctrl monitoring group, which can be used to + * monitor the resource utilization information for either cache or + * memory bandwidth. + */ +struct _virResctrlMonitor { + virObject parent; + + /* In resctrl, each monitor is associated with one specific allocation, Each ResctrlMonitor is associated...
+ * either the allocation under / sys / fs / resctrl or allocation of the either the root directory allocation /sys/fs/resctrl or a specific allocation defined under the root directory.
With these simple changes that I can make for you,
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Please help me correct these errors. Thanks for the review. Huaqiang
[...]

The code for determining resctrl allocation path could be reused for monitor. Refactor it for reuse. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 38 ++++++++++++++++++++++++++++++-------- 1 file changed, 30 insertions(+), 8 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 6530801..956aca8 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2265,28 +2265,50 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, } +static char * +virResctrlDeterminePath(const char *parentpath, + const char *prefix, + const char *id) +{ + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl ID must be set before determining resctrl " + "parentpath='%s'"), parentpath); + return NULL; + } + + if (virAsprintf(&path, "%s/%s-%s", parentpath, prefix, id) < 0) + return NULL; + + return path; +} + + int virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, const char *machinename) { - if (!alloc->id) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Resctrl Allocation ID must be set before creation")); + if (alloc->path) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl allocation path is already set to '%s'"), + alloc->path); return -1; } /* If the allocation is empty, then the path will be SYSFS_RESCTRL_PATH */ if (virResctrlAllocIsEmpty(alloc)) { - if (!alloc->path && - VIR_STRDUP(alloc->path, SYSFS_RESCTRL_PATH) < 0) + if (VIR_STRDUP(alloc->path, SYSFS_RESCTRL_PATH) < 0) return -1; return 0; } - if (!alloc->path && - virAsprintf(&alloc->path, "%s/%s-%s", - SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH, + machinename, alloc->id); + + if (!alloc->path) return -1; return 0; -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
The code for determining resctrl allocation path could be reused for monitor. Refactor it for reuse.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 38 ++++++++++++++++++++++++++++++-------- 1 file changed, 30 insertions(+), 8 deletions(-)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 6530801..956aca8 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2265,28 +2265,50 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, }
+static char * +virResctrlDeterminePath(const char *parentpath, + const char *prefix, + const char *id) +{ + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl ID must be set before determining resctrl " + "parentpath='%s'"), parentpath);
Add "for prefix='%s'" w/ prefix as argument especially since for Alloc's the parent path is SYSFS_RESCTRL_PATH so it's perhaps not specific enough. With this change that I can make for you, Reviewed-by: John Ferlan <jferlan@redhat.com> John [...]

On 2018年11月05日 23:02, John Ferlan wrote:
The code for determining resctrl allocation path could be reused for monitor. Refactor it for reuse.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 38 ++++++++++++++++++++++++++++++-------- 1 file changed, 30 insertions(+), 8 deletions(-)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 6530801..956aca8 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2265,28 +2265,50 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, }
+static char * +virResctrlDeterminePath(const char *parentpath, + const char *prefix, + const char *id) +{ + char *path = NULL; + + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl ID must be set before determining resctrl " + "parentpath='%s'"), parentpath); Add "for prefix='%s'" w/ prefix as argument especially since for Alloc's
On 10/22/18 4:01 AM, Wang Huaqiang wrote: the parent path is SYSFS_RESCTRL_PATH so it's perhaps not specific enough.
With this change that I can make for you,
Your change made the message be more clear. Agree with your change.
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for your review. Huaqiang
[...]

Add interface for resctrl monitor to determine the path. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 5 ++++- 3 files changed, 60 insertions(+), 1 deletion(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d2573c5..5235046 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache; virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; +virResctrlMonitorDeterminePath; virResctrlMonitorNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 956aca8..1d0eb01 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2462,3 +2462,58 @@ virResctrlMonitorNew(void) return virObjectNew(virResctrlMonitorClass); } + + +/* + * virResctrlMonitorDeterminePath + * + * @monitor: Pointer to a resctrl monitor + * @machinename: Name string of the VM + * + * Determines the directory path that the underlying resctrl group will be + * created with. + * + * A monitor represents a directory under resource control file system, + * its directory path could be the same path as @monitor->alloc, could be a + * path of directory under 'mon_groups' of @monitor->alloc, or a path of + * directory under '/sys/fs/resctrl/mon_groups' if @monitor->alloc is NULL. + * + * Returns 0 on success, -1 on error. + */ +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename) +{ + VIR_AUTOFREE(char *) parentpath = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (monitor->path) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl monitor path is already set to '%s'"), + monitor->path); + return -1; + } + + if (monitor->id && monitor->alloc && monitor->alloc->id) { + if (STREQ(monitor->id, monitor->alloc->id)) { + if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0) + return -1; + return 0; + } + } + + if (virAsprintf(&parentpath, "%s/mon_groups", monitor->alloc->path) < 0) + return -1; + + monitor->path = virResctrlDeterminePath(parentpath, machinename, + monitor->id); + if (!monitor->path) + return -1; + + return 0; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index eaa077d..baae554 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -191,7 +191,10 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl, typedef struct _virResctrlMonitor virResctrlMonitor; typedef virResctrlMonitor *virResctrlMonitorPtr; - virResctrlMonitorPtr virResctrlMonitorNew(void); + +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interface for resctrl monitor to determine the path.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 5 ++++- 3 files changed, 60 insertions(+), 1 deletion(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 2018年11月05日 23:02, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interface for resctrl monitor to determine the path.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 5 ++++- 3 files changed, 60 insertions(+), 1 deletion(-)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interface for resctrl monitor to determine the path.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 5 ++++- 3 files changed, 60 insertions(+), 1 deletion(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d2573c5..5235046 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache; virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; +virResctrlMonitorDeterminePath; virResctrlMonitorNew;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 956aca8..1d0eb01 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2462,3 +2462,58 @@ virResctrlMonitorNew(void)
return virObjectNew(virResctrlMonitorClass); } + + +/* + * virResctrlMonitorDeterminePath + * + * @monitor: Pointer to a resctrl monitor + * @machinename: Name string of the VM + * + * Determines the directory path that the underlying resctrl group will be + * created with. + * + * A monitor represents a directory under resource control file system, + * its directory path could be the same path as @monitor->alloc, could be a + * path of directory under 'mon_groups' of @monitor->alloc, or a path of + * directory under '/sys/fs/resctrl/mon_groups' if @monitor->alloc is NULL. + * + * Returns 0 on success, -1 on error. + */ +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename) +{ + VIR_AUTOFREE(char *) parentpath = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (monitor->path) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl monitor path is already set to '%s'"), + monitor->path); + return -1; + } + + if (monitor->id && monitor->alloc && monitor->alloc->id) { + if (STREQ(monitor->id, monitor->alloc->id)) { + if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0) + return -1; + return 0; + } + } + + if (virAsprintf(&parentpath, "%s/mon_groups", monitor->alloc->path) < 0)
Just ran the changes through Coverity - because of the above "monitor->alloc" check, Coverity notes right here that monitor->alloc could be NULL, so I think a check prior to here would be in order, such as either before or after the monitor->path check: if (!monitor->alloc) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Missing resctrl monitor allocation")); return -1; } Then the monitor->alloc check wouldn't be necessary. Thus the above STRDUP becomes: if (STREQ_NULLABLE(monitor->id, monitor->alloc->id)) { if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0) return -1; return 0; } Let me know where you want it. John
+ return -1; + + monitor->path = virResctrlDeterminePath(parentpath, machinename, + monitor->id); + if (!monitor->path) + return -1; + + return 0; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index eaa077d..baae554 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -191,7 +191,10 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl, typedef struct _virResctrlMonitor virResctrlMonitor; typedef virResctrlMonitor *virResctrlMonitorPtr;
- virResctrlMonitorPtr virResctrlMonitorNew(void); + +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */

On 2018年11月06日 01:19, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interface for resctrl monitor to determine the path.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 5 ++++- 3 files changed, 60 insertions(+), 1 deletion(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d2573c5..5235046 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache; virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; +virResctrlMonitorDeterminePath; virResctrlMonitorNew;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 956aca8..1d0eb01 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2462,3 +2462,58 @@ virResctrlMonitorNew(void)
return virObjectNew(virResctrlMonitorClass); } + + +/* + * virResctrlMonitorDeterminePath + * + * @monitor: Pointer to a resctrl monitor + * @machinename: Name string of the VM + * + * Determines the directory path that the underlying resctrl group will be + * created with. + * + * A monitor represents a directory under resource control file system, + * its directory path could be the same path as @monitor->alloc, could be a + * path of directory under 'mon_groups' of @monitor->alloc, or a path of + * directory under '/sys/fs/resctrl/mon_groups' if @monitor->alloc is NULL. + * + * Returns 0 on success, -1 on error. + */ +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename) +{ + VIR_AUTOFREE(char *) parentpath = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (monitor->path) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Resctrl monitor path is already set to '%s'"), + monitor->path); + return -1; + } + + if (monitor->id && monitor->alloc && monitor->alloc->id) { + if (STREQ(monitor->id, monitor->alloc->id)) { + if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0) + return -1; + return 0; + } + } + + if (virAsprintf(&parentpath, "%s/mon_groups", monitor->alloc->path) < 0) Just ran the changes through Coverity - because of the above "monitor->alloc" check, Coverity notes right here that monitor->alloc could be NULL, so I think a check prior to here would be in order,
Yes. Here exists a risk that this line you mentioned could be executed at the condition that @monitor->alloc is empty pointer, this will cause a crash.
such as either before or after the monitor->path check:
if (!monitor->alloc) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Missing resctrl monitor allocation")); return -1; }
Putting these lines before and after the monitor->path check should be OK for me. (I don't find the influence made by the order. )
Then the monitor->alloc check wouldn't be necessary. Thus the above STRDUP becomes:
if (STREQ_NULLABLE(monitor->id, monitor->alloc->id)) { if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0) return -1; return 0; }
Agree. Thanks.
Let me know where you want it.
John
Thanks for review. Huaqiang
+ return -1; + + monitor->path = virResctrlDeterminePath(parentpath, machinename, + monitor->id); + if (!monitor->path) + return -1; + + return 0; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index eaa077d..baae554 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -191,7 +191,10 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl, typedef struct _virResctrlMonitor virResctrlMonitor; typedef virResctrlMonitor *virResctrlMonitorPtr;
- virResctrlMonitorPtr virResctrlMonitorNew(void); + +int +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */

The code of adding PID to the allocation could be reused, refactor it for later reuse. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 1d0eb01..5fc8772 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2390,26 +2390,21 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, } -int -virResctrlAllocAddPID(virResctrlAllocPtr alloc, - pid_t pid) +static int +virResctrlAddPID(const char *path, + pid_t pid) { char *tasks = NULL; char *pidstr = NULL; int ret = 0; - /* If the allocation is empty, then it is impossible to add a PID to - * allocation due to lacking of its 'tasks' file. Just return */ - if (virResctrlAllocIsEmpty(alloc)) - return 0; - - if (!alloc->path) { + if (!path) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Cannot add pid to non-existing resctrl allocation")); + _("Cannot add pid to non-existing resctrl group")); return -1; } - if (virAsprintf(&tasks, "%s/tasks", alloc->path) < 0) + if (virAsprintf(&tasks, "%s/tasks", path) < 0) return -1; if (virAsprintf(&pidstr, "%lld", (long long int) pid) < 0) @@ -2431,6 +2426,19 @@ virResctrlAllocAddPID(virResctrlAllocPtr alloc, int +virResctrlAllocAddPID(virResctrlAllocPtr alloc, + pid_t pid) +{ + /* If allocation is empty, then no resctrl directory and the 'tasks' file + * exists, just return */ + if (virResctrlAllocIsEmpty(alloc)) + return 0; + + return virResctrlAddPID(alloc->path, pid); +} + + +int virResctrlAllocRemove(virResctrlAllocPtr alloc) { int ret = 0; -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
The code of adding PID to the allocation could be reused, refactor it for later reuse.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 2018年11月05日 23:03, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
The code of adding PID to the allocation could be reused, refactor it for later reuse.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang

On 11/6/18 3:41 AM, Huaqiang,Wang wrote:
On 2018年11月05日 23:03, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
The code of adding PID to the allocation could be reused, refactor it for later reuse.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang
While working through patch1 adjustments, I noted an extra space in a comment, so I fixed it: /* If the allocation is empty, then it is impossible to add a PID to * allocation due to lacking of its 'tasks' file. Just return */ ^^ Of course that resulted in a merge conflict in this patch where I (now) note you changed the comment to: /* If allocation is empty, then no resctrl directory and the 'tasks' file * exists, just return */ I'm going to restore the original comment; however, it made me stop and think about future patch14 which used the *tasks (and a new local *pid list) - perhaps you need to rethink the changes... Even patch1... What's the use of altering the *tasks file at all then? Fortunately it seems it's not really used for much more than logging the tids that are running the vcpu. I'll hold off on pushing anything... So far there's no real impact, but you may decide that you need to 'design in' a way to handle this default/system path/issue. John

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Tuesday, November 6, 2018 10:41 PM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; bing.niu@intel.com; Ding, Jian- feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [PATCHv7 05/18] util: Refactor code for adding PID to the resource group
On 11/6/18 3:41 AM, Huaqiang,Wang wrote:
On 2018年11月05日 23:03, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
The code of adding PID to the allocation could be reused, refactor it for later reuse.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang
While working through patch1 adjustments, I noted an extra space in a comment, so I fixed it:
/* If the allocation is empty, then it is impossible to add a PID to * allocation due to lacking of its 'tasks' file. Just return */ ^^ Of course that resulted in a merge conflict in this patch where I (now) note you changed the comment to:
/* If allocation is empty, then no resctrl directory and the 'tasks' file * exists, just return */
I'm going to restore the original comment; however, it made me stop and think about future patch14 which used the *tasks (and a new local *pid list) - perhaps you need to rethink the changes... Even patch1...
What's the use of altering the *tasks file at all then? Fortunately it seems it's not really used for much more than logging the tids that are running the vcpu.
I'll hold off on pushing anything... So far there's no real impact, but you may decide that you need to 'design in' a way to handle this default/system path/issue.
I'll send out my v8 patches today, and will be covered in my new series, please make a review. Thanks for review.
John

Add interface for adding task PID to the monitor. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 8 ++++++++ src/util/virresctrl.h | 4 ++++ 3 files changed, 13 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 5235046..e175c8b 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache; virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; +virResctrlMonitorAddPID; virResctrlMonitorDeterminePath; virResctrlMonitorNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 5fc8772..93b5e70 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2525,3 +2525,11 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, return 0; } + + +int +virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, + pid_t pid) +{ + return virResctrlAddPID(monitor->path, pid); +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index baae554..52d283a 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -197,4 +197,8 @@ virResctrlMonitorNew(void); int virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, const char *machinename); + +int +virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, + pid_t pid); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interface for adding task PID to the monitor.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 8 ++++++++ src/util/virresctrl.h | 4 ++++ 3 files changed, 13 insertions(+)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 2018年11月05日 23:03, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interface for adding task PID to the monitor.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 8 ++++++++ src/util/virresctrl.h | 4 ++++ 3 files changed, 13 insertions(+)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang

The code for creating resctrl allocation group could be reused for monitoring group, refactor it for reuse in the later patch. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 93b5e70..5b984d9 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2315,6 +2315,26 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc, } +/* This function creates a resctrl directory in resource control file system, + * and the directory path is specified by @path. */ +static int +virResctrlCreateGroupPath(const char *path) +{ + /* Directory exists, return */ + if (virFileExists(path)) + return 0; + + if (virFileMakePath(path) < 0) { + virReportSystemError(errno, + _("Cannot create resctrl directory '%s'"), + path); + return -1; + } + + return 0; +} + + /* This checks if the directory for the alloc exists. If not it tries to create * it and apply appropriate alloc settings. */ int @@ -2344,13 +2364,6 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, if (virResctrlAllocIsEmpty(alloc)) return 0; - if (virFileExists(alloc->path)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Path '%s' for resctrl allocation exists"), - alloc->path); - goto cleanup; - } - lockfd = virResctrlLockWrite(); if (lockfd < 0) goto cleanup; @@ -2358,6 +2371,9 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, if (virResctrlAllocAssign(resctrl, alloc) < 0) goto cleanup; + if (virResctrlCreateGroupPath(alloc->path) < 0) + goto cleanup; + alloc_str = virResctrlAllocFormat(alloc); if (!alloc_str) goto cleanup; @@ -2365,13 +2381,6 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, if (virAsprintf(&schemata_path, "%s/schemata", alloc->path) < 0) goto cleanup; - if (virFileMakePath(alloc->path) < 0) { - virReportSystemError(errno, - _("Cannot create resctrl directory '%s'"), - alloc->path); - goto cleanup; - } - VIR_DEBUG("Writing resctrl schemata '%s' into '%s'", alloc_str, schemata_path); if (virFileWriteStr(schemata_path, alloc_str, 0) < 0) { rmdir(alloc->path); -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
The code for creating resctrl allocation group could be reused for monitoring group, refactor it for reuse in the later patch.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 2018年11月05日 23:03, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
The code for creating resctrl allocation group could be reused for monitoring group, refactor it for reuse in the later patch.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/util/virresctrl.c | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang

Add interface for creating the resource monitoring group according to '@virResctrlMonitor->path'. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 24 ++++++++++++++++++++++++ src/util/virresctrl.h | 4 ++++ 3 files changed, 29 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e175c8b..a878083 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix; virResctrlInfoMonFree; virResctrlInfoNew; virResctrlMonitorAddPID; +virResctrlMonitorCreate; virResctrlMonitorDeterminePath; virResctrlMonitorNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 5b984d9..9f42065 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2542,3 +2542,27 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, { return virResctrlAddPID(monitor->path, pid); } + + +int +virResctrlMonitorCreate(virResctrlMonitorPtr monitor, + const char *machinename) +{ + int lockfd = -1; + int ret = -1; + + if (!monitor) + return 0; + + if (virResctrlMonitorDeterminePath(monitor, machinename) < 0) + return -1; + + lockfd = virResctrlLockWrite(); + if (lockfd < 0) + return -1; + + ret = virResctrlCreateGroupPath(monitor->path); + + virResctrlUnlock(lockfd); + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 52d283a..76e40a2 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -201,4 +201,8 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor, int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid); + +int +virResctrlMonitorCreate(virResctrlMonitorPtr monitor, + const char *machinename); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interface for creating the resource monitoring group according to '@virResctrlMonitor->path'.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 24 ++++++++++++++++++++++++ src/util/virresctrl.h | 4 ++++ 3 files changed, 29 insertions(+)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 2018年11月05日 23:04, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interface for creating the resource monitoring group according to '@virResctrlMonitor->path'.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 24 ++++++++++++++++++++++++ src/util/virresctrl.h | 4 ++++ 3 files changed, 29 insertions(+)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang

Add interfaces monitor group to support operations such as add PID, set ID, remove group ... etc. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 5 +++++ src/util/virresctrl.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 14 ++++++++++++++ 3 files changed, 66 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a878083..d59ac86 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2683,7 +2683,12 @@ virResctrlInfoNew; virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; +virResctrlMonitorGetID; virResctrlMonitorNew; +virResctrlMonitorRemove; +virResctrlMonitorSetAlloc; +virResctrlMonitorSetID; + # util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 9f42065..b0205bc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2566,3 +2566,50 @@ virResctrlMonitorCreate(virResctrlMonitorPtr monitor, virResctrlUnlock(lockfd); return ret; } + + +int +virResctrlMonitorSetID(virResctrlMonitorPtr monitor, + const char *id) +{ + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl monitor 'id' cannot be NULL")); + return -1; + } + + return VIR_STRDUP(monitor->id, id); +} + + +const char * +virResctrlMonitorGetID(virResctrlMonitorPtr monitor) +{ + return monitor->id; +} + + +void +virResctrlMonitorSetAlloc(virResctrlMonitorPtr monitor, + virResctrlAllocPtr alloc) +{ + monitor->alloc = virObjectRef(alloc); +} + + +int +virResctrlMonitorRemove(virResctrlMonitorPtr monitor) +{ + int ret = 0; + + if (!monitor->path) + return 0; + + VIR_DEBUG("Removing resctrl monitor%s", monitor->path); + if (rmdir(monitor->path) != 0 && errno != ENOENT) { + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), monitor->path, errno); + } + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 76e40a2..804d6f5 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -205,4 +205,18 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, int virResctrlMonitorCreate(virResctrlMonitorPtr monitor, const char *machinename); + +int +virResctrlMonitorSetID(virResctrlMonitorPtr monitor, + const char *id); + +const char * +virResctrlMonitorGetID(virResctrlMonitorPtr monitor); + +void +virResctrlMonitorSetAlloc(virResctrlMonitorPtr monitor, + virResctrlAllocPtr alloc); + +int +virResctrlMonitorRemove(virResctrlMonitorPtr monitor); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interfaces monitor group to support operations such as add PID, set ID, remove group ... etc.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 5 +++++ src/util/virresctrl.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 14 ++++++++++++++ 3 files changed, 66 insertions(+)
[...]
+int +virResctrlMonitorRemove(virResctrlMonitorPtr monitor) +{ + int ret = 0; + + if (!monitor->path) + return 0;
Similar to patch1 - if we are using a default path, then we don't want to removed even if it exists, so I *think* (but you need to confirm for me) that the following should be done: if (STREQ(monitor->path, monitor->alloc->path)) return 0; Although I wonder if a !monitor->alloc guard should be used as well. Whether it's part of a || !monitor->path return 0 or this check should be "if (monitor->alloc && STREQ(...))"... Thoughts?
+ + VIR_DEBUG("Removing resctrl monitor%s", monitor->path);
s/monitor%s/monitor path='%s'
+ if (rmdir(monitor->path) != 0 && errno != ENOENT) { + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), monitor->path, errno); + } + + return ret; +}
I can make the changes - just let me know your preferred way to proceed... Reviewed-by: John Ferlan <jferlan@redhat.com> John [...]

On 2018年11月05日 23:10, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add interfaces monitor group to support operations such as add PID, set ID, remove group ... etc.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 5 +++++ src/util/virresctrl.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 14 ++++++++++++++ 3 files changed, 66 insertions(+)
[...]
+int +virResctrlMonitorRemove(virResctrlMonitorPtr monitor) +{ + int ret = 0; + + if (!monitor->path) + return 0; Similar to patch1 - if we are using a default path, then we don't want to removed even if it exists, so I *think* (but you need to confirm for me) that the following should be done:
if (STREQ(monitor->path, monitor->alloc->path)) return 0;
Although I wonder if a !monitor->alloc guard should be used as well. Whether it's part of a || !monitor->path return 0 or this check should be "if (monitor->alloc && STREQ(...))"... Thoughts?
You are right, I should do the similar things that have done for removal of allocation. otherwise the allocation directory will be removed in removing monitor group. I did not find these, because the removal of allocation is just after removal of all monitor groups. @monitor->alloc and @monitor->path should not be NULL here, so following changes would be fine. if (STREQ(monitor->path, monitor->alloc->path)) return 0; Thanks
+ + VIR_DEBUG("Removing resctrl monitor%s", monitor->path); s/monitor%s/monitor path='%s'
OK. Thanks.
+ if (rmdir(monitor->path) != 0 && errno != ENOENT) { + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), monitor->path, errno); + } + + return ret; +} I can make the changes - just let me know your preferred way to proceed...
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for the review. Huaqiang
[...]

Introduced virDomainResctrlNew to do the most part of virDomainResctrlAppend and move the operation of appending resctrl to @def->resctrls out of function. Rather than rely on virDomainResctrlAppend to perform the allocation, move the onus to the caller and make use of virBitmapNewCopy for @vcpus and virObjectRef for @alloc, thus removing the need to set each to NULL after the call. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 60 +++++++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 25 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e8e0adc..39bd396 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18920,26 +18920,22 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, } -static int -virDomainResctrlAppend(virDomainDefPtr def, - xmlNodePtr node, - virResctrlAllocPtr alloc, - virBitmapPtr vcpus, - unsigned int flags) +static virDomainResctrlDefPtr +virDomainResctrlNew(xmlNodePtr node, + virResctrlAllocPtr *alloc, + virBitmapPtr *vcpus, + unsigned int flags) { char *vcpus_str = NULL; char *alloc_id = NULL; - virDomainResctrlDefPtr tmp_resctrl = NULL; - int ret = -1; - - if (VIR_ALLOC(tmp_resctrl) < 0) - goto cleanup; + virDomainResctrlDefPtr resctrl = NULL; + virDomainResctrlDefPtr ret = NULL; /* We need to format it back because we need to be consistent in the naming * even when users specify some "sub-optimal" string there. */ - vcpus_str = virBitmapFormat(vcpus); + vcpus_str = virBitmapFormat(*vcpus); if (!vcpus_str) - goto cleanup; + return NULL; if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) alloc_id = virXMLPropString(node, "id"); @@ -18954,18 +18950,23 @@ virDomainResctrlAppend(virDomainDefPtr def, goto cleanup; } - if (virResctrlAllocSetID(alloc, alloc_id) < 0) + if (virResctrlAllocSetID(*alloc, alloc_id) < 0) goto cleanup; - tmp_resctrl->vcpus = vcpus; - tmp_resctrl->alloc = alloc; + if (VIR_ALLOC(resctrl) < 0) + goto cleanup; - if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) + if (!(resctrl->vcpus = virBitmapNewCopy(*vcpus))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to copy 'vcpus'")); goto cleanup; + } - ret = 0; + resctrl->alloc = virObjectRef(*alloc); + + VIR_STEAL_PTR(ret, resctrl); cleanup: - virDomainResctrlDefFree(tmp_resctrl); + virDomainResctrlDefFree(resctrl); VIR_FREE(alloc_id); VIR_FREE(vcpus_str); return ret; @@ -18982,6 +18983,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; + virDomainResctrlDefPtr resctrl = NULL; ssize_t i = 0; int n; int ret = -1; @@ -19025,14 +19027,17 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } - if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) + resctrl = virDomainResctrlNew(node, &alloc, &vcpus, flags); + if (!resctrl) + goto cleanup; + + if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, resctrl) < 0) goto cleanup; - vcpus = NULL; - alloc = NULL; ret = 0; cleanup: ctxt->node = oldnode; + virDomainResctrlDefFree(resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(nodes); @@ -19190,6 +19195,8 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; + virDomainResctrlDefPtr resctrl = NULL; + ssize_t i = 0; int n; int ret = -1; @@ -19234,15 +19241,18 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, * just update the existing alloc information, which is done in above * virDomainMemorytuneDefParseMemory */ if (new_alloc) { - if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) + resctrl = virDomainResctrlNew(node, &alloc, &vcpus, flags); + if (!resctrl) + goto cleanup; + + if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, resctrl) < 0) goto cleanup; - vcpus = NULL; - alloc = NULL; } ret = 0; cleanup: ctxt->node = oldnode; + virDomainResctrlDefFree(resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(nodes); -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Introduced virDomainResctrlNew to do the most part of virDomainResctrlAppend and move the operation of appending resctrl to @def->resctrls out of function.
Rather than rely on virDomainResctrlAppend to perform the allocation, move the onus to the caller and make use of virBitmapNewCopy for @vcpus and virObjectRef for @alloc, thus removing the need to set each to NULL after the call.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 60 +++++++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 25 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e8e0adc..39bd396 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18920,26 +18920,22 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, }
-static int -virDomainResctrlAppend(virDomainDefPtr def, - xmlNodePtr node, - virResctrlAllocPtr alloc, - virBitmapPtr vcpus, - unsigned int flags) +static virDomainResctrlDefPtr +virDomainResctrlNew(xmlNodePtr node, + virResctrlAllocPtr *alloc, + virBitmapPtr *vcpus,
Because we're not "stealing" @*alloc and/or @*vcpus, they do not need to be passed by reference. I can change these. There's some minor merge impact in later patches too, but no big deal.
+ unsigned int flags) { char *vcpus_str = NULL; char *alloc_id = NULL; - virDomainResctrlDefPtr tmp_resctrl = NULL; - int ret = -1; - - if (VIR_ALLOC(tmp_resctrl) < 0) - goto cleanup; + virDomainResctrlDefPtr resctrl = NULL; + virDomainResctrlDefPtr ret = NULL;
/* We need to format it back because we need to be consistent in the naming * even when users specify some "sub-optimal" string there. */ - vcpus_str = virBitmapFormat(vcpus); + vcpus_str = virBitmapFormat(*vcpus); if (!vcpus_str) - goto cleanup; + return NULL;
if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) alloc_id = virXMLPropString(node, "id"); @@ -18954,18 +18950,23 @@ virDomainResctrlAppend(virDomainDefPtr def, goto cleanup; }
/* NB: Callers assume new @alloc, need to fill in ID now */ Not that it would prevent someone in the future from passing an @alloc w/ ->id already filled in and overwriting, but at least for now it's not the case. With the changes (that I can make), Reviewed-by: John Ferlan <jferlan@redhat.com> John [...]

On 2018年11月06日 01:26, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Introduced virDomainResctrlNew to do the most part of virDomainResctrlAppend and move the operation of appending resctrl to @def->resctrls out of function.
Rather than rely on virDomainResctrlAppend to perform the allocation, move the onus to the caller and make use of virBitmapNewCopy for @vcpus and virObjectRef for @alloc, thus removing the need to set each to NULL after the call.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 60 +++++++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 25 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e8e0adc..39bd396 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18920,26 +18920,22 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, }
-static int -virDomainResctrlAppend(virDomainDefPtr def, - xmlNodePtr node, - virResctrlAllocPtr alloc, - virBitmapPtr vcpus, - unsigned int flags) +static virDomainResctrlDefPtr +virDomainResctrlNew(xmlNodePtr node, + virResctrlAllocPtr *alloc, + virBitmapPtr *vcpus, Because we're not "stealing" @*alloc and/or @*vcpus, they do not need to be passed by reference. I can change these. There's some minor merge impact in later patches too, but no big deal.
Agree. Please help make change.
+ unsigned int flags) { char *vcpus_str = NULL; char *alloc_id = NULL; - virDomainResctrlDefPtr tmp_resctrl = NULL; - int ret = -1; - - if (VIR_ALLOC(tmp_resctrl) < 0) - goto cleanup; + virDomainResctrlDefPtr resctrl = NULL; + virDomainResctrlDefPtr ret = NULL;
/* We need to format it back because we need to be consistent in the naming * even when users specify some "sub-optimal" string there. */ - vcpus_str = virBitmapFormat(vcpus); + vcpus_str = virBitmapFormat(*vcpus); if (!vcpus_str) - goto cleanup; + return NULL;
if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) alloc_id = virXMLPropString(node, "id"); @@ -18954,18 +18950,23 @@ virDomainResctrlAppend(virDomainDefPtr def, goto cleanup; }
/* NB: Callers assume new @alloc, need to fill in ID now */
Not that it would prevent someone in the future from passing an @alloc w/ ->id already filled in and overwriting, but at least for now it's not the case.
Yes, it might happen. If @alloc->id is specified through XML and is not following the naming convention virAsprintf(&alloc_id, "vcpus_%s", vcpus_str) If you think it is necessary we might need to through a warning for this case.
With the changes (that I can make),
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for review. Huaqiang
[...]

On 11/6/18 4:51 AM, Huaqiang,Wang wrote:
On 2018年11月06日 01:26, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Introduced virDomainResctrlNew to do the most part of virDomainResctrlAppend and move the operation of appending resctrl to @def->resctrls out of function.
Rather than rely on virDomainResctrlAppend to perform the allocation, move the onus to the caller and make use of virBitmapNewCopy for @vcpus and virObjectRef for @alloc, thus removing the need to set each to NULL after the call.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 60 +++++++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 25 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e8e0adc..39bd396 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18920,26 +18920,22 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, } -static int -virDomainResctrlAppend(virDomainDefPtr def, - xmlNodePtr node, - virResctrlAllocPtr alloc, - virBitmapPtr vcpus, - unsigned int flags) +static virDomainResctrlDefPtr +virDomainResctrlNew(xmlNodePtr node, + virResctrlAllocPtr *alloc, + virBitmapPtr *vcpus, Because we're not "stealing" @*alloc and/or @*vcpus, they do not need to be passed by reference. I can change these. There's some minor merge impact in later patches too, but no big deal.
Agree. Please help make change.
+ unsigned int flags) { char *vcpus_str = NULL; char *alloc_id = NULL; - virDomainResctrlDefPtr tmp_resctrl = NULL; - int ret = -1; - - if (VIR_ALLOC(tmp_resctrl) < 0) - goto cleanup; + virDomainResctrlDefPtr resctrl = NULL; + virDomainResctrlDefPtr ret = NULL; /* We need to format it back because we need to be consistent in the naming * even when users specify some "sub-optimal" string there. */ - vcpus_str = virBitmapFormat(vcpus); + vcpus_str = virBitmapFormat(*vcpus); if (!vcpus_str) - goto cleanup; + return NULL; if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) alloc_id = virXMLPropString(node, "id"); @@ -18954,18 +18950,23 @@ virDomainResctrlAppend(virDomainDefPtr def, goto cleanup; } /* NB: Callers assume new @alloc, need to fill in ID now */
Not that it would prevent someone in the future from passing an @alloc w/ ->id already filled in and overwriting, but at least for now it's not the case.
Yes, it might happen. If @alloc->id is specified through XML and is not following the naming convention virAsprintf(&alloc_id, "vcpus_%s", vcpus_str)
If you think it is necessary we might need to through a warning for this case.
Let's see - virDomainResctrlNew has two callers: 1. virDomainCachetuneDefParse In this case, we "know" we have a new/empty @alloc because if virDomainResctrlVcpuMatch found one, then there'd be a failure. The virDomainCachetuneDefParseCache calls don't seem to fill in alloc->id, but virDomainResctrlNew will for the first time. 2. virDomainMemorytuneDefParse The virDomainResctrlVcpuMatch may find a preexisting @alloc, but @new_alloc is set to true. The virDomainMemorytuneDefParseMemory won't fill alloc->id. Then only if @new_alloc do we call virDomainResctrlNew So I think both are safe "for now". If you want I could modify the virResctrlAllocSetID code to : if (alloc->id) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Attempt to overwrite alloc->id='%s' with id='%s'"), alloc->id, id); return -1; } Let me know. John
With the changes (that I can make),
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for review. Huaqiang
[...]

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Wednesday, November 7, 2018 12:15 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; bing.niu@intel.com; Ding, Jian- feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [PATCHv7 10/18] conf: Remove virDomainResctrlAppend and introduce virDomainResctrlNew
On 11/6/18 4:51 AM, Huaqiang,Wang wrote:
On 2018年11月06日 01:26, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Introduced virDomainResctrlNew to do the most part of virDomainResctrlAppend and move the operation of appending resctrl to @def->resctrls out of function.
Rather than rely on virDomainResctrlAppend to perform the allocation, move the onus to the caller and make use of virBitmapNewCopy for @vcpus and virObjectRef for @alloc, thus removing the need to set each to NULL after the call.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 60 +++++++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 25 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e8e0adc..39bd396 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18920,26 +18920,22 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, } -static int -virDomainResctrlAppend(virDomainDefPtr def, - xmlNodePtr node, - virResctrlAllocPtr alloc, - virBitmapPtr vcpus, - unsigned int flags) +static virDomainResctrlDefPtr +virDomainResctrlNew(xmlNodePtr node, + virResctrlAllocPtr *alloc, + virBitmapPtr *vcpus, Because we're not "stealing" @*alloc and/or @*vcpus, they do not need to be passed by reference. I can change these. There's some minor merge impact in later patches too, but no big deal.
Agree. Please help make change.
+ unsigned int flags) { char *vcpus_str = NULL; char *alloc_id = NULL; - virDomainResctrlDefPtr tmp_resctrl = NULL; - int ret = -1; - - if (VIR_ALLOC(tmp_resctrl) < 0) - goto cleanup; + virDomainResctrlDefPtr resctrl = NULL; + virDomainResctrlDefPtr ret = NULL; /* We need to format it back because we need to be consistent in the naming * even when users specify some "sub-optimal" string there. */ - vcpus_str = virBitmapFormat(vcpus); + vcpus_str = virBitmapFormat(*vcpus); if (!vcpus_str) - goto cleanup; + return NULL; if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) alloc_id = virXMLPropString(node, "id"); @@ -18954,18 +18950,23 @@ virDomainResctrlAppend(virDomainDefPtr def, goto cleanup; }
/* NB: Callers assume new @alloc, need to fill in ID now */
Not that it would prevent someone in the future from passing an @alloc w/ ->id already filled in and overwriting, but at least for now it's not the case.
Yes, it might happen. If @alloc->id is specified through XML and is not following the naming convention virAsprintf(&alloc_id, "vcpus_%s", vcpus_str)
If you think it is necessary we might need to through a warning for this case.
Let's see - virDomainResctrlNew has two callers:
1. virDomainCachetuneDefParse
In this case, we "know" we have a new/empty @alloc because if virDomainResctrlVcpuMatch found one, then there'd be a failure.
The virDomainCachetuneDefParseCache calls don't seem to fill in alloc->id, but virDomainResctrlNew will for the first time.
2. virDomainMemorytuneDefParse
The virDomainResctrlVcpuMatch may find a preexisting @alloc, but @new_alloc is set to true. The virDomainMemorytuneDefParseMemory won't fill alloc->id. Then only if @new_alloc do we call virDomainResctrlNew
So I think both are safe "for now".
Yes. Agree. Thanks for the analysis.
If you want I could modify the virResctrlAllocSetID code to :
if (alloc->id) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Attempt to overwrite alloc->id='%s' with id='%s'"), alloc->id, id); return -1; }
Let me know.
virResctrlMonitorSetID should also do similar patch, right? Then the body of two functions, virRresctrlAllocSetID and virResctrlMonitorSetID, are very similar. I will introduce two patches, the first patch will refactor virRresctrlAllocSetID and the second patch will reuse the refactor for virResctrlMonitorSetID. I know you have a solution solving this, my code is just for your reference.
John
Thanks for review. Huaqiang
With the changes (that I can make),
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for review. Huaqiang
[...]

This refactor allows to add some code between virDomainResctrlNew and virResctrlAllocIsEmpty to extend the scope of resctrl. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 39bd396..a068d4d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19022,15 +19022,15 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } + resctrl = virDomainResctrlNew(node, &alloc, &vcpus, flags); + if (!resctrl) + goto cleanup; + if (virResctrlAllocIsEmpty(alloc)) { ret = 0; goto cleanup; } - resctrl = virDomainResctrlNew(node, &alloc, &vcpus, flags); - if (!resctrl) - goto cleanup; - if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, resctrl) < 0) goto cleanup; -- 2.7.4

$SUBJ: s/litter/little Perhaps better said: conf: Alter order of Cachetune virDomainResctrlNew call On 10/22/18 4:01 AM, Wang Huaqiang wrote:
This refactor allows to add some code between virDomainResctrlNew and virResctrlAllocIsEmpty to extend the scope of resctrl.
this then becomes: Refactor the logic to handle subsequent generation of a resource monitor which would effect whether a non default cache exists.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 2018年11月06日 01:26, John Ferlan wrote:
$SUBJ:
s/litter/little
Perhaps better said:
conf: Alter order of Cachetune virDomainResctrlNew call
This refactor allows to add some code between virDomainResctrlNew and virResctrlAllocIsEmpty to extend the scope of resctrl.
On 10/22/18 4:01 AM, Wang Huaqiang wrote: this then becomes:
Refactor the logic to handle subsequent generation of a resource monitor which would effect whether a non default cache exists.
Thanks for wording.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for your review. Huaqiang

Introducing <monitor> element under <cachetune> to represent a cache monitor. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 26 +++ docs/schemas/domaincommon.rng | 10 + src/conf/domain_conf.c | 234 ++++++++++++++++++++- src/conf/domain_conf.h | 11 + tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 8 files changed, 322 insertions(+), 1 deletion(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b1651e3..2fd665c 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,12 @@ <cachetune vcpus='0-3'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-3'/> + </cachetune> + <cachetune vcpus='4-5'> + <monitor level='3' vcpus='4'/> + <monitor level='3' vcpus='5'/> </cachetune> <memorytune vcpus='0-3'> <node id='0' bandwidth='60'/> @@ -978,6 +984,26 @@ </dd> </dl> </dd> + <dt><code>monitor</code></dt> + <dd> + The optional element <code>monitor</code> creates the cache + monitor(s) for current cache allocation and has the following + required attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level the monitor belongs to. + </dd> + <dt><code>vcpus</code></dt> + <dd> + vCPU list the monitor applies to. A monitor's vCPU list + can only be the member(s) of the vCPU list of associating + allocation. The default monitor has the same vCPU list as the + associating allocation. For non-default monitors, there + are no vCPU overlap permitted. + </dd> + </dl> + </dd> </dl> </dd> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 5c533d6..7ce49d3 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -981,6 +981,16 @@ </optional> </element> </zeroOrMore> + <zeroOrMore> + <element name="monitor"> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + </element> + </zeroOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a068d4d..01f795f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2955,13 +2955,31 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) static void +virDomainResctrlMonDefFree(virDomainResctrlMonDefPtr domresmon) +{ + if (!domresmon) + return; + + virBitmapFree(domresmon->vcpus); + virObjectUnref(domresmon->instance); + VIR_FREE(domresmon); +} + + +static void virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) { + size_t i = 0; + if (!resctrl) return; + for (i = 0; i < resctrl->nmonitors; i++) + virDomainResctrlMonDefFree(resctrl->monitors[i]); + virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); + VIR_FREE(resctrl->monitors); VIR_FREE(resctrl); } @@ -18920,6 +18938,177 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, } +/* Checking if the monitor's vcpus is conflicted with existing allocation + * and monitors. + * + * Returns 1 if @vcpus equals to @resctrl->vcpus, then the monitor will + * share the underlying resctrl group with @resctrl->alloc. Returns - 1 + * if any conflict found. Returns 0 if no conflict and @vcpus is not equal + * to @resctrl->vcpus. + */ +static int +virDomainResctrlMonValidateVcpus(virDomainResctrlDefPtr resctrl, + virBitmapPtr vcpus) +{ + size_t i = 0; + int vcpu = -1; + size_t mons_same_alloc_vcpus = 0; + + if (virBitmapIsAllClear(vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("vcpus is empty")); + return -1; + } + + while ((vcpu = virBitmapNextSetBit(vcpus, vcpu)) >= 0) { + if (!virBitmapIsBitSet(resctrl->vcpus, vcpu)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with allocation")); + return -1; + } + } + + if (virBitmapEqual(vcpus, resctrl->vcpus)) + return 1; + + for (i = 0; i < resctrl->nmonitors; i++) { + if (virBitmapEqual(resctrl->vcpus, resctrl->monitors[i]->vcpus)) { + mons_same_alloc_vcpus++; + continue; + } + + if (virBitmapOverlaps(vcpus, resctrl->monitors[i]->vcpus)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Monitor vcpus conflicts with monitors")); + + return -1; + } + } + + if (mons_same_alloc_vcpus > 1) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Too many monitors have the same vcpu as allocation")); + return -1; + } + + return 0; +} + + +#define VIR_DOMAIN_RESCTRL_MONITOR_CACHELEVEL 3 + +static int +virDomainResctrlMonDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlMonitorType tag, + virDomainResctrlDefPtr resctrl) +{ + virDomainResctrlMonDefPtr domresmon = NULL; + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + unsigned int level = 0; + char *tmp = NULL; + char *id = NULL; + size_t i = 0; + int n = 0; + int rv = -1; + int ret = -1; + + ctxt->node = node; + + if ((n = virXPathNodeSet("./monitor", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract monitor nodes")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + if (VIR_ALLOC(domresmon) < 0) + goto cleanup; + + domresmon->tag = tag; + + domresmon->instance = virResctrlMonitorNew(); + if (!domresmon->instance) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not create monitor")); + goto cleanup; + } + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + tmp = virXMLPropString(nodes[i], "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing monitor attribute 'level'")); + goto cleanup; + } + + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid monitor attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + + if (level != VIR_DOMAIN_RESCTRL_MONITOR_CACHELEVEL) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid monitor cache level '%d'"), + level); + goto cleanup; + } + + VIR_FREE(tmp); + } + + if (virDomainResctrlParseVcpus(def, nodes[i], &domresmon->vcpus) < 0) + goto cleanup; + + rv = virDomainResctrlMonValidateVcpus(resctrl, domresmon->vcpus); + if (rv < 0) + goto cleanup; + + /* If monitor's vcpu list is identical to the vcpu list of the + * associated allocation, set monitor's id to the same value + * as the allocation. */ + if (rv == 1) { + const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); + + if (VIR_STRDUP(id, alloc_id) < 0) + goto cleanup; + } else { + if (!(tmp = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + if (virAsprintf(&id, "vcpus_%s", tmp) < 0) + goto cleanup; + } + + virResctrlMonitorSetAlloc(domresmon->instance, resctrl->alloc); + + if (virResctrlMonitorSetID(domresmon->instance, id) < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(resctrl->monitors, + resctrl->nmonitors, + domresmon) < 0) + goto cleanup; + + VIR_FREE(id); + VIR_FREE(tmp); + } + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(id); + VIR_FREE(tmp); + VIR_FREE(nodes); + virDomainResctrlMonDefFree(domresmon); + return ret; +} + + static virDomainResctrlDefPtr virDomainResctrlNew(xmlNodePtr node, virResctrlAllocPtr *alloc, @@ -19026,7 +19215,14 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (!resctrl) goto cleanup; - if (virResctrlAllocIsEmpty(alloc)) { + if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_CACHE, + resctrl) < 0) + goto cleanup; + + /* If no <cache> element or <monitor> element in <cachetune>, do not + * append any resctrl element */ + if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) { ret = 0; goto cleanup; } @@ -27063,12 +27259,41 @@ virDomainCachetuneDefFormatHelper(unsigned int level, static int +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr domresmon, + virResctrlMonitorType tag, + virBufferPtr buf) +{ + char *vcpus = NULL; + + if (domresmon->tag != tag) + return 0; + + virBufferAddLit(buf, "<monitor "); + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + virBufferAsprintf(buf, "level='%u' ", + VIR_DOMAIN_RESCTRL_MONITOR_CACHELEVEL); + } + + vcpus = virBitmapFormat(domresmon->vcpus); + if (!vcpus) + return -1; + + virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus); + + VIR_FREE(vcpus); + return 0; +} + + +static int virDomainCachetuneDefFormat(virBufferPtr buf, virDomainResctrlDefPtr resctrl, unsigned int flags) { virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; char *vcpus = NULL; + size_t i = 0; int ret = -1; virBufferSetChildIndent(&childrenBuf, buf); @@ -27077,6 +27302,13 @@ virDomainCachetuneDefFormat(virBufferPtr buf, &childrenBuf) < 0) goto cleanup; + for (i = 0; i < resctrl->nmonitors; i ++) { + if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i], + VIR_RESCTRL_MONITOR_TYPE_CACHE, + &childrenBuf) < 0) + goto cleanup; + } + if (virBufferCheckError(&childrenBuf) < 0) goto cleanup; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e30a4b2..60f6464 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2236,12 +2236,23 @@ struct _virDomainCputune { }; +typedef struct _virDomainResctrlMonDef virDomainResctrlMonDef; +typedef virDomainResctrlMonDef *virDomainResctrlMonDefPtr; +struct _virDomainResctrlMonDef { + virBitmapPtr vcpus; + virResctrlMonitorType tag; + virResctrlMonitorPtr instance; +}; + typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr; struct _virDomainResctrlDef { virBitmapPtr vcpus; virResctrlAllocPtr alloc; + + virDomainResctrlMonDefPtr *monitors; + size_t nmonitors; }; diff --git a/tests/genericxml2xmlindata/cachetune-cdp.xml b/tests/genericxml2xmlindata/cachetune-cdp.xml index 9718f06..9f4c139 100644 --- a/tests/genericxml2xmlindata/cachetune-cdp.xml +++ b/tests/genericxml2xmlindata/cachetune-cdp.xml @@ -8,9 +8,12 @@ <cachetune vcpus='0-1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='2'> <cache id='1' level='3' type='code' size='6' unit='MiB'/> + <monitor level='3' vcpus='2'/> </cachetune> <cachetune vcpus='3'> <cache id='1' level='3' type='data' size='6912' unit='KiB'/> diff --git a/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml new file mode 100644 index 0000000..d481fb5 --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml @@ -0,0 +1,30 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='2'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-small.xml b/tests/genericxml2xmlindata/cachetune-small.xml index ab2d9cf..748be08 100644 --- a/tests/genericxml2xmlindata/cachetune-small.xml +++ b/tests/genericxml2xmlindata/cachetune-small.xml @@ -7,6 +7,13 @@ <cputune> <cachetune vcpus='0-1'> <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <monitor level='3' vcpus='0'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-1'/> + </cachetune> + <cachetune vcpus='2-3'> + <monitor level='3' vcpus='2'/> + <monitor level='3' vcpus='3'/> </cachetune> </cputune> <os> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index fa941f0..4393d44 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -137,6 +137,8 @@ mymain(void) TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST_FULL("cachetune-colliding-types", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST_FULL("cachetune-colliding-monitor", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST("memorytune"); DO_TEST_FULL("memorytune-colliding-allocs", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Introducing <monitor> element under <cachetune> to represent a cache monitor.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 26 +++ docs/schemas/domaincommon.rng | 10 + src/conf/domain_conf.c | 234 ++++++++++++++++++++- src/conf/domain_conf.h | 11 + tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 8 files changed, 322 insertions(+), 1 deletion(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b1651e3..2fd665c 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,12 @@ <cachetune vcpus='0-3'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-3'/> + </cachetune> + <cachetune vcpus='4-5'> + <monitor level='3' vcpus='4'/> + <monitor level='3' vcpus='5'/> </cachetune> <memorytune vcpus='0-3'> <node id='0' bandwidth='60'/> @@ -978,6 +984,26 @@ </dd> </dl> </dd> + <dt><code>monitor</code></dt> + <dd> + The optional element <code>monitor</code> creates the cache + monitor(s) for current cache allocation and has the following + required attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level the monitor belongs to. + </dd> + <dt><code>vcpus</code></dt> + <dd> + vCPU list the monitor applies to. A monitor's vCPU list + can only be the member(s) of the vCPU list of associating
the associated
+ allocation. The default monitor has the same vCPU list as the + associating allocation. For non-default monitors, there
associated
+ are no vCPU overlap permitted.
For non-default monitors, overlapping vCPUs are not permitted.
+ </dd> + </dl> + </dd> </dl> </dd>
[...]
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a068d4d..01f795f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2955,13 +2955,31 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader)
[...]
@@ -19026,7 +19215,14 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (!resctrl) goto cleanup;
- if (virResctrlAllocIsEmpty(alloc)) { + if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_CACHE, + resctrl) < 0) + goto cleanup; + + /* If no <cache> element or <monitor> element in <cachetune>, do not + * append any resctrl element */ + if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) {
Swap order since *IsEmpty is more compute intensive, also change to: if (resctrl->nmonitors == 0 &&
ret = 0; goto cleanup; } @@ -27063,12 +27259,41 @@ virDomainCachetuneDefFormatHelper(unsigned int level,
static int +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr domresmon, + virResctrlMonitorType tag, + virBufferPtr buf) +{ + char *vcpus = NULL; + + if (domresmon->tag != tag) + return 0; + + virBufferAddLit(buf, "<monitor "); + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + virBufferAsprintf(buf, "level='%u' ", + VIR_DOMAIN_RESCTRL_MONITOR_CACHELEVEL); + }
So is this because when <memtune> is introduced it won't use a cache level, right? Just trying to recall (and perhaps add a comment).
+ + vcpus = virBitmapFormat(domresmon->vcpus); + if (!vcpus) + return -1; + + virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus); + + VIR_FREE(vcpus); + return 0; +} + +
[...] I can fix the minor nits, just ack them... Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 2018年11月06日 01:26, John Ferlan wrote:
Introducing <monitor> element under <cachetune> to represent a cache monitor.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- docs/formatdomain.html.in | 26 +++ docs/schemas/domaincommon.rng | 10 + src/conf/domain_conf.c | 234 ++++++++++++++++++++- src/conf/domain_conf.h | 11 + tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 8 files changed, 322 insertions(+), 1 deletion(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index b1651e3..2fd665c 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,12 @@ <cachetune vcpus='0-3'> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> + <monitor level='3' vcpus='1'/> + <monitor level='3' vcpus='0-3'/> + </cachetune> + <cachetune vcpus='4-5'> + <monitor level='3' vcpus='4'/> + <monitor level='3' vcpus='5'/> </cachetune> <memorytune vcpus='0-3'> <node id='0' bandwidth='60'/> @@ -978,6 +984,26 @@ </dd> </dl> </dd> + <dt><code>monitor</code></dt> + <dd> + The optional element <code>monitor</code> creates the cache + monitor(s) for current cache allocation and has the following + required attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level the monitor belongs to. + </dd> + <dt><code>vcpus</code></dt> + <dd> + vCPU list the monitor applies to. A monitor's vCPU list + can only be the member(s) of the vCPU list of associating
On 10/22/18 4:01 AM, Wang Huaqiang wrote: the associated
Thanks.
+ allocation. The default monitor has the same vCPU list as the + associating allocation. For non-default monitors, there associated
Thanks.
+ are no vCPU overlap permitted. For non-default monitors, overlapping vCPUs are not permitted.
Thanks.
+ </dd> + </dl> + </dd> </dl> </dd>
[...]
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a068d4d..01f795f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2955,13 +2955,31 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader)
[...]
@@ -19026,7 +19215,14 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (!resctrl) goto cleanup;
- if (virResctrlAllocIsEmpty(alloc)) { + if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_CACHE, + resctrl) < 0) + goto cleanup; + + /* If no <cache> element or <monitor> element in <cachetune>, do not + * append any resctrl element */ + if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) { Swap order since *IsEmpty is more compute intensive, also change to:
if (resctrl->nmonitors == 0 &&
Agree.
ret = 0; goto cleanup; } @@ -27063,12 +27259,41 @@ virDomainCachetuneDefFormatHelper(unsigned int level,
static int +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr domresmon, + virResctrlMonitorType tag, + virBufferPtr buf) +{ + char *vcpus = NULL; + + if (domresmon->tag != tag) + return 0; + + virBufferAddLit(buf, "<monitor "); + + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + virBufferAsprintf(buf, "level='%u' ", + VIR_DOMAIN_RESCTRL_MONITOR_CACHELEVEL); + }
So is this because when <memtune> is introduced it won't use a cache level, right? Just trying to recall (and perhaps add a comment).
Yes. 'level' is just for cachetune monitor. Planned to add the comment when extending this function to support memtune monitor, mentioning the memtune monitor ( tag == VIR_RESCTRL_MONITOR_TYPE_BW) here will cause confusing.
+ + vcpus = virBitmapFormat(domresmon->vcpus); + if (!vcpus) + return -1; + + virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus); + + VIR_FREE(vcpus); + return 0; +} + + [...]
I can fix the minor nits, just ack them...
Thanks please help fix.
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
Thanks for fixing and the review, I will make changes in my code. Huaqiang

Add functions for creating, destroying, reconnecting resctrl monitor in qemu according to the configuration in domain XML. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 65 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e9c7618..fba4fb4 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2611,10 +2611,21 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, return -1; for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; if (virResctrlAllocCreate(caps->host.resctrl, vm->def->resctrls[i]->alloc, priv->machineName) < 0) goto cleanup; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorCreate(mon->instance, + priv->machineName) < 0) + goto cleanup; + + } } ret = 0; @@ -5430,6 +5441,7 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, { pid_t vcpupid = qemuDomainGetVcpuPid(vm, vcpuid); virDomainVcpuDefPtr vcpu = virDomainDefGetVcpu(vm->def, vcpuid); + virDomainResctrlMonDefPtr mon = NULL; size_t i = 0; if (qemuProcessSetupPid(vm, vcpupid, VIR_CGROUP_THREAD_VCPU, @@ -5440,11 +5452,42 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, return -1; for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; virDomainResctrlDefPtr ct = vm->def->resctrls[i]; if (virBitmapIsBitSet(ct->vcpus, vcpuid)) { if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) return -1; + + /* The order of invoking virResctrlMonitorAddPID matters, it is + * required to invoke this function first for monitor that has + * the same vcpus setting as the allocation in same def->resctrl. + * Otherwise, some other monitor's pid may be removed from its + * resource group's 'tasks' file.*/ + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + mon = vm->def->resctrls[i]->monitors[j]; + + if (!virBitmapEqual(ct->vcpus, mon->vcpus)) + continue; + + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + break; + } + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + mon = vm->def->resctrls[i]->monitors[j]; + + if (virBitmapEqual(ct->vcpus, mon->vcpus)) + continue; + + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + } break; } } @@ -7207,8 +7250,18 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove resctrl allocation after cgroups are cleaned up which makes it * kind of safer (although removing the allocation should work even with * pids in tasks file */ - for (i = 0; i < vm->def->nresctrls; i++) + for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + virResctrlMonitorRemove(mon->instance); + } + virResctrlAllocRemove(vm->def->resctrls[i]->alloc); + } qemuProcessRemoveDomainStatus(driver, vm); @@ -7939,9 +7992,20 @@ qemuProcessReconnect(void *opaque) goto error; for (i = 0; i < obj->def->nresctrls; i++) { + size_t j = 0; + if (virResctrlAllocDeterminePath(obj->def->resctrls[i]->alloc, priv->machineName) < 0) goto error; + + for (j = 0; j < obj->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = obj->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorDeterminePath(mon->instance, + priv->machineName) < 0) + goto error; + } } /* update domain state XML with possibly updated state in virDomainObj */ -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add functions for creating, destroying, reconnecting resctrl monitor in qemu according to the configuration in domain XML.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 65 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e9c7618..fba4fb4 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c
[...]
@@ -5440,11 +5452,42 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, return -1;
for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; virDomainResctrlDefPtr ct = vm->def->resctrls[i];
if (virBitmapIsBitSet(ct->vcpus, vcpuid)) {> if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) return -1; + + /* The order of invoking virResctrlMonitorAddPID matters, it is + * required to invoke this function first for monitor that has + * the same vcpus setting as the allocation in same def->resctrl. + * Otherwise, some other monitor's pid may be removed from its + * resource group's 'tasks' file.*/ + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) {
s/vm->def->resctrls[i]/ct/ (above and below)
+ mon = vm->def->resctrls[i]->monitors[j]; + + if (!virBitmapEqual(ct->vcpus, mon->vcpus)) + continue; + + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + break;
It seems this break should be inside the IsBitSet, right (as is for the ct->alloc, vcpupid match)? Otherwise, we run the loop just once and not run until we find our vcpuid in mon->vcpus
+ } +
The next loop is duplicitous and can be removed, right? with some adjustments (which I can make as described), Reviewed-by: John Ferlan <jferlan@redhat.com> John [...]

On 11/6/2018 2:01 AM, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add functions for creating, destroying, reconnecting resctrl monitor in qemu according to the configuration in domain XML.
Signed-off-by: Wang Huaqiang<huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 65 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e9c7618..fba4fb4 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c [...]
@@ -5440,11 +5452,42 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, return -1;
for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; virDomainResctrlDefPtr ct = vm->def->resctrls[i];
if (virBitmapIsBitSet(ct->vcpus, vcpuid)) {> if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) return -1; + + /* The order of invoking virResctrlMonitorAddPID matters, it is + * required to invoke this function first for monitor that has + * the same vcpus setting as the allocation in same def->resctrl. + * Otherwise, some other monitor's pid may be removed from its + * resource group's 'tasks' file.*/ + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { s/vm->def->resctrls[i]/ct/ (above and below)
Yes. Will make such kind of substitution for all cases.
+ mon = vm->def->resctrls[i]->monitors[j]; + + if (!virBitmapEqual(ct->vcpus, mon->vcpus)) + continue; + + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + break; It seems this break should be inside the IsBitSet, right (as is for the ct->alloc, vcpupid match)? Otherwise, we run the loop just once and not run until we find our vcpuid in mon->vcpus
Yes. You are right.
+ } + The next loop is duplicitous and can be removed, right?
In my original code logic, which is I need a @monitor->pids array to track all pids belonging to @monitor, these two loops are necessary. The first loop ignores all non-default monitors and adds the default monitor's PIDs to its 'tasks' file if default monitor exists. The second loop adds non-default monitors' PIDs. This order is intentionally designed because file writing for default monitor's 'tasks' file will remove PIDs that existing in other monitor's 'tasks' file. But I have taken your suggestion that the @monitor->pids is meaningless and removed this array, then, these two loops are not needed any more, only the second loop is kept. Along with the review comments you made the code would be: @@ -5440,11 +5451,26 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, return -1; for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; virDomainResctrlDefPtr ct = vm->def->resctrls[i]; if (virBitmapIsBitSet(ct->vcpus, vcpuid)) { if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) return -1; + + for (j = 0; j < ct->nmonitors; j++) { + mon = ct->monitors[j]; + + if (virBitmapEqual(ct->vcpus, mon->vcpus)) + continue; + + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + break; + } + } + break; } }
with some adjustments (which I can make as described),
Reviewed-by: John Ferlan<jferlan@redhat.com>
John
Thanks for review. Huaqiang
[...]

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add functions for creating, destroying, reconnecting resctrl monitor in qemu according to the configuration in domain XML.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 65 insertions(+), 1 deletion(-)
[...]
@@ -5430,6 +5441,7 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, { pid_t vcpupid = qemuDomainGetVcpuPid(vm, vcpuid); virDomainVcpuDefPtr vcpu = virDomainDefGetVcpu(vm->def, vcpuid); + virDomainResctrlMonDefPtr mon = NULL; size_t i = 0;
if (qemuProcessSetupPid(vm, vcpupid, VIR_CGROUP_THREAD_VCPU, @@ -5440,11 +5452,42 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, return -1;
for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; virDomainResctrlDefPtr ct = vm->def->resctrls[i];
if (virBitmapIsBitSet(ct->vcpus, vcpuid)) { if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) return -1; + + /* The order of invoking virResctrlMonitorAddPID matters, it is + * required to invoke this function first for monitor that has + * the same vcpus setting as the allocation in same def->resctrl. + * Otherwise, some other monitor's pid may be removed from its + * resource group's 'tasks' file.*/ + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + mon = vm->def->resctrls[i]->monitors[j]; + + if (!virBitmapEqual(ct->vcpus, mon->vcpus)) + continue; + + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + break; + } + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + mon = vm->def->resctrls[i]->monitors[j]; + + if (virBitmapEqual(ct->vcpus, mon->vcpus)) + continue; + + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + } break; } }
As I'm reading the subsequent patch, I'm wondering about the order of the patches, what happens on the vcpu hotunplug case, and are we doing the "correct thing" on the reconnect path. 1. with respect to order - it probably doesn't really matter, but I guess adding another list to manage in the next patch got my attention and thinking well shouldn't the above code for MonitorAddPID be "ready" and not changed afterwards. 2. Since qemuProcessSetupVcpu is called from qemuDomainHotplugAddVcpu that means we probably need to handle qemuDomainHotplugDelVcpu processing. Of course, when Martin added the resctrl support in commit 9a2fc2db8, the corollary wasn't handled. So the *tasks file could have stale pids in it if vcpus were unplugged since there's nothing (obvious to me) that removes the pid from the file. Furthermore a stale pid in the grand scheme of pid reusage could become not stale and what happens if a pid is found - do we end up in some error path... 3. In the reconnect logic - it would seem that we would be starting from scratch again, right? So would the *tasks file even be valid since vcpus could be added/deleted and we didn't get notified... So perhaps we need some logic in the reconnect code to reinitialize the tasks file (IOW: delete it since we're going to recreate it - I assume). Perhaps more questions now vis-a-vis any sort of ACK/push for patches starting here. At least 1-12 are in decent shape. John
@@ -7207,8 +7250,18 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove resctrl allocation after cgroups are cleaned up which makes it * kind of safer (although removing the allocation should work even with * pids in tasks file */ - for (i = 0; i < vm->def->nresctrls; i++) + for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + virResctrlMonitorRemove(mon->instance); + } + virResctrlAllocRemove(vm->def->resctrls[i]->alloc); + }
qemuProcessRemoveDomainStatus(driver, vm);
@@ -7939,9 +7992,20 @@ qemuProcessReconnect(void *opaque) goto error;
for (i = 0; i < obj->def->nresctrls; i++) { + size_t j = 0; + if (virResctrlAllocDeterminePath(obj->def->resctrls[i]->alloc, priv->machineName) < 0) goto error; + + for (j = 0; j < obj->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = obj->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorDeterminePath(mon->instance, + priv->machineName) < 0) + goto error; + } }
/* update domain state XML with possibly updated state in virDomainObj */

On 11/6/2018 2:44 AM, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Add functions for creating, destroying, reconnecting resctrl monitor in qemu according to the configuration in domain XML.
Signed-off-by: Wang Huaqiang<huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 65 insertions(+), 1 deletion(-)
[...]
@@ -5430,6 +5441,7 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, { pid_t vcpupid = qemuDomainGetVcpuPid(vm, vcpuid); virDomainVcpuDefPtr vcpu = virDomainDefGetVcpu(vm->def, vcpuid); + virDomainResctrlMonDefPtr mon = NULL; size_t i = 0;
if (qemuProcessSetupPid(vm, vcpupid, VIR_CGROUP_THREAD_VCPU, @@ -5440,11 +5452,42 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, return -1;
for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; virDomainResctrlDefPtr ct = vm->def->resctrls[i];
if (virBitmapIsBitSet(ct->vcpus, vcpuid)) { if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) return -1; + + /* The order of invoking virResctrlMonitorAddPID matters, it is + * required to invoke this function first for monitor that has + * the same vcpus setting as the allocation in same def->resctrl. + * Otherwise, some other monitor's pid may be removed from its + * resource group's 'tasks' file.*/ + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + mon = vm->def->resctrls[i]->monitors[j]; + + if (!virBitmapEqual(ct->vcpus, mon->vcpus)) + continue; + + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + break; + } + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + mon = vm->def->resctrls[i]->monitors[j]; + + if (virBitmapEqual(ct->vcpus, mon->vcpus)) + continue; + + if (virBitmapIsBitSet(mon->vcpus, vcpuid)) { + if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0) + return -1; + } + } break; } } As I'm reading the subsequent patch, I'm wondering about the order of the patches, what happens on the vcpu hotunplug case, and are we doing the "correct thing" on the reconnect path.
1. with respect to order - it probably doesn't really matter, but I guess adding another list to manage in the next patch got my attention and thinking well shouldn't the above code for MonitorAddPID be "ready" and not changed afterwards.
Then I need to merge the next patch, patch 14, with some previous patch, at least for the 'MonitorAddPID' function, to avoid a second changing of a same function. Got, thanks. But, in my next series of patches, patch 14 is removed with all logic in it. So the patch order seems rational then.
2. Since qemuProcessSetupVcpu is called from qemuDomainHotplugAddVcpu that means we probably need to handle qemuDomainHotplugDelVcpu processing. Of course, when Martin added the resctrl support in commit 9a2fc2db8, the corollary wasn't handled.
So the *tasks file could have stale pids in it if vcpus were unplugged since there's nothing (obvious to me) that removes the pid from the file. Furthermore a stale pid in the grand scheme of pid reusage could become not stale and what happens if a pid is found - do we end up in some error path...
The *tasks file is not a file kept in a block device, its content, the PIDS, could be removed if the process/task is terminated or PIDs have been added to some other resctrl group. It seems the *tasks file will not be stale in scenario of vcpu hot-plug. For vcpu unplugging case, qemuDomainHotplugDelVcpu removes and terminates the vcpu process, right? Then the *tasks file will automatically be updated by removing the process's PID. The removal of PID from *tasks is done by resctrl file system. So there is no hot-plug issue for resctrl in terms of tracking PID in *tasks.
3. In the reconnect logic - it would seem that we would be starting from scratch again, right?
In my understanding reconnect means the restart of libvirtd and the information re-building for VMs. Originally I believed that vcpupid would be changed during a reconnect, that is the reason I add the function for validating monitor in patch14. But the vcpupid has not been changed in process of reconnect. So I decide to remove patch14.
So would the *tasks file even be valid since vcpus could be added/deleted and we didn't get notified...
Do you still believe "vcpus could be added/deleted and we didn't get notified' in reconnect"? If so, can you list more details? And any operation for adding/deleting VM vcpus out of libvirt is not permitted right?
So perhaps we need some logic in the reconnect code to reinitialize the tasks file (IOW: delete it since we're going to recreate it - I assume).
*tasks file is still there with the PIDs in it. Seems it is not needed to recreate it. So I removed my last patch, in that patch I invoked qemuProcessSetupVcpus in process of re-connection.
Perhaps more questions now vis-a-vis any sort of ACK/push for patches starting here. At least 1-12 are in decent shape.
John
Thanks for review. Huaqiang
@@ -7207,8 +7250,18 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove resctrl allocation after cgroups are cleaned up which makes it * kind of safer (although removing the allocation should work even with * pids in tasks file */ - for (i = 0; i < vm->def->nresctrls; i++) + for (i = 0; i < vm->def->nresctrls; i++) { + size_t j = 0; + + for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = vm->def->resctrls[i]->monitors[j]; + virResctrlMonitorRemove(mon->instance); + } + virResctrlAllocRemove(vm->def->resctrls[i]->alloc); + }
qemuProcessRemoveDomainStatus(driver, vm);
@@ -7939,9 +7992,20 @@ qemuProcessReconnect(void *opaque) goto error;
for (i = 0; i < obj->def->nresctrls; i++) { + size_t j = 0; + if (virResctrlAllocDeterminePath(obj->def->resctrls[i]->alloc, priv->machineName) < 0) goto error; + + for (j = 0; j < obj->def->resctrls[i]->nmonitors; j++) { + virDomainResctrlMonDefPtr mon = NULL; + + mon = obj->def->resctrls[i]->monitors[j]; + if (virResctrlMonitorDeterminePath(mon->instance, + priv->machineName) < 0) + goto error; + } }
/* update domain state XML with possibly updated state in virDomainObj */

Check whether monitor is running by checking the monitor's PIDs status. Monitor is looked as running normally if the vcpu PID list matches with the content of monitor's 'tasks' file. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 102 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 3 ++ 3 files changed, 105 insertions(+), 1 deletion(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d59ac86..91801ed 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2684,6 +2684,7 @@ virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; virResctrlMonitorGetID; +virResctrlMonitorIsRunning; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetAlloc; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index b0205bc..fa3e6e9 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -359,6 +359,9 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* Tracking the tasks' PID associated with this monitor */ + pid_t *pids; + size_t npids; }; @@ -418,6 +421,7 @@ virResctrlMonitorDispose(void *obj) virObjectUnref(monitor->alloc); VIR_FREE(monitor->id); VIR_FREE(monitor->path); + VIR_FREE(monitor->pids); } @@ -2540,7 +2544,20 @@ int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid) { - return virResctrlAddPID(monitor->path, pid); + size_t i = 0; + + if (virResctrlAddPID(monitor->path, pid) < 0) + return -1; + + for (i = 0; i < monitor->npids; i++) { + if (pid == monitor->pids[i]) + return 0; + } + + if (VIR_APPEND_ELEMENT(monitor->pids, monitor->npids, pid) < 0) + return -1; + + return 0; } @@ -2613,3 +2630,86 @@ virResctrlMonitorRemove(virResctrlMonitorPtr monitor) return ret; } + + +static int +virResctrlPIDSorter(const void *pida, const void *pidb) +{ + return *(pid_t*)pida - *(pid_t*)pidb; +} + + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor) +{ + char *pidstr = NULL; + char **spids = NULL; + size_t nspids = 0; + pid_t *pids = NULL; + size_t npids = 0; + size_t i = 0; + int rv = -1; + bool ret = false; + + /* path is not determined yet, monitor is not running*/ + if (!monitor->path) + return false; + + /* No vcpu PID filled, regard monitor as not running */ + if (monitor->npids == 0) + return false; + + /* If no 'tasks' file found, underlying resctrl directory is not + * ready, regard monitor as not running */ + rv = virFileReadValueString(&pidstr, "%s/tasks", monitor->path); + if (rv < 0) + goto cleanup; + + /* no PID in task file, monitor is not running */ + if (!*pidstr) + goto cleanup; + + /* The tracking monitor PID list is not identical to the + * list in current resctrl directory. monitor is corrupted, + * report error and un-running state */ + spids = virStringSplitCount(pidstr, "\n", 0, &nspids); + if (nspids != monitor->npids) { + VIR_ERROR(_("Monitor %s PID list mismatch in length"), monitor->path); + goto cleanup; + } + + for (i = 0; i < nspids; i++) { + unsigned int val = 0; + pid_t pid = 0; + + if (virStrToLong_uip(spids[i], NULL, 0, &val) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Monitor %s failed in parse PID list"), + monitor->path); + goto cleanup; + } + + pid = (pid_t)val; + + if (VIR_APPEND_ELEMENT(pids, npids, pid) < 0) + goto cleanup; + } + + qsort(pids, npids, sizeof(pid_t), virResctrlPIDSorter); + qsort(monitor->pids, monitor->npids, sizeof(pid_t), virResctrlPIDSorter); + + for (i = 0; i < monitor->npids; i++) { + if (monitor->pids[i] != pids[i]) { + VIR_ERROR(_("Monitor %s PID list corrupted"), monitor->path); + goto cleanup; + } + } + + ret = true; + cleanup: + virStringListFree(spids); + VIR_FREE(pids); + VIR_FREE(pidstr); + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 804d6f5..8d8fdc2 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -219,4 +219,7 @@ virResctrlMonitorSetAlloc(virResctrlMonitorPtr monitor, int virResctrlMonitorRemove(virResctrlMonitorPtr monitor); + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Check whether monitor is running by checking the monitor's PIDs status.
Monitor is looked as running normally if the vcpu PID list matches with the content of monitor's 'tasks' file.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 102 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 3 ++ 3 files changed, 105 insertions(+), 1 deletion(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d59ac86..91801ed 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2684,6 +2684,7 @@ virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; virResctrlMonitorGetID; +virResctrlMonitorIsRunning; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetAlloc; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index b0205bc..fa3e6e9 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -359,6 +359,9 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* Tracking the tasks' PID associated with this monitor */ + pid_t *pids; + size_t npids; };
@@ -418,6 +421,7 @@ virResctrlMonitorDispose(void *obj) virObjectUnref(monitor->alloc); VIR_FREE(monitor->id); VIR_FREE(monitor->path); + VIR_FREE(monitor->pids); }
@@ -2540,7 +2544,20 @@ int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid) { - return virResctrlAddPID(monitor->path, pid); + size_t i = 0; + + if (virResctrlAddPID(monitor->path, pid) < 0) + return -1; + + for (i = 0; i < monitor->npids; i++) { + if (pid == monitor->pids[i]) + return 0; + } + + if (VIR_APPEND_ELEMENT(monitor->pids, monitor->npids, pid) < 0) + return -1; + + return 0;
could just be a return VIR_APPEND_ELEMENT() So we theoretically have our @pid in the task file (ResctrlAddPID) and this internal list of pids which makes me wonder why other than "quicker" lookup than opening the tasks file and looking for our pid, right? But based on the next hunk for virResctrlMonitorIsRunning, I'm not so sure. In fact I wonder why are we doing this?
}
@@ -2613,3 +2630,86 @@ virResctrlMonitorRemove(virResctrlMonitorPtr monitor)
return ret; } + + +static int +virResctrlPIDSorter(const void *pida, const void *pidb) +{ + return *(pid_t*)pida - *(pid_t*)pidb; +} + + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor) +{ + char *pidstr = NULL; + char **spids = NULL; + size_t nspids = 0; + pid_t *pids = NULL; + size_t npids = 0; + size_t i = 0; + int rv = -1; + bool ret = false; +
So this is a heavyweight type method and seems to me to do far more than just determine if a monitor is running. In fact, it seems it's validating that the internal list of monitor->pids matches whatever is in the *tasks file. There's multiple failure scenarios that may or may not "mean" a monitor is or isn't running.
+ /* path is not determined yet, monitor is not running*/ + if (!monitor->path) + return false; + + /* No vcpu PID filled, regard monitor as not running */ + if (monitor->npids == 0) + return false; + + /* If no 'tasks' file found, underlying resctrl directory is not + * ready, regard monitor as not running */ + rv = virFileReadValueString(&pidstr, "%s/tasks", monitor->path); + if (rv < 0) + goto cleanup;
So we read the file, but while we're reading someone could have added to it... There's some locking concerns here. The *tasks file can have a pid added by a <cache> and the same pid added by a <monitor> it seems at least from my reading of qemuProcessSetupVcpu logic where virResctrlAllocAddPID would be called first followed by a call to virResctrlMonitorAddPID. Neither checks if the pid exists before writing (so yes more questions/concerns about patch 13 logic).
+ + /* no PID in task file, monitor is not running */ + if (!*pidstr) + goto cleanup; + + /* The tracking monitor PID list is not identical to the + * list in current resctrl directory. monitor is corrupted, + * report error and un-running state */ + spids = virStringSplitCount(pidstr, "\n", 0, &nspids); + if (nspids != monitor->npids) { + VIR_ERROR(_("Monitor %s PID list mismatch in length"), monitor->path); + goto cleanup; + }
See this doesn't make sense - why have a lookaside list then? Either you trust what's in your file or you don't. Having boolean logic return true/false where false can either be an error generated or a truly false value just doesn't seem right.
+ + for (i = 0; i < nspids; i++) { + unsigned int val = 0; + pid_t pid = 0; + + if (virStrToLong_uip(spids[i], NULL, 0, &val) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Monitor %s failed in parse PID list"), + monitor->path); + goto cleanup; + } + + pid = (pid_t)val; + + if (VIR_APPEND_ELEMENT(pids, npids, pid) < 0) + goto cleanup; + }
So we're building this list and could have problems doing so. I'm not sure this is "good" and why we need to do this. Either keep an internal list of pids or keep the list of pids in a file. John
+ + qsort(pids, npids, sizeof(pid_t), virResctrlPIDSorter); + qsort(monitor->pids, monitor->npids, sizeof(pid_t), virResctrlPIDSorter); + + for (i = 0; i < monitor->npids; i++) { + if (monitor->pids[i] != pids[i]) { + VIR_ERROR(_("Monitor %s PID list corrupted"), monitor->path); + goto cleanup; + } + } + + ret = true; + cleanup: + virStringListFree(spids); + VIR_FREE(pids); + VIR_FREE(pidstr); + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 804d6f5..8d8fdc2 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -219,4 +219,7 @@ virResctrlMonitorSetAlloc(virResctrlMonitorPtr monitor,
int virResctrlMonitorRemove(virResctrlMonitorPtr monitor); + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); #endif /* __VIR_RESCTRL_H__ */

This patch is added previously because I though the vcpupid would be changed during libvirt re-connection, and if vcpupid is changed and libvirt is not aware of this change it will make monitor working on an stale *tasks file and monitor will get wrong cache information. But the vcpuid will not change in process of libvirt re-connection, the *tasks file is always tacking on right PIDs. So this patch is not necessary now, will be removed in next update of patch series. Thanks for review. Huaqiang On 11/6/2018 3:07 AM, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Check whether monitor is running by checking the monitor's PIDs status.
Monitor is looked as running normally if the vcpu PID list matches with the content of monitor's 'tasks' file.
Signed-off-by: Wang Huaqiang<huaqiang.wang@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 102 ++++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 3 ++ 3 files changed, 105 insertions(+), 1 deletion(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d59ac86..91801ed 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2684,6 +2684,7 @@ virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; virResctrlMonitorGetID; +virResctrlMonitorIsRunning; virResctrlMonitorNew; virResctrlMonitorRemove; virResctrlMonitorSetAlloc; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index b0205bc..fa3e6e9 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -359,6 +359,9 @@ struct _virResctrlMonitor { /* libvirt-generated path in /sys/fs/resctrl for this particular * monitor */ char *path; + /* Tracking the tasks' PID associated with this monitor */ + pid_t *pids; + size_t npids; };
@@ -418,6 +421,7 @@ virResctrlMonitorDispose(void *obj) virObjectUnref(monitor->alloc); VIR_FREE(monitor->id); VIR_FREE(monitor->path); + VIR_FREE(monitor->pids); }
@@ -2540,7 +2544,20 @@ int virResctrlMonitorAddPID(virResctrlMonitorPtr monitor, pid_t pid) { - return virResctrlAddPID(monitor->path, pid); + size_t i = 0; + + if (virResctrlAddPID(monitor->path, pid) < 0) + return -1; + + for (i = 0; i < monitor->npids; i++) { + if (pid == monitor->pids[i]) + return 0; + } + + if (VIR_APPEND_ELEMENT(monitor->pids, monitor->npids, pid) < 0) + return -1; + + return 0; could just be a return VIR_APPEND_ELEMENT()
So we theoretically have our @pid in the task file (ResctrlAddPID) and this internal list of pids which makes me wonder why other than "quicker" lookup than opening the tasks file and looking for our pid, right?
But based on the next hunk for virResctrlMonitorIsRunning, I'm not so sure. In fact I wonder why are we doing this?
}
@@ -2613,3 +2630,86 @@ virResctrlMonitorRemove(virResctrlMonitorPtr monitor)
return ret; } + + +static int +virResctrlPIDSorter(const void *pida, const void *pidb) +{ + return *(pid_t*)pida - *(pid_t*)pidb; +} + + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor) +{ + char *pidstr = NULL; + char **spids = NULL; + size_t nspids = 0; + pid_t *pids = NULL; + size_t npids = 0; + size_t i = 0; + int rv = -1; + bool ret = false; + So this is a heavyweight type method and seems to me to do far more than just determine if a monitor is running. In fact, it seems it's validating that the internal list of monitor->pids matches whatever is in the *tasks file. There's multiple failure scenarios that may or may not "mean" a monitor is or isn't running.
+ /* path is not determined yet, monitor is not running*/ + if (!monitor->path) + return false; + + /* No vcpu PID filled, regard monitor as not running */ + if (monitor->npids == 0) + return false; + + /* If no 'tasks' file found, underlying resctrl directory is not + * ready, regard monitor as not running */ + rv = virFileReadValueString(&pidstr, "%s/tasks", monitor->path); + if (rv < 0) + goto cleanup; So we read the file, but while we're reading someone could have added to it... There's some locking concerns here.
The *tasks file can have a pid added by a <cache> and the same pid added by a <monitor> it seems at least from my reading of qemuProcessSetupVcpu logic where virResctrlAllocAddPID would be called first followed by a call to virResctrlMonitorAddPID. Neither checks if the pid exists before writing (so yes more questions/concerns about patch 13 logic).
+ + /* no PID in task file, monitor is not running */ + if (!*pidstr) + goto cleanup; + + /* The tracking monitor PID list is not identical to the + * list in current resctrl directory. monitor is corrupted, + * report error and un-running state */ + spids = virStringSplitCount(pidstr, "\n", 0, &nspids); + if (nspids != monitor->npids) { + VIR_ERROR(_("Monitor %s PID list mismatch in length"), monitor->path); + goto cleanup; + } See this doesn't make sense - why have a lookaside list then? Either you trust what's in your file or you don't.
Having boolean logic return true/false where false can either be an error generated or a truly false value just doesn't seem right.
+ + for (i = 0; i < nspids; i++) { + unsigned int val = 0; + pid_t pid = 0; + + if (virStrToLong_uip(spids[i], NULL, 0, &val) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Monitor %s failed in parse PID list"), + monitor->path); + goto cleanup; + } + + pid = (pid_t)val; + + if (VIR_APPEND_ELEMENT(pids, npids, pid) < 0) + goto cleanup; + } So we're building this list and could have problems doing so.
I'm not sure this is "good" and why we need to do this. Either keep an internal list of pids or keep the list of pids in a file.
John
+ + qsort(pids, npids, sizeof(pid_t), virResctrlPIDSorter); + qsort(monitor->pids, monitor->npids, sizeof(pid_t), virResctrlPIDSorter); + + for (i = 0; i < monitor->npids; i++) { + if (monitor->pids[i] != pids[i]) { + VIR_ERROR(_("Monitor %s PID list corrupted"), monitor->path); + goto cleanup; + } + } + + ret = true; + cleanup: + virStringListFree(spids); + VIR_FREE(pids); + VIR_FREE(pidstr); + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 804d6f5..8d8fdc2 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -219,4 +219,7 @@ virResctrlMonitorSetAlloc(virResctrlMonitorPtr monitor,
int virResctrlMonitorRemove(virResctrlMonitorPtr monitor); + +bool +virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); #endif /* __VIR_RESCTRL_H__ */

Adding element 'id' to virDomainResctrlDef tracking resource group id, it reflects the attribute 'id' of of element <cachetune> in XML. virResctrlAlloc.id is a copy from virDomanResctrlDef.id. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 20 ++++++++------------ src/conf/domain_conf.h | 1 + 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 01f795f..1aecd8c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2980,6 +2980,7 @@ virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); VIR_FREE(resctrl->monitors); + VIR_FREE(resctrl->id); VIR_FREE(resctrl); } @@ -19145,6 +19146,9 @@ virDomainResctrlNew(xmlNodePtr node, if (VIR_ALLOC(resctrl) < 0) goto cleanup; + if (VIR_STRDUP(resctrl->id, alloc_id) < 0) + goto cleanup; + if (!(resctrl->vcpus = virBitmapNewCopy(*vcpus))) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("failed to copy 'vcpus'")); @@ -27323,13 +27327,9 @@ virDomainCachetuneDefFormat(virBufferPtr buf, virBufferAsprintf(buf, "<cachetune vcpus='%s'", vcpus); - if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { - const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); - if (!alloc_id) - goto cleanup; + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) + virBufferAsprintf(buf, " id='%s'", resctrl->id); - virBufferAsprintf(buf, " id='%s'", alloc_id); - } virBufferAddLit(buf, ">\n"); virBufferAddBuffer(buf, &childrenBuf); @@ -27386,13 +27386,9 @@ virDomainMemorytuneDefFormat(virBufferPtr buf, virBufferAsprintf(buf, "<memorytune vcpus='%s'", vcpus); - if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { - const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); - if (!alloc_id) - goto cleanup; + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) + virBufferAsprintf(buf, " id='%s'", resctrl->id); - virBufferAsprintf(buf, " id='%s'", alloc_id); - } virBufferAddLit(buf, ">\n"); virBufferAddBuffer(buf, &childrenBuf); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 60f6464..e190aa2 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2248,6 +2248,7 @@ typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr; struct _virDomainResctrlDef { + char *id; virBitmapPtr vcpus; virResctrlAllocPtr alloc; -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Adding element 'id' to virDomainResctrlDef tracking resource group id, it reflects the attribute 'id' of of element <cachetune> in XML.
virResctrlAlloc.id is a copy from virDomanResctrlDef.id.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 20 ++++++++------------ src/conf/domain_conf.h | 1 + 2 files changed, 9 insertions(+), 12 deletions(-)
Not sure I see the need to duplicate the alloc->id value especially since it's "invalid" have "resctrl->alloc == NULL", right? Can this resctrl->id ever be different? Not that I can see. I see this is a desire to reduce an error case in Format processing, but if virResctrlAllocGetID returned NULL there's other problems... John
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 01f795f..1aecd8c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2980,6 +2980,7 @@ virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); VIR_FREE(resctrl->monitors); + VIR_FREE(resctrl->id); VIR_FREE(resctrl); }
@@ -19145,6 +19146,9 @@ virDomainResctrlNew(xmlNodePtr node, if (VIR_ALLOC(resctrl) < 0) goto cleanup;
+ if (VIR_STRDUP(resctrl->id, alloc_id) < 0) + goto cleanup; + if (!(resctrl->vcpus = virBitmapNewCopy(*vcpus))) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("failed to copy 'vcpus'")); @@ -27323,13 +27327,9 @@ virDomainCachetuneDefFormat(virBufferPtr buf,
virBufferAsprintf(buf, "<cachetune vcpus='%s'", vcpus);
- if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { - const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); - if (!alloc_id) - goto cleanup; + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) + virBufferAsprintf(buf, " id='%s'", resctrl->id);
- virBufferAsprintf(buf, " id='%s'", alloc_id); - } virBufferAddLit(buf, ">\n");
virBufferAddBuffer(buf, &childrenBuf); @@ -27386,13 +27386,9 @@ virDomainMemorytuneDefFormat(virBufferPtr buf,
virBufferAsprintf(buf, "<memorytune vcpus='%s'", vcpus);
- if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { - const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); - if (!alloc_id) - goto cleanup; + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) + virBufferAsprintf(buf, " id='%s'", resctrl->id);
- virBufferAsprintf(buf, " id='%s'", alloc_id); - } virBufferAddLit(buf, ">\n");
virBufferAddBuffer(buf, &childrenBuf); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 60f6464..e190aa2 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2248,6 +2248,7 @@ typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr;
struct _virDomainResctrlDef { + char *id; virBitmapPtr vcpus; virResctrlAllocPtr alloc;

On 11/6/2018 3:15 AM, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Adding element 'id' to virDomainResctrlDef tracking resource group id, it reflects the attribute 'id' of of element <cachetune> in XML.
virResctrlAlloc.id is a copy from virDomanResctrlDef.id.
Signed-off-by: Wang Huaqiang<huaqiang.wang@intel.com> --- src/conf/domain_conf.c | 20 ++++++++------------ src/conf/domain_conf.h | 1 + 2 files changed, 9 insertions(+), 12 deletions(-)
Not sure I see the need to duplicate the alloc->id value especially since it's "invalid" have "resctrl->alloc == NULL", right?
Can this resctrl->id ever be different? Not that I can see. I see this is a desire to reduce an error case in Format processing, but if virResctrlAllocGetID returned NULL there's other problems...
John
This patch should be removed and will be removed. It is introduced in series v4, in that series the 'resctrl->alloc' is allowed to be NULL representing a default monitor. Now we changed our design and resctrl->alloc is always assigned with an virResctrlAlloc data structure, then 'resctrl->id' is not necessary. We can keep the resctrl ID in alloc->id. Thanks for review. Huaqiang
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 01f795f..1aecd8c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2980,6 +2980,7 @@ virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) virObjectUnref(resctrl->alloc); virBitmapFree(resctrl->vcpus); VIR_FREE(resctrl->monitors); + VIR_FREE(resctrl->id); VIR_FREE(resctrl); }
@@ -19145,6 +19146,9 @@ virDomainResctrlNew(xmlNodePtr node, if (VIR_ALLOC(resctrl) < 0) goto cleanup;
+ if (VIR_STRDUP(resctrl->id, alloc_id) < 0) + goto cleanup; + if (!(resctrl->vcpus = virBitmapNewCopy(*vcpus))) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("failed to copy 'vcpus'")); @@ -27323,13 +27327,9 @@ virDomainCachetuneDefFormat(virBufferPtr buf,
virBufferAsprintf(buf, "<cachetune vcpus='%s'", vcpus);
- if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { - const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); - if (!alloc_id) - goto cleanup; + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) + virBufferAsprintf(buf, " id='%s'", resctrl->id);
- virBufferAsprintf(buf, " id='%s'", alloc_id); - } virBufferAddLit(buf, ">\n");
virBufferAddBuffer(buf, &childrenBuf); @@ -27386,13 +27386,9 @@ virDomainMemorytuneDefFormat(virBufferPtr buf,
virBufferAsprintf(buf, "<memorytune vcpus='%s'", vcpus);
- if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { - const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); - if (!alloc_id) - goto cleanup; + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) + virBufferAsprintf(buf, " id='%s'", resctrl->id);
- virBufferAsprintf(buf, " id='%s'", alloc_id); - } virBufferAddLit(buf, ">\n");
virBufferAddBuffer(buf, &childrenBuf); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 60f6464..e190aa2 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2248,6 +2248,7 @@ typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef virDomainResctrlDef *virDomainResctrlDefPtr;
struct _virDomainResctrlDef { + char *id; virBitmapPtr vcpus; virResctrlAllocPtr alloc;

Refactoring qemuDomainGetStatsCpu, make it possible to add more CPU statistics. Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_driver.c | 45 ++++++++++++++++++++++----------------------- 1 file changed, 22 insertions(+), 23 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a52e249..12a5f8f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -19711,30 +19711,29 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, unsigned long long sys_time = 0; int err = 0; - if (!priv->cgroup) - return 0; - - err = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.time", - cpu_time) < 0) - return -1; + if (priv->cgroup) { + err = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.time", + cpu_time) < 0) + return -1; - err = virCgroupGetCpuacctStat(priv->cgroup, &user_time, &sys_time); - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.user", - user_time) < 0) - return -1; - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.system", - sys_time) < 0) - return -1; + err = virCgroupGetCpuacctStat(priv->cgroup, &user_time, &sys_time); + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.user", + user_time) < 0) + return -1; + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.system", + sys_time) < 0) + return -1; + } return 0; } -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Refactoring qemuDomainGetStatsCpu, make it possible to add more CPU statistics.
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_driver.c | 45 ++++++++++++++++++++++----------------------- 1 file changed, 22 insertions(+), 23 deletions(-)
Maybe instead of inverting the logic, let's create a cgroup helper... qemuDomainGetStatsCpuCgroup(dom, record, *maxparams) { logic as is now } static int qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr dom, virDomainStatsRecordPtr record, int *maxparams, unsigned int privflags ATTRIBUTE_UNUSED) { return qemuDomainGetStatsCpuCgroup(dom, record, maxparams); } Then the subsequent patch would alter the above check and add the new call. John
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a52e249..12a5f8f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -19711,30 +19711,29 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, unsigned long long sys_time = 0; int err = 0;
- if (!priv->cgroup) - return 0; - - err = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.time", - cpu_time) < 0) - return -1; + if (priv->cgroup) { + err = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.time", + cpu_time) < 0) + return -1;
- err = virCgroupGetCpuacctStat(priv->cgroup, &user_time, &sys_time); - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.user", - user_time) < 0) - return -1; - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.system", - sys_time) < 0) - return -1; + err = virCgroupGetCpuacctStat(priv->cgroup, &user_time, &sys_time); + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.user", + user_time) < 0) + return -1; + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.system", + sys_time) < 0) + return -1; + }
return 0; }

On 11/6/2018 3:27 AM, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Refactoring qemuDomainGetStatsCpu, make it possible to add more CPU statistics.
Signed-off-by: Wang Huaqiang<huaqiang.wang@intel.com> --- src/qemu/qemu_driver.c | 45 ++++++++++++++++++++++----------------------- 1 file changed, 22 insertions(+), 23 deletions(-)
Maybe instead of inverting the logic, let's create a cgroup helper...
qemuDomainGetStatsCpuCgroup(dom, record, *maxparams) { logic as is now }
static int qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr dom, virDomainStatsRecordPtr record, int *maxparams, unsigned int privflags ATTRIBUTE_UNUSED) { return qemuDomainGetStatsCpuCgroup(dom, record, maxparams); }
Then the subsequent patch would alter the above check and add the new call.
John
Thanks for advice. Will be done in next series. Huaqiang
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a52e249..12a5f8f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -19711,30 +19711,29 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, unsigned long long sys_time = 0; int err = 0;
- if (!priv->cgroup) - return 0; - - err = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.time", - cpu_time) < 0) - return -1; + if (priv->cgroup) { + err = virCgroupGetCpuacctUsage(priv->cgroup, &cpu_time); + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.time", + cpu_time) < 0) + return -1;
- err = virCgroupGetCpuacctStat(priv->cgroup, &user_time, &sys_time); - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.user", - user_time) < 0) - return -1; - if (!err && virTypedParamsAddULLong(&record->params, - &record->nparams, - maxparams, - "cpu.system", - sys_time) < 0) - return -1; + err = virCgroupGetCpuacctStat(priv->cgroup, &user_time, &sys_time); + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.user", + user_time) < 0) + return -1; + if (!err && virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "cpu.system", + sys_time) < 0) + return -1; + }
return 0; }

Adding the interface in qemu to report CMT statistic information through command 'virsh domstats --cpu-total'. Below is a typical output: # virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.monitor.0.name=vcpus_1 cpu.cache.monitor.0.vcpus=1 cpu.cache.monitor.0.bank.count=2 cpu.cache.monitor.0.bank.0.id=0 cpu.cache.monitor.0.bank.0.bytes=4505600 cpu.cache.monitor.0.bank.1.id=1 cpu.cache.monitor.0.bank.1.bytes=5586944 cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6 cpu.cache.monitor.1.bank.count=2 cpu.cache.monitor.1.bank.0.id=0 cpu.cache.monitor.1.bank.0.bytes=17571840 cpu.cache.monitor.1.bank.1.id=1 cpu.cache.monitor.1.bank.1.bytes=29106176 Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt-domain.c | 9 ++ src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 229 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.c | 130 +++++++++++++++++++++++++++ src/util/virresctrl.h | 12 +++ 5 files changed, 381 insertions(+) diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 7690339..4895f9f 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -11345,6 +11345,15 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * "cpu.user" - user cpu time spent in nanoseconds as unsigned long long. * "cpu.system" - system cpu time spent in nanoseconds as unsigned long * long. + * "cpu.cache.monitor.count" - tocal cache monitoring groups + * "cpu.cache.monitor.M.name" - name of cache monitoring group 'M' + * "cpu.cache.monitor.M.vcpus" - vcpus for cache monitoring group 'M' + * "cpu.cache.monitor.M.bank.count" - total bank number of cache monitoring + * group 'M' + * "cpu.cache.monitor.M.bank.N.id" - OS assigned cache bank id for cache + * 'N' in cache monitoring group 'M' + * "cpu.cache.monitor.M.bank.N.bytes" - monitor's cache occupancy of cache + * bank 'N' in cache monitoring group 'M' * * VIR_DOMAIN_STATS_BALLOON: * Return memory balloon device information. diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 91801ed..4551767 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2683,6 +2683,7 @@ virResctrlInfoNew; virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; +virResctrlMonitorGetCacheOccupancy; virResctrlMonitorGetID; virResctrlMonitorIsRunning; virResctrlMonitorNew; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 12a5f8f..9828118 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -102,6 +102,7 @@ #include "virnuma.h" #include "dirname.h" #include "netdev_bandwidth_conf.h" +#include "c-ctype.h" #define VIR_FROM_THIS VIR_FROM_QEMU @@ -19698,6 +19699,230 @@ typedef enum { #define HAVE_JOB(flags) ((flags) & QEMU_DOMAIN_STATS_HAVE_JOB) +/* In terms of the output of virBitmapFormat, both '1-3' and '1,3' are valid + * outputs and represent different vcpu set. But it is not easy to + * differentiate these two vcpu set formats at first glance. + * + * This function could be used to clear this ambiguity, it substitutes all '-' + * with ',' while generates semantically correct vcpu set. + * e.g. vcpu set string '1-3' will be replaced by string '1,2,3'. */ +static char * +qemuDomainVcpuFormatHelper(const char *vcpus) +{ + size_t i = 0; + int last = 0; + int start = 0; + char * tmp = NULL; + bool firstnum = true; + const char *cur = vcpus; + virBuffer buf = VIR_BUFFER_INITIALIZER; + char *ret = NULL; + + if (virStringIsEmpty(cur)) + return NULL; + + while (*cur != '\0') { + if (!c_isdigit(*cur)) + goto cleanup; + + if (virStrToLong_i(cur, &tmp, 10, &start) < 0) + goto cleanup; + if (start < 0) + goto cleanup; + + cur = tmp; + + virSkipSpaces(&cur); + + if (*cur == ',' || *cur == 0) { + if (!firstnum) + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%d", start); + firstnum = false; + } else if (*cur == '-') { + cur++; + virSkipSpaces(&cur); + + if (virStrToLong_i(cur, &tmp, 10, &last) < 0) + goto cleanup; + + if (last < start) + goto cleanup; + cur = tmp; + + for (i = start; i <= last; i++) { + if (!firstnum) + + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%ld", i); + firstnum = 0; + } + + virSkipSpaces(&cur); + } + + if (*cur == ',') { + cur++; + virSkipSpaces(&cur); + } else if (*cur == 0) { + break; + } else { + goto cleanup; + } + } + + ret = virBufferContentAndReset(&buf); + cleanup: + virBufferFreeAndReset(&buf); + return ret; +} + + +static int +qemuDomainGetStatsCpuResource(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr dom, + virDomainStatsRecordPtr record, + int *maxparams, + unsigned int privflags ATTRIBUTE_UNUSED, + virResctrlMonitorType restag) +{ + char param_name[VIR_TYPED_PARAM_FIELD_LENGTH]; + virDomainResctrlMonDefPtr domresmon = NULL; + virResctrlMonitorStatsPtr stats = NULL; + size_t nstats = 0; + virDomainResctrlDefPtr resctrl = NULL; + unsigned int nmonitors = 0; + unsigned int imonitor = 0; + const char *restype = NULL; + char *rawvcpus = NULL; + char *vcpus = NULL; + size_t i = 0; + size_t j = 0; + int ret = -1; + + if (!virDomainObjIsActive(dom)) + return 0; + + if (restag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + restype = "cache"; + } else { + VIR_DEBUG("Invalid CPU resource type"); + return -1; + } + + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + domresmon = resctrl->monitors[j]; + if (virResctrlMonitorIsRunning(domresmon->instance) && + domresmon->tag == restag) + nmonitors++; + } + } + + if (nmonitors) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.count", restype); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nmonitors) < 0) + goto cleanup; + } + + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + size_t l = 0; + virResctrlMonitorPtr monitor = resctrl->monitors[j]->instance; + const char *id = virResctrlMonitorGetID(monitor); + + if (!id) + goto cleanup; + + domresmon = resctrl->monitors[j]; + + if (!virResctrlMonitorIsRunning(domresmon->instance)) + continue; + + if (!(rawvcpus = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + vcpus = qemuDomainVcpuFormatHelper(rawvcpus); + if (!vcpus) + goto cleanup; + + if (virResctrlMonitorGetCacheOccupancy(monitor, &stats, + &nstats) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.name", restype, imonitor); + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + id) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.vcpus", restype, imonitor); + + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + vcpus) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.count", restype, imonitor); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nstats) < 0) + goto cleanup; + + for (l = 0; l < nstats; l++) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.%ld.id", + restype, imonitor, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + stats[l].id) < 0) + goto cleanup; + + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.%ld.bytes", + restype, imonitor, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + stats[l].val) < 0) + goto cleanup; + } + + VIR_FREE(stats); + VIR_FREE(vcpus); + imonitor++; + } + } + + ret = 0; + cleanup: + VIR_FREE(vcpus); + return ret; +} + + static int qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr dom, @@ -19735,6 +19960,10 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, return -1; } + if (qemuDomainGetStatsCpuResource(driver, dom, record, maxparams, privflags, + VIR_RESCTRL_MONITOR_TYPE_CACHE) < 0) + return -1; + return 0; } diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index fa3e6e9..02e7095 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2713,3 +2713,133 @@ virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor) return ret; } + + +static int +virResctrlMonitorStatsSorter(const void *a, + const void *b) +{ + return ((virResctrlMonitorStatsPtr)a)->id + - ((virResctrlMonitorStatsPtr)b)->id; +} + + +/* + * virResctrlMonitorGetStats + * + * @monitor: The monitor that the statistic data will be retrieved from. + * @resource: The name for resource name. 'llc_occupancy' for cache resource. + * "mbm_total_bytes" and "mbm_local_bytes" for memory bandwidth resource. + * @stats: Array of virResctrlMonitorStatsPtr for holding cache or memory + * bandwidth usage data. + * @nstats: A size_t pointer to hold the returned array length of @stats + * + * Get cache or memory bandwidth utilization information. + * + * Returns 0 on success, -1 on error. + */ +static int +virResctrlMonitorGetStats(virResctrlMonitorPtr monitor, + const char *resource, + virResctrlMonitorStatsPtr *stats, + size_t *nstats) +{ + int rv = -1; + int ret = -1; + DIR *dirp = NULL; + char *datapath = NULL; + struct dirent *ent = NULL; + virResctrlMonitorStatsPtr stat = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (virAsprintf(&datapath, "%s/mon_data", monitor->path) < 0) + return -1; + + if (virDirOpen(&dirp, datapath) < 0) + goto cleanup; + + *nstats = 0; + while (virDirRead(dirp, &ent, datapath) > 0) { + char *node_id = NULL; + + if (VIR_ALLOC(stat) < 0) + goto cleanup; + + /* Looking for directory that contains resource utilization + * information file. The directory name is arranged in format + * "mon_<node_name>_<node_id>". For example, "mon_L3_00" and + * "mon_L3_01" are two target directories for a two nodes system + * with resource utilization data file for each node respectively. + */ + if (ent->d_type != DT_DIR) + continue; + + /* Looking for directory has a prefix 'mon_L' */ + if (!(node_id = STRSKIP(ent->d_name, "mon_L"))) + continue; + + /* Looking for directory has another '_' */ + node_id = strchr(node_id, '_'); + if (!node_id) + continue; + + /* Skip the character '_' */ + if (!(node_id = STRSKIP(node_id, "_"))) + continue; + + /* The node ID number should be here, parsing it. */ + if (virStrToLong_uip(node_id, NULL, 0, &stat->id) < 0) + goto cleanup; + + rv = virFileReadValueUint(&stat->val, "%s/%s/%s", datapath, + ent->d_name, resource); + if (rv == -2) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%s/%s/%s' does not exist."), + datapath, ent->d_name, resource); + } + if (rv < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*stats, *nstats, *stat) < 0) + goto cleanup; + } + + /* Sort in id's ascending order */ + qsort(*stats, *nstats, sizeof(*stat), virResctrlMonitorStatsSorter); + + ret = 0; + cleanup: + VIR_FREE(datapath); + VIR_FREE(stat); + VIR_DIR_CLOSE(dirp); + return ret; +} + + +/* + * virResctrlMonitorGetCacheOccupancy + * + * @monitor: The monitor that the statistic data will be retrieved from. + * @caches: Array of virResctrlMonitorStatsPtr for receiving cache occupancy + * data. Caller is responsible to free this array. + * @ncaches: A size_t pointer to hold the returned array length of @caches + * + * Get cache or memory bandwidth utilization information. + * + * Returns 0 on success, -1 on error. + */ + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + virResctrlMonitorStatsPtr *caches, + size_t *ncaches) +{ + return virResctrlMonitorGetStats(monitor, "llc_occupancy", + caches, ncaches); +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 8d8fdc2..004c40e 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -191,6 +191,13 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl, typedef struct _virResctrlMonitor virResctrlMonitor; typedef virResctrlMonitor *virResctrlMonitorPtr; +typedef struct _virResctrlMonitorStats virResctrlMonitorStats; +typedef virResctrlMonitorStats *virResctrlMonitorStatsPtr; +struct _virResctrlMonitorStats { + unsigned int id; + unsigned int val; +}; + virResctrlMonitorPtr virResctrlMonitorNew(void); @@ -222,4 +229,9 @@ virResctrlMonitorRemove(virResctrlMonitorPtr monitor); bool virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + virResctrlMonitorStatsPtr *caches, + size_t *ncaches); #endif /* __VIR_RESCTRL_H__ */ -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Adding the interface in qemu to report CMT statistic information through command 'virsh domstats --cpu-total'.
Below is a typical output:
# virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.monitor.0.name=vcpus_1 cpu.cache.monitor.0.vcpus=1 cpu.cache.monitor.0.bank.count=2 cpu.cache.monitor.0.bank.0.id=0 cpu.cache.monitor.0.bank.0.bytes=4505600 cpu.cache.monitor.0.bank.1.id=1 cpu.cache.monitor.0.bank.1.bytes=5586944 cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6 cpu.cache.monitor.1.bank.count=2 cpu.cache.monitor.1.bank.0.id=0 cpu.cache.monitor.1.bank.0.bytes=17571840 cpu.cache.monitor.1.bank.1.id=1 cpu.cache.monitor.1.bank.1.bytes=29106176
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/libvirt-domain.c | 9 ++ src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 229 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.c | 130 +++++++++++++++++++++++++++ src/util/virresctrl.h | 12 +++ 5 files changed, 381 insertions(+)
I have a feeling I'll be asking for this to be split up a bit, but let's see... There's *util, *qemu, and *API code in the same patch.
diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 7690339..4895f9f 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -11345,6 +11345,15 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * "cpu.user" - user cpu time spent in nanoseconds as unsigned long long. * "cpu.system" - system cpu time spent in nanoseconds as unsigned long * long. + * "cpu.cache.monitor.count" - tocal cache monitoring groups + * "cpu.cache.monitor.M.name" - name of cache monitoring group 'M' + * "cpu.cache.monitor.M.vcpus" - vcpus for cache monitoring group 'M' + * "cpu.cache.monitor.M.bank.count" - total bank number of cache monitoring + * group 'M' + * "cpu.cache.monitor.M.bank.N.id" - OS assigned cache bank id for cache + * 'N' in cache monitoring group 'M' + * "cpu.cache.monitor.M.bank.N.bytes" - monitor's cache occupancy of cache + * bank 'N' in cache monitoring group 'M' * * VIR_DOMAIN_STATS_BALLOON: * Return memory balloon device information. diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 91801ed..4551767 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2683,6 +2683,7 @@ virResctrlInfoNew; virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; +virResctrlMonitorGetCacheOccupancy; virResctrlMonitorGetID; virResctrlMonitorIsRunning; virResctrlMonitorNew; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 12a5f8f..9828118 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -102,6 +102,7 @@ #include "virnuma.h" #include "dirname.h" #include "netdev_bandwidth_conf.h" +#include "c-ctype.h"
Ahh.. this one's a red flag... Says to me that something should move to a util function...
#define VIR_FROM_THIS VIR_FROM_QEMU
@@ -19698,6 +19699,230 @@ typedef enum { #define HAVE_JOB(flags) ((flags) & QEMU_DOMAIN_STATS_HAVE_JOB)
+/* In terms of the output of virBitmapFormat, both '1-3' and '1,3' are valid + * outputs and represent different vcpu set. But it is not easy to + * differentiate these two vcpu set formats at first glance. + * + * This function could be used to clear this ambiguity, it substitutes all '-' + * with ',' while generates semantically correct vcpu set. + * e.g. vcpu set string '1-3' will be replaced by string '1,2,3'. */ +static char * +qemuDomainVcpuFormatHelper(const char *vcpus) +{ + size_t i = 0; + int last = 0; + int start = 0; + char * tmp = NULL; + bool firstnum = true; + const char *cur = vcpus; + virBuffer buf = VIR_BUFFER_INITIALIZER; + char *ret = NULL; + + if (virStringIsEmpty(cur)) + return NULL; + + while (*cur != '\0') { + if (!c_isdigit(*cur))
Explains the #include, but in the long run, I don't think this method is worth the effort since all you're doing is printing the format in the output. Is there other libvirt generated output for cpu stats that does a similar conversion? If so, share code, if not drop this. No need to rearrange the range someone else entered and we've succesfully parsed in some manner. I think the output should look like the XML output unless there's some compelling reason to change it which I don't see.
From above the:
cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6
would then be:
cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4-6
right? I don't even want to think about someone who does "7,4-6"...
+ goto cleanup; + + if (virStrToLong_i(cur, &tmp, 10, &start) < 0) + goto cleanup; + if (start < 0) + goto cleanup; + + cur = tmp; + + virSkipSpaces(&cur); + + if (*cur == ',' || *cur == 0) { + if (!firstnum) + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%d", start); + firstnum = false; + } else if (*cur == '-') { + cur++; + virSkipSpaces(&cur); + + if (virStrToLong_i(cur, &tmp, 10, &last) < 0) + goto cleanup; + + if (last < start) + goto cleanup; + cur = tmp; + + for (i = start; i <= last; i++) { + if (!firstnum) + + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%ld", i); + firstnum = 0; + } + + virSkipSpaces(&cur); + } + + if (*cur == ',') { + cur++; + virSkipSpaces(&cur); + } else if (*cur == 0) { + break; + } else { + goto cleanup; + } + } + + ret = virBufferContentAndReset(&buf); + cleanup: + virBufferFreeAndReset(&buf); + return ret; +} + + +static int +qemuDomainGetStatsCpuResource(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr dom, + virDomainStatsRecordPtr record, + int *maxparams, + unsigned int privflags ATTRIBUTE_UNUSED, + virResctrlMonitorType restag)
Why bother passing ATTRIBUTE_UNUSED ?
+{ + char param_name[VIR_TYPED_PARAM_FIELD_LENGTH]; + virDomainResctrlMonDefPtr domresmon = NULL; + virResctrlMonitorStatsPtr stats = NULL; + size_t nstats = 0;
since these are local to the: + for (j = 0; j < resctrl->nmonitors; j++) { loop and there's local's in that for loop, maybe these two should move there too...
+ virDomainResctrlDefPtr resctrl = NULL; + unsigned int nmonitors = 0; + unsigned int imonitor = 0; + const char *restype = NULL; + char *rawvcpus = NULL; + char *vcpus = NULL; + size_t i = 0; + size_t j = 0; + int ret = -1; + + if (!virDomainObjIsActive(dom)) + return 0; + + if (restag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + restype = "cache"; + } else { + VIR_DEBUG("Invalid CPU resource type");
VIR_DEBUG? virReportError would seem to be more appropriate. As I've stated before there will be an error for some reason generated.
+ return -1; + } + + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + domresmon = resctrl->monitors[j]; + if (virResctrlMonitorIsRunning(domresmon->instance) && + domresmon->tag == restag)
Knowing how heavy wait *IsRunning is, the order of checking should be reversed at the very least.
+ nmonitors++; + } + } + + if (nmonitors) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.count", restype); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nmonitors) < 0) + goto cleanup; + }
All that above to ensure nmonitors == resctrl->nmonitors??? I think there's a lot of unnecessary stuff.
+ + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + size_t l = 0; + virResctrlMonitorPtr monitor = resctrl->monitors[j]->instance; + const char *id = virResctrlMonitorGetID(monitor); + + if (!id) + goto cleanup;
I would think you have other problems in this case. Besides there's no error message associated with this.
+ + domresmon = resctrl->monitors[j]; + + if (!virResctrlMonitorIsRunning(domresmon->instance)) + continue;
Oh my one call for each iteration, ouch - highly inefficient.
+ + if (!(rawvcpus = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + vcpus = qemuDomainVcpuFormatHelper(rawvcpus); + if (!vcpus) + goto cleanup; + + if (virResctrlMonitorGetCacheOccupancy(monitor, &stats, + &nstats) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.name", restype, imonitor); + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + id) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.vcpus", restype, imonitor); + + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + vcpus) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.count", restype, imonitor); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nstats) < 0) + goto cleanup; + + for (l = 0; l < nstats; l++) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.%ld.id", + restype, imonitor, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + stats[l].id) < 0) + goto cleanup; + + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.%ld.bytes", + restype, imonitor, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + stats[l].val) < 0) + goto cleanup; + } + + VIR_FREE(stats);
nstats = 0; too!
+ VIR_FREE(vcpus); + imonitor++; + } + } + + ret = 0; + cleanup: + VIR_FREE(vcpus); + return ret; +} + + static int qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr dom, @@ -19735,6 +19960,10 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, return -1; }
+ if (qemuDomainGetStatsCpuResource(driver, dom, record, maxparams, privflags, + VIR_RESCTRL_MONITOR_TYPE_CACHE) < 0) + return -1; + return 0; }
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index fa3e6e9..02e7095 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2713,3 +2713,133 @@ virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor)
return ret; } + + +static int +virResctrlMonitorStatsSorter(const void *a, + const void *b) +{ + return ((virResctrlMonitorStatsPtr)a)->id + - ((virResctrlMonitorStatsPtr)b)->id; +} + + +/* + * virResctrlMonitorGetStats + * + * @monitor: The monitor that the statistic data will be retrieved from. + * @resource: The name for resource name. 'llc_occupancy' for cache resource. + * "mbm_total_bytes" and "mbm_local_bytes" for memory bandwidth resource. + * @stats: Array of virResctrlMonitorStatsPtr for holding cache or memory + * bandwidth usage data. + * @nstats: A size_t pointer to hold the returned array length of @stats + * + * Get cache or memory bandwidth utilization information. + * + * Returns 0 on success, -1 on error. + */ +static int +virResctrlMonitorGetStats(virResctrlMonitorPtr monitor, + const char *resource, + virResctrlMonitorStatsPtr *stats, + size_t *nstats) +{ + int rv = -1; + int ret = -1; + DIR *dirp = NULL; + char *datapath = NULL; + struct dirent *ent = NULL; + virResctrlMonitorStatsPtr stat = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (virAsprintf(&datapath, "%s/mon_data", monitor->path) < 0) + return -1; + + if (virDirOpen(&dirp, datapath) < 0) + goto cleanup; + + *nstats = 0; + while (virDirRead(dirp, &ent, datapath) > 0) { + char *node_id = NULL; + + if (VIR_ALLOC(stat) < 0) + goto cleanup; + + /* Looking for directory that contains resource utilization + * information file. The directory name is arranged in format + * "mon_<node_name>_<node_id>". For example, "mon_L3_00" and + * "mon_L3_01" are two target directories for a two nodes system + * with resource utilization data file for each node respectively. + */ + if (ent->d_type != DT_DIR) + continue; + + /* Looking for directory has a prefix 'mon_L' */ + if (!(node_id = STRSKIP(ent->d_name, "mon_L"))) + continue; + + /* Looking for directory has another '_' */ + node_id = strchr(node_id, '_'); + if (!node_id) + continue; + + /* Skip the character '_' */ + if (!(node_id = STRSKIP(node_id, "_"))) + continue; + + /* The node ID number should be here, parsing it. */ + if (virStrToLong_uip(node_id, NULL, 0, &stat->id) < 0) + goto cleanup; + + rv = virFileReadValueUint(&stat->val, "%s/%s/%s", datapath, + ent->d_name, resource); + if (rv == -2) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%s/%s/%s' does not exist."), + datapath, ent->d_name, resource); + } + if (rv < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*stats, *nstats, *stat) < 0) + goto cleanup; + } + + /* Sort in id's ascending order */ + qsort(*stats, *nstats, sizeof(*stat), virResctrlMonitorStatsSorter);
Coverity notes that *stats could be NULL - if the above while loop gets nothing...
+ + ret = 0; + cleanup: + VIR_FREE(datapath); + VIR_FREE(stat); + VIR_DIR_CLOSE(dirp); + return ret; +} + + +/* + * virResctrlMonitorGetCacheOccupancy + * + * @monitor: The monitor that the statistic data will be retrieved from. + * @caches: Array of virResctrlMonitorStatsPtr for receiving cache occupancy + * data. Caller is responsible to free this array. + * @ncaches: A size_t pointer to hold the returned array length of @caches
They're @stats and @nstats elsewhere, be consistent.
+ * + * Get cache or memory bandwidth utilization information. + * + * Returns 0 on success, -1 on error. + */ + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + virResctrlMonitorStatsPtr *caches, + size_t *ncaches)
They're @stats and @nstats elsewhere, be consistent.
+{ + return virResctrlMonitorGetStats(monitor, "llc_occupancy", + caches, ncaches); +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 8d8fdc2..004c40e 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -191,6 +191,13 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl, typedef struct _virResctrlMonitor virResctrlMonitor; typedef virResctrlMonitor *virResctrlMonitorPtr;
+typedef struct _virResctrlMonitorStats virResctrlMonitorStats; +typedef virResctrlMonitorStats *virResctrlMonitorStatsPtr; +struct _virResctrlMonitorStats { + unsigned int id; + unsigned int val; +}; + virResctrlMonitorPtr virResctrlMonitorNew(void);
@@ -222,4 +229,9 @@ virResctrlMonitorRemove(virResctrlMonitorPtr monitor);
bool virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + virResctrlMonitorStatsPtr *caches, + size_t *ncaches);
They're @stats and @nstats elsewhere, be consistent.
#endif /* __VIR_RESCTRL_H__ */
I'm sure there's more I'll pick up later, but you should have plenty for now. As to splitting - I think some splitting could be done, I'll leave it up to you... Lots of change across multiple areas - ask yourself - can something be separated out? John

On 11/6/2018 4:04 AM, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Adding the interface in qemu to report CMT statistic information through command 'virsh domstats --cpu-total'.
Below is a typical output:
# virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.monitor.0.name=vcpus_1 cpu.cache.monitor.0.vcpus=1 cpu.cache.monitor.0.bank.count=2 cpu.cache.monitor.0.bank.0.id=0 cpu.cache.monitor.0.bank.0.bytes=4505600 cpu.cache.monitor.0.bank.1.id=1 cpu.cache.monitor.0.bank.1.bytes=5586944 cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6 cpu.cache.monitor.1.bank.count=2 cpu.cache.monitor.1.bank.0.id=0 cpu.cache.monitor.1.bank.0.bytes=17571840 cpu.cache.monitor.1.bank.1.id=1 cpu.cache.monitor.1.bank.1.bytes=29106176
Signed-off-by: Wang Huaqiang<huaqiang.wang@intel.com> --- src/libvirt-domain.c | 9 ++ src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 229 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.c | 130 +++++++++++++++++++++++++++ src/util/virresctrl.h | 12 +++ 5 files changed, 381 insertions(+)
I have a feeling I'll be asking for this to be split up a bit, but let's see... There's *util, *qemu, and *API code in the same patch.
diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 7690339..4895f9f 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -11345,6 +11345,15 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * "cpu.user" - user cpu time spent in nanoseconds as unsigned long long. * "cpu.system" - system cpu time spent in nanoseconds as unsigned long * long. + * "cpu.cache.monitor.count" - tocal cache monitoring groups + * "cpu.cache.monitor.M.name" - name of cache monitoring group 'M' + * "cpu.cache.monitor.M.vcpus" - vcpus for cache monitoring group 'M' + * "cpu.cache.monitor.M.bank.count" - total bank number of cache monitoring + * group 'M' + * "cpu.cache.monitor.M.bank.N.id" - OS assigned cache bank id for cache + * 'N' in cache monitoring group 'M' + * "cpu.cache.monitor.M.bank.N.bytes" - monitor's cache occupancy of cache + * bank 'N' in cache monitoring group 'M' * * VIR_DOMAIN_STATS_BALLOON: * Return memory balloon device information. diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 91801ed..4551767 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2683,6 +2683,7 @@ virResctrlInfoNew; virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; +virResctrlMonitorGetCacheOccupancy; virResctrlMonitorGetID; virResctrlMonitorIsRunning; virResctrlMonitorNew; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 12a5f8f..9828118 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -102,6 +102,7 @@ #include "virnuma.h" #include "dirname.h" #include "netdev_bandwidth_conf.h" +#include "c-ctype.h" Ahh.. this one's a red flag... Says to me that something should move to a util function...
Got. Header file will be removed (actually I planned to remove all code using it.)
#define VIR_FROM_THIS VIR_FROM_QEMU
@@ -19698,6 +19699,230 @@ typedef enum { #define HAVE_JOB(flags) ((flags) & QEMU_DOMAIN_STATS_HAVE_JOB)
+/* In terms of the output of virBitmapFormat, both '1-3' and '1,3' are valid + * outputs and represent different vcpu set. But it is not easy to + * differentiate these two vcpu set formats at first glance. + * + * This function could be used to clear this ambiguity, it substitutes all '-' + * with ',' while generates semantically correct vcpu set. + * e.g. vcpu set string '1-3' will be replaced by string '1,2,3'. */ +static char * +qemuDomainVcpuFormatHelper(const char *vcpus) +{ + size_t i = 0; + int last = 0; + int start = 0; + char * tmp = NULL; + bool firstnum = true; + const char *cur = vcpus; + virBuffer buf = VIR_BUFFER_INITIALIZER; + char *ret = NULL; + + if (virStringIsEmpty(cur)) + return NULL; + + while (*cur != '\0') { + if (!c_isdigit(*cur))
Explains the #include, but in the long run, I don't think this method is worth the effort since all you're doing is printing the format in the output. Is there other libvirt generated output for cpu stats that does a similar conversion? If so, share code, if not drop this.
No need to rearrange the range someone else entered and we've succesfully parsed in some manner. I think the output should look like the XML output unless there's some compelling reason to change it which I don't see.
From above the:
cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6
would then be:
cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4-6
right?
Yes, this heavy function is just re-formatting the output of vcpus. I also think it not worth to do this kind of reformat. Let's remove it.
I don't even want to think about someone who does "7,4-6"...
+ goto cleanup; + + if (virStrToLong_i(cur, &tmp, 10, &start) < 0) + goto cleanup; + if (start < 0) + goto cleanup; + + cur = tmp; + + virSkipSpaces(&cur); + + if (*cur == ',' || *cur == 0) { + if (!firstnum) + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%d", start); + firstnum = false; + } else if (*cur == '-') { + cur++; + virSkipSpaces(&cur); + + if (virStrToLong_i(cur, &tmp, 10, &last) < 0) + goto cleanup; + + if (last < start) + goto cleanup; + cur = tmp; + + for (i = start; i <= last; i++) { + if (!firstnum) + + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%ld", i); + firstnum = 0; + } + + virSkipSpaces(&cur); + } + + if (*cur == ',') { + cur++; + virSkipSpaces(&cur); + } else if (*cur == 0) { + break; + } else { + goto cleanup; + } + } + + ret = virBufferContentAndReset(&buf); + cleanup: + virBufferFreeAndReset(&buf); + return ret; +} + + +static int +qemuDomainGetStatsCpuResource(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr dom, + virDomainStatsRecordPtr record, + int *maxparams, + unsigned int privflags ATTRIBUTE_UNUSED, + virResctrlMonitorType restag) Why bother passing ATTRIBUTE_UNUSED ?
I just copy-pasted the function header from some other function. Will be removed since it is an internal function.
+{ + char param_name[VIR_TYPED_PARAM_FIELD_LENGTH]; + virDomainResctrlMonDefPtr domresmon = NULL; + virResctrlMonitorStatsPtr stats = NULL; + size_t nstats = 0; since these are local to the:
+ for (j = 0; j < resctrl->nmonitors; j++) {
loop and there's local's in that for loop, maybe these two should move there too...
If the lines with virResctrlMonitorIsRunning could be removed, then this loop is not necessary any more. virResctrlMonitorIsRunning is validating monitor through verifying PID lists, but it seems there is no need to do such kind of validation and the PIDs in *tasks file will not change without control: -. The PID in *tasks file will be removed by kernel if vcpu process is terminated, but the vcpu process will not exit except performing vcpu hot-plug. -. In process of libvirt re-connection, VM will not make any changes in terms of resctrl groups and vcpus, so the content of *tasks will not change and the layout of resctrl groups will not change. -. Any changes out of libvirt are not allowed, for example, manually creating or deleting any resctrl groups by bash commands are not allowed. I decided to remove the lines invoking virResctrlMonitorIsRunning, and the loop will be removed. Actually I will split this function into three functions: qemuDomainGetCpuResMonitorData qemuDomainGetStatsCpuResMonitorPerTag and qemuDomainGetStatsCPUResMonitor qemuDomainGetCpuResMonitorData will fetch data such as the monitor group ID, vcpus, and cache information for all monitors, any error happened in this function will reported to user through virReportError. So the error will be reported to user before dumping the final result. qemuDomainGetStatsCpuResMonitorPerTag is responsible for showing the data retrieved by a successful call of qemuDomainGetCpuResMonitorData. In this function it does not dump any error message created by virReportError which will interrupt the showing of cache monitor information. qemuDomainGetStatsCPUResMonitor is the interface called by qemuDomainGetStatsCPU. we could invoke qemuDomainGetStatsCpuResMonitorPerTag with tag VIR_RESCTRL_MONITOR_TYPE_CACHE for cache monitor. memBW monitor result will be processed by invoking this function through specify tag VIR_RESCTRL_MONITOR_TYPE_MEMBW.
+ virDomainResctrlDefPtr resctrl = NULL; + unsigned int nmonitors = 0; + unsigned int imonitor = 0; + const char *restype = NULL; + char *rawvcpus = NULL; + char *vcpus = NULL; + size_t i = 0; + size_t j = 0; + int ret = -1; + + if (!virDomainObjIsActive(dom)) + return 0; + + if (restag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + restype = "cache"; + } else { + VIR_DEBUG("Invalid CPU resource type"); VIR_DEBUG? virReportError would seem to be more appropriate. As I've stated before there will be an error for some reason generated.
Originally, the purpose of using VIR_DEBUG,not virReportError, to report error is to avoid showing partial monitor information with an error message on screen. I will split this function into three functions, qemuDomainGetCpuResMonitorData qemuDomainGetStatsCpuResMonitorPerTag and qemuDomainGetStatsCPUResMonitor qemuDomainGetCpuResMonitorData will get necessary data and reported any error occurred to user promptly. qemuDomainGetStatsCpuResMonitorPerTag and qemuDomainGetStatsCPUResMonitor will report the cache monitor result. This will avoid the mix of cache monitor result and error message if error occurs. Please review my new code.
+ return -1; + } + + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + domresmon = resctrl->monitors[j]; + if (virResctrlMonitorIsRunning(domresmon->instance) && + domresmon->tag == restag) Knowing how heavy wait *IsRunning is, the order of checking should be reversed at the very least.
virResctrlMonitorIsRunning will not be called any more as I stated. and the code will be changed, my new code will be more efficient.
+ nmonitors++; + } + } + + if (nmonitors) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.count", restype); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nmonitors) < 0) + goto cleanup; + } All that above to ensure nmonitors == resctrl->nmonitors??? I think there's a lot of unnecessary stuff.
This part will be removed. Here the purpose of code is to get the total 'valid' (after a successful check of virResctrlMonitorIsRunning) monitor groups. It actually performing the validation of nmonitors == resctrl->nmonitors, just as you said. Now I think there is no need to validate the monitor any more.
+ + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + size_t l = 0; + virResctrlMonitorPtr monitor = resctrl->monitors[j]->instance; + const char *id = virResctrlMonitorGetID(monitor); + + if (!id) + goto cleanup; I would think you have other problems in this case. Besides there's no error message associated with this.
Error message should be reported. And will be reported. Please review the new code.
+ + domresmon = resctrl->monitors[j]; + + if (!virResctrlMonitorIsRunning(domresmon->instance)) + continue; Oh my one call for each iteration, ouch - highly inefficient.
+ + if (!(rawvcpus = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + vcpus = qemuDomainVcpuFormatHelper(rawvcpus); + if (!vcpus) + goto cleanup; + + if (virResctrlMonitorGetCacheOccupancy(monitor, &stats, + &nstats) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.name", restype, imonitor); + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + id) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.vcpus", restype, imonitor); + + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + vcpus) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.count", restype, imonitor); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nstats) < 0) + goto cleanup; + + for (l = 0; l < nstats; l++) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.%ld.id", + restype, imonitor, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + stats[l].id) < 0) + goto cleanup; + + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.%ld.bytes", + restype, imonitor, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + stats[l].val) < 0) + goto cleanup; + } + + VIR_FREE(stats); nstats = 0; If virResctrlMonitorGetCacheOccupancy returns without error, nstats will be assigned with
virResctrlMonitorIsRunning disappears in code and patch for this file will be redesigned. Please renew the new code. the proper bank group number. If virResctrlMonitorGetCacheOccupancy returns an error, then code will go through cleanup path and returns. Seems no need to clear nstats to 0. Any way in my new code, this nstats will not be used any more, please review the logic of new code later.
too!
+ VIR_FREE(vcpus); + imonitor++; + } + } + + ret = 0; + cleanup: + VIR_FREE(vcpus); + return ret; +} + + static int qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr dom, @@ -19735,6 +19960,10 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, return -1; }
+ if (qemuDomainGetStatsCpuResource(driver, dom, record, maxparams, privflags, + VIR_RESCTRL_MONITOR_TYPE_CACHE) < 0) + return -1; + return 0; }
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index fa3e6e9..02e7095 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2713,3 +2713,133 @@ virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor)
return ret; } + + +static int +virResctrlMonitorStatsSorter(const void *a, + const void *b) +{ + return ((virResctrlMonitorStatsPtr)a)->id + - ((virResctrlMonitorStatsPtr)b)->id; +} + + +/* + * virResctrlMonitorGetStats + * + * @monitor: The monitor that the statistic data will be retrieved from. + * @resource: The name for resource name. 'llc_occupancy' for cache resource. + * "mbm_total_bytes" and "mbm_local_bytes" for memory bandwidth resource. + * @stats: Array of virResctrlMonitorStatsPtr for holding cache or memory + * bandwidth usage data. + * @nstats: A size_t pointer to hold the returned array length of @stats + * + * Get cache or memory bandwidth utilization information. + * + * Returns 0 on success, -1 on error. + */ +static int +virResctrlMonitorGetStats(virResctrlMonitorPtr monitor, + const char *resource, + virResctrlMonitorStatsPtr *stats, + size_t *nstats) +{ + int rv = -1; + int ret = -1; + DIR *dirp = NULL; + char *datapath = NULL; + struct dirent *ent = NULL; + virResctrlMonitorStatsPtr stat = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (virAsprintf(&datapath, "%s/mon_data", monitor->path) < 0) + return -1; + + if (virDirOpen(&dirp, datapath) < 0) + goto cleanup; + + *nstats = 0; + while (virDirRead(dirp, &ent, datapath) > 0) { + char *node_id = NULL; + + if (VIR_ALLOC(stat) < 0) + goto cleanup; + + /* Looking for directory that contains resource utilization + * information file. The directory name is arranged in format + * "mon_<node_name>_<node_id>". For example, "mon_L3_00" and + * "mon_L3_01" are two target directories for a two nodes system + * with resource utilization data file for each node respectively. + */ + if (ent->d_type != DT_DIR) + continue; + + /* Looking for directory has a prefix 'mon_L' */ + if (!(node_id = STRSKIP(ent->d_name, "mon_L"))) + continue; + + /* Looking for directory has another '_' */ + node_id = strchr(node_id, '_'); + if (!node_id) + continue; + + /* Skip the character '_' */ + if (!(node_id = STRSKIP(node_id, "_"))) + continue; + + /* The node ID number should be here, parsing it. */ + if (virStrToLong_uip(node_id, NULL, 0, &stat->id) < 0) + goto cleanup; + + rv = virFileReadValueUint(&stat->val, "%s/%s/%s", datapath, + ent->d_name, resource); + if (rv == -2) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%s/%s/%s' does not exist."), + datapath, ent->d_name, resource); + } + if (rv < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*stats, *nstats, *stat) < 0) + goto cleanup; + } + + /* Sort in id's ascending order */ + qsort(*stats, *nstats, sizeof(*stat), virResctrlMonitorStatsSorter); Coverity notes that *stats could be NULL - if the above while loop gets nothing...
Will be changed to if (*nstats) qsort(*stats, *nstats, sizeof(*stat), virResctrlMonitorStatsSorter);
+ + ret = 0; + cleanup: + VIR_FREE(datapath); + VIR_FREE(stat); + VIR_DIR_CLOSE(dirp); + return ret; +} + + +/* + * virResctrlMonitorGetCacheOccupancy + * + * @monitor: The monitor that the statistic data will be retrieved from. + * @caches: Array of virResctrlMonitorStatsPtr for receiving cache occupancy + * data. Caller is responsible to free this array. + * @ncaches: A size_t pointer to hold the returned array length of @caches They're @stats and @nstats elsewhere, be consistent.
Will use @stats and @nstats for virResctrlMonitorGetCacheOccupancy and later API (for memBW).
+ * + * Get cache or memory bandwidth utilization information. + * + * Returns 0 on success, -1 on error. + */ + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + virResctrlMonitorStatsPtr *caches, + size_t *ncaches) They're @stats and @nstats elsewhere, be consistent.
Will use @stats and @nstats for parameters.
+{ + return virResctrlMonitorGetStats(monitor, "llc_occupancy", + caches, ncaches); +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 8d8fdc2..004c40e 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -191,6 +191,13 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl, typedef struct _virResctrlMonitor virResctrlMonitor; typedef virResctrlMonitor *virResctrlMonitorPtr;
+typedef struct _virResctrlMonitorStats virResctrlMonitorStats; +typedef virResctrlMonitorStats *virResctrlMonitorStatsPtr; +struct _virResctrlMonitorStats { + unsigned int id; + unsigned int val; +}; + virResctrlMonitorPtr virResctrlMonitorNew(void);
@@ -222,4 +229,9 @@ virResctrlMonitorRemove(virResctrlMonitorPtr monitor);
bool virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + virResctrlMonitorStatsPtr *caches, + size_t *ncaches); They're @stats and @nstats elsewhere, be consistent.
Will use @stats and @nstats for parameters.
#endif /* __VIR_RESCTRL_H__ */
I'm sure there's more I'll pick up later, but you should have plenty for now. As to splitting - I think some splitting could be done, I'll leave it up to you... Lots of change across multiple areas - ask yourself - can something be separated out?
The virresctrl.* part could be separated, and how about merging this part with previous patch that adding more monitor interfaces in virresctrl.*. I'll do this in my next update for your review.
John
Thanks for review. Huaqiang

Invoking qemuProcessSetupVcpus in process of VM reconnection. The vcpu pid information need to be refilled to resctrl monitor after a VM reconnection./ Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index fba4fb4..ed0330b 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -8008,6 +8008,9 @@ qemuProcessReconnect(void *opaque) } } + if (qemuProcessSetupVcpus(obj) < 0) + goto error; + /* update domain state XML with possibly updated state in virDomainObj */ if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, obj, driver->caps) < 0) goto error; -- 2.7.4

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Invoking qemuProcessSetupVcpus in process of VM reconnection.
The vcpu pid information need to be refilled to resctrl monitor after a VM reconnection./
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index fba4fb4..ed0330b 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -8008,6 +8008,9 @@ qemuProcessReconnect(void *opaque) } }
+ if (qemuProcessSetupVcpus(obj) < 0) + goto error; +
IIRC the reason for not needing this is that the assumption is that the initial startup (ProcessLaunch) and any hotplug thereafter would end up calling qemuProcessSetupVcpu. Not sure this is necessary for everything except perhaps the resctrl monitor *pids list which I'm not even sure is necessary. John
/* update domain state XML with possibly updated state in virDomainObj */ if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, obj, driver->caps) < 0) goto error;

-----Original Message----- From: John Ferlan [mailto:jferlan@redhat.com] Sent: Tuesday, November 6, 2018 4:09 AM To: Wang, Huaqiang <huaqiang.wang@intel.com>; libvir-list@redhat.com Cc: Feng, Shaohe <shaohe.feng@intel.com>; Niu, Bing <bing.niu@intel.com>; Ding, Jian-feng <jian-feng.ding@intel.com>; Zang, Rui <rui.zang@intel.com> Subject: Re: [PATCHv7 18/18] qemu: Setting up vcpu and adding pids to resctrl monitor groups during reconnection
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
Invoking qemuProcessSetupVcpus in process of VM reconnection.
The vcpu pid information need to be refilled to resctrl monitor after a VM reconnection./
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_process.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index fba4fb4..ed0330b 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -8008,6 +8008,9 @@ qemuProcessReconnect(void *opaque) } }
+ if (qemuProcessSetupVcpus(obj) < 0) + goto error; +
IIRC the reason for not needing this is that the assumption is that the initial startup (ProcessLaunch) and any hotplug thereafter would end up calling qemuProcessSetupVcpu.
Not sure this is necessary for everything except perhaps the resctrl monitor *pids list which I'm not even sure is necessary.
This time I think this patch is not necessary. This patch is only required if *tasks file content is changed in process of re-connection. Now I don't think the *tasks file will be changed because the vcpu processes are still there in process of reconnection and no add no removal. Patch will be removed.
John
Thanks for review. Huaqiang
/* update domain state XML with possibly updated state in virDomainObj */ if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, obj, driver->caps) < 0) goto error;

On 10/22/18 4:01 AM, Wang Huaqiang wrote:
This series of patches and the series already been merged introduce the x86 Cache Monitoring Technology (CMT) to libvirt by interacting with kernel resource control (resctrl) interface. CMT is one of the Intel(R) x86 CPU feature which belongs to the Resource Director Technology (RDT). CMT reports the occupancy of the last level cache, which is shared by all CPU cores.
Rather than top or bottom post ;-) - as noted during the reviewing - many of the patches are fine; however, once we get to the meat in patch 12 and beyond I think more adjustment is necessary. I wouldn't want to introduce the XML from patch 12, but have no teeth behind it from the remaining patches. Getting closer, but not quite there yet. If you're good with my proposed adjustments for patches 1-11, then I can get those pushed so that we can make some progress. On your next round, please let's try to get a news.xml to summarize the changes added at the end. Saves me from chasing later on. I won't have a lot of cycles for the rest of the week to debate or provide feedback. I'm in class Wed -> Fri. I really don't think the comments I've provided in the later patches are that controversial. I do think we probably need a way to handle the Unplug with the existing usage of the *tasks file before we go too far and add monitor. You also need to figure a way to not add the same pid twice. I realize now that a *Remove operation will remove the *tasks file from a $path so some concerns I had were at least reduced. John
In the v1 series, an original and complete feature for CMT was introduced The v2 and v3 patches address the feature for the host capability of CMT. v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command.
We have serval discussion about the enabling of CMT, please refer to following links for the RFCs. RFCv3 https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html RFCv2 https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html RFCv1 https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html
And the merged commits are list as below, for host capability of CMT. 6af8417415508c31f8ce71234b573b4999f35980 8f6887998bf63594ae26e3db18d4d5896c5f2cb4 58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8 12093f1feaf8f5023dcd9d65dff111022842183d a5d293c18831dcf69ec6195798387fbb70c9f461
1. About reason why CMT is necessary in libvirt? The perf events of 'CMT, MBML, MBMT' have been phased out since Linux kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt the perf based cmt,mbm will not work with the latest linux kernel. These patches add CMT feature to libvirt through kernel resctrlfs interface.
2 Create cache monitoring group (cache monitor).
The main interface for creating monitoring group is through XML file. The proposed configuration is like:
<cputune> <cachetune vcpus='1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='4-7'> + <monitor level='3' vcpus='4-6'/> </cachetune> </cputune>
In above XML, created 2 cache resctrl allocation groups and 2 resctrl monitoring groups. The changes of cache monitor will be effective in next booting of VM.
2 Show CMT result through command 'domstats'
Adding the interface in qemu to report this information for resource monitor group through command 'virsh domstats --cpu-total'. Below is a typical output:
# virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.monitor.0.name=vcpus_1 cpu.cache.monitor.0.vcpus=1 cpu.cache.monitor.0.bank.count=2 cpu.cache.monitor.0.bank.0.id=0 cpu.cache.monitor.0.bank.0.bytes=4505600 cpu.cache.monitor.0.bank.1.id=1 cpu.cache.monitor.0.bank.1.bytes=5586944 cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6 cpu.cache.monitor.1.bank.count=2 cpu.cache.monitor.1.bank.0.id=0 cpu.cache.monitor.1.bank.0.bytes=17571840 cpu.cache.monitor.1.bank.1.id=1 cpu.cache.monitor.1.bank.1.bytes=29106176
Changes in v7: - Add several lines removed by mistake.
Changes in v6: - Addressing John's review comments for v5. - Removed and cleaned the concepts of 'default allocation' and 'default monitor'. - qemu: virsh domstats --cpu-total output for CMT, add identifier 'monitor' for each itm.
Changes in v5: - qemu: Setting up vcpu and adding pids to resctrl monitor groups during re-connection. - Add the document for domain configuration related to resctrl monitor.
Changes in v4: v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command. - Introduced resctrl default allocation - Introduced resctrl monitor and default monitor
Changes in v3: - Addressed John Ferlan's review. - Typo fixed. - Removed VIR_ENUM_DECL(virMonitor);
Changes in v2: - Introduced MBM capability. - Capability layout changed * Moved <monitor> from cahe <bank> to <cache> * Renamed <Threshold> to <reuseThreshold> - Document for 'reuseThreshold' changed. - Introduced API virResctrlInfoGetMonitorPrefix - Added more tests, covering standalone CMT, fake new feature. - Creating CMT resource control group will be subsequent job.
Wang Huaqiang (18): docs,util: Refactor schemas and virresctrl to support optional cache util: Introduce resctrl monitor for CMT util: Refactor code for determining allocation path util: Add interface to determine monitor path util: Refactor code for adding PID to the resource group util: Add interface for adding PID to the monitor util: Refactor code for creating resctrl group util: Add interface for creating monitor group util: Add more interfaces for resctrl monitor conf: Remove virDomainResctrlAppend and introduce virDomainResctrlNew conf: move virResctrlAllocIsEmpty to a place a litter lower conf: Introduce cache monitor element in cachetune qemu: enable resctrl monitor in qemu util: Add function for checking if monitor is running conf: Add 'id' to virDomainResctrlDef qemu: refactor qemuDomainGetStatsCpu qemu: Report cache occupancy (CMT) with domstats qemu: Setting up vcpu and adding pids to resctrl monitor groups during reconnection
docs/formatdomain.html.in | 30 +- docs/schemas/domaincommon.rng | 14 +- src/conf/domain_conf.c | 310 ++++++++++-- src/conf/domain_conf.h | 12 + src/libvirt-domain.c | 9 + src/libvirt_private.syms | 11 + src/qemu/qemu_driver.c | 270 +++++++++- src/qemu/qemu_process.c | 69 ++- src/util/virresctrl.c | 560 +++++++++++++++++++-- src/util/virresctrl.h | 49 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 ++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 14 files changed, 1283 insertions(+), 93 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml

On 11/6/2018 4:19 AM, John Ferlan wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:
This series of patches and the series already been merged introduce the x86 Cache Monitoring Technology (CMT) to libvirt by interacting with kernel resource control (resctrl) interface. CMT is one of the Intel(R) x86 CPU feature which belongs to the Resource Director Technology (RDT). CMT reports the occupancy of the last level cache, which is shared by all CPU cores.
Rather than top or bottom post ;-) - as noted during the reviewing - many of the patches are fine; however, once we get to the meat in patch 12 and beyond I think more adjustment is necessary. I wouldn't want to introduce the XML from patch 12, but have no teeth behind it from the remaining patches.
Really appreciate your review and comments, your comments not only pointed out my faults but also provide very reasonable suggestions. For patch 1-11, I also addressed your review opinions in my next version code, please make a review at your convenient time. For patches after 11, I also addressed your review suggestions by removing the patches for adding 'resctrl->id' and for 'virResctrlMonitorIsRunning', and made some code split for patch17. Please make a review.
Getting closer, but not quite there yet. If you're good with my proposed adjustments for patches 1-11, then I can get those pushed so that we can make some progress.
On your next round, please let's try to get a news.xml to summarize the changes added at the end. Saves me from chasing later on.
OK. A separate patch for change of news.xml is appended.
I won't have a lot of cycles for the rest of the week to debate or provide feedback. I'm in class Wed -> Fri. I really don't think the comments I've provided in the later patches are that controversial.
Agree. Your comments are very accurate. This time I have time to prepare the code along with the reply to your review emails.
I do think we probably need a way to handle the Unplug with the existing usage of the *tasks file before we go too far and add monitor.
By analyzing the code handling vcpu unplug , it seems there is no problem to keep the *tasks file aligned with PIDs. Because the removal of PID is automatically done by kernel if kernel find some vcpu process is disappeared. I need more time to test on vcpu unplug cases for my code. I have not achieved a successful live removal of vcpus, even with code without my new resctrl monitor patches. I will update when I have some conclusion.
You also need to figure a way to not add the same pid twice. I realize now that a *Remove operation will remove the *tasks file from a $path so some concerns I had were at least reduced.
To add PID twice is because I thought the *tasks file could be removed/added in process of re-connection, when performing 'systemctl restart libvirtd', but it seems is does not. So it is not necessary to add same pid twice. The libvirtd daemon restart process works fine after the remove of relevant patch. And *tasks file is not removed or changed in process of re-connection. So patch18 and patch14 are removed.
John
Thanks for all for your review. Huaqiang
In the v1 series, an original and complete feature for CMT was introduced The v2 and v3 patches address the feature for the host capability of CMT. v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command.
We have serval discussion about the enabling of CMT, please refer to following links for the RFCs. RFCv3 https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html RFCv2 https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html RFCv1 https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html
And the merged commits are list as below, for host capability of CMT. 6af8417415508c31f8ce71234b573b4999f35980 8f6887998bf63594ae26e3db18d4d5896c5f2cb4 58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8 12093f1feaf8f5023dcd9d65dff111022842183d a5d293c18831dcf69ec6195798387fbb70c9f461
1. About reason why CMT is necessary in libvirt? The perf events of 'CMT, MBML, MBMT' have been phased out since Linux kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt the perf based cmt,mbm will not work with the latest linux kernel. These patches add CMT feature to libvirt through kernel resctrlfs interface.
2 Create cache monitoring group (cache monitor).
The main interface for creating monitoring group is through XML file. The proposed configuration is like:
<cputune> <cachetune vcpus='1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor level='3' vcpus='1'/> </cachetune> <cachetune vcpus='4-7'> + <monitor level='3' vcpus='4-6'/> </cachetune> </cputune>
In above XML, created 2 cache resctrl allocation groups and 2 resctrl monitoring groups. The changes of cache monitor will be effective in next booting of VM.
2 Show CMT result through command 'domstats'
Adding the interface in qemu to report this information for resource monitor group through command 'virsh domstats --cpu-total'. Below is a typical output:
# virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.monitor.0.name=vcpus_1 cpu.cache.monitor.0.vcpus=1 cpu.cache.monitor.0.bank.count=2 cpu.cache.monitor.0.bank.0.id=0 cpu.cache.monitor.0.bank.0.bytes=4505600 cpu.cache.monitor.0.bank.1.id=1 cpu.cache.monitor.0.bank.1.bytes=5586944 cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6 cpu.cache.monitor.1.bank.count=2 cpu.cache.monitor.1.bank.0.id=0 cpu.cache.monitor.1.bank.0.bytes=17571840 cpu.cache.monitor.1.bank.1.id=1 cpu.cache.monitor.1.bank.1.bytes=29106176
Changes in v7: - Add several lines removed by mistake.
Changes in v6: - Addressing John's review comments for v5. - Removed and cleaned the concepts of 'default allocation' and 'default monitor'. - qemu: virsh domstats --cpu-total output for CMT, add identifier 'monitor' for each itm.
Changes in v5: - qemu: Setting up vcpu and adding pids to resctrl monitor groups during re-connection. - Add the document for domain configuration related to resctrl monitor.
Changes in v4: v4 is addressing the feature for monitoring VM vcpu thread set cache occupancy and reporting it through a virsh command. - Introduced resctrl default allocation - Introduced resctrl monitor and default monitor
Changes in v3: - Addressed John Ferlan's review. - Typo fixed. - Removed VIR_ENUM_DECL(virMonitor);
Changes in v2: - Introduced MBM capability. - Capability layout changed * Moved <monitor> from cahe <bank> to <cache> * Renamed <Threshold> to <reuseThreshold> - Document for 'reuseThreshold' changed. - Introduced API virResctrlInfoGetMonitorPrefix - Added more tests, covering standalone CMT, fake new feature. - Creating CMT resource control group will be subsequent job.
Wang Huaqiang (18): docs,util: Refactor schemas and virresctrl to support optional cache util: Introduce resctrl monitor for CMT util: Refactor code for determining allocation path util: Add interface to determine monitor path util: Refactor code for adding PID to the resource group util: Add interface for adding PID to the monitor util: Refactor code for creating resctrl group util: Add interface for creating monitor group util: Add more interfaces for resctrl monitor conf: Remove virDomainResctrlAppend and introduce virDomainResctrlNew conf: move virResctrlAllocIsEmpty to a place a litter lower conf: Introduce cache monitor element in cachetune qemu: enable resctrl monitor in qemu util: Add function for checking if monitor is running conf: Add 'id' to virDomainResctrlDef qemu: refactor qemuDomainGetStatsCpu qemu: Report cache occupancy (CMT) with domstats qemu: Setting up vcpu and adding pids to resctrl monitor groups during reconnection
docs/formatdomain.html.in | 30 +- docs/schemas/domaincommon.rng | 14 +- src/conf/domain_conf.c | 310 ++++++++++-- src/conf/domain_conf.h | 12 + src/libvirt-domain.c | 9 + src/libvirt_private.syms | 11 + src/qemu/qemu_driver.c | 270 +++++++++- src/qemu/qemu_process.c | 69 ++- src/util/virresctrl.c | 560 +++++++++++++++++++-- src/util/virresctrl.h | 49 ++ tests/genericxml2xmlindata/cachetune-cdp.xml | 3 + .../cachetune-colliding-monitor.xml | 30 ++ tests/genericxml2xmlindata/cachetune-small.xml | 7 + tests/genericxml2xmltest.c | 2 + 14 files changed, 1283 insertions(+), 93 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
participants (4)
-
Huaqiang,Wang
-
John Ferlan
-
Wang Huaqiang
-
Wang, Huaqiang