[libvirt] [PATCH v2 RESEND 00/17] Introduce RDT memory bandwidth allocation support

From: Bing Niu <bing.niu@intel.com> This series is to introduce RDT memory bandwidth allocation support by extending current virresctrl implementation. The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate control over memory bandwidth available per-core. This feature provides a method to control applications which may be over-utilizing bandwidth relative to their priority in environments such as the data-center. The details can be found in Intel's SDM 17.19.7. Kernel supports MBA through resctrl file system same as CAT. Each resctrl group have a MB parameter to control how much memory bandwidth it can utilize in unit of percentage. In this series, MBA is enabled by enhancing existing virresctrl implementation. The policy employed for MBA is similar with CAT: The sum of each MBA group's bandwidth dose not exceed 100%. The enhancement of virresctrl include two main parts: Part 1: Add two new structures virResctrlInfoMemMB and virResctrlAllocMemBW for collecting host system MBA capability and domain memory bandwidth allocation. Those two structures are the extension of existing virResctrlInfo and virResctrlAlloc. With them, virresctrl framework can support MBA and CAT concurrently. Each virResctrlAlloc represent a resource allocation including CAT, or MBA, or CAT&MBA. The policy of MBA is that: total memory bandwidth of each resctrl group, created by virresctrl, does not exceed to 100%. Part 2: On XML part, add new elements to host capabilities query and domain allocation to support memory bandwidth allocation. --------------------------------------------------------------------------------------------- For host capabilities XML, new XML format like below example, <host> ..... <memory_bandwidth> <node id='0' cpus='0-19'> <control granularity='10' min ='10' maxAllocs='8'/> </node> </memory_bandwidth> </host> granularity --- memory bandwidth granularity min --- minimum memory bandwidth allowed maxAllocs --- maximum concurrent memory bandwidth allocation allowed. --------------------------------------------------------------------------------------------- For domain XML, new format as below example <domain type='kvm' id='2'> ...... <cputune> ...... <shares>1024</shares> <memorytune vcpus='0-1'> <node id='0' bandwidth='20'/> </memorytune> </cputune> ...... </domain> id --- node where memory bandwidth allocation will happen bandwidth --- bandwidth allocated in percentage ---------------------------------------------------------------------------------------------- With this extension of the virresctrl, the overall working follow of CAT and MBA is described by below picture. XML parser will aggregate MBA and CAT configuration and represents it in one virresctrl object. The methods of virresctrl class will manipulate resctrl interface to allocate corresponding resources. <memorytune cpus='0-3'> +---------+ | <cachetune vcpus='0-3'> XML | + parser +-----------+ | | +------------------------------+ | | internal object +------v--------------+ virResctrlAlloc | backing object | +------+--------------+ | | +------------------------------+ | +--v-------+ | | | schemata | /sys/fs/resctrl | tasks | | . | | . | | | +----------+ --------------------------------------------------------------------- previous versions and discussion can be found at v1: https://www.redhat.com/archives/libvir-list/2018-July/msg01144.html RFC v2: https://www.redhat.com/archives/libvir-list/2018-June/msg01268.html RFC v1: https://www.redhat.com/archives/libvir-list/2018-May/msg02101.html Changelog: v1 -> this: John's comment: 1. Split calculation of number of memory bandwidth control to one patch. 2. Split virResctrlAllocMemBW relating methods to 5 patch, each provides one kind of function, eg: schemata processing, memory bandwidth calculation..... 3. Use resctrl to replace cachetune in domain conf. 4. Split refactor virDomainCachetuneDefParse into 3 patches. And adjust some logic, eg: use %s format error log, renaming functions..... 5. Complete doc description. eg: update cachetune part about vcpus overlapping with memorytune, update libvirt version info for memory bandwidth control availability. 6. Some coding style fix. RFC_v2->v1: John's comment: 1. use name MemBW to replace MB for a more clear description. 2. split rename patch and put refactor function part separately. 3. split virResctrlInfoMemMB and virResctrlAllocMemBW to different patches. 4. add docs/schemas/*.rng for XML related patches. 5. some cleanup for coding conventions. RFC_ v1->RFC_v2: Jano's comment: 1. put renaming parts into separated patches. 2. set the initial return value as -1. 3. using full name in structure definition. 4. do not use VIR_CACHE_TYPE_LAST for memory bandwidth allocation formatting. Pavel's comment: 1. add host capabilities XML for memory bandwidth allocation. 2. do not mix use cachetune section in XML for memory bandwidth allocation in domain XML. define a dedicated one for memory bandwidth allocation. Bing Niu (17): util: Rename some functions of virresctrl util: Refactor virResctrlGetInfo in virresctrl util: Refactor virResctrlAllocFormat of virresctrl util: Add MBA capability information query to resctrl util: Add MBA check to virResctrlInfoGetCache util: Add MBA allocation to virresctrl util: Add MBA schemata parse and format methods util: Add support to calculate MBA utilization util: Introduce virResctrlAllocForeachMemory util: Introduce virResctrlAllocSetMemoryBandwidth conf: Rename cachetune to resctrl conf: Factor out vcpus parsing part from virDomainCachetuneDefParse conf: Factor out vcpus overlapping from virDomainCachetuneDefParse conf: Factor out virDomainResctrlDef update from virDomainCachetuneDefParse conf: Add support for memorytune XML processing for resctrl MBA conf: Add return value check to virResctrlAllocForeachCache conf: Add memory bandwidth allocation capability of host docs/formatdomain.html.in | 39 +- docs/schemas/capability.rng | 33 ++ docs/schemas/domaincommon.rng | 17 + src/conf/capabilities.c | 107 ++++ src/conf/capabilities.h | 11 + src/conf/domain_conf.c | 428 ++++++++++++--- src/conf/domain_conf.h | 10 +- src/libvirt_private.syms | 6 +- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_process.c | 18 +- src/util/virresctrl.c | 611 +++++++++++++++++++-- src/util/virresctrl.h | 55 +- .../memorytune-colliding-allocs.xml | 30 + .../memorytune-colliding-cachetune.xml | 32 ++ tests/genericxml2xmlindata/memorytune.xml | 33 ++ tests/genericxml2xmltest.c | 5 + .../linux-resctrl/resctrl/info/MB/bandwidth_gran | 1 + .../linux-resctrl/resctrl/info/MB/min_bandwidth | 1 + .../linux-resctrl/resctrl/info/MB/num_closids | 1 + tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 8 + tests/virresctrldata/resctrl.schemata | 1 + 21 files changed, 1280 insertions(+), 169 deletions(-) create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml create mode 100644 tests/genericxml2xmlindata/memorytune.xml create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Some functions in virresctrl are for CAT only, while some of other functions are for resource allocation, not just CAT. So change their names to reflect the reality. Signed-off-by: Bing Niu <bing.niu@intel.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/conf/domain_conf.c | 8 ++++---- src/libvirt_private.syms | 4 ++-- src/util/virresctrl.c | 30 +++++++++++++++--------------- src/util/virresctrl.h | 26 +++++++++++++------------- 4 files changed, 34 insertions(+), 34 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index eff8af2..abb6c5e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19013,7 +19013,7 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, ULLONG_MAX, true) < 0) goto cleanup; - if (virResctrlAllocSetSize(alloc, level, type, cache, size) < 0) + if (virResctrlAllocSetCacheSize(alloc, level, type, cache, size) < 0) goto cleanup; ret = 0; @@ -27002,9 +27002,9 @@ virDomainCachetuneDefFormat(virBufferPtr buf, int ret = -1; virBufferSetChildIndent(&childrenBuf, buf); - virResctrlAllocForeachSize(cachetune->alloc, - virDomainCachetuneDefFormatHelper, - &childrenBuf); + virResctrlAllocForeachCache(cachetune->alloc, + virDomainCachetuneDefFormatHelper, + &childrenBuf); if (virBufferCheckError(&childrenBuf) < 0) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index fc386e1..bc489cb 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2648,14 +2648,14 @@ virCacheTypeToString; virResctrlAllocAddPID; virResctrlAllocCreate; virResctrlAllocDeterminePath; -virResctrlAllocForeachSize; +virResctrlAllocForeachCache; virResctrlAllocFormat; virResctrlAllocGetID; virResctrlAllocGetUnused; virResctrlAllocNew; virResctrlAllocRemove; +virResctrlAllocSetCacheSize; virResctrlAllocSetID; -virResctrlAllocSetSize; virResctrlInfoGetCache; virResctrlInfoNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index e492a63..6d69c8d 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -754,11 +754,11 @@ virResctrlAllocCheckCollision(virResctrlAllocPtr alloc, int -virResctrlAllocSetSize(virResctrlAllocPtr alloc, - unsigned int level, - virCacheType type, - unsigned int cache, - unsigned long long size) +virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size) { if (virResctrlAllocCheckCollision(alloc, level, type, cache)) { virReportError(VIR_ERR_XML_ERROR, @@ -773,9 +773,9 @@ virResctrlAllocSetSize(virResctrlAllocPtr alloc, int -virResctrlAllocForeachSize(virResctrlAllocPtr alloc, - virResctrlAllocForeachSizeCallback cb, - void *opaque) +virResctrlAllocForeachCache(virResctrlAllocPtr alloc, + virResctrlAllocForeachCacheCallback cb, + void *opaque) { int ret = 0; unsigned int level = 0; @@ -939,9 +939,9 @@ virResctrlAllocParseProcessCache(virResctrlInfoPtr resctrl, static int -virResctrlAllocParseProcessLine(virResctrlInfoPtr resctrl, - virResctrlAllocPtr alloc, - char *line) +virResctrlAllocParseCacheLine(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *line) { char **caches = NULL; char *tmp = NULL; @@ -1009,7 +1009,7 @@ virResctrlAllocParse(virResctrlInfoPtr resctrl, lines = virStringSplitCount(schemata, "\n", 0, &nlines); for (i = 0; i < nlines; i++) { - if (virResctrlAllocParseProcessLine(resctrl, alloc, lines[i]) < 0) + if (virResctrlAllocParseCacheLine(resctrl, alloc, lines[i]) < 0) goto cleanup; } @@ -1401,8 +1401,8 @@ virResctrlAllocCopyMasks(virResctrlAllocPtr dst, * transforming `sizes` into `masks`. */ static int -virResctrlAllocMasksAssign(virResctrlInfoPtr resctrl, - virResctrlAllocPtr alloc) +virResctrlAllocAssign(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc) { int ret = -1; unsigned int level = 0; @@ -1526,7 +1526,7 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, if (lockfd < 0) goto cleanup; - if (virResctrlAllocMasksAssign(resctrl, alloc) < 0) + if (virResctrlAllocAssign(resctrl, alloc) < 0) goto cleanup; alloc_str = virResctrlAllocFormat(alloc); diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 9052a2b..d657c06 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -67,11 +67,11 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, typedef struct _virResctrlAlloc virResctrlAlloc; typedef virResctrlAlloc *virResctrlAllocPtr; -typedef int virResctrlAllocForeachSizeCallback(unsigned int level, - virCacheType type, - unsigned int cache, - unsigned long long size, - void *opaque); +typedef int virResctrlAllocForeachCacheCallback(unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size, + void *opaque); virResctrlAllocPtr virResctrlAllocNew(void); @@ -80,16 +80,16 @@ bool virResctrlAllocIsEmpty(virResctrlAllocPtr alloc); int -virResctrlAllocSetSize(virResctrlAllocPtr alloc, - unsigned int level, - virCacheType type, - unsigned int cache, - unsigned long long size); +virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size); int -virResctrlAllocForeachSize(virResctrlAllocPtr alloc, - virResctrlAllocForeachSizeCallback cb, - void *opaque); +virResctrlAllocForeachCache(virResctrlAllocPtr alloc, + virResctrlAllocForeachCacheCallback cb, + void *opaque); int virResctrlAllocSetID(virResctrlAllocPtr alloc, -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Separate resctrl common information parts from CAT specific parts, so that common information parts can be reused among different resource allocation technologies. Signed-off-by: Bing Niu <bing.niu@intel.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/util/virresctrl.c | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 6d69c8d..313f964 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -318,9 +318,9 @@ virResctrlUnlock(int fd) /* virResctrlInfo-related definitions */ static int -virResctrlGetInfo(virResctrlInfoPtr resctrl) +virResctrlGetCacheInfo(virResctrlInfoPtr resctrl, + DIR *dirp) { - DIR *dirp = NULL; char *endptr = NULL; char *tmp_str = NULL; int ret = -1; @@ -332,12 +332,6 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) virResctrlInfoPerLevelPtr i_level = NULL; virResctrlInfoPerTypePtr i_type = NULL; - rv = virDirOpenIfExists(&dirp, SYSFS_RESCTRL_PATH "/info"); - if (rv <= 0) { - ret = rv; - goto cleanup; - } - while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH "/info")) > 0) { VIR_DEBUG("Parsing info type '%s'", ent->d_name); if (ent->d_name[0] != 'L') @@ -443,12 +437,32 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) ret = 0; cleanup: - VIR_DIR_CLOSE(dirp); VIR_FREE(i_type); return ret; } +static int +virResctrlGetInfo(virResctrlInfoPtr resctrl) +{ + DIR *dirp = NULL; + int ret = -1; + + ret = virDirOpenIfExists(&dirp, SYSFS_RESCTRL_PATH "/info"); + if (ret <= 0) + goto cleanup; + + ret = virResctrlGetCacheInfo(resctrl, dirp); + if (ret < 0) + goto cleanup; + + ret = 0; + cleanup: + VIR_DIR_CLOSE(dirp); + return ret; +} + + virResctrlInfoPtr virResctrlInfoNew(void) { -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Refactor virResctrlAllocFormat so that it is easy to support other resource allocation technologies. Signed-off-by: Bing Niu <bing.niu@intel.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/util/virresctrl.c | 41 ++++++++++++++++++++++++++--------------- 1 file changed, 26 insertions(+), 15 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 313f964..a38c926 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -849,17 +849,14 @@ virResctrlAllocGetID(virResctrlAllocPtr alloc) } -char * -virResctrlAllocFormat(virResctrlAllocPtr alloc) +static int +virResctrlAllocFormatCache(virResctrlAllocPtr alloc, + virBufferPtr buf) { - virBuffer buf = VIR_BUFFER_INITIALIZER; unsigned int level = 0; unsigned int type = 0; unsigned int cache = 0; - if (!alloc) - return NULL; - for (level = 0; level < alloc->nlevels; level++) { virResctrlAllocPerLevelPtr a_level = alloc->levels[level]; @@ -872,7 +869,7 @@ virResctrlAllocFormat(virResctrlAllocPtr alloc) if (!a_type) continue; - virBufferAsprintf(&buf, "L%u%s:", level, virResctrlTypeToString(type)); + virBufferAsprintf(buf, "L%u%s:", level, virResctrlTypeToString(type)); for (cache = 0; cache < a_type->nmasks; cache++) { virBitmapPtr mask = a_type->masks[cache]; @@ -882,21 +879,35 @@ virResctrlAllocFormat(virResctrlAllocPtr alloc) continue; mask_str = virBitmapToString(mask, false, true); - if (!mask_str) { - virBufferFreeAndReset(&buf); - return NULL; - } + if (!mask_str) + return -1; - virBufferAsprintf(&buf, "%u=%s;", cache, mask_str); + virBufferAsprintf(buf, "%u=%s;", cache, mask_str); VIR_FREE(mask_str); } - virBufferTrim(&buf, ";", 1); - virBufferAddChar(&buf, '\n'); + virBufferTrim(buf, ";", 1); + virBufferAddChar(buf, '\n'); } } - virBufferCheckError(&buf); + return virBufferCheckError(buf); +} + + +char * +virResctrlAllocFormat(virResctrlAllocPtr alloc) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!alloc) + return NULL; + + if (virResctrlAllocFormatCache(alloc, &buf) < 0) { + virBufferFreeAndReset(&buf); + return NULL; + } + return virBufferContentAndReset(&buf); } -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Introducing virResctrlInfoMemBW for the information memory bandwidth allocation information. Signed-off-by: Bing Niu <bing.niu@intel.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/util/virresctrl.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 84 insertions(+) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index a38c926..b12a05c 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -80,6 +80,9 @@ typedef virResctrlInfoPerType *virResctrlInfoPerTypePtr; typedef struct _virResctrlInfoPerLevel virResctrlInfoPerLevel; typedef virResctrlInfoPerLevel *virResctrlInfoPerLevelPtr; +typedef struct _virResctrlInfoMemBW virResctrlInfoMemBW; +typedef virResctrlInfoMemBW *virResctrlInfoMemBWPtr; + typedef struct _virResctrlAllocPerType virResctrlAllocPerType; typedef virResctrlAllocPerType *virResctrlAllocPerTypePtr; @@ -116,11 +119,30 @@ struct _virResctrlInfoPerLevel { virResctrlInfoPerTypePtr *types; }; +/* Information about memory bandwidth allocation */ +struct _virResctrlInfoMemBW { + /* minimum memory bandwidth allowed */ + unsigned int min_bandwidth; + /* bandwidth granularity */ + unsigned int bandwidth_granularity; + /* Maximum number of simultaneous allocations */ + unsigned int max_allocation; + /* level number of last level cache */ + unsigned int last_level_cache; + /* max id of last level cache, this is used to track + * how many last level cache available in host system, + * the number of memory bandwidth allocation controller + * is identical with last level cache. */ + unsigned int max_id; +}; + struct _virResctrlInfo { virObject parent; virResctrlInfoPerLevelPtr *levels; size_t nlevels; + + virResctrlInfoMemBWPtr membw_info; }; @@ -146,6 +168,7 @@ virResctrlInfoDispose(void *obj) VIR_FREE(level); } + VIR_FREE(resctrl->membw_info); VIR_FREE(resctrl->levels); } @@ -443,6 +466,60 @@ virResctrlGetCacheInfo(virResctrlInfoPtr resctrl, static int +virResctrlGetMemoryBandwidthInfo(virResctrlInfoPtr resctrl) +{ + int ret = -1; + int rv = -1; + virResctrlInfoMemBWPtr i_membw = NULL; + + /* query memory bandwidth allocation info */ + if (VIR_ALLOC(i_membw) < 0) + goto cleanup; + rv = virFileReadValueUint(&i_membw->bandwidth_granularity, + SYSFS_RESCTRL_PATH "/info/MB/bandwidth_gran"); + if (rv == -2) { + /* The file doesn't exist, so it's unusable for us, + * probably memory bandwidth allocation unsupported */ + VIR_INFO("The path '" SYSFS_RESCTRL_PATH "/info/MB/bandwidth_gran'" + "does not exist"); + ret = 0; + goto cleanup; + } else if (rv < 0) { + /* Other failures are fatal, so just quit */ + goto cleanup; + } + + rv = virFileReadValueUint(&i_membw->min_bandwidth, + SYSFS_RESCTRL_PATH "/info/MB/min_bandwidth"); + if (rv == -2) { + /* If the previous file exists, so should this one. Hence -2 is + * fatal in this case (errors out in next condition) - the kernel + * interface might've changed too much or something else is wrong. */ + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get min bandwidth from resctrl memory info")); + } + if (rv < 0) + goto cleanup; + + rv = virFileReadValueUint(&i_membw->max_allocation, + SYSFS_RESCTRL_PATH "/info/MB/num_closids"); + if (rv == -2) { + /* Similar reasoning to min_bandwidth above. */ + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get max allocation from resctrl memory info")); + } + if (rv < 0) + goto cleanup; + + VIR_STEAL_PTR(resctrl->membw_info, i_membw); + ret = 0; + cleanup: + VIR_FREE(i_membw); + return ret; +} + + +static int virResctrlGetInfo(virResctrlInfoPtr resctrl) { DIR *dirp = NULL; @@ -452,6 +529,10 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) if (ret <= 0) goto cleanup; + ret = virResctrlGetMemoryBandwidthInfo(resctrl); + if (ret < 0) + goto cleanup; + ret = virResctrlGetCacheInfo(resctrl, dirp); if (ret < 0) goto cleanup; @@ -493,6 +574,9 @@ virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl) if (!resctrl) return true; + if (resctrl->membw_info) + return false; + for (i = 0; i < resctrl->nlevels; i++) { virResctrlInfoPerLevelPtr i_level = resctrl->levels[i]; -- 2.7.4

From: Bing Niu <bing.niu@intel.com> If we have some membw_info data, then we need to calculate the number of MBA controllers on the system. The value cannot be obtained from a direct query to the RDT kernel module, but it is the same as the last level cache value which is calculated by traversing the cache hierarchy of host(/sys/bus/cpu/devices/cpuX/cache/). Signed-off-by: Bing Niu <bing.niu@intel.com> Reviewed-by: John Ferlan <jferlan@redhat.com> --- src/util/virresctrl.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index b12a05c..f454868 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -608,6 +608,20 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, if (virResctrlInfoIsEmpty(resctrl)) return 0; + /* Let's take the opportunity to update the number of last level + * cache. This number of memory bandwidth controller is same with + * last level cache */ + if (resctrl->membw_info) { + virResctrlInfoMemBWPtr membw_info = resctrl->membw_info; + + if (level > membw_info->last_level_cache) { + membw_info->last_level_cache = level; + membw_info->max_id = 0; + } else if (membw_info->last_level_cache == level) { + membw_info->max_id++; + } + } + if (level >= resctrl->nlevels) return 0; -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Add memory bandwidth allocation support to virresctrl class. Introducing virResctrlAllocMemBW which is used for allocating memory bandwidth. Following virResctrlAllocPerType, it also employs a nested sparse array to indicate whether allocation is available for particular last level cache. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/util/virresctrl.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 59 insertions(+), 4 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index f454868..8a25798 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -36,9 +36,9 @@ VIR_LOG_INIT("util.virresctrl") /* Resctrl is short for Resource Control. It might be implemented for various - * resources, but at the time of this writing this is only supported for cache - * allocation technology (aka CAT). Hence the reson for leaving 'Cache' out of - * all the structure and function names for now (can be added later if needed. + * resources. Currently this supports cache allocation technology (aka CAT) and + * memory bandwidth allocation (aka MBA). More resources technologies may be + * added in the future. */ @@ -89,6 +89,9 @@ typedef virResctrlAllocPerType *virResctrlAllocPerTypePtr; typedef struct _virResctrlAllocPerLevel virResctrlAllocPerLevel; typedef virResctrlAllocPerLevel *virResctrlAllocPerLevelPtr; +typedef struct _virResctrlAllocMemBW virResctrlAllocMemBW; +typedef virResctrlAllocMemBW *virResctrlAllocMemBWPtr; + /* Class definitions and initializations */ static virClassPtr virResctrlInfoClass; @@ -180,7 +183,10 @@ virResctrlInfoDispose(void *obj) * consequently a directory under /sys/fs/resctrl). Since it can have multiple * parts of multiple caches allocated it is represented as bunch of nested * sparse arrays (by sparse I mean array of pointers so that each might be NULL - * in case there is no allocation for that particular one (level, cache, ...)). + * in case there is no allocation for that particular cache allocation (level, + * cache, ...) or memory allocation for particular node). + * + * =====Cache allocation technology (CAT)===== * * Since one allocation can be made for caches on different levels, the first * nested sparse array is of types virResctrlAllocPerLevel. For example if you @@ -205,6 +211,17 @@ virResctrlInfoDispose(void *obj) * all of them. While doing that we store the bitmask in a sparse array of * virBitmaps named `masks` indexed the same way as `sizes`. The upper bounds * of the sparse arrays are stored in nmasks or nsizes, respectively. + + * + * =====Memory Bandwidth allocation technology (MBA)===== + * + * The memory bandwidth allocation support in virResctrlAlloc works in the + * same fashion as CAT. However, memory bandwidth controller doesn't have a + * hierarchy organization as cache, each node have one memory bandwidth + * controller to memory bandwidth distribution. The number of memory bandwidth + * controller is identical with number of last level cache. So MBA also employs + * a sparse array to represent whether a memory bandwidth allocation happens + * on corresponding node. The available memory controller number is collected + * in 'virResctrlInfo'. */ struct _virResctrlAllocPerType { /* There could be bool saying whether this is set or not, but since everything @@ -225,12 +242,24 @@ struct _virResctrlAllocPerLevel { * VIR_CACHE_TYPE_LAST number of items */ }; +/* + * virResctrlAllocMemBW represents one memory bandwidth allocation. + * Since it can have several last level caches in a NUMA system, it is + * also represented as a nested sparse arrays as virRestrlAllocPerLevel. + */ +struct _virResctrlAllocMemBW { + unsigned int **bandwidths; + size_t nbandwidths; +}; + struct _virResctrlAlloc { virObject parent; virResctrlAllocPerLevelPtr *levels; size_t nlevels; + virResctrlAllocMemBWPtr mem_bw; + /* The identifier (any unique string for now) */ char *id; /* libvirt-generated path in /sys/fs/resctrl for this particular @@ -274,6 +303,13 @@ virResctrlAllocDispose(void *obj) VIR_FREE(level); } + if (alloc->mem_bw) { + virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw; + for (i = 0; i < mem_bw->nbandwidths; i++) + VIR_FREE(mem_bw->bandwidths[i]); + VIR_FREE(alloc->mem_bw); + } + VIR_FREE(alloc->id); VIR_FREE(alloc->path); VIR_FREE(alloc->levels); @@ -692,6 +728,9 @@ virResctrlAllocIsEmpty(virResctrlAllocPtr alloc) if (!alloc) return true; + if (alloc->mem_bw) + return false; + for (i = 0; i < alloc->nlevels; i++) { virResctrlAllocPerLevelPtr a_level = alloc->levels[i]; @@ -1266,6 +1305,22 @@ virResctrlAllocNewFromInfo(virResctrlInfoPtr info) } } + /* set default free memory bandwidth to 100%*/ + if (info->membw_info) { + if (VIR_ALLOC(ret->mem_bw) < 0) + goto error; + + if (VIR_EXPAND_N(ret->mem_bw->bandwidths, ret->mem_bw->nbandwidths, + info->membw_info->max_id + 1) < 0) + goto error; + + for (i = 0; i < ret->mem_bw->nbandwidths; i++) { + if (VIR_ALLOC(ret->mem_bw->bandwidths[i]) < 0) + goto error; + *(ret->mem_bw->bandwidths[i]) = 100; + } + } + cleanup: virBitmapFree(mask); return ret; -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Introduce virResctrlAllocMemoryBandwidthFormat and virResctrlAllocParseMemoryBandwidthLine which will format and parse an entry in the schemata file for MBA. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/util/virresctrl.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 141 insertions(+) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 8a25798..1cbf9b3 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -986,6 +986,139 @@ virResctrlAllocGetID(virResctrlAllocPtr alloc) } +/* Format the Memory Bandwidth Allocation line that will be found in + * the schemata files. The line should be start with "MB:" and be + * followed by "id=value" pairs separated by a colon such as: + * + * MB:0=100;1=100 + * + * which indicates node id 0 has 100 percent bandwith and node id 1 + * has 100 percent bandwidth + */ +static int +virResctrlAllocMemoryBandwidthFormat(virResctrlAllocPtr alloc, + virBufferPtr buf) +{ + size_t i; + + if (!alloc->mem_bw) + return 0; + + virBufferAddLit(buf, "MB:"); + + for (i = 0; i < alloc->mem_bw->nbandwidths; i++) { + if (alloc->mem_bw->bandwidths[i]) { + virBufferAsprintf(buf, "%zd=%u;", i, + *(alloc->mem_bw->bandwidths[i])); + } + } + + virBufferTrim(buf, ";", 1); + virBufferAddChar(buf, '\n'); + return virBufferCheckError(buf); +} + + +static int +virResctrlAllocParseProcessMemoryBandwidth(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *mem_bw) +{ + unsigned int bandwidth; + unsigned int id; + char *tmp = NULL; + + tmp = strchr(mem_bw, '='); + if (!tmp) + return 0; + *tmp = '\0'; + tmp++; + + if (virStrToLong_uip(mem_bw, NULL, 10, &id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid node id %u "), id); + return -1; + } + if (virStrToLong_uip(tmp, NULL, 10, &bandwidth) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid bandwidth %u"), bandwidth); + return -1; + } + if (bandwidth < resctrl->membw_info->min_bandwidth || + id > resctrl->membw_info->max_id) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Missing or inconsistent resctrl info for " + "memory bandwidth node '%u'"), id); + return -1; + } + if (alloc->mem_bw->nbandwidths <= id && + VIR_EXPAND_N(alloc->mem_bw->bandwidths, alloc->mem_bw->nbandwidths, + id - alloc->mem_bw->nbandwidths + 1) < 0) { + return -1; + } + if (!alloc->mem_bw->bandwidths[id]) { + if (VIR_ALLOC(alloc->mem_bw->bandwidths[id]) < 0) + return -1; + } + + *(alloc->mem_bw->bandwidths[id]) = bandwidth; + return 0; +} + + +/* Parse a schemata formatted MB: entry. Format details are described in + * virResctrlAllocMemoryBandwidthFormat. + */ +static int +virResctrlAllocParseMemoryBandwidthLine(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *line) +{ + char **mbs = NULL; + char *tmp = NULL; + size_t nmbs = 0; + size_t i; + int ret = -1; + + /* For no reason there can be spaces */ + virSkipSpaces((const char **) &line); + + if (STRNEQLEN(line, "MB", 2)) + return 0; + + if (!resctrl || !resctrl->membw_info || + !resctrl->membw_info->min_bandwidth || + !resctrl->membw_info->bandwidth_granularity) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or inconsistent resctrl info for " + "memory bandwidth allocation")); + } + + if (!alloc->mem_bw) { + if (VIR_ALLOC(alloc->mem_bw) < 0) + return -1; + } + + tmp = strchr(line, ':'); + if (!tmp) + return 0; + tmp++; + + mbs = virStringSplitCount(tmp, ";", 0, &nmbs); + if (nmbs == 0) + return 0; + + for (i = 0; i < nmbs; i++) { + if (virResctrlAllocParseProcessMemoryBandwidth(resctrl, alloc, mbs[i]) < 0) + goto cleanup; + } + ret = 0; + cleanup: + virStringListFree(mbs); + return ret; +} + + static int virResctrlAllocFormatCache(virResctrlAllocPtr alloc, virBufferPtr buf) @@ -1045,6 +1178,11 @@ virResctrlAllocFormat(virResctrlAllocPtr alloc) return NULL; } + if (virResctrlAllocMemoryBandwidthFormat(alloc, &buf) < 0) { + virBufferFreeAndReset(&buf); + return NULL; + } + return virBufferContentAndReset(&buf); } @@ -1173,6 +1311,9 @@ virResctrlAllocParse(virResctrlInfoPtr resctrl, for (i = 0; i < nlines; i++) { if (virResctrlAllocParseCacheLine(resctrl, alloc, lines[i]) < 0) goto cleanup; + if (virResctrlAllocParseMemoryBandwidthLine(resctrl, alloc, lines[i]) < 0) + goto cleanup; + } ret = 0; -- 2.7.4

On 07/29/2018 11:12 PM, bing.niu@intel.com wrote:
From: Bing Niu <bing.niu@intel.com>
Introduce virResctrlAllocMemoryBandwidthFormat and virResctrlAllocParseMemoryBandwidthLine which will format and parse an entry in the schemata file for MBA.
Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/util/virresctrl.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 141 insertions(+)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 8a25798..1cbf9b3 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -986,6 +986,139 @@ virResctrlAllocGetID(virResctrlAllocPtr alloc) }
+/* Format the Memory Bandwidth Allocation line that will be found in + * the schemata files. The line should be start with "MB:" and be + * followed by "id=value" pairs separated by a colon such as:
semi-colon
+ * + * MB:0=100;1=100 + * + * which indicates node id 0 has 100 percent bandwith and node id 1 + * has 100 percent bandwidth
s/$/. A trailing semi-colon is not formatted./
+ */
[...]
+/* Parse a schemata formatted MB: entry. Format details are described in + * virResctrlAllocMemoryBandwidthFormat. + */ +static int +virResctrlAllocParseMemoryBandwidthLine(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *line) +{ + char **mbs = NULL; + char *tmp = NULL; + size_t nmbs = 0; + size_t i; + int ret = -1; + + /* For no reason there can be spaces */ + virSkipSpaces((const char **) &line); + + if (STRNEQLEN(line, "MB", 2)) + return 0; + + if (!resctrl || !resctrl->membw_info || + !resctrl->membw_info->min_bandwidth || + !resctrl->membw_info->bandwidth_granularity) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or inconsistent resctrl info for " + "memory bandwidth allocation"));
I assume a return -1 is in order here. Missed this earlier, but the Coverity checker found it.
+ } + + if (!alloc->mem_bw) { + if (VIR_ALLOC(alloc->mem_bw) < 0) + return -1; + } + + tmp = strchr(line, ':'); + if (!tmp) + return 0; + tmp++; + + mbs = virStringSplitCount(tmp, ";", 0, &nmbs); + if (nmbs == 0) + return 0;
As strange as this is - the above 2 aren't necessary given the next for loop. Keeping them cause Coverity to whine that virStringSplitCount can return allocated memory which is then leaked. It's a false positive, but avoidable.
+ + for (i = 0; i < nmbs; i++) { + if (virResctrlAllocParseProcessMemoryBandwidth(resctrl, alloc, mbs[i]) < 0) + goto cleanup; + } + ret = 0; + cleanup: + virStringListFree(mbs); + return ret; +} + +
[...] I'll adjust in my branch before pushing. John

On 2018年07月31日 06:09, John Ferlan wrote:
On 07/29/2018 11:12 PM, bing.niu@intel.com wrote:
From: Bing Niu <bing.niu@intel.com>
Introduce virResctrlAllocMemoryBandwidthFormat and virResctrlAllocParseMemoryBandwidthLine which will format and parse an entry in the schemata file for MBA.
Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/util/virresctrl.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 141 insertions(+)
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 8a25798..1cbf9b3 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -986,6 +986,139 @@ virResctrlAllocGetID(virResctrlAllocPtr alloc)
[....]
+/* Parse a schemata formatted MB: entry. Format details are described in + * virResctrlAllocMemoryBandwidthFormat. + */ +static int +virResctrlAllocParseMemoryBandwidthLine(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *line) +{ + char **mbs = NULL; + char *tmp = NULL; + size_t nmbs = 0; + size_t i; + int ret = -1; + + /* For no reason there can be spaces */ + virSkipSpaces((const char **) &line); + + if (STRNEQLEN(line, "MB", 2)) + return 0; + + if (!resctrl || !resctrl->membw_info || + !resctrl->membw_info->min_bandwidth || + !resctrl->membw_info->bandwidth_granularity) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or inconsistent resctrl info for " + "memory bandwidth allocation"));
I assume a return -1 is in order here.
Missed this earlier, but the Coverity checker found it.
Yes. I miss this and should return -1 here.
+ } + + if (!alloc->mem_bw) { + if (VIR_ALLOC(alloc->mem_bw) < 0) + return -1; + } + + tmp = strchr(line, ':'); + if (!tmp) + return 0; + tmp++; + + mbs = virStringSplitCount(tmp, ";", 0, &nmbs); + if (nmbs == 0) + return 0;
As strange as this is - the above 2 aren't necessary given the next for loop. Keeping them cause Coverity to whine that virStringSplitCount can return allocated memory which is then leaked. It's a false positive, but avoidable.
Yes my bad, sorry. I saw virStringSplitCount will return a pointer array even 0 delim found. And this pointer array need to do a virStringListFree. Should change like: mbs = virStringSplitCount(tmp, ";", 0, &nmbs); if (mbs == 0) return 0; Bing
+ + for (i = 0; i < nmbs; i++) { + if (virResctrlAllocParseProcessMemoryBandwidth(resctrl, alloc, mbs[i]) < 0) + goto cleanup; + } + ret = 0; + cleanup: + virStringListFree(mbs); + return ret; +} + +
[...]
I'll adjust in my branch before pushing.
John

From: Bing Niu <bing.niu@intel.com> Introduce virResctrlMemoryBandwidthSubtract and virResctrlAllocMemoryBandwidth to be used as part of the virResctrlAllocAssign processing to configure the available memory bandwidth. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/util/virresctrl.c | 105 ++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 93 insertions(+), 12 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 1cbf9b3..c22b30f 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -1409,6 +1409,22 @@ virResctrlAllocSubtract(virResctrlAllocPtr dst, } +static void +virResctrlMemoryBandwidthSubtract(virResctrlAllocPtr free, + virResctrlAllocPtr used) +{ + size_t i; + + if (!used->mem_bw) + return; + + for (i = 0; i < used->mem_bw->nbandwidths; i++) { + if (used->mem_bw->bandwidths[i]) + *(free->mem_bw->bandwidths[i]) -= *(used->mem_bw->bandwidths[i]); + } +} + + static virResctrlAllocPtr virResctrlAllocNewFromInfo(virResctrlInfoPtr info) { @@ -1472,14 +1488,15 @@ virResctrlAllocNewFromInfo(virResctrlInfoPtr info) } /* - * This function creates an allocation that represents all unused parts of all - * caches in the system. It uses virResctrlInfo for creating a new full - * allocation with all bits set (using virResctrlAllocNewFromInfo()) and then - * scans for all allocations under /sys/fs/resctrl and subtracts each one of - * them from it. That way it can then return an allocation with only bit set - * being those that are not mentioned in any other allocation. It is used for - * two things, a) calculating the masks when creating allocations and b) from - * tests. + * This function creates an allocation that represents all unused parts of + * all caches and memory bandwidth in the system. It uses virResctrlInfo + * for creating a new full allocation with all bits set (using the + * virResctrlAllocNewFromInfo()), sets memory bandwidth 100%, and then scans + * for all allocations under /sys/fs/resctrl and subtracts each one of them + * from it. That way it can then return an allocation with only bit set + * being those that are not mentioned in any other allocation for CAT and + * available memory bandwidth for MBA. It is used for two things, calculating + * the masks and bandwidth available when creating allocations and from tests. */ virResctrlAllocPtr virResctrlAllocGetUnused(virResctrlInfoPtr resctrl) @@ -1525,6 +1542,7 @@ virResctrlAllocGetUnused(virResctrlInfoPtr resctrl) goto error; } + virResctrlMemoryBandwidthSubtract(ret, alloc); virResctrlAllocSubtract(ret, alloc); virObjectUnref(alloc); alloc = NULL; @@ -1676,6 +1694,66 @@ virResctrlAllocFindUnused(virResctrlAllocPtr alloc, static int +virResctrlAllocMemoryBandwidth(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + virResctrlAllocPtr free) +{ + size_t i; + virResctrlAllocMemBWPtr mem_bw_alloc = alloc->mem_bw; + virResctrlAllocMemBWPtr mem_bw_free = free->mem_bw; + virResctrlInfoMemBWPtr mem_bw_info = resctrl->membw_info; + + if (!mem_bw_alloc) + return 0; + + if (mem_bw_alloc && !mem_bw_info) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("RDT Memory Bandwidth allocation unsupported")); + return -1; + } + + for (i = 0; i < mem_bw_alloc->nbandwidths; i++) { + if (!mem_bw_alloc->bandwidths[i]) + continue; + + if (*(mem_bw_alloc->bandwidths[i]) % mem_bw_info->bandwidth_granularity) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Memory Bandwidth allocation of size " + "%u is not divisible by granularity %u"), + *(mem_bw_alloc->bandwidths[i]), + mem_bw_info->bandwidth_granularity); + return -1; + } + if (*(mem_bw_alloc->bandwidths[i]) < mem_bw_info->min_bandwidth) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Memory Bandwidth allocation of size " + "%u is smaller than the minimum " + "allowed allocation %u"), + *(mem_bw_alloc->bandwidths[i]), + mem_bw_info->min_bandwidth); + return -1; + } + if (i > mem_bw_info->max_id) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("bandwidth controller id %zd does not " + "exist, max controller id %u"), + i, mem_bw_info->max_id); + return -1; + } + if (*(mem_bw_alloc->bandwidths[i]) > *(mem_bw_free->bandwidths[i])) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Not enough room for allocation of %u%% " + "bandwidth on node %zd, available bandwidth %u%%"), + *(mem_bw_alloc->bandwidths[i]), i, + *(mem_bw_free->bandwidths[i])); + return -1; + } + } + return 0; +} + + +static int virResctrlAllocCopyMasks(virResctrlAllocPtr dst, virResctrlAllocPtr src) { @@ -1714,10 +1792,10 @@ virResctrlAllocCopyMasks(virResctrlAllocPtr dst, /* - * This function is called when creating an allocation in the system. What it - * does is that it gets all the unused bits using virResctrlAllocGetUnused() and - * then tries to find a proper space for every requested allocation effectively - * transforming `sizes` into `masks`. + * This function is called when creating an allocation in the system. + * What it does is that it gets all the unused resources using + * virResctrlAllocGetUnused and then tries to find a proper space for + * every requested allocation effectively transforming `sizes` into `masks`. */ static int virResctrlAllocAssign(virResctrlInfoPtr resctrl, @@ -1736,6 +1814,9 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, if (!alloc_default) goto cleanup; + if (virResctrlAllocMemoryBandwidth(resctrl, alloc, alloc_free) < 0) + goto cleanup; + if (virResctrlAllocCopyMasks(alloc, alloc_default) < 0) goto cleanup; -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Introduce an API that will traverse the memory bandwidth data calling a callback function for each defined bandwidth entry. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 33 +++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 9 +++++++++ 3 files changed, 43 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index bc489cb..cd598f9 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2649,6 +2649,7 @@ virResctrlAllocAddPID; virResctrlAllocCreate; virResctrlAllocDeterminePath; virResctrlAllocForeachCache; +virResctrlAllocForeachMemory; virResctrlAllocFormat; virResctrlAllocGetID; virResctrlAllocGetUnused; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index c22b30f..dc57c89 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -965,6 +965,39 @@ virResctrlAllocForeachCache(virResctrlAllocPtr alloc, } +/* virResctrlAllocForeachMemory + * @alloc: Pointer to an active allocation + * @cb: Callback function + * @opaque: Opaque data to be passed to @cb + * + * If available, traverse the defined memory bandwidth allocations and + * call the @cb function. + * + * Returns 0 on success, -1 and immediate failure if the @cb has any failure. + */ +int +virResctrlAllocForeachMemory(virResctrlAllocPtr alloc, + virResctrlAllocForeachMemoryCallback cb, + void *opaque) +{ + size_t i = 0; + virResctrlAllocMemBWPtr mem_bw; + + if (!alloc || !alloc->mem_bw) + return 0; + + mem_bw = alloc->mem_bw; + for (i = 0; i < mem_bw->nbandwidths; i++) { + if (mem_bw->bandwidths[i]) { + if (cb(i, *mem_bw->bandwidths[i], opaque) < 0) + return -1; + } + } + + return 0; +} + + int virResctrlAllocSetID(virResctrlAllocPtr alloc, const char *id) diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index d657c06..5ea5b27 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -73,6 +73,10 @@ typedef int virResctrlAllocForeachCacheCallback(unsigned int level, unsigned long long size, void *opaque); +typedef int virResctrlAllocForeachMemoryCallback(unsigned int id, + unsigned int size, + void *opaque); + virResctrlAllocPtr virResctrlAllocNew(void); @@ -92,6 +96,11 @@ virResctrlAllocForeachCache(virResctrlAllocPtr alloc, void *opaque); int +virResctrlAllocForeachMemory(virResctrlAllocPtr resctrl, + virResctrlAllocForeachMemoryCallback cb, + void *opaque); + +int virResctrlAllocSetID(virResctrlAllocPtr alloc, const char *id); const char * -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Introduce an API to allow setting of the MBA from domain XML. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/libvirt_private.syms | 1 + src/util/virresctrl.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 5 +++++ 3 files changed, 54 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index cd598f9..0d2fc5c 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2657,6 +2657,7 @@ virResctrlAllocNew; virResctrlAllocRemove; virResctrlAllocSetCacheSize; virResctrlAllocSetID; +virResctrlAllocSetMemoryBandwidth; virResctrlInfoGetCache; virResctrlInfoNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index dc57c89..f25e7bc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -965,6 +965,54 @@ virResctrlAllocForeachCache(virResctrlAllocPtr alloc, } +/* virResctrlAllocSetMemoryBandwidth + * @alloc: Pointer to an active allocation + * @id: node id of MBA to be set + * @memory_bandwidth: new memory bandwidth value + * + * Set the @memory_bandwidth for the node @id entry in the @alloc. + * + * Returns 0 on success, -1 on failure with error message set. + */ +int +virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr alloc, + unsigned int id, + unsigned int memory_bandwidth) +{ + virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw; + + if (memory_bandwidth > 100) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Memory Bandwidth value exceeding 100 is invalid.")); + return -1; + } + + if (!mem_bw) { + if (VIR_ALLOC(mem_bw) < 0) + return -1; + alloc->mem_bw = mem_bw; + } + + if (mem_bw->nbandwidths <= id && + VIR_EXPAND_N(mem_bw->bandwidths, mem_bw->nbandwidths, + id - mem_bw->nbandwidths + 1) < 0) + return -1; + + if (mem_bw->bandwidths[id]) { + virReportError(VIR_ERR_XML_ERROR, + _("Memory Bandwidth already defined for node %u"), + id); + return -1; + } + + if (VIR_ALLOC(mem_bw->bandwidths[id]) < 0) + return -1; + + *(mem_bw->bandwidths[id]) = memory_bandwidth; + return 0; +} + + /* virResctrlAllocForeachMemory * @alloc: Pointer to an active allocation * @cb: Callback function diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 5ea5b27..8d62517 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -96,6 +96,11 @@ virResctrlAllocForeachCache(virResctrlAllocPtr alloc, void *opaque); int +virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr alloc, + unsigned int id, + unsigned int memory_bandwidth); + +int virResctrlAllocForeachMemory(virResctrlAllocPtr resctrl, virResctrlAllocForeachMemoryCallback cb, void *opaque); -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Resctrl not only supports cache tuning, but also memory bandwidth tuning. Renaming cachetune to resctrl to reflect that. With resctrl, all allocation for different resources (cache, memory bandwidth) are aggregated and represented by a virResctrlAllocPtr inside virDomainResctrlDef. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/conf/domain_conf.c | 44 ++++++++++++++++++++++---------------------- src/conf/domain_conf.h | 10 +++++----- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_process.c | 18 +++++++++--------- 4 files changed, 37 insertions(+), 37 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index abb6c5e..c1527b2 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2966,14 +2966,14 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) static void -virDomainCachetuneDefFree(virDomainCachetuneDefPtr cachetune) +virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) { - if (!cachetune) + if (!resctrl) return; - virObjectUnref(cachetune->alloc); - virBitmapFree(cachetune->vcpus); - VIR_FREE(cachetune); + virObjectUnref(resctrl->alloc); + virBitmapFree(resctrl->vcpus); + VIR_FREE(resctrl); } @@ -3163,9 +3163,9 @@ void virDomainDefFree(virDomainDefPtr def) virDomainShmemDefFree(def->shmems[i]); VIR_FREE(def->shmems); - for (i = 0; i < def->ncachetunes; i++) - virDomainCachetuneDefFree(def->cachetunes[i]); - VIR_FREE(def->cachetunes); + for (i = 0; i < def->nresctrls; i++) + virDomainResctrlDefFree(def->resctrls[i]); + VIR_FREE(def->resctrls); VIR_FREE(def->keywrap); @@ -19034,7 +19034,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = virResctrlAllocNew(); - virDomainCachetuneDefPtr tmp_cachetune = NULL; + virDomainResctrlDefPtr tmp_resctrl = NULL; char *tmp = NULL; char *vcpus_str = NULL; char *alloc_id = NULL; @@ -19047,7 +19047,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (!alloc) goto cleanup; - if (VIR_ALLOC(tmp_cachetune) < 0) + if (VIR_ALLOC(tmp_resctrl) < 0) goto cleanup; vcpus_str = virXMLPropString(node, "vcpus"); @@ -19088,8 +19088,8 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } - for (i = 0; i < def->ncachetunes; i++) { - if (virBitmapOverlaps(def->cachetunes[i]->vcpus, vcpus)) { + for (i = 0; i < def->nresctrls; i++) { + if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) { virReportError(VIR_ERR_XML_ERROR, "%s", _("Overlapping vcpus in cachetunes")); goto cleanup; @@ -19119,16 +19119,16 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (virResctrlAllocSetID(alloc, alloc_id) < 0) goto cleanup; - VIR_STEAL_PTR(tmp_cachetune->vcpus, vcpus); - VIR_STEAL_PTR(tmp_cachetune->alloc, alloc); + VIR_STEAL_PTR(tmp_resctrl->vcpus, vcpus); + VIR_STEAL_PTR(tmp_resctrl->alloc, alloc); - if (VIR_APPEND_ELEMENT(def->cachetunes, def->ncachetunes, tmp_cachetune) < 0) + if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) goto cleanup; ret = 0; cleanup: ctxt->node = oldnode; - virDomainCachetuneDefFree(tmp_cachetune); + virDomainResctrlDefFree(tmp_resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(alloc_id); @@ -26994,7 +26994,7 @@ virDomainCachetuneDefFormatHelper(unsigned int level, static int virDomainCachetuneDefFormat(virBufferPtr buf, - virDomainCachetuneDefPtr cachetune, + virDomainResctrlDefPtr resctrl, unsigned int flags) { virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; @@ -27002,7 +27002,7 @@ virDomainCachetuneDefFormat(virBufferPtr buf, int ret = -1; virBufferSetChildIndent(&childrenBuf, buf); - virResctrlAllocForeachCache(cachetune->alloc, + virResctrlAllocForeachCache(resctrl->alloc, virDomainCachetuneDefFormatHelper, &childrenBuf); @@ -27015,14 +27015,14 @@ virDomainCachetuneDefFormat(virBufferPtr buf, goto cleanup; } - vcpus = virBitmapFormat(cachetune->vcpus); + vcpus = virBitmapFormat(resctrl->vcpus); if (!vcpus) goto cleanup; virBufferAsprintf(buf, "<cachetune vcpus='%s'", vcpus); if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { - const char *alloc_id = virResctrlAllocGetID(cachetune->alloc); + const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); if (!alloc_id) goto cleanup; @@ -27143,8 +27143,8 @@ virDomainCputuneDefFormat(virBufferPtr buf, def->iothreadids[i]->iothread_id); } - for (i = 0; i < def->ncachetunes; i++) - virDomainCachetuneDefFormat(&childrenBuf, def->cachetunes[i], flags); + for (i = 0; i < def->nresctrls; i++) + virDomainCachetuneDefFormat(&childrenBuf, def->resctrls[i], flags); if (virBufferCheckError(&childrenBuf) < 0) return -1; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index c1dfa37..ade60f5 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2232,10 +2232,10 @@ struct _virDomainCputune { }; -typedef struct _virDomainCachetuneDef virDomainCachetuneDef; -typedef virDomainCachetuneDef *virDomainCachetuneDefPtr; +typedef struct _virDomainResctrlDef virDomainResctrlDef; +typedef virDomainResctrlDef *virDomainResctrlDefPtr; -struct _virDomainCachetuneDef { +struct _virDomainResctrlDef { virBitmapPtr vcpus; virResctrlAllocPtr alloc; }; @@ -2413,8 +2413,8 @@ struct _virDomainDef { virDomainCputune cputune; - virDomainCachetuneDefPtr *cachetunes; - size_t ncachetunes; + virDomainResctrlDefPtr *resctrls; + size_t nresctrls; virDomainNumaPtr numa; virDomainResourceDefPtr resource; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index de05627..c6388fd 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -3999,7 +3999,7 @@ qemuDomainDefValidate(const virDomainDef *def, } } - if (def->ncachetunes && + if (def->nresctrls && def->virtType != VIR_DOMAIN_VIRT_KVM) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("cachetune is only supported for KVM domains")); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index c4e3372..02fdc55 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2559,7 +2559,7 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, virCapsPtr caps = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - if (!vm->def->ncachetunes) + if (!vm->def->nresctrls) return 0; /* Force capability refresh since resctrl info can change @@ -2568,9 +2568,9 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, if (!caps) return -1; - for (i = 0; i < vm->def->ncachetunes; i++) { + for (i = 0; i < vm->def->nresctrls; i++) { if (virResctrlAllocCreate(caps->host.resctrl, - vm->def->cachetunes[i]->alloc, + vm->def->resctrls[i]->alloc, priv->machineName) < 0) goto cleanup; } @@ -5388,8 +5388,8 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, &vcpu->sched) < 0) return -1; - for (i = 0; i < vm->def->ncachetunes; i++) { - virDomainCachetuneDefPtr ct = vm->def->cachetunes[i]; + for (i = 0; i < vm->def->nresctrls; i++) { + virDomainResctrlDefPtr ct = vm->def->resctrls[i]; if (virBitmapIsBitSet(ct->vcpus, vcpuid)) { if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) @@ -7091,8 +7091,8 @@ void qemuProcessStop(virQEMUDriverPtr driver, /* Remove resctrl allocation after cgroups are cleaned up which makes it * kind of safer (although removing the allocation should work even with * pids in tasks file */ - for (i = 0; i < vm->def->ncachetunes; i++) - virResctrlAllocRemove(vm->def->cachetunes[i]->alloc); + for (i = 0; i < vm->def->nresctrls; i++) + virResctrlAllocRemove(vm->def->resctrls[i]->alloc); qemuProcessRemoveDomainStatus(driver, vm); @@ -7818,8 +7818,8 @@ qemuProcessReconnect(void *opaque) if (qemuConnectAgent(driver, obj) < 0) goto error; - for (i = 0; i < obj->def->ncachetunes; i++) { - if (virResctrlAllocDeterminePath(obj->def->cachetunes[i]->alloc, + for (i = 0; i < obj->def->nresctrls; i++) { + if (virResctrlAllocDeterminePath(obj->def->resctrls[i]->alloc, priv->machineName) < 0) goto error; } -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Extract vcpus parsing part from virDomainCachetuneDefParse into one function called virDomainResctrlParseVcpus. So that vcpus parsing logic can be reused by other resource control technologies. Adjust error message and use node->name so that the error message can fit to all technologies. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/conf/domain_conf.c | 48 +++++++++++++++++++++++++++++++++--------------- 1 file changed, 33 insertions(+), 15 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c1527b2..d6314de 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18951,6 +18951,38 @@ virDomainDefParseBootOptions(virDomainDefPtr def, static int +virDomainResctrlParseVcpus(virDomainDefPtr def, + xmlNodePtr node, + virBitmapPtr *vcpus) +{ + char *vcpus_str = NULL; + int ret = -1; + + vcpus_str = virXMLPropString(node, "vcpus"); + if (!vcpus_str) { + virReportError(VIR_ERR_XML_ERROR, _("Missing %s attribute 'vcpus'"), + node->name); + goto cleanup; + } + if (virBitmapParse(vcpus_str, vcpus, VIR_DOMAIN_CPUMASK_LEN) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid %s attribute 'vcpus' value '%s'"), + vcpus_str, node->name); + goto cleanup; + } + + /* We need to limit the bitmap to number of vCPUs. If there's nothing left, + * then we can just clean up and return 0 immediately */ + virBitmapShrink(*vcpus, def->maxvcpus); + + ret = 0; + cleanup: + VIR_FREE(vcpus_str); + return ret; +} + + +static int virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, xmlNodePtr node, virResctrlAllocPtr alloc) @@ -19050,22 +19082,8 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (VIR_ALLOC(tmp_resctrl) < 0) goto cleanup; - vcpus_str = virXMLPropString(node, "vcpus"); - if (!vcpus_str) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Missing cachetune attribute 'vcpus'")); - goto cleanup; - } - if (virBitmapParse(vcpus_str, &vcpus, VIR_DOMAIN_CPUMASK_LEN) < 0) { - virReportError(VIR_ERR_XML_ERROR, - _("Invalid cachetune attribute 'vcpus' value '%s'"), - vcpus_str); + if (virDomainResctrlParseVcpus(def, node, &vcpus) < 0) goto cleanup; - } - - /* We need to limit the bitmap to number of vCPUs. If there's nothing left, - * then we can just clean up and return 0 immediately */ - virBitmapShrink(vcpus, def->maxvcpus); if (virBitmapIsAllClear(vcpus)) { ret = 0; -- 2.7.4

On 07/29/2018 11:12 PM, bing.niu@intel.com wrote:
From: Bing Niu <bing.niu@intel.com>
Extract vcpus parsing part from virDomainCachetuneDefParse into one function called virDomainResctrlParseVcpus. So that vcpus parsing logic can be reused by other resource control technologies. Adjust error message and use node->name so that the error message can fit to all technologies.
Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/conf/domain_conf.c | 48 +++++++++++++++++++++++++++++++++--------------- 1 file changed, 33 insertions(+), 15 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c1527b2..d6314de 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18951,6 +18951,38 @@ virDomainDefParseBootOptions(virDomainDefPtr def,
static int +virDomainResctrlParseVcpus(virDomainDefPtr def, + xmlNodePtr node, + virBitmapPtr *vcpus) +{ + char *vcpus_str = NULL; + int ret = -1; + + vcpus_str = virXMLPropString(node, "vcpus"); + if (!vcpus_str) { + virReportError(VIR_ERR_XML_ERROR, _("Missing %s attribute 'vcpus'"), + node->name); + goto cleanup; + } + if (virBitmapParse(vcpus_str, vcpus, VIR_DOMAIN_CPUMASK_LEN) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid %s attribute 'vcpus' value '%s'"), + vcpus_str, node->name);
Arguments reversed - should "Invalid {cputunes|memtunes} attribute 'vcpus' value '$VCPUS_STR'"
+ goto cleanup; + } + + /* We need to limit the bitmap to number of vCPUs. If there's nothing left, + * then we can just clean up and return 0 immediately */ + virBitmapShrink(*vcpus, def->maxvcpus); + + ret = 0; + cleanup: + VIR_FREE(vcpus_str); + return ret; +} + +
[...] I'll fix in my branch before pushing. John

From: Bing Niu <bing.niu@intel.com> Factor out vcpus overlapping detecting part from virDomainCachetuneDefParse and introduce virDomainResctrlVcpuMatch. Instead of allocating virResctrlAllocPtr by default, allocating virResctrlAllocPtr after confirm vcpus not overlap with existing ones. And virDomainResctrlVcpuMatch can be reused by other resource control technologies. virDomainResctrlVcpuMatch can clarify old vcpus overlap error whether an overlap or a redefinition. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/conf/domain_conf.c | 51 ++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 39 insertions(+), 12 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d6314de..da8681d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18983,6 +18983,31 @@ virDomainResctrlParseVcpus(virDomainDefPtr def, static int +virDomainResctrlVcpuMatch(virDomainDefPtr def, + virBitmapPtr vcpus, + virResctrlAllocPtr *alloc) +{ + ssize_t i = 0; + + for (i = 0; i < def->nresctrls; i++) { + /* vcpus group has been created, directly use the existing one. + * Just updating memory allocation information of that group + */ + if (virBitmapEqual(def->resctrls[i]->vcpus, vcpus)) { + *alloc = def->resctrls[i]->alloc; + break; + } + if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Overlapping vcpus in resctrls")); + return -1; + } + } + return 0; +} + + +static int virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, xmlNodePtr node, virResctrlAllocPtr alloc) @@ -19065,7 +19090,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr oldnode = ctxt->node; xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; - virResctrlAllocPtr alloc = virResctrlAllocNew(); + virResctrlAllocPtr alloc = NULL; virDomainResctrlDefPtr tmp_resctrl = NULL; char *tmp = NULL; char *vcpus_str = NULL; @@ -19076,9 +19101,6 @@ virDomainCachetuneDefParse(virDomainDefPtr def, ctxt->node = node; - if (!alloc) - goto cleanup; - if (VIR_ALLOC(tmp_resctrl) < 0) goto cleanup; @@ -19096,6 +19118,19 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } + if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) + goto cleanup; + + if (!alloc) { + alloc = virResctrlAllocNew(); + if (!alloc) + goto cleanup; + } else { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Identical vcpus in cachetunes found")); + goto cleanup; + } + for (i = 0; i < n; i++) { if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) goto cleanup; @@ -19106,14 +19141,6 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } - for (i = 0; i < def->nresctrls; i++) { - if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("Overlapping vcpus in cachetunes")); - goto cleanup; - } - } - /* We need to format it back because we need to be consistent in the naming * even when users specify some "sub-optimal" string there. */ VIR_FREE(vcpus_str); -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Factor out vcpus virDomainResctrlDef update from virDomainCachetuneDefParse and introduce virDomainResctrlAppend. virDomainResctrlAppend will format vcpus string and append a new virDomainResctrlDef to virDomainDefPtr. So that this logic can be reusable. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/conf/domain_conf.c | 93 +++++++++++++++++++++++++++++--------------------- 1 file changed, 55 insertions(+), 38 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index da8681d..f45def8 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19082,6 +19082,58 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, static int +virDomainResctrlAppend(virDomainDefPtr def, + xmlNodePtr node, + virResctrlAllocPtr alloc, + virBitmapPtr vcpus, + unsigned int flags) +{ + char *vcpus_str = NULL; + char *alloc_id = NULL; + virDomainResctrlDefPtr tmp_resctrl = NULL; + int ret = -1; + + if (VIR_ALLOC(tmp_resctrl) < 0) + goto cleanup; + + /* We need to format it back because we need to be consistent in the naming + * even when users specify some "sub-optimal" string there. */ + vcpus_str = virBitmapFormat(vcpus); + if (!vcpus_str) + goto cleanup; + + if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) + alloc_id = virXMLPropString(node, "id"); + + if (!alloc_id) { + /* The number of allocations is limited and the directory structure is flat, + * not hierarchical, so we need to have all same allocations in one + * directory, so it's nice to have it named appropriately. For now it's + * 'vcpus_...' but it's designed in order for it to be changeable in the + * future (it's part of the status XML). */ + if (virAsprintf(&alloc_id, "vcpus_%s", vcpus_str) < 0) + goto cleanup; + } + + if (virResctrlAllocSetID(alloc, alloc_id) < 0) + goto cleanup; + + tmp_resctrl->vcpus = vcpus; + tmp_resctrl->alloc = alloc; + + if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) + goto cleanup; + + ret = 0; + cleanup: + virDomainResctrlDefFree(tmp_resctrl); + VIR_FREE(alloc_id); + VIR_FREE(vcpus_str); + return ret; +} + + +static int virDomainCachetuneDefParse(virDomainDefPtr def, xmlXPathContextPtr ctxt, xmlNodePtr node, @@ -19091,19 +19143,12 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; - virDomainResctrlDefPtr tmp_resctrl = NULL; - char *tmp = NULL; - char *vcpus_str = NULL; - char *alloc_id = NULL; ssize_t i = 0; int n; int ret = -1; ctxt->node = node; - if (VIR_ALLOC(tmp_resctrl) < 0) - goto cleanup; - if (virDomainResctrlParseVcpus(def, node, &vcpus) < 0) goto cleanup; @@ -19141,45 +19186,17 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } - /* We need to format it back because we need to be consistent in the naming - * even when users specify some "sub-optimal" string there. */ - VIR_FREE(vcpus_str); - vcpus_str = virBitmapFormat(vcpus); - if (!vcpus_str) - goto cleanup; - - if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) - alloc_id = virXMLPropString(node, "id"); - - if (!alloc_id) { - /* The number of allocations is limited and the directory structure is flat, - * not hierarchical, so we need to have all same allocations in one - * directory, so it's nice to have it named appropriately. For now it's - * 'vcpus_...' but it's designed in order for it to be changeable in the - * future (it's part of the status XML). */ - if (virAsprintf(&alloc_id, "vcpus_%s", vcpus_str) < 0) - goto cleanup; - } - - if (virResctrlAllocSetID(alloc, alloc_id) < 0) - goto cleanup; - - VIR_STEAL_PTR(tmp_resctrl->vcpus, vcpus); - VIR_STEAL_PTR(tmp_resctrl->alloc, alloc); - - if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) + if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) goto cleanup; + vcpus = NULL; + alloc = NULL; ret = 0; cleanup: ctxt->node = oldnode; - virDomainResctrlDefFree(tmp_resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); - VIR_FREE(alloc_id); - VIR_FREE(vcpus_str); VIR_FREE(nodes); - VIR_FREE(tmp); return ret; } -- 2.7.4

From: Bing Niu <bing.niu@intel.com> Introduce a new section memorytune to support memory bandwidth allocation. This is consistent with existing cachetune. As the example: below: <cputune> ...... <memorytune vcpus='0'> <node id='0' bandwidth='30'/> </memorytune> </cputune> vpus --- vpus subjected to this memory bandwidth. id --- on which node memory bandwidth to be set. bandwidth --- the memory bandwidth percent to set. Signed-off-by: Bing Niu <bing.niu@intel.com> --- docs/formatdomain.html.in | 39 +++- docs/schemas/domaincommon.rng | 17 ++ src/conf/domain_conf.c | 200 +++++++++++++++++++++ .../memorytune-colliding-allocs.xml | 30 ++++ .../memorytune-colliding-cachetune.xml | 32 ++++ tests/genericxml2xmlindata/memorytune.xml | 33 ++++ tests/genericxml2xmltest.c | 5 + 7 files changed, 355 insertions(+), 1 deletion(-) create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml create mode 100644 tests/genericxml2xmlindata/memorytune.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 19b7312..1ae6e98 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,10 @@ <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> </cachetune> + <memorytune vcpus='0-3'> + <node id='0' bandwidth='60'/> + </memorytune> + </cputune> ... </domain> @@ -932,7 +936,9 @@ size and required granularity are reported as well. The required attribute <code>vcpus</code> specifies to which vCPUs this allocation applies. A vCPU can only be member of one <code>cachetune</code> element - allocations. Supported subelements are: + allocations. The vCPUs specified by cachetune can be identical with those + in memorytune, however they are not allowed to overlap. + Supported subelements are: <dl> <dt><code>cache</code></dt> <dd> @@ -972,7 +978,38 @@ </dl> </dd> </dl> + </dd> + <dt><code>memorytune</code><span class="since">Since 4.7.0</span></dt> + <dd> + Optional <code>memorytune</code> element can control allocations for + memory bandwidth using the resctrl on the host. Whether or not is this + supported can be gathered from capabilities where some limitations like + minimum bandwidth and required granularity are reported as well. The + required attribute <code>vcpus</code> specifies to which vCPUs this + allocation applies. A vCPU can only be member of one + <code>memorytune</code> element allocations. The <code>vcpus</code> specified + by <code>memorytune</code> can be identical to those specified by + <code>cachetune</code>. However they are not allowed to overlap each other. + Supported subelements are: + <dl> + <dt><code>node</code></dt> + <dd> + This element controls the allocation of CPU memory bandwidth and has the + following attributes: + <dl> + <dt><code>id</code></dt> + <dd> + Host node id from which to allocate memory bandwidth. + </dd> + <dt><code>bandwidth</code></dt> + <dd> + The memory bandwidth to allocate from this node. The value by default + is in percentage. + </dd> + </dl> + </dd> + </dl> </dd> </dl> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index ac04af5..48f0637 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -983,6 +983,23 @@ </oneOrMore> </element> </zeroOrMore> + <zeroOrMore> + <element name="memorytune"> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + <oneOrMore> + <element name="node"> + <attribute name="id"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="bandwidth"> + <ref name='unsignedInt'/> + </attribute> + </element> + </oneOrMore> + </element> + </zeroOrMore> </interleave> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index f45def8..d8333ad 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19290,6 +19290,129 @@ virDomainDefParseCaps(virDomainDefPtr def, } +static int +virDomainMemorytuneDefParseMemory(xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlAllocPtr alloc) +{ + xmlNodePtr oldnode = ctxt->node; + unsigned int id; + unsigned int bandwidth; + char *tmp = NULL; + int ret = -1; + + ctxt->node = node; + + tmp = virXMLPropString(node, "id"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing memorytune attribute 'id'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &id) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid memorytune attribute 'id' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + tmp = virXMLPropString(node, "bandwidth"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing memorytune attribute 'bandwidth'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &bandwidth) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid memorytune attribute 'bandwidth' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + if (virResctrlAllocSetMemoryBandwidth(alloc, id, bandwidth) < 0) + goto cleanup; + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(tmp); + return ret; +} + + +static int +virDomainMemorytuneDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + unsigned int flags) +{ + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + virBitmapPtr vcpus = NULL; + virResctrlAllocPtr alloc = NULL; + ssize_t i = 0; + int n; + int ret = -1; + bool new_alloc = false; + + ctxt->node = node; + + if (virDomainResctrlParseVcpus(def, node, &vcpus) < 0) + goto cleanup; + + if (virBitmapIsAllClear(vcpus)) { + ret = 0; + goto cleanup; + } + + if ((n = virXPathNodeSet("./node", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract memory nodes under memorytune")); + goto cleanup; + } + + if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) + goto cleanup; + + if (!alloc) { + alloc = virResctrlAllocNew(); + if (!alloc) + goto cleanup; + new_alloc = true; + } else { + alloc = virObjectRef(alloc); + } + + for (i = 0; i < n; i++) { + if (virDomainMemorytuneDefParseMemory(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } + if (virResctrlAllocIsEmpty(alloc)) { + ret = 0; + goto cleanup; + } + /* + * If this is a new allocation, format ID and append to resctrl, otherwise + * just update the existing alloc information, which is done in above + * virDomainMemorytuneDefParseMemory */ + if (new_alloc) { + if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) + goto cleanup; + vcpus = NULL; + alloc = NULL; + } + + ret = 0; + cleanup: + ctxt->node = oldnode; + virObjectUnref(alloc); + virBitmapFree(vcpus); + VIR_FREE(nodes); + return ret; +} + + static virDomainDefPtr virDomainDefParseXML(xmlDocPtr xml, xmlNodePtr root, @@ -19804,6 +19927,18 @@ virDomainDefParseXML(xmlDocPtr xml, } VIR_FREE(nodes); + if ((n = virXPathNodeSet("./cputune/memorytune", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("cannot extract memorytune nodes")); + goto error; + } + + for (i = 0; i < n; i++) { + if (virDomainMemorytuneDefParse(def, ctxt, nodes[i], flags) < 0) + goto error; + } + VIR_FREE(nodes); + if (virCPUDefParseXML(ctxt, "./cpu[1]", VIR_CPU_TYPE_GUEST, &def->cpu) < 0) goto error; @@ -27104,6 +27239,68 @@ virDomainCachetuneDefFormat(virBufferPtr buf, static int +virDomainMemorytuneDefFormatHelper(unsigned int id, + unsigned int bandwidth, + void *opaque) +{ + virBufferPtr buf = opaque; + + virBufferAsprintf(buf, + "<node id='%u' bandwidth='%u'/>\n", + id, bandwidth); + return 0; +} + + +static int +virDomainMemorytuneDefFormat(virBufferPtr buf, + virDomainResctrlDefPtr resctrl, + unsigned int flags) +{ + virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; + char *vcpus = NULL; + int ret = -1; + + virBufferSetChildIndent(&childrenBuf, buf); + if (virResctrlAllocForeachMemory(resctrl->alloc, + virDomainMemorytuneDefFormatHelper, + &childrenBuf) < 0) + goto cleanup; + + if (virBufferCheckError(&childrenBuf) < 0) + goto cleanup; + + if (!virBufferUse(&childrenBuf)) { + ret = 0; + goto cleanup; + } + + vcpus = virBitmapFormat(resctrl->vcpus); + if (!vcpus) + goto cleanup; + + virBufferAsprintf(buf, "<memorytune vcpus='%s'", vcpus); + + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { + const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); + if (!alloc_id) + goto cleanup; + + virBufferAsprintf(buf, " id='%s'", alloc_id); + } + virBufferAddLit(buf, ">\n"); + + virBufferAddBuffer(buf, &childrenBuf); + virBufferAddLit(buf, "</memorytune>\n"); + + ret = 0; + cleanup: + virBufferFreeAndReset(&childrenBuf); + VIR_FREE(vcpus); + return ret; +} + +static int virDomainCputuneDefFormat(virBufferPtr buf, virDomainDefPtr def, unsigned int flags) @@ -27208,6 +27405,9 @@ virDomainCputuneDefFormat(virBufferPtr buf, for (i = 0; i < def->nresctrls; i++) virDomainCachetuneDefFormat(&childrenBuf, def->resctrls[i], flags); + for (i = 0; i < def->nresctrls; i++) + virDomainMemorytuneDefFormat(&childrenBuf, def->resctrls[i], flags); + if (virBufferCheckError(&childrenBuf) < 0) return -1; diff --git a/tests/genericxml2xmlindata/memorytune-colliding-allocs.xml b/tests/genericxml2xmlindata/memorytune-colliding-allocs.xml new file mode 100644 index 0000000..9b8ebaa --- /dev/null +++ b/tests/genericxml2xmlindata/memorytune-colliding-allocs.xml @@ -0,0 +1,30 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <memorytune vcpus='0'> + <node id='0' bandwidth='50'/> + <node id='0' bandwidth='50'/> + </memorytune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml b/tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml new file mode 100644 index 0000000..5416870 --- /dev/null +++ b/tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml @@ -0,0 +1,32 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='code' size='12' unit='KiB'/> + </cachetune> + <memorytune vcpus='0'> + <node id='0' bandwidth='50'/> + </memorytune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/memorytune.xml b/tests/genericxml2xmlindata/memorytune.xml new file mode 100644 index 0000000..ea03e22 --- /dev/null +++ b/tests/genericxml2xmlindata/memorytune.xml @@ -0,0 +1,33 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <memorytune vcpus='0-1'> + <node id='0' bandwidth='20'/> + <node id='1' bandwidth='30'/> + </memorytune> + <memorytune vcpus='3'> + <node id='0' bandwidth='50'/> + </memorytune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index 7a4fc1e..e6d4ef2 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -140,6 +140,11 @@ mymain(void) TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST_FULL("cachetune-colliding-types", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST("memorytune"); + DO_TEST_FULL("memorytune-colliding-allocs", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST_FULL("memorytune-colliding-cachetune", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST("tseg"); -- 2.7.4

On 07/29/2018 11:12 PM, bing.niu@intel.com wrote:
From: Bing Niu <bing.niu@intel.com>
Introduce a new section memorytune to support memory bandwidth allocation. This is consistent with existing cachetune. As the example: below: <cputune> ...... <memorytune vcpus='0'> <node id='0' bandwidth='30'/> </memorytune> </cputune>
vpus --- vpus subjected to this memory bandwidth. id --- on which node memory bandwidth to be set. bandwidth --- the memory bandwidth percent to set.
Signed-off-by: Bing Niu <bing.niu@intel.com> --- docs/formatdomain.html.in | 39 +++- docs/schemas/domaincommon.rng | 17 ++ src/conf/domain_conf.c | 200 +++++++++++++++++++++ .../memorytune-colliding-allocs.xml | 30 ++++ .../memorytune-colliding-cachetune.xml | 32 ++++ tests/genericxml2xmlindata/memorytune.xml | 33 ++++ tests/genericxml2xmltest.c | 5 + 7 files changed, 355 insertions(+), 1 deletion(-) create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml create mode 100644 tests/genericxml2xmlindata/memorytune.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 19b7312..1ae6e98 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,10 @@ <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> </cachetune> + <memorytune vcpus='0-3'> + <node id='0' bandwidth='60'/> + </memorytune> + </cputune> ... </domain> @@ -932,7 +936,9 @@ size and required granularity are reported as well. The required attribute <code>vcpus</code> specifies to which vCPUs this allocation applies. A vCPU can only be member of one <code>cachetune</code> element - allocations. Supported subelements are: + allocations. The vCPUs specified by cachetune can be identical with those
s/allocations/allocation (since we're touching the line anyway)
+ in memorytune, however they are not allowed to overlap. + Supported subelements are: <dl> <dt><code>cache</code></dt> <dd> @@ -972,7 +978,38 @@ </dl> </dd> </dl> + </dd>
+ <dt><code>memorytune</code><span class="since">Since 4.7.0</span></dt> + <dd> + Optional <code>memorytune</code> element can control allocations for + memory bandwidth using the resctrl on the host. Whether or not is this + supported can be gathered from capabilities where some limitations like + minimum bandwidth and required granularity are reported as well. The + required attribute <code>vcpus</code> specifies to which vCPUs this + allocation applies. A vCPU can only be member of one + <code>memorytune</code> element allocations. The <code>vcpus</code> specified
s/allocations/allocation/
+ by <code>memorytune</code> can be identical to those specified by + <code>cachetune</code>. However they are not allowed to overlap each other. + Supported subelements are: + <dl> + <dt><code>node</code></dt> + <dd> + This element controls the allocation of CPU memory bandwidth and has the + following attributes: + <dl> + <dt><code>id</code></dt> + <dd> + Host node id from which to allocate memory bandwidth. + </dd> + <dt><code>bandwidth</code></dt> + <dd> + The memory bandwidth to allocate from this node. The value by default + is in percentage. + </dd> + </dl> + </dd> + </dl> </dd> </dl>
[...] I'll fix in my branch before pushing John

From: Bing Niu <bing.niu@intel.com> Add return value check to virResctrlAllocForeachCache in virDomainCachetuneDefFormat. The virResctrlAllocForeachCache dose have return value, so need check return value to make sure function executed without error. Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/conf/domain_conf.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d8333ad..095c903 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -27199,10 +27199,10 @@ virDomainCachetuneDefFormat(virBufferPtr buf, int ret = -1; virBufferSetChildIndent(&childrenBuf, buf); - virResctrlAllocForeachCache(resctrl->alloc, - virDomainCachetuneDefFormatHelper, - &childrenBuf); - + if (virResctrlAllocForeachCache(resctrl->alloc, + virDomainCachetuneDefFormatHelper, + &childrenBuf) < 0) + goto cleanup; if (virBufferCheckError(&childrenBuf) < 0) goto cleanup; -- 2.7.4

On 07/29/2018 11:12 PM, bing.niu@intel.com wrote:
From: Bing Niu <bing.niu@intel.com>
Add return value check to virResctrlAllocForeachCache in virDomainCachetuneDefFormat. The virResctrlAllocForeachCache dose have
s/dose/does/
return value, so need check return value to make sure function executed without error.
Signed-off-by: Bing Niu <bing.niu@intel.com> --- src/conf/domain_conf.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
I'll fix in my branch before pushing John

From: Bing Niu <bing.niu@intel.com> Add new XML section to report host's memory bandwidth allocation capability. The format as below example: <host> ..... <memory_bandwidth> <node id='0' cpus='0-19'> <control granularity='10' min ='10' maxAllocs='8'/> </node> </memory_bandwidth> </host> granularity ---- granularity of memory bandwidth, unit percentage. min ---- minimum memory bandwidth allowed, unit percentage. maxAllocs ---- maximum memory bandwidth allocation group supported. Signed-off-by: Bing Niu <bing.niu@intel.com> --- docs/schemas/capability.rng | 33 +++++++ src/conf/capabilities.c | 107 +++++++++++++++++++++ src/conf/capabilities.h | 11 +++ src/util/virresctrl.c | 20 ++++ src/util/virresctrl.h | 15 +++ .../linux-resctrl/resctrl/info/MB/bandwidth_gran | 1 + .../linux-resctrl/resctrl/info/MB/min_bandwidth | 1 + .../linux-resctrl/resctrl/info/MB/num_closids | 1 + tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 8 ++ tests/virresctrldata/resctrl.schemata | 1 + 10 files changed, 198 insertions(+) create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids diff --git a/docs/schemas/capability.rng b/docs/schemas/capability.rng index 52164d5..d61515c 100644 --- a/docs/schemas/capability.rng +++ b/docs/schemas/capability.rng @@ -51,6 +51,9 @@ <optional> <ref name='cache'/> </optional> + <optional> + <ref name='memory_bandwidth'/> + </optional> <zeroOrMore> <ref name='secmodel'/> </zeroOrMore> @@ -326,6 +329,36 @@ </attribute> </define> + <define name='memory_bandwidth'> + <element name='memory_bandwidth'> + <oneOrMore> + <element name='node'> + <attribute name='id'> + <ref name='unsignedInt'/> + </attribute> + <attribute name='cpus'> + <ref name='cpuset'/> + </attribute> + <zeroOrMore> + <element name='control'> + <attribute name='granularity'> + <ref name='unsignedInt'/> + </attribute> + <optional> + <attribute name='min'> + <ref name='unsignedInt'/> + </attribute> + </optional> + <attribute name='maxAllocs'> + <ref name='unsignedInt'/> + </attribute> + </element> + </zeroOrMore> + </element> + </oneOrMore> + </element> + </define> + <define name='guestcaps'> <element name='guest'> <ref name='ostype'/> diff --git a/src/conf/capabilities.c b/src/conf/capabilities.c index 0f96500..ef2ca81 100644 --- a/src/conf/capabilities.c +++ b/src/conf/capabilities.c @@ -198,6 +198,16 @@ virCapabilitiesFreeNUMAInfo(virCapsPtr caps) } static void +virCapsHostMemBWNodeFree(virCapsHostMemBWNodePtr ptr) +{ + if (!ptr) + return; + + virBitmapFree(ptr->cpus); + VIR_FREE(ptr); +} + +static void virCapabilitiesClearSecModel(virCapsHostSecModelPtr secmodel) { size_t i; @@ -239,6 +249,10 @@ virCapsDispose(void *object) virCapsHostCacheBankFree(caps->host.caches[i]); VIR_FREE(caps->host.caches); + for (i = 0; i < caps->host.nnodes; i++) + virCapsHostMemBWNodeFree(caps->host.nodes[i]); + VIR_FREE(caps->host.nodes); + VIR_FREE(caps->host.netprefix); VIR_FREE(caps->host.pagesSize); virCPUDefFree(caps->host.cpu); @@ -957,6 +971,58 @@ virCapabilitiesFormatCaches(virBufferPtr buf, return 0; } +static int +virCapabilitiesFormatMemoryBandwidth(virBufferPtr buf, + size_t nnodes, + virCapsHostMemBWNodePtr *nodes) +{ + size_t i = 0; + virBuffer controlBuf = VIR_BUFFER_INITIALIZER; + + if (!nnodes) + return 0; + + virBufferAddLit(buf, "<memory_bandwidth>\n"); + virBufferAdjustIndent(buf, 2); + + for (i = 0; i < nnodes; i++) { + virCapsHostMemBWNodePtr node = nodes[i]; + virResctrlInfoMemBWPerNodePtr control = &node->control; + char *cpus_str = virBitmapFormat(node->cpus); + + if (!cpus_str) + return -1; + + virBufferAsprintf(buf, + "<node id='%u' cpus='%s'", + node->id, cpus_str); + VIR_FREE(cpus_str); + + virBufferSetChildIndent(&controlBuf, buf); + virBufferAsprintf(&controlBuf, + "<control granularity='%u' min ='%u' " + "maxAllocs='%u'/>\n", + control->granularity, control->min, + control->max_allocation); + + if (virBufferCheckError(&controlBuf) < 0) + return -1; + + if (virBufferUse(&controlBuf)) { + virBufferAddLit(buf, ">\n"); + virBufferAddBuffer(buf, &controlBuf); + virBufferAddLit(buf, "</node>\n"); + } else { + virBufferAddLit(buf, "/>\n"); + } + } + + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</memory_bandwidth>\n"); + + return 0; +} + /** * virCapabilitiesFormatXML: * @caps: capabilities to format @@ -1060,6 +1126,10 @@ virCapabilitiesFormatXML(virCapsPtr caps) caps->host.caches) < 0) goto error; + if (virCapabilitiesFormatMemoryBandwidth(&buf, caps->host.nnodes, + caps->host.nodes) < 0) + goto error; + for (i = 0; i < caps->host.nsecModels; i++) { virBufferAddLit(&buf, "<secmodel>\n"); virBufferAdjustIndent(&buf, 2); @@ -1602,6 +1672,40 @@ virCapabilitiesInitResctrl(virCapsPtr caps) } +static int +virCapabilitiesInitResctrlMemory(virCapsPtr caps) +{ + virCapsHostMemBWNodePtr node = NULL; + size_t i = 0; + int ret = -1; + + for (i = 0; i < caps->host.ncaches; i++) { + virCapsHostCacheBankPtr bank = caps->host.caches[i]; + if (VIR_ALLOC(node) < 0) + goto cleanup; + + if (virResctrlInfoGetMemoryBandwidth(caps->host.resctrl, + bank->level, &node->control) > 0) { + node->id = bank->id; + if (!(node->cpus = virBitmapNewCopy(bank->cpus))) + goto cleanup; + + if (VIR_APPEND_ELEMENT(caps->host.nodes, + caps->host.nnodes, node) < 0) { + goto cleanup; + } + } + virCapsHostMemBWNodeFree(node); + node = NULL; + } + + ret = 0; + cleanup: + virCapsHostMemBWNodeFree(node); + return ret; +} + + int virCapabilitiesInitCaches(virCapsPtr caps) { @@ -1731,6 +1835,9 @@ virCapabilitiesInitCaches(virCapsPtr caps) qsort(caps->host.caches, caps->host.ncaches, sizeof(*caps->host.caches), virCapsHostCacheBankSorter); + if (virCapabilitiesInitResctrlMemory(caps) < 0) + goto cleanup; + ret = 0; cleanup: VIR_FREE(type); diff --git a/src/conf/capabilities.h b/src/conf/capabilities.h index fe1b9ea..046e275 100644 --- a/src/conf/capabilities.h +++ b/src/conf/capabilities.h @@ -151,6 +151,14 @@ struct _virCapsHostCacheBank { virResctrlInfoPerCachePtr *controls; }; +typedef struct _virCapsHostMemBWNode virCapsHostMemBWNode; +typedef virCapsHostMemBWNode *virCapsHostMemBWNodePtr; +struct _virCapsHostMemBWNode { + unsigned int id; + virBitmapPtr cpus; /* All CPUs that belong to this node*/ + virResctrlInfoMemBWPerNode control; +}; + typedef struct _virCapsHost virCapsHost; typedef virCapsHost *virCapsHostPtr; struct _virCapsHost { @@ -175,6 +183,9 @@ struct _virCapsHost { size_t ncaches; virCapsHostCacheBankPtr *caches; + size_t nnodes; + virCapsHostMemBWNodePtr *nodes; + size_t nsecModels; virCapsHostSecModelPtr secModels; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index f25e7bc..4d5adcc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -630,6 +630,26 @@ virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl) int +virResctrlInfoGetMemoryBandwidth(virResctrlInfoPtr resctrl, + unsigned int level, + virResctrlInfoMemBWPerNodePtr control) +{ + virResctrlInfoMemBWPtr membw_info = resctrl->membw_info; + + if (!membw_info) + return 0; + + if (membw_info->last_level_cache != level) + return 0; + + control->granularity = membw_info->bandwidth_granularity; + control->min = membw_info->min_bandwidth; + control->max_allocation = membw_info->max_allocation; + return 1; +} + + +int virResctrlInfoGetCache(virResctrlInfoPtr resctrl, unsigned int level, unsigned long long size, diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 8d62517..cfd56dd 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -50,6 +50,17 @@ struct _virResctrlInfoPerCache { unsigned int max_allocation; }; +typedef struct _virResctrlInfoMemBWPerNode virResctrlInfoMemBWPerNode; +typedef virResctrlInfoMemBWPerNode *virResctrlInfoMemBWPerNodePtr; +struct _virResctrlInfoMemBWPerNode { + /* Smallest possible increase of the allocation bandwidth in percentage */ + unsigned int granularity; + /* Minimal allocatable bandwidth in percentage */ + unsigned int min; + /* Maximum number of simultaneous allocations */ + unsigned int max_allocation; +}; + typedef struct _virResctrlInfo virResctrlInfo; typedef virResctrlInfo *virResctrlInfoPtr; @@ -63,6 +74,10 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, size_t *ncontrols, virResctrlInfoPerCachePtr **controls); +int +virResctrlInfoGetMemoryBandwidth(virResctrlInfoPtr resctrl, + unsigned int level, + virResctrlInfoMemBWPerNodePtr control); /* Alloc-related things */ typedef struct _virResctrlAlloc virResctrlAlloc; typedef virResctrlAlloc *virResctrlAllocPtr; diff --git a/tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran b/tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran new file mode 100644 index 0000000..f599e28 --- /dev/null +++ b/tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran @@ -0,0 +1 @@ +10 diff --git a/tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth b/tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth new file mode 100644 index 0000000..f599e28 --- /dev/null +++ b/tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth @@ -0,0 +1 @@ +10 diff --git a/tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids b/tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids new file mode 100644 index 0000000..b8626c4 --- /dev/null +++ b/tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids @@ -0,0 +1 @@ +4 diff --git a/tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml b/tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml index 4840614..9b00cf0 100644 --- a/tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml +++ b/tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml @@ -49,6 +49,14 @@ <control granularity='768' min='1536' unit='KiB' type='both' maxAllocs='4'/> </bank> </cache> + <memory_bandwidth> + <node id='0' cpus='0-5'> + <control granularity='10' min ='10' maxAllocs='4'/> + </node> + <node id='1' cpus='6-11'> + <control granularity='10' min ='10' maxAllocs='4'/> + </node> + </memory_bandwidth> </host> </capabilities> diff --git a/tests/virresctrldata/resctrl.schemata b/tests/virresctrldata/resctrl.schemata index fa980e5..2578822 100644 --- a/tests/virresctrldata/resctrl.schemata +++ b/tests/virresctrldata/resctrl.schemata @@ -1 +1,2 @@ L3:0=000ff;1=000f0 +MB:0=100;1=100 -- 2.7.4

On 07/29/2018 11:12 PM, bing.niu@intel.com wrote:
From: Bing Niu <bing.niu@intel.com>
This series is to introduce RDT memory bandwidth allocation support by extending current virresctrl implementation.
The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate control over memory bandwidth available per-core. This feature provides a method to control applications which may be over-utilizing bandwidth relative to their priority in environments such as the data-center. The details can be found in Intel's SDM 17.19.7. Kernel supports MBA through resctrl file system same as CAT. Each resctrl group have a MB parameter to control how much memory bandwidth it can utilize in unit of percentage.
In this series, MBA is enabled by enhancing existing virresctrl implementation. The policy employed for MBA is similar with CAT: The sum of each MBA group's bandwidth dose not exceed 100%.
The enhancement of virresctrl include two main parts:
Part 1: Add two new structures virResctrlInfoMemMB and virResctrlAllocMemBW for collecting host system MBA capability and domain memory bandwidth allocation. Those two structures are the extension of existing virResctrlInfo and virResctrlAlloc. With them, virresctrl framework can support MBA and CAT concurrently. Each virResctrlAlloc represent a resource allocation including CAT, or MBA, or CAT&MBA. The policy of MBA is that: total memory bandwidth of each resctrl group, created by virresctrl, does not exceed to 100%.
Part 2: On XML part, add new elements to host capabilities query and domain allocation to support memory bandwidth allocation. --------------------------------------------------------------------------------------------- For host capabilities XML, new XML format like below example, <host> ..... <memory_bandwidth> <node id='0' cpus='0-19'> <control granularity='10' min ='10' maxAllocs='8'/> </node> </memory_bandwidth> </host>
granularity --- memory bandwidth granularity min --- minimum memory bandwidth allowed maxAllocs --- maximum concurrent memory bandwidth allocation allowed.
--------------------------------------------------------------------------------------------- For domain XML, new format as below example <domain type='kvm' id='2'> ...... <cputune> ...... <shares>1024</shares> <memorytune vcpus='0-1'> <node id='0' bandwidth='20'/> </memorytune> </cputune> ...... </domain>
id --- node where memory bandwidth allocation will happen bandwidth --- bandwidth allocated in percentage
----------------------------------------------------------------------------------------------
With this extension of the virresctrl, the overall working follow of CAT and MBA is described by below picture. XML parser will aggregate MBA and CAT configuration and represents it in one virresctrl object. The methods of virresctrl class will manipulate resctrl interface to allocate corresponding resources.
<memorytune cpus='0-3'> +---------+ | <cachetune vcpus='0-3'> XML | + parser +-----------+ | | +------------------------------+ | | internal object +------v--------------+ virResctrlAlloc | backing object | +------+--------------+ | | +------------------------------+ | +--v-------+ | | | schemata | /sys/fs/resctrl | tasks | | . | | . | | | +----------+ ---------------------------------------------------------------------
previous versions and discussion can be found at v1: https://www.redhat.com/archives/libvir-list/2018-July/msg01144.html RFC v2: https://www.redhat.com/archives/libvir-list/2018-June/msg01268.html RFC v1: https://www.redhat.com/archives/libvir-list/2018-May/msg02101.html
Changelog: v1 -> this: John's comment: 1. Split calculation of number of memory bandwidth control to one patch. 2. Split virResctrlAllocMemBW relating methods to 5 patch, each provides one kind of function, eg: schemata processing, memory bandwidth calculation..... 3. Use resctrl to replace cachetune in domain conf. 4. Split refactor virDomainCachetuneDefParse into 3 patches. And adjust some logic, eg: use %s format error log, renaming functions..... 5. Complete doc description. eg: update cachetune part about vcpus overlapping with memorytune, update libvirt version info for memory bandwidth control availability. 6. Some coding style fix.
RFC_v2->v1: John's comment: 1. use name MemBW to replace MB for a more clear description. 2. split rename patch and put refactor function part separately. 3. split virResctrlInfoMemMB and virResctrlAllocMemBW to different patches. 4. add docs/schemas/*.rng for XML related patches. 5. some cleanup for coding conventions. RFC_ v1->RFC_v2: Jano's comment: 1. put renaming parts into separated patches. 2. set the initial return value as -1. 3. using full name in structure definition. 4. do not use VIR_CACHE_TYPE_LAST for memory bandwidth allocation formatting.
Pavel's comment: 1. add host capabilities XML for memory bandwidth allocation. 2. do not mix use cachetune section in XML for memory bandwidth allocation in domain XML. define a dedicated one for memory bandwidth allocation.
Bing Niu (17): util: Rename some functions of virresctrl util: Refactor virResctrlGetInfo in virresctrl util: Refactor virResctrlAllocFormat of virresctrl util: Add MBA capability information query to resctrl util: Add MBA check to virResctrlInfoGetCache util: Add MBA allocation to virresctrl util: Add MBA schemata parse and format methods util: Add support to calculate MBA utilization util: Introduce virResctrlAllocForeachMemory util: Introduce virResctrlAllocSetMemoryBandwidth conf: Rename cachetune to resctrl conf: Factor out vcpus parsing part from virDomainCachetuneDefParse conf: Factor out vcpus overlapping from virDomainCachetuneDefParse conf: Factor out virDomainResctrlDef update from virDomainCachetuneDefParse conf: Add support for memorytune XML processing for resctrl MBA conf: Add return value check to virResctrlAllocForeachCache conf: Add memory bandwidth allocation capability of host
docs/formatdomain.html.in | 39 +- docs/schemas/capability.rng | 33 ++ docs/schemas/domaincommon.rng | 17 + src/conf/capabilities.c | 107 ++++ src/conf/capabilities.h | 11 + src/conf/domain_conf.c | 428 ++++++++++++--- src/conf/domain_conf.h | 10 +- src/libvirt_private.syms | 6 +- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_process.c | 18 +- src/util/virresctrl.c | 611 +++++++++++++++++++-- src/util/virresctrl.h | 55 +- .../memorytune-colliding-allocs.xml | 30 + .../memorytune-colliding-cachetune.xml | 32 ++ tests/genericxml2xmlindata/memorytune.xml | 33 ++ tests/genericxml2xmltest.c | 5 + .../linux-resctrl/resctrl/info/MB/bandwidth_gran | 1 + .../linux-resctrl/resctrl/info/MB/min_bandwidth | 1 + .../linux-resctrl/resctrl/info/MB/num_closids | 1 + tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 8 + tests/virresctrldata/resctrl.schemata | 1 + 21 files changed, 1280 insertions(+), 169 deletions(-) create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml create mode 100644 tests/genericxml2xmlindata/memorytune.xml create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids
Reviewed-by: John Ferlan <jferlan@redhat.com> (series) I'll push once the tree is open for 4.7.0 commits unless someone else chimes in with other major issues that need to be addressed. John

On 2018年07月31日 06:14, John Ferlan wrote:
On 07/29/2018 11:12 PM, bing.niu@intel.com wrote:
From: Bing Niu <bing.niu@intel.com>
This series is to introduce RDT memory bandwidth allocation support by extending current virresctrl implementation.
[....]
Bing Niu (17): util: Rename some functions of virresctrl util: Refactor virResctrlGetInfo in virresctrl util: Refactor virResctrlAllocFormat of virresctrl util: Add MBA capability information query to resctrl util: Add MBA check to virResctrlInfoGetCache util: Add MBA allocation to virresctrl util: Add MBA schemata parse and format methods util: Add support to calculate MBA utilization util: Introduce virResctrlAllocForeachMemory util: Introduce virResctrlAllocSetMemoryBandwidth conf: Rename cachetune to resctrl conf: Factor out vcpus parsing part from virDomainCachetuneDefParse conf: Factor out vcpus overlapping from virDomainCachetuneDefParse conf: Factor out virDomainResctrlDef update from virDomainCachetuneDefParse conf: Add support for memorytune XML processing for resctrl MBA conf: Add return value check to virResctrlAllocForeachCache conf: Add memory bandwidth allocation capability of host
docs/formatdomain.html.in | 39 +- docs/schemas/capability.rng | 33 ++ docs/schemas/domaincommon.rng | 17 + src/conf/capabilities.c | 107 ++++ src/conf/capabilities.h | 11 + src/conf/domain_conf.c | 428 ++++++++++++--- src/conf/domain_conf.h | 10 +- src/libvirt_private.syms | 6 +- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_process.c | 18 +- src/util/virresctrl.c | 611 +++++++++++++++++++-- src/util/virresctrl.h | 55 +- .../memorytune-colliding-allocs.xml | 30 + .../memorytune-colliding-cachetune.xml | 32 ++ tests/genericxml2xmlindata/memorytune.xml | 33 ++ tests/genericxml2xmltest.c | 5 + .../linux-resctrl/resctrl/info/MB/bandwidth_gran | 1 + .../linux-resctrl/resctrl/info/MB/min_bandwidth | 1 + .../linux-resctrl/resctrl/info/MB/num_closids | 1 + tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 8 + tests/virresctrldata/resctrl.schemata | 1 + 21 files changed, 1280 insertions(+), 169 deletions(-) create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml create mode 100644 tests/genericxml2xmlindata/memorytune.xml create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids
Reviewed-by: John Ferlan <jferlan@redhat.com> (series)
I'll push once the tree is open for 4.7.0 commits unless someone else chimes in with other major issues that need to be addressed.
Thanks for this :). Bing
John

On 07/30/2018 11:54 PM, bing.niu wrote:
On 2018年07月31日 06:14, John Ferlan wrote:
On 07/29/2018 11:12 PM, bing.niu@intel.com wrote:
From: Bing Niu <bing.niu@intel.com>
This series is to introduce RDT memory bandwidth allocation support by extending current virresctrl implementation.
[....]
Bing Niu (17): util: Rename some functions of virresctrl util: Refactor virResctrlGetInfo in virresctrl util: Refactor virResctrlAllocFormat of virresctrl util: Add MBA capability information query to resctrl util: Add MBA check to virResctrlInfoGetCache util: Add MBA allocation to virresctrl util: Add MBA schemata parse and format methods util: Add support to calculate MBA utilization util: Introduce virResctrlAllocForeachMemory util: Introduce virResctrlAllocSetMemoryBandwidth conf: Rename cachetune to resctrl conf: Factor out vcpus parsing part from virDomainCachetuneDefParse conf: Factor out vcpus overlapping from virDomainCachetuneDefParse conf: Factor out virDomainResctrlDef update from virDomainCachetuneDefParse conf: Add support for memorytune XML processing for resctrl MBA conf: Add return value check to virResctrlAllocForeachCache conf: Add memory bandwidth allocation capability of host
docs/formatdomain.html.in | 39 +- docs/schemas/capability.rng | 33 ++ docs/schemas/domaincommon.rng | 17 + src/conf/capabilities.c | 107 ++++ src/conf/capabilities.h | 11 + src/conf/domain_conf.c | 428 ++++++++++++--- src/conf/domain_conf.h | 10 +- src/libvirt_private.syms | 6 +- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_process.c | 18 +- src/util/virresctrl.c | 611 +++++++++++++++++++-- src/util/virresctrl.h | 55 +- .../memorytune-colliding-allocs.xml | 30 + .../memorytune-colliding-cachetune.xml | 32 ++ tests/genericxml2xmlindata/memorytune.xml | 33 ++ tests/genericxml2xmltest.c | 5 + .../linux-resctrl/resctrl/info/MB/bandwidth_gran | 1 + .../linux-resctrl/resctrl/info/MB/min_bandwidth | 1 + .../linux-resctrl/resctrl/info/MB/num_closids | 1 + tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 8 + tests/virresctrldata/resctrl.schemata | 1 + 21 files changed, 1280 insertions(+), 169 deletions(-) create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml create mode 100644 tests/genericxml2xmlindata/memorytune.xml create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids
Reviewed-by: John Ferlan <jferlan@redhat.com> (series)
I'll push once the tree is open for 4.7.0 commits unless someone else chimes in with other major issues that need to be addressed.
Tree re-opened after I left for a week away from the virtual world... Now that I'm back and digging out of email, figured I'd sync this series up with current top and push. Please post a followup docs/news.xml article describing the change. Tks - John

On 2018年08月14日 02:33, John Ferlan wrote:
On 07/30/2018 11:54 PM, bing.niu wrote:
On 2018年07月31日 06:14, John Ferlan wrote:
On 07/29/2018 11:12 PM, bing.niu@intel.com wrote:
From: Bing Niu <bing.niu@intel.com>
This series is to introduce RDT memory bandwidth allocation support by extending current virresctrl implementation.
[....]
Bing Niu (17): util: Rename some functions of virresctrl util: Refactor virResctrlGetInfo in virresctrl util: Refactor virResctrlAllocFormat of virresctrl util: Add MBA capability information query to resctrl util: Add MBA check to virResctrlInfoGetCache util: Add MBA allocation to virresctrl util: Add MBA schemata parse and format methods util: Add support to calculate MBA utilization util: Introduce virResctrlAllocForeachMemory util: Introduce virResctrlAllocSetMemoryBandwidth conf: Rename cachetune to resctrl conf: Factor out vcpus parsing part from virDomainCachetuneDefParse conf: Factor out vcpus overlapping from virDomainCachetuneDefParse conf: Factor out virDomainResctrlDef update from virDomainCachetuneDefParse conf: Add support for memorytune XML processing for resctrl MBA conf: Add return value check to virResctrlAllocForeachCache conf: Add memory bandwidth allocation capability of host
docs/formatdomain.html.in | 39 +- docs/schemas/capability.rng | 33 ++ docs/schemas/domaincommon.rng | 17 + src/conf/capabilities.c | 107 ++++ src/conf/capabilities.h | 11 + src/conf/domain_conf.c | 428 ++++++++++++--- src/conf/domain_conf.h | 10 +- src/libvirt_private.syms | 6 +- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_process.c | 18 +- src/util/virresctrl.c | 611 +++++++++++++++++++-- src/util/virresctrl.h | 55 +- .../memorytune-colliding-allocs.xml | 30 + .../memorytune-colliding-cachetune.xml | 32 ++ tests/genericxml2xmlindata/memorytune.xml | 33 ++ tests/genericxml2xmltest.c | 5 + .../linux-resctrl/resctrl/info/MB/bandwidth_gran | 1 + .../linux-resctrl/resctrl/info/MB/min_bandwidth | 1 + .../linux-resctrl/resctrl/info/MB/num_closids | 1 + tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 8 + tests/virresctrldata/resctrl.schemata | 1 + 21 files changed, 1280 insertions(+), 169 deletions(-) create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml create mode 100644 tests/genericxml2xmlindata/memorytune.xml create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids
Reviewed-by: John Ferlan <jferlan@redhat.com> (series)
I'll push once the tree is open for 4.7.0 commits unless someone else chimes in with other major issues that need to be addressed.
Tree re-opened after I left for a week away from the virtual world... Now that I'm back and digging out of email, figured I'd sync this series up with current top and push.
Please post a followup docs/news.xml article describing the change.
Thanks for this. I will cook a patch for this. :)
Tks -
John
participants (3)
-
bing.niu
-
bing.niu@intel.com
-
John Ferlan