[libvirt] [PATCH 00/10] Cache Allocation Technology support, this time for real (??)

So again all the reviews are incorporated. I managed to get my hands on a machine with CAT support, so I also found out some things that needed tweaking. @Eli: I would still appreciate you helping with some testing as I wasn't able to find a machine with multiple caches (sockets) and I couldn't test that properly (mainly the fact that full mask needs to be specified for caches for which we don't want any allocation to happen). Martin Kletzander (10): Rename virResctrlInfo to virResctrlInfoPerCache util: Add virResctrlInfo conf: Use virResctrlInfo in capabilities util: Remove now-unneeded resctrl functions resctrl: Add functions to work with resctrl allocations conf: Add support for cputune/cachetune tests: Add virresctrltest qemu: Add support for resctrl docs: Add CAT (resctrl) support into news.xml tests: Clean up and modify some vircaps2xmldata docs/formatdomain.html.in | 54 + docs/news.xml | 9 + docs/schemas/domaincommon.rng | 32 + po/POTFILES.in | 1 + src/Makefile.am | 2 +- src/conf/capabilities.c | 55 +- src/conf/capabilities.h | 4 +- src/conf/domain_conf.c | 295 +++- src/conf/domain_conf.h | 13 + src/libvirt_private.syms | 15 +- src/qemu/qemu_process.c | 60 +- src/util/virresctrl.c | 1542 ++++++++++++++++++-- src/util/virresctrl.h | 73 +- src/util/virresctrlpriv.h | 27 + tests/Makefile.am | 9 +- tests/genericxml2xmlindata/cachetune-cdp.xml | 36 + .../cachetune-colliding-allocs.xml | 30 + .../cachetune-colliding-tunes.xml | 32 + .../cachetune-colliding-types.xml | 30 + tests/genericxml2xmlindata/cachetune-small.xml | 29 + tests/genericxml2xmlindata/cachetune.xml | 33 + tests/genericxml2xmltest.c | 10 + .../vircaps2xmldata/linux-resctrl-cdp/resctrl/cpus | 1 - .../linux-resctrl-cdp/resctrl/tasks | 0 .../linux-resctrl-skx-twocaches/resctrl/schemata | 2 +- tests/vircaps2xmldata/linux-resctrl/resctrl/cpus | 1 - .../vircaps2xmldata/linux-resctrl/resctrl/schemata | 2 +- tests/vircaps2xmldata/linux-resctrl/resctrl/tasks | 0 tests/virresctrldata/resctrl-cdp.schemata | 2 + .../virresctrldata/resctrl-skx-twocaches.schemata | 1 + tests/virresctrldata/resctrl-skx.schemata | 1 + tests/virresctrldata/resctrl.schemata | 1 + tests/virresctrltest.c | 102 ++ 33 files changed, 2363 insertions(+), 141 deletions(-) create mode 100644 src/util/virresctrlpriv.h create mode 100644 tests/genericxml2xmlindata/cachetune-cdp.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-tunes.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-types.xml create mode 100644 tests/genericxml2xmlindata/cachetune-small.xml create mode 100644 tests/genericxml2xmlindata/cachetune.xml delete mode 100644 tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/cpus delete mode 100644 tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/tasks delete mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/cpus delete mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/tasks create mode 100644 tests/virresctrldata/resctrl-cdp.schemata create mode 100644 tests/virresctrldata/resctrl-skx-twocaches.schemata create mode 100644 tests/virresctrldata/resctrl-skx.schemata create mode 100644 tests/virresctrldata/resctrl.schemata create mode 100644 tests/virresctrltest.c -- 2.16.1

Just to ease the review of following patches. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/conf/capabilities.c | 2 +- src/conf/capabilities.h | 2 +- src/util/virresctrl.c | 4 ++-- src/util/virresctrl.h | 8 ++++---- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/src/conf/capabilities.c b/src/conf/capabilities.c index 798c9bdaeaf3..e93eaed2f0e0 100644 --- a/src/conf/capabilities.c +++ b/src/conf/capabilities.c @@ -904,7 +904,7 @@ virCapabilitiesFormatCaches(virBufferPtr buf, virBufferSetChildIndent(&controlBuf, buf); for (j = 0; j < bank->ncontrols; j++) { const char *min_unit; - virResctrlInfoPtr controls = bank->controls[j]; + virResctrlInfoPerCachePtr controls = bank->controls[j]; unsigned long long gran_short_size = controls->granularity; unsigned long long min_short_size = controls->min; diff --git a/src/conf/capabilities.h b/src/conf/capabilities.h index 5048fa819d95..27b88cb5edfa 100644 --- a/src/conf/capabilities.h +++ b/src/conf/capabilities.h @@ -148,7 +148,7 @@ struct _virCapsHostCacheBank { virCacheType type; /* Data, Instruction or Unified */ virBitmapPtr cpus; /* All CPUs that share this bank */ size_t ncontrols; - virResctrlInfoPtr *controls; + virResctrlInfoPerCachePtr *controls; }; typedef struct _virCapsHost virCapsHost; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 2a11825a52dc..050a08178e24 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -59,7 +59,7 @@ int virResctrlGetCacheInfo(unsigned int level, unsigned long long size, virCacheType scope, - virResctrlInfoPtr **controls, + virResctrlInfoPerCachePtr **controls, size_t *ncontrols) { int ret = -1; @@ -69,7 +69,7 @@ virResctrlGetCacheInfo(unsigned int level, char *type_upper = NULL; unsigned int bits = 0; unsigned int min_cbm_bits = 0; - virResctrlInfoPtr control; + virResctrlInfoPerCachePtr control; if (VIR_ALLOC(control) < 0) goto cleanup; diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 848b13e98aa3..42e852780318 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -36,9 +36,9 @@ typedef enum { VIR_ENUM_DECL(virCache); -typedef struct _virResctrlInfo virResctrlInfo; -typedef virResctrlInfo *virResctrlInfoPtr; -struct _virResctrlInfo { +typedef struct _virResctrlInfoPerCache virResctrlInfoPerCache; +typedef virResctrlInfoPerCache *virResctrlInfoPerCachePtr; +struct _virResctrlInfoPerCache { /* Smallest possible increase of the allocation size in bytes */ unsigned long long granularity; /* Minimal allocatable size in bytes (if different from granularity) */ @@ -54,7 +54,7 @@ int virResctrlGetCacheInfo(unsigned int level, unsigned long long size, virCacheType scope, - virResctrlInfoPtr **controls, + virResctrlInfoPerCachePtr **controls, size_t *ncontrols); int -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:10PM +0100, Martin Kletzander wrote:
Just to ease the review of following patches.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/conf/capabilities.c | 2 +- src/conf/capabilities.h | 2 +- src/util/virresctrl.c | 4 ++-- src/util/virresctrl.h | 8 ++++---- 4 files changed, 8 insertions(+), 8 deletions(-)
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

This will make the current functions obsolete and it will provide more information to the virresctrl module so that it can be used later. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- po/POTFILES.in | 1 + src/libvirt_private.syms | 3 + src/util/virresctrl.c | 305 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 19 +++ 4 files changed, 328 insertions(+) diff --git a/po/POTFILES.in b/po/POTFILES.in index c1fa23427eff..8382ee633621 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -252,6 +252,7 @@ src/util/virportallocator.c src/util/virprocess.c src/util/virqemu.c src/util/virrandom.c +src/util/virresctrl.c src/util/virrotatingfile.c src/util/virscsi.c src/util/virscsihost.c diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index bc8cc1fba9e3..e951a5495b3f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2554,6 +2554,9 @@ virCacheTypeFromString; virCacheTypeToString; virResctrlGetCacheControlType; virResctrlGetCacheInfo; +virResctrlGetInfo; +virResctrlInfoGetCache; +virResctrlInfoNew; # util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 050a08178e24..25c11a1314dc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -25,12 +25,15 @@ #include "viralloc.h" #include "virfile.h" #include "virlog.h" +#include "virobject.h" #include "virstring.h" #define VIR_FROM_THIS VIR_FROM_RESCTRL VIR_LOG_INIT("util.virresctrl") + +/* Common definitions */ #define SYSFS_RESCTRL_PATH "/sys/fs/resctrl" /* Resctrl is short for Resource Control. It might be implemented for various @@ -55,6 +58,308 @@ VIR_ENUM_IMPL(virResctrl, VIR_CACHE_TYPE_LAST, "CODE", "DATA") + +/* Info-related definitions and InfoClass-related functions */ +typedef struct _virResctrlInfoPerType virResctrlInfoPerType; +typedef virResctrlInfoPerType *virResctrlInfoPerTypePtr; +struct _virResctrlInfoPerType { + /* Kernel-provided information */ + char *cbm_mask; + unsigned int min_cbm_bits; + + /* Our computed information from the above */ + unsigned int bits; + unsigned int max_cache_id; + + /* In order to be self-sufficient we need size information per cache. + * Funnily enough, one of the outcomes of the resctrlfs design is that it + * does not account for different sizes per cache on the same level. So + * for the sake of easiness, let's copy that, for now. */ + unsigned long long size; + + /* Information that we will return upon request (this is public struct) as + * until now all the above is internal to this module */ + virResctrlInfoPerCache control; +}; + +typedef struct _virResctrlInfoPerLevel virResctrlInfoPerLevel; +typedef virResctrlInfoPerLevel *virResctrlInfoPerLevelPtr; +struct _virResctrlInfoPerLevel { + virResctrlInfoPerTypePtr *types; +}; + +struct _virResctrlInfo { + virObject parent; + + virResctrlInfoPerLevelPtr *levels; + size_t nlevels; +}; + +static virClassPtr virResctrlInfoClass; + +static void +virResctrlInfoDispose(void *obj) +{ + size_t i = 0; + size_t j = 0; + + virResctrlInfoPtr resctrl = obj; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlInfoPerLevelPtr level = resctrl->levels[i]; + + if (!level) + continue; + + if (level->types) { + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + if (level->types[j]) + VIR_FREE(level->types[j]->cbm_mask); + VIR_FREE(level->types[j]); + } + } + VIR_FREE(level->types); + VIR_FREE(level); + } + + VIR_FREE(resctrl->levels); +} + + +static int +virResctrlInfoOnceInit(void) +{ + if (!(virResctrlInfoClass = virClassNew(virClassForObject(), + "virResctrlInfo", + sizeof(virResctrlInfo), + virResctrlInfoDispose))) + return -1; + + return 0; +} + + +VIR_ONCE_GLOBAL_INIT(virResctrlInfo) + + +virResctrlInfoPtr +virResctrlInfoNew(void) +{ + if (virResctrlInfoInitialize() < 0) + return NULL; + + return virObjectNew(virResctrlInfoClass); +} + + +/* Info-related functions */ +bool +virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl) +{ + size_t i = 0; + size_t j = 0; + + if (!resctrl) + return true; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlInfoPerLevelPtr i_level = resctrl->levels[i]; + + if (!i_level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + if (i_level->types[j]) + return false; + } + } + + return true; +} + + +int +virResctrlGetInfo(virResctrlInfoPtr resctrl) +{ + DIR *dirp = NULL; + char *info_path = NULL; + char *endptr = NULL; + char *tmp_str = NULL; + int ret = -1; + int rv = -1; + int type = 0; + struct dirent *ent = NULL; + unsigned int level = 0; + virResctrlInfoPerLevelPtr i_level = NULL; + virResctrlInfoPerTypePtr i_type = NULL; + + rv = virDirOpenIfExists(&dirp, SYSFS_RESCTRL_PATH "/info"); + if (rv <= 0) { + ret = rv; + goto cleanup; + } + + while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH "/info")) > 0) { + if (ent->d_type != DT_DIR) + continue; + + if (ent->d_name[0] != 'L') + continue; + + if (virStrToLong_uip(ent->d_name + 1, &endptr, 10, &level) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot parse resctrl cache info level")); + goto cleanup; + } + + type = virResctrlTypeFromString(endptr); + if (type < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot parse resctrl cache info type")); + goto cleanup; + } + + if (VIR_ALLOC(i_type) < 0) + goto cleanup; + + i_type->control.scope = type; + + rv = virFileReadValueUint(&i_type->control.max_allocation, + SYSFS_RESCTRL_PATH "/info/%s/num_closids", + ent->d_name); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get num_closids from resctrl cache info")); + if (rv < 0) + goto cleanup; + + rv = virFileReadValueString(&i_type->cbm_mask, + SYSFS_RESCTRL_PATH + "/info/%s/cbm_mask", + ent->d_name); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get cbm_mask from resctrl cache info")); + if (rv < 0) + goto cleanup; + + virStringTrimOptionalNewline(i_type->cbm_mask); + + rv = virFileReadValueUint(&i_type->min_cbm_bits, + SYSFS_RESCTRL_PATH "/info/%s/min_cbm_bits", + ent->d_name); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get min_cbm_bits from resctrl cache info")); + if (rv < 0) + goto cleanup; + + if (resctrl->nlevels <= level && + VIR_EXPAND_N(resctrl->levels, resctrl->nlevels, + level - resctrl->nlevels + 1) < 0) + goto cleanup; + + if (!resctrl->levels[level] && + (VIR_ALLOC(resctrl->levels[level]) < 0 || + VIR_ALLOC_N(resctrl->levels[level]->types, VIR_CACHE_TYPE_LAST) < 0)) + goto cleanup; + i_level = resctrl->levels[level]; + + if (i_level->types[type]) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Duplicate cache type in resctrlfs for level %u"), + level); + goto cleanup; + } + + for (tmp_str = i_type->cbm_mask; *tmp_str != '\0'; tmp_str++) { + if (!c_isxdigit(*tmp_str)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot parse cbm_mask from resctrl cache info")); + goto cleanup; + } + + i_type->bits += count_one_bits(virHexToBin(*tmp_str)); + } + + VIR_STEAL_PTR(i_level->types[type], i_type); + } + + ret = 0; + cleanup: + VIR_DIR_CLOSE(dirp); + VIR_FREE(info_path); + if (i_type) + VIR_FREE(i_type->cbm_mask); + VIR_FREE(i_type); + return ret; +} + + +int +virResctrlInfoGetCache(virResctrlInfoPtr resctrl, + unsigned int level, + unsigned long long size, + size_t *ncontrols, + virResctrlInfoPerCachePtr **controls) +{ + virResctrlInfoPerLevelPtr i_level = NULL; + virResctrlInfoPerTypePtr i_type = NULL; + size_t i = 0; + int ret = -1; + + if (virResctrlInfoIsEmpty(resctrl)) + return 0; + + if (level >= resctrl->nlevels) + return 0; + + i_level = resctrl->levels[level]; + if (!i_level) + return 0; + + for (i = 0; i < VIR_CACHE_TYPE_LAST; i++) { + i_type = i_level->types[i]; + if (!i_type) + continue; + + /* Let's take the opportunity to update our internal information about + * the cache size */ + if (!i_type->size) { + i_type->size = size; + i_type->control.granularity = size / i_type->bits; + if (i_type->min_cbm_bits != 1) + i_type->control.min = i_type->min_cbm_bits * i_type->control.granularity; + } else { + if (i_type->size != size) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("level %u cache size %llu doesn not match " + "expected size %llu"), + level, i_type->size, size); + goto error; + } + i_type->max_cache_id++; + } + + if (VIR_EXPAND_N(*controls, *ncontrols, 1) < 0) + goto error; + if (VIR_ALLOC((*controls)[*ncontrols - 1]) < 0) + goto error; + + memcpy((*controls)[*ncontrols - 1], &i_type->control, sizeof(i_type->control)); + } + + ret = 0; + cleanup: + return ret; + error: + while (*ncontrols) + VIR_FREE((*controls)[--*ncontrols]); + VIR_FREE(*controls); + goto cleanup; +} + + int virResctrlGetCacheInfo(unsigned int level, unsigned long long size, diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 42e852780318..43063903730b 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -49,7 +49,26 @@ struct _virResctrlInfoPerCache { unsigned int max_allocation; }; +typedef struct _virResctrlInfo virResctrlInfo; +typedef virResctrlInfo *virResctrlInfoPtr; +virResctrlInfoPtr +virResctrlInfoNew(void); + +bool +virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl); + +int +virResctrlGetInfo(virResctrlInfoPtr resctrl); + +int +virResctrlInfoGetCache(virResctrlInfoPtr resctrl, + unsigned int level, + unsigned long long size, + size_t *ncontrols, + virResctrlInfoPerCachePtr **controls); + +/* To be removed */ int virResctrlGetCacheInfo(unsigned int level, unsigned long long size, -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:11PM +0100, Martin Kletzander wrote:
This will make the current functions obsolete and it will provide more information to the virresctrl module so that it can be used later.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- po/POTFILES.in | 1 + src/libvirt_private.syms | 3 + src/util/virresctrl.c | 305 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.h | 19 +++ 4 files changed, 328 insertions(+)
diff --git a/po/POTFILES.in b/po/POTFILES.in index c1fa23427eff..8382ee633621 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -252,6 +252,7 @@ src/util/virportallocator.c src/util/virprocess.c src/util/virqemu.c src/util/virrandom.c +src/util/virresctrl.c src/util/virrotatingfile.c src/util/virscsi.c src/util/virscsihost.c diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index bc8cc1fba9e3..e951a5495b3f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2554,6 +2554,9 @@ virCacheTypeFromString; virCacheTypeToString; virResctrlGetCacheControlType; virResctrlGetCacheInfo; +virResctrlGetInfo; +virResctrlInfoGetCache; +virResctrlInfoNew;
# util/virrotatingfile.h diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 050a08178e24..25c11a1314dc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -25,12 +25,15 @@ #include "viralloc.h" #include "virfile.h" #include "virlog.h" +#include "virobject.h" #include "virstring.h"
#define VIR_FROM_THIS VIR_FROM_RESCTRL
VIR_LOG_INIT("util.virresctrl")
+ +/* Common definitions */ #define SYSFS_RESCTRL_PATH "/sys/fs/resctrl"
/* Resctrl is short for Resource Control. It might be implemented for various @@ -55,6 +58,308 @@ VIR_ENUM_IMPL(virResctrl, VIR_CACHE_TYPE_LAST, "CODE", "DATA")
+ +/* Info-related definitions and InfoClass-related functions */ +typedef struct _virResctrlInfoPerType virResctrlInfoPerType; +typedef virResctrlInfoPerType *virResctrlInfoPerTypePtr; +struct _virResctrlInfoPerType { + /* Kernel-provided information */ + char *cbm_mask; + unsigned int min_cbm_bits; + + /* Our computed information from the above */ + unsigned int bits; + unsigned int max_cache_id; + + /* In order to be self-sufficient we need size information per cache. + * Funnily enough, one of the outcomes of the resctrlfs design is that it + * does not account for different sizes per cache on the same level. So + * for the sake of easiness, let's copy that, for now. */ + unsigned long long size; + + /* Information that we will return upon request (this is public struct) as + * until now all the above is internal to this module */ + virResctrlInfoPerCache control; +}; + +typedef struct _virResctrlInfoPerLevel virResctrlInfoPerLevel; +typedef virResctrlInfoPerLevel *virResctrlInfoPerLevelPtr; +struct _virResctrlInfoPerLevel { + virResctrlInfoPerTypePtr *types; +}; + +struct _virResctrlInfo { + virObject parent; + + virResctrlInfoPerLevelPtr *levels; + size_t nlevels; +}; + +static virClassPtr virResctrlInfoClass; + +static void +virResctrlInfoDispose(void *obj) +{ + size_t i = 0; + size_t j = 0; + + virResctrlInfoPtr resctrl = obj; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlInfoPerLevelPtr level = resctrl->levels[i]; + + if (!level) + continue; + + if (level->types) { + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + if (level->types[j]) + VIR_FREE(level->types[j]->cbm_mask); + VIR_FREE(level->types[j]); + } + } + VIR_FREE(level->types); + VIR_FREE(level); + } + + VIR_FREE(resctrl->levels); +} + + +static int +virResctrlInfoOnceInit(void) +{ + if (!(virResctrlInfoClass = virClassNew(virClassForObject(), + "virResctrlInfo", + sizeof(virResctrlInfo), + virResctrlInfoDispose))) + return -1; + + return 0; +} + + +VIR_ONCE_GLOBAL_INIT(virResctrlInfo) + + +virResctrlInfoPtr +virResctrlInfoNew(void) +{ + if (virResctrlInfoInitialize() < 0) + return NULL; + + return virObjectNew(virResctrlInfoClass); +} + + +/* Info-related functions */ +bool +virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl) +{
This one can be static since it's not used outside of this file and it's not exported in private symbols.
+ size_t i = 0; + size_t j = 0; + + if (!resctrl) + return true; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlInfoPerLevelPtr i_level = resctrl->levels[i]; + + if (!i_level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + if (i_level->types[j]) + return false; + } + } + + return true; +} + + +int +virResctrlGetInfo(virResctrlInfoPtr resctrl) +{ + DIR *dirp = NULL; + char *info_path = NULL; + char *endptr = NULL; + char *tmp_str = NULL; + int ret = -1; + int rv = -1; + int type = 0; + struct dirent *ent = NULL; + unsigned int level = 0; + virResctrlInfoPerLevelPtr i_level = NULL; + virResctrlInfoPerTypePtr i_type = NULL; + + rv = virDirOpenIfExists(&dirp, SYSFS_RESCTRL_PATH "/info"); + if (rv <= 0) { + ret = rv; + goto cleanup; + } + + while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH "/info")) > 0) { + if (ent->d_type != DT_DIR) + continue; + + if (ent->d_name[0] != 'L') + continue; + + if (virStrToLong_uip(ent->d_name + 1, &endptr, 10, &level) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot parse resctrl cache info level")); + goto cleanup; + } + + type = virResctrlTypeFromString(endptr); + if (type < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot parse resctrl cache info type")); + goto cleanup; + } + + if (VIR_ALLOC(i_type) < 0) + goto cleanup; + + i_type->control.scope = type; + + rv = virFileReadValueUint(&i_type->control.max_allocation, + SYSFS_RESCTRL_PATH "/info/%s/num_closids", + ent->d_name); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get num_closids from resctrl cache info")); + if (rv < 0) + goto cleanup; + + rv = virFileReadValueString(&i_type->cbm_mask, + SYSFS_RESCTRL_PATH + "/info/%s/cbm_mask", + ent->d_name); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get cbm_mask from resctrl cache info")); + if (rv < 0) + goto cleanup; + + virStringTrimOptionalNewline(i_type->cbm_mask); + + rv = virFileReadValueUint(&i_type->min_cbm_bits, + SYSFS_RESCTRL_PATH "/info/%s/min_cbm_bits", + ent->d_name); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get min_cbm_bits from resctrl cache info")); + if (rv < 0) + goto cleanup; + + if (resctrl->nlevels <= level && + VIR_EXPAND_N(resctrl->levels, resctrl->nlevels, + level - resctrl->nlevels + 1) < 0) + goto cleanup; + + if (!resctrl->levels[level] && + (VIR_ALLOC(resctrl->levels[level]) < 0 || + VIR_ALLOC_N(resctrl->levels[level]->types, VIR_CACHE_TYPE_LAST) < 0)) + goto cleanup; + i_level = resctrl->levels[level]; + + if (i_level->types[type]) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Duplicate cache type in resctrlfs for level %u"), + level); + goto cleanup; + } + + for (tmp_str = i_type->cbm_mask; *tmp_str != '\0'; tmp_str++) { + if (!c_isxdigit(*tmp_str)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot parse cbm_mask from resctrl cache info")); + goto cleanup; + } + + i_type->bits += count_one_bits(virHexToBin(*tmp_str)); + } + + VIR_STEAL_PTR(i_level->types[type], i_type); + } + + ret = 0; + cleanup: + VIR_DIR_CLOSE(dirp); + VIR_FREE(info_path); + if (i_type) + VIR_FREE(i_type->cbm_mask); + VIR_FREE(i_type); + return ret; +} + + +int +virResctrlInfoGetCache(virResctrlInfoPtr resctrl, + unsigned int level, + unsigned long long size, + size_t *ncontrols, + virResctrlInfoPerCachePtr **controls) +{ + virResctrlInfoPerLevelPtr i_level = NULL; + virResctrlInfoPerTypePtr i_type = NULL; + size_t i = 0; + int ret = -1; + + if (virResctrlInfoIsEmpty(resctrl)) + return 0; + + if (level >= resctrl->nlevels) + return 0; + + i_level = resctrl->levels[level]; + if (!i_level) + return 0; + + for (i = 0; i < VIR_CACHE_TYPE_LAST; i++) { + i_type = i_level->types[i]; + if (!i_type) + continue; + + /* Let's take the opportunity to update our internal information about + * the cache size */ + if (!i_type->size) { + i_type->size = size; + i_type->control.granularity = size / i_type->bits; + if (i_type->min_cbm_bits != 1) + i_type->control.min = i_type->min_cbm_bits * i_type->control.granularity; + } else { + if (i_type->size != size) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("level %u cache size %llu doesn not match "
s/doesn/does/ With the issues fixed Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/conf/capabilities.c | 53 ++++++++++++++++++++++++------------------------- src/conf/capabilities.h | 2 ++ 2 files changed, 28 insertions(+), 27 deletions(-) diff --git a/src/conf/capabilities.c b/src/conf/capabilities.c index e93eaed2f0e0..edf9f54f7710 100644 --- a/src/conf/capabilities.c +++ b/src/conf/capabilities.c @@ -245,6 +245,7 @@ virCapabilitiesDispose(void *object) VIR_FREE(caps->host.netprefix); VIR_FREE(caps->host.pagesSize); virCPUDefFree(caps->host.cpu); + virObjectUnref(caps->host.resctrl); } /** @@ -1592,6 +1593,20 @@ virCapsHostCacheBankSorter(const void *a, } +static int +virCapabilitiesInitResctrl(virCapsPtr caps) +{ + if (caps->host.resctrl) + return 0; + + caps->host.resctrl = virResctrlInfoNew(); + if (!caps->host.resctrl) + return -1; + + return virResctrlGetInfo(caps->host.resctrl); +} + + int virCapabilitiesInitCaches(virCapsPtr caps) { @@ -1600,7 +1615,6 @@ virCapabilitiesInitCaches(virCapsPtr caps) ssize_t pos = -1; DIR *dirp = NULL; int ret = -1; - int typeret; char *path = NULL; char *type = NULL; struct dirent *ent = NULL; @@ -1611,6 +1625,9 @@ virCapabilitiesInitCaches(virCapsPtr caps) * lose information. */ const int cache_min_level = 3; + if (virCapabilitiesInitResctrl(caps) < 0) + return -1; + /* offline CPUs don't provide cache info */ if (virFileReadValueBitmap(&cpus, "%s/cpu/online", SYSFS_SYSTEM_PATH) < 0) return -1; @@ -1676,32 +1693,6 @@ virCapabilitiesInitCaches(virCapsPtr caps) SYSFS_SYSTEM_PATH, pos, ent->d_name) < 0) goto cleanup; - typeret = virResctrlGetCacheControlType(bank->level); - if (typeret < 0) - goto cleanup; - - if (typeret == 1) { - if (virResctrlGetCacheInfo(bank->level, - bank->size, - VIR_CACHE_TYPE_BOTH, - &bank->controls, - &bank->ncontrols) < 0) - goto cleanup; - } else if (typeret == 2) { - if (virResctrlGetCacheInfo(bank->level, - bank->size, - VIR_CACHE_TYPE_CODE, - &bank->controls, - &bank->ncontrols) < 0) - goto cleanup; - if (virResctrlGetCacheInfo(bank->level, - bank->size, - VIR_CACHE_TYPE_DATA, - &bank->controls, - &bank->ncontrols) < 0) - goto cleanup; - } - kernel_type = virCacheKernelTypeFromString(type); if (kernel_type < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -1717,6 +1708,14 @@ virCapabilitiesInitCaches(virCapsPtr caps) break; } if (i == caps->host.ncaches) { + /* If it is a new cache, then update its resctrl information. */ + if (virResctrlInfoGetCache(caps->host.resctrl, + bank->level, + bank->size, + &bank->ncontrols, + &bank->controls) < 0) + goto cleanup; + if (VIR_APPEND_ELEMENT(caps->host.caches, caps->host.ncaches, bank) < 0) { diff --git a/src/conf/capabilities.h b/src/conf/capabilities.h index 27b88cb5edfa..694a3590bf83 100644 --- a/src/conf/capabilities.h +++ b/src/conf/capabilities.h @@ -170,6 +170,8 @@ struct _virCapsHost { size_t nnumaCell_max; virCapsHostNUMACellPtr *numaCell; + virResctrlInfoPtr resctrl; + size_t ncaches; virCapsHostCacheBankPtr *caches; -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:12PM +0100, Martin Kletzander wrote:
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/conf/capabilities.c | 53 ++++++++++++++++++++++++------------------------- src/conf/capabilities.h | 2 ++ 2 files changed, 28 insertions(+), 27 deletions(-)
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/libvirt_private.syms | 2 - src/util/virresctrl.c | 132 ----------------------------------------------- src/util/virresctrl.h | 11 ---- 3 files changed, 145 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e951a5495b3f..d63474e0fc20 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2552,8 +2552,6 @@ virRandomInt; # util/virresctrl.h virCacheTypeFromString; virCacheTypeToString; -virResctrlGetCacheControlType; -virResctrlGetCacheInfo; virResctrlGetInfo; virResctrlInfoGetCache; virResctrlInfoNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 25c11a1314dc..f32929468c80 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -358,135 +358,3 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, VIR_FREE(*controls); goto cleanup; } - - -int -virResctrlGetCacheInfo(unsigned int level, - unsigned long long size, - virCacheType scope, - virResctrlInfoPerCachePtr **controls, - size_t *ncontrols) -{ - int ret = -1; - char *tmp = NULL; - char *path = NULL; - char *cbm_mask = NULL; - char *type_upper = NULL; - unsigned int bits = 0; - unsigned int min_cbm_bits = 0; - virResctrlInfoPerCachePtr control; - - if (VIR_ALLOC(control) < 0) - goto cleanup; - - if (scope != VIR_CACHE_TYPE_BOTH && - virStringToUpper(&type_upper, virCacheTypeToString(scope)) < 0) - goto cleanup; - - if (virFileReadValueUint(&control->max_allocation, - SYSFS_RESCTRL_PATH "/info/L%u%s/num_closids", - level, - type_upper ? type_upper : "") < 0) - goto cleanup; - - if (virFileReadValueString(&cbm_mask, - SYSFS_RESCTRL_PATH - "/info/L%u%s/cbm_mask", - level, - type_upper ? type_upper: "") < 0) - goto cleanup; - - if (virFileReadValueUint(&min_cbm_bits, - SYSFS_RESCTRL_PATH "/info/L%u%s/min_cbm_bits", - level, - type_upper ? type_upper : "") < 0) - goto cleanup; - - virStringTrimOptionalNewline(cbm_mask); - - for (tmp = cbm_mask; *tmp != '\0'; tmp++) { - if (c_isxdigit(*tmp)) - bits += count_one_bits(virHexToBin(*tmp)); - } - - control->granularity = size / bits; - if (min_cbm_bits != 1) - control->min = min_cbm_bits * control->granularity; - - control->scope = scope; - - if (VIR_APPEND_ELEMENT(*controls, *ncontrols, control) < 0) - goto cleanup; - - ret = 0; - - cleanup: - VIR_FREE(path); - VIR_FREE(cbm_mask); - VIR_FREE(type_upper); - VIR_FREE(control); - return ret; -} - - -static inline int -virResctrlGetCacheDir(char **path, - const char *prefix, - unsigned int level, - virCacheType type) -{ - return virAsprintf(path, - SYSFS_RESCTRL_PATH "%s/L%u%s", - prefix ? prefix : "", - level, - virResctrlTypeToString(type)); -} - - -/* - * This function tests whether TYPE of cache control is supported or not. - * - * Returns 0 if not, 1 if yes and negative value on error. - */ -static int -virResctrlGetCacheSupport(unsigned int level, virCacheType type) -{ - int ret = -1; - char *path = NULL; - - if (virResctrlGetCacheDir(&path, "/info", level, type) < 0) - return -1; - - ret = virFileExists(path); - VIR_FREE(path); - return ret; -} - - -/* - * This function tests which TYPE of cache control is supported - * Return values are: - * -1: error - * 0: none - * 1: CAT - * 2: CDP - */ -int -virResctrlGetCacheControlType(unsigned int level) -{ - int rv = -1; - - rv = virResctrlGetCacheSupport(level, VIR_CACHE_TYPE_BOTH); - if (rv < 0) - return -1; - if (rv) - return 1; - - rv = virResctrlGetCacheSupport(level, VIR_CACHE_TYPE_CODE); - if (rv < 0) - return -1; - if (rv) - return 2; - - return 0; -} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 43063903730b..471c02f28dcd 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -68,15 +68,4 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, size_t *ncontrols, virResctrlInfoPerCachePtr **controls); -/* To be removed */ -int -virResctrlGetCacheInfo(unsigned int level, - unsigned long long size, - virCacheType scope, - virResctrlInfoPerCachePtr **controls, - size_t *ncontrols); - -int -virResctrlGetCacheControlType(unsigned int level); - #endif /* __VIR_RESCTRL_H__ */ -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:13PM +0100, Martin Kletzander wrote:
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/libvirt_private.syms | 2 - src/util/virresctrl.c | 132 ----------------------------------------------- src/util/virresctrl.h | 11 ---- 3 files changed, 145 deletions(-)
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

With this commit we finally have a way to read and manipulate basic resctrl settings. Locking is done only on exposed functions that read/write from/to resctrlfs. Not in functions that are exposed in virresctrlpriv.h as those are only supposed to be used from tests. More information about how resctrl works: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu... Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/Makefile.am | 2 +- src/libvirt_private.syms | 10 + src/util/virresctrl.c | 1211 ++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 49 ++ src/util/virresctrlpriv.h | 27 + 5 files changed, 1288 insertions(+), 11 deletions(-) create mode 100644 src/util/virresctrlpriv.h diff --git a/src/Makefile.am b/src/Makefile.am index 166c9a8e9199..289b03747212 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -167,7 +167,7 @@ UTIL_SOURCES = \ util/virprocess.c util/virprocess.h \ util/virqemu.c util/virqemu.h \ util/virrandom.h util/virrandom.c \ - util/virresctrl.h util/virresctrl.c \ + util/virresctrl.h util/virresctrl.c util/virresctrlpriv.h \ util/virrotatingfile.h util/virrotatingfile.c \ util/virscsi.c util/virscsi.h \ util/virscsihost.c util/virscsihost.h \ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d63474e0fc20..f2a2c8650d97 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2552,6 +2552,16 @@ virRandomInt; # util/virresctrl.h virCacheTypeFromString; virCacheTypeToString; +virResctrlAllocAddPID; +virResctrlAllocCreate; +virResctrlAllocForeachSize; +virResctrlAllocFormat; +virResctrlAllocGetID; +virResctrlAllocGetUnused; +virResctrlAllocNew; +virResctrlAllocRemove; +virResctrlAllocSetID; +virResctrlAllocSetSize; virResctrlGetInfo; virResctrlInfoGetCache; virResctrlInfoNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index f32929468c80..b5c0ea338830 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -18,8 +18,12 @@ #include <config.h> -#include "virresctrl.h" +#include <sys/file.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> +#include "virresctrlpriv.h" #include "c-ctype.h" #include "count-one-bits.h" #include "viralloc.h" @@ -152,6 +156,195 @@ virResctrlInfoNew(void) } +/* Alloc-related definitions and AllocClass-related functions */ + +/* + * virResctrlAlloc represents one allocation (in XML under cputune/cachetune and + * consequently a directory under /sys/fs/resctrl). Since it can have multiple + * parts of multiple caches allocated it is represented as bunch of nested + * sparse arrays (by sparse I mean array of pointers so that each might be NULL + * in case there is no allocation for that particular one (level, cache, ...)). + * + * Since one allocation can be made for caches on different levels, the first + * nested sparse array is of types virResctrlAllocPerLevel. For example if you + * have allocation for level 3 cache, there will be three NULL pointers and then + * allocated pointer to virResctrlAllocPerLevel. That way you can access it by + * `alloc[level]` as O(1) is desired instead of crawling through normal arrays + * or lists in three nested loops. The code uses a lot of direct accesses. + * + * Each virResctrlAllocPerLevel can have allocations for different cache + * allocation types. You can allocate instruction cache (VIR_CACHE_TYPE_CODE), + * data cache (VIR_CACHE_TYPE_DATA) or unified cache (VIR_CACHE_TYPE_BOTH). + * Those allocations are kept in sparse array of virResctrlAllocPerType pointers. + * + * For each virResctrlAllocPerType users can request some size of the cache to + * be allocated. That's what the sparse array `sizes` is for. Non-NULL + * pointers represent requested size allocations. The array is indexed by host + * cache id (gotten from `/sys/devices/system/cpu/cpuX/cache/indexY/id`). Users + * can see this information e.g. in the output of `virsh capabilities` (for that + * information there's the other struct, namely `virResctrlInfo`). + * + * When allocation is being created we need to find unused part of the cache for + * all of them. While doing that we store the bitmask in a sparse array of + * virBitmaps named `masks` indexed the same way as `sizes`. The upper bounds + * of the sparse arrays are stored in nmasks or nsizes, respectively. + */ + +/* virResctrlAllocPerType represents */ +typedef struct _virResctrlAllocPerType virResctrlAllocPerType; +typedef virResctrlAllocPerType *virResctrlAllocPerTypePtr; +struct _virResctrlAllocPerType { + /* There could be bool saying whether this is set or not, but since everything + * in virResctrlAlloc (and most of libvirt) goes with pointer arrays we would + * have to have one more level of allocation anyway, so this stays faithful to + * the concept */ + unsigned long long **sizes; + size_t nsizes; + + /* Mask for each cache */ + virBitmapPtr *masks; + size_t nmasks; +}; + +typedef struct _virResctrlAllocPerLevel virResctrlAllocPerLevel; +typedef virResctrlAllocPerLevel *virResctrlAllocPerLevelPtr; +struct _virResctrlAllocPerLevel { + virResctrlAllocPerTypePtr *types; /* Indexed with enum virCacheType */ + /* There is no `ntypes` member variable as it is always allocated for + * VIR_CACHE_TYPE_LAST number of items */ +}; + +struct _virResctrlAlloc { + virObject parent; + + virResctrlAllocPerLevelPtr *levels; + size_t nlevels; + + /* The identifier (any unique string for now) */ + char *id; + /* libvirt-generated path in /sys/fs/resctrl for this particular + * allocation */ + char *path; +}; + +static virClassPtr virResctrlAllocClass; + +static void +virResctrlAllocDispose(void *obj) +{ + size_t i = 0; + size_t j = 0; + size_t k = 0; + + virResctrlAllocPtr resctrl = obj; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlAllocPerLevelPtr level = resctrl->levels[i]; + + if (!level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlAllocPerTypePtr type = level->types[j]; + + if (!type) + continue; + + for (k = 0; k < type->nsizes; k++) + VIR_FREE(type->sizes[k]); + + for (k = 0; k < type->nmasks; k++) + virBitmapFree(type->masks[k]); + + VIR_FREE(type->sizes); + VIR_FREE(type->masks); + VIR_FREE(type); + } + VIR_FREE(level->types); + VIR_FREE(level); + } + + VIR_FREE(resctrl->id); + VIR_FREE(resctrl->path); + VIR_FREE(resctrl->levels); +} + + +static int +virResctrlAllocOnceInit(void) +{ + if (!(virResctrlAllocClass = virClassNew(virClassForObject(), + "virResctrlAlloc", + sizeof(virResctrlAlloc), + virResctrlAllocDispose))) + return -1; + + return 0; +} + + +VIR_ONCE_GLOBAL_INIT(virResctrlAlloc) + + +virResctrlAllocPtr +virResctrlAllocNew(void) +{ + if (virResctrlAllocInitialize() < 0) + return NULL; + + return virObjectNew(virResctrlAllocClass); +} + + +/* Common functions */ +static int +virResctrlLockInternal(int op) +{ + int fd = open(SYSFS_RESCTRL_PATH, O_DIRECTORY | O_CLOEXEC); + + if (fd < 0) { + virReportSystemError(errno, "%s", _("Cannot open resctrlfs")); + return -1; + } + + if (flock(fd, op) < 0) { + virReportSystemError(errno, "%s", _("Cannot lock resctrlfs")); + VIR_FORCE_CLOSE(fd); + return -1; + } + + return fd; +} + + +static inline int +virResctrlLockWrite(void) +{ + return virResctrlLockInternal(LOCK_EX); +} + + +static int +virResctrlUnlock(int fd) +{ + if (fd == -1) + return 0; + + /* The lock gets unlocked by closing the fd, which we need to do anyway in + * order to clean up properly */ + if (VIR_CLOSE(fd) < 0) { + virReportSystemError(errno, "%s", _("Cannot close resctrlfs")); + + /* Trying to save the already broken */ + if (flock(fd, LOCK_UN) < 0) + virReportSystemError(errno, "%s", _("Cannot unlock resctrlfs")); + return -1; + } + + return 0; +} + + /* Info-related functions */ bool virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl) @@ -200,6 +393,8 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) } while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH "/info")) > 0) { + VIR_DEBUG("Parsing info type '%s'", ent->d_name); + if (ent->d_type != DT_DIR) continue; @@ -207,16 +402,14 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) continue; if (virStrToLong_uip(ent->d_name + 1, &endptr, 10, &level) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Cannot parse resctrl cache info level")); + VIR_DEBUG("Cannot parse resctrl cache info level '%s'", ent->d_name + 1); goto cleanup; } type = virResctrlTypeFromString(endptr); if (type < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Cannot parse resctrl cache info type")); - goto cleanup; + VIR_DEBUG("Cannot parse resctrl cache info type '%s'", endptr); + continue; } if (VIR_ALLOC(i_type) < 0) @@ -259,10 +452,19 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) level - resctrl->nlevels + 1) < 0) goto cleanup; - if (!resctrl->levels[level] && - (VIR_ALLOC(resctrl->levels[level]) < 0 || - VIR_ALLOC_N(resctrl->levels[level]->types, VIR_CACHE_TYPE_LAST) < 0)) - goto cleanup; + if (!resctrl->levels[level]) { + virResctrlInfoPerTypePtr *types = NULL; + + if (VIR_ALLOC_N(types, VIR_CACHE_TYPE_LAST) < 0) + goto cleanup; + + if (VIR_ALLOC(resctrl->levels[level]) < 0) { + VIR_FREE(types); + goto cleanup; + } + resctrl->levels[level]->types = types; + } + i_level = resctrl->levels[level]; if (i_level->types[type]) { @@ -358,3 +560,992 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, VIR_FREE(*controls); goto cleanup; } + + +/* Alloc-related functions */ +bool +virResctrlAllocIsEmpty(virResctrlAllocPtr resctrl) +{ + size_t i = 0; + size_t j = 0; + size_t k = 0; + + if (!resctrl) + return true; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlAllocPerLevelPtr a_level = resctrl->levels[i]; + + if (!a_level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlAllocPerTypePtr a_type = a_level->types[j]; + + if (!a_type) + continue; + + for (k = 0; k < a_type->nsizes; k++) { + if (a_type->sizes[k]) + return false; + } + + for (k = 0; k < a_type->nmasks; k++) { + if (a_type->masks[k]) + return false; + } + } + } + + return true; +} + + +static virResctrlAllocPerTypePtr +virResctrlAllocGetType(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type) +{ + virResctrlAllocPerLevelPtr a_level = NULL; + + if (resctrl->nlevels <= level && + VIR_EXPAND_N(resctrl->levels, resctrl->nlevels, level - resctrl->nlevels + 1) < 0) + return NULL; + + if (!resctrl->levels[level]) { + virResctrlAllocPerTypePtr *types = NULL; + + if (VIR_ALLOC_N(types, VIR_CACHE_TYPE_LAST) < 0) + return NULL; + + if (VIR_ALLOC(resctrl->levels[level]) < 0) { + VIR_FREE(types); + return NULL; + } + resctrl->levels[level]->types = types; + } + + a_level = resctrl->levels[level]; + + if (!a_level->types[type] && VIR_ALLOC(a_level->types[type]) < 0) + return NULL; + + return a_level->types[type]; +} + + +static int +virResctrlAllocUpdateMask(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + virBitmapPtr mask) +{ + virResctrlAllocPerTypePtr a_type = virResctrlAllocGetType(resctrl, level, type); + + if (!a_type) + return -1; + + if (a_type->nmasks <= cache && + VIR_EXPAND_N(a_type->masks, a_type->nmasks, + cache - a_type->nmasks + 1) < 0) + return -1; + + if (!a_type->masks[cache]) { + a_type->masks[cache] = virBitmapNew(virBitmapSize(mask)); + + if (!a_type->masks[cache]) + return -1; + } + + return virBitmapCopy(a_type->masks[cache], mask); +} + + +static int +virResctrlAllocUpdateSize(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size) +{ + virResctrlAllocPerTypePtr a_type = virResctrlAllocGetType(resctrl, level, type); + + if (!a_type) + return -1; + + if (a_type->nsizes <= cache && + VIR_EXPAND_N(a_type->sizes, a_type->nsizes, + cache - a_type->nsizes + 1) < 0) + return -1; + + if (!a_type->sizes[cache] && VIR_ALLOC(a_type->sizes[cache]) < 0) + return -1; + + *(a_type->sizes[cache]) = size; + + return 0; +} + + +/* + * Check if there is an allocation for this level/type/cache already. Called + * before updating the structure. VIR_CACHE_TYPE_BOTH collides with any type, + * the other types collide with itself. This code basically checks if either: + * `alloc[level]->types[type]->sizes[cache]` + * or + * `alloc[level]->types[VIR_CACHE_TYPE_BOTH]->sizes[cache]` + * is non-NULL. All the fuzz around it is checking for NULL pointers along + * the way. + */ +static bool +virResctrlAllocCheckCollision(virResctrlAllocPtr alloc, + unsigned int level, + virCacheType type, + unsigned int cache) +{ + virResctrlAllocPerLevelPtr a_level = NULL; + virResctrlAllocPerTypePtr a_type = NULL; + + if (!alloc) + return false; + + if (alloc->nlevels <= level) + return false; + + a_level = alloc->levels[level]; + + if (!a_level) + return false; + + a_type = a_level->types[VIR_CACHE_TYPE_BOTH]; + + /* If there is an allocation for type 'both', there can be no other + * allocation for the same cache */ + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + + if (type == VIR_CACHE_TYPE_BOTH) { + a_type = a_level->types[VIR_CACHE_TYPE_CODE]; + + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + + a_type = a_level->types[VIR_CACHE_TYPE_DATA]; + + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + } else { + a_type = a_level->types[type]; + + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + } + + return false; +} + + +int +virResctrlAllocSetSize(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size) +{ + if (virResctrlAllocCheckCollision(resctrl, level, type, cache)) { + virReportError(VIR_ERR_XML_ERROR, + _("Colliding cache allocations for cache " + "level '%u' id '%u', type '%s'"), + level, cache, virCacheTypeToString(type)); + return -1; + } + + return virResctrlAllocUpdateSize(resctrl, level, type, cache, size); +} + + +int +virResctrlAllocForeachSize(virResctrlAllocPtr resctrl, + virResctrlAllocForeachSizeCallback cb, + void *opaque) +{ + int ret = 0; + unsigned int level = 0; + unsigned int type = 0; + unsigned int cache = 0; + + if (!resctrl) + return 0; + + for (level = 0; level < resctrl->nlevels; level++) { + virResctrlAllocPerLevelPtr a_level = resctrl->levels[level]; + + if (!a_level) + continue; + + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) { + virResctrlAllocPerTypePtr a_type = a_level->types[type]; + + if (!a_type) + continue; + + for (cache = 0; cache < a_type->nsizes; cache++) { + unsigned long long *size = a_type->sizes[cache]; + + if (!size) + continue; + + ret = cb(level, type, cache, *size, opaque); + if (ret < 0) + return ret; + } + } + } + + return 0; +} + + +int +virResctrlAllocSetID(virResctrlAllocPtr alloc, + const char *id) +{ + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl allocation 'id' cannot be NULL")); + return -1; + } + + return VIR_STRDUP(alloc->id, id); +} + + +const char * +virResctrlAllocGetID(virResctrlAllocPtr alloc) +{ + return alloc->id; +} + + +char * +virResctrlAllocFormat(virResctrlAllocPtr resctrl) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + unsigned int level = 0; + unsigned int type = 0; + unsigned int cache = 0; + + if (!resctrl) + return NULL; + + for (level = 0; level < resctrl->nlevels; level++) { + virResctrlAllocPerLevelPtr a_level = resctrl->levels[level]; + + if (!a_level) + continue; + + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) { + virResctrlAllocPerTypePtr a_type = a_level->types[type]; + + if (!a_type) + continue; + + virBufferAsprintf(&buf, "L%u%s:", level, virResctrlTypeToString(type)); + + for (cache = 0; cache < a_type->nmasks; cache++) { + virBitmapPtr mask = a_type->masks[cache]; + char *mask_str = NULL; + + if (!mask) + continue; + + mask_str = virBitmapToString(mask, false, true); + if (!mask_str) { + virBufferFreeAndReset(&buf); + return NULL; + } + + virBufferAsprintf(&buf, "%u=%s;", cache, mask_str); + VIR_FREE(mask_str); + } + + virBufferTrim(&buf, ";", 1); + virBufferAddChar(&buf, '\n'); + } + } + + virBufferCheckError(&buf); + return virBufferContentAndReset(&buf); +} + + +static int +virResctrlAllocParseProcessCache(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + unsigned int level, + virCacheType type, + char *cache) +{ + char *tmp = strchr(cache, '='); + unsigned int cache_id = 0; + virBitmapPtr mask = NULL; + int ret = -1; + + if (!tmp) + return 0; + + *tmp = '\0'; + tmp++; + + if (virStrToLong_uip(cache, NULL, 10, &cache_id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid cache id '%s'"), cache); + return -1; + } + + mask = virBitmapNewString(tmp); + if (!mask) + return -1; + + virBitmapShrink(mask, resctrl->levels[level]->types[type]->bits); + + /* Guess what! All bits set mean there is no allocation and it uses only + * unallocated parts of the cache. */ + if (!virBitmapIsAllSet(mask) && + virResctrlAllocUpdateMask(alloc, level, type, cache_id, mask) < 0) + goto cleanup; + + ret = 0; + cleanup: + virBitmapFree(mask); + return ret; +} + + +static int +virResctrlAllocParseProcessLine(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *line) +{ + char **caches = NULL; + char *tmp = NULL; + unsigned int level = 0; + int type = -1; + size_t ncaches = 0; + size_t i = 0; + int ret = -1; + + /* For no reason there can be spaces */ + virSkipSpaces((const char **) &line); + + /* Skip lines that don't concern caches, e.g. MB: etc. */ + if (line[0] != 'L') + return 0; + + /* And lines that we can't parse too */ + tmp = strchr(line, ':'); + if (!tmp) + return 0; + + *tmp = '\0'; + tmp++; + + if (virStrToLong_uip(line + 1, &line, 10, &level) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse resctrl schema level '%s'"), + line + 1); + return -1; + } + + type = virResctrlTypeFromString(line); + if (type < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse resctrl schema level '%s'"), + line + 1); + return -1; + } + + caches = virStringSplitCount(tmp, ";", 0, &ncaches); + if (!caches) + return 0; + + for (i = 0; i < ncaches; i++) { + if (virResctrlAllocParseProcessCache(resctrl, alloc, level, type, caches[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virStringListFree(caches); + return ret; +} + + +static int +virResctrlAllocParse(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + const char *schemata) +{ + char **lines = NULL; + size_t nlines = 0; + size_t i = 0; + int ret = -1; + + lines = virStringSplitCount(schemata, "\n", 0, &nlines); + for (i = 0; i < nlines; i++) { + if (virResctrlAllocParseProcessLine(resctrl, alloc, lines[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virStringListFree(lines); + return ret; +} + + +static void +virResctrlAllocSubtractPerType(virResctrlAllocPerTypePtr dst, + virResctrlAllocPerTypePtr src) +{ + size_t i = 0; + + if (!dst || !src) + return; + + for (i = 0; i < dst->nmasks && i < src->nmasks; i++) { + if (dst->masks[i] && src->masks[i]) + virBitmapSubtract(dst->masks[i], src->masks[i]); + } +} + + +static void +virResctrlAllocSubtract(virResctrlAllocPtr dst, + virResctrlAllocPtr src) +{ + size_t i = 0; + size_t j = 0; + + if (!src) + return; + + for (i = 0; i < dst->nlevels && i < src->nlevels; i++) { + if (dst->levels[i] && src->levels[i]) { + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlAllocSubtractPerType(dst->levels[i]->types[j], + src->levels[i]->types[j]); + } + } + } +} + + +static virResctrlAllocPtr +virResctrlAllocNewFromInfo(virResctrlInfoPtr info) +{ + size_t i = 0; + size_t j = 0; + size_t k = 0; + virResctrlAllocPtr ret = virResctrlAllocNew(); + virBitmapPtr mask = NULL; + + if (!ret) + return NULL; + + for (i = 0; i < info->nlevels; i++) { + virResctrlInfoPerLevelPtr i_level = info->levels[i]; + + if (!i_level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlInfoPerTypePtr i_type = i_level->types[j]; + + if (!i_type) + continue; + + virBitmapFree(mask); + mask = virBitmapNew(i_type->bits); + if (!mask) + goto error; + virBitmapSetAll(mask); + + for (k = 0; k <= i_type->max_cache_id; k++) { + if (virResctrlAllocUpdateMask(ret, i, j, k, mask) < 0) + goto error; + } + } + } + + cleanup: + virBitmapFree(mask); + return ret; + error: + virObjectUnref(ret); + ret = NULL; + goto cleanup; +} + +/* + * This function creates an allocation that represents all unused parts of all + * caches in the system. It uses virResctrlInfo for creating a new full + * allocation with all bits set (using virResctrlAllocNewFromInfo()) and then + * scans for all allocations under /sys/fs/resctrl and subtracts each one of + * them from it. That way it can then return an allocation with only bit set + * being those that are not mentioned in any other allocation. It is used for + * two things, a) calculating the masks when creating allocations and b) from + * tests. + */ +virResctrlAllocPtr +virResctrlAllocGetUnused(virResctrlInfoPtr resctrl) +{ + virResctrlAllocPtr ret = NULL; + virResctrlAllocPtr alloc = NULL; + struct dirent *ent = NULL; + DIR *dirp = NULL; + char *schemata = NULL; + int rv = -1; + + if (virResctrlInfoIsEmpty(resctrl)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Resource control is not supported on this host")); + return NULL; + } + + ret = virResctrlAllocNewFromInfo(resctrl); + if (!ret) + return NULL; + + if (virFileReadValueString(&schemata, + SYSFS_RESCTRL_PATH + "/schemata") < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not read schemata file for the default group")); + goto error; + } + + alloc = virResctrlAllocNew(); + if (!alloc) + goto error; + + if (virResctrlAllocParse(resctrl, alloc, schemata) < 0) + goto error; + if (!virResctrlAllocIsEmpty(alloc)) + virResctrlAllocSubtract(ret, alloc); + + if (virDirOpen(&dirp, SYSFS_RESCTRL_PATH) < 0) + goto error; + + while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH)) > 0) { + if (ent->d_type != DT_DIR) + continue; + + if (STREQ(ent->d_name, "info")) + continue; + + VIR_FREE(schemata); + rv = virFileReadValueString(&schemata, + SYSFS_RESCTRL_PATH + "/%s/schemata", + ent->d_name); + if (rv == -2) + continue; + + if (rv < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Could not read schemata file for group %s"), + ent->d_name); + goto error; + } + + virObjectUnref(alloc); + alloc = virResctrlAllocNew(); + if (!alloc) + goto error; + + if (virResctrlAllocParse(resctrl, alloc, schemata) < 0) + goto error; + + virResctrlAllocSubtract(ret, alloc); + } + if (rv < 0) + goto error; + + cleanup: + virObjectUnref(alloc); + VIR_DIR_CLOSE(dirp); + VIR_FREE(schemata); + return ret; + + error: + virObjectUnref(ret); + ret = NULL; + goto cleanup; +} + + + +static int +virResctrlAllocSetMask(virResctrlAllocPerTypePtr a_type, + unsigned int cache, + virBitmapPtr mask) +{ + if (a_type->nmasks <= cache && + VIR_EXPAND_N(a_type->masks, a_type->nmasks, + cache - a_type->nmasks + 1) < 0) + return -1; + + a_type->masks[cache] = mask; + + return 0; +} + + +/* + * Given the information about requested allocation type `a_type`, the host + * cache for a particular type `i_type` and unused bits in the system `f_type` + * this function tries to find the smallest free space in which the allocation + * for cache id `cache` would fit. We're looking for the smallest place in + * order to minimize fragmentation and maximize the possibility of succeeding. + */ +static int +virResctrlAllocFindUnused(virResctrlAllocPerTypePtr a_type, + virResctrlInfoPerTypePtr i_type, + virResctrlAllocPerTypePtr f_type, + unsigned int level, + unsigned int type, + unsigned int cache) +{ + unsigned long long *size = a_type->sizes[cache]; + virBitmapPtr a_mask = NULL; + virBitmapPtr f_mask = NULL; + unsigned long long granularity; + unsigned long long need_bits; + size_t i = 0; + ssize_t pos = -1; + ssize_t last_bits = 0; + ssize_t last_pos = -1; + + /* If there is no reservation requested we need to set all bits. That's due + * to weird interface of the resctrl sysfs. It's also the reason why we + * cannot reserve the whole cache in one allocation. */ + if (!size) { + a_mask = virBitmapNew(i_type->bits); + if (!a_mask) + return -1; + + virBitmapSetAll(a_mask); + + if (virResctrlAllocSetMask(a_type, cache, a_mask) < 0) { + virBitmapFree(a_mask); + return -1; + } + + return 0; + } + + if (cache >= f_type->nmasks) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache with id %u does not exists for level %d"), + cache, level); + return -1; + } + + f_mask = f_type->masks[cache]; + if (!f_mask) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache level %d id %u does not support tuning for " + "scope type '%s'"), + level, cache, virCacheTypeToString(type)); + return -1; + } + + if (*size == i_type->size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache allocation for the whole cache is not " + "possible, specify size smaller than %llu"), + i_type->size); + return -1; + } + + granularity = i_type->size / i_type->bits; + need_bits = *size / granularity; + + if (*size % granularity) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache allocation of size %llu is not " + "divisible by granularity %llu"), + *size, granularity); + return -1; + } + + if (need_bits < i_type->min_cbm_bits) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache allocation of size %llu is smaller " + "than the minimum allowed allocation %llu"), + *size, granularity * i_type->min_cbm_bits); + return -1; + } + + while ((pos = virBitmapNextSetBit(f_mask, pos)) >= 0) { + ssize_t pos_clear = virBitmapNextClearBit(f_mask, pos); + ssize_t bits; + + if (pos_clear < 0) + pos_clear = virBitmapSize(f_mask); + + bits = pos_clear - pos; + + /* Not enough bits, move on and skip all of them */ + if (bits < need_bits) { + pos = pos_clear; + continue; + } + + /* This fits perfectly */ + if (bits == need_bits) { + last_pos = pos; + break; + } + + /* Remember the smaller region if we already found on before */ + if (last_pos < 0 || (last_bits && bits < last_bits)) { + last_bits = bits; + last_pos = pos; + } + + pos = pos_clear; + } + + if (last_pos < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Not enough room for allocation of " + "%llu bytes for level %u cache %u " + "scope type '%s'"), + *size, level, cache, + virCacheTypeToString(type)); + return -1; + } + + a_mask = virBitmapNew(i_type->bits); + if (!a_mask) + return -1; + + for (i = last_pos; i < last_pos + need_bits; i++) { + ignore_value(virBitmapSetBit(a_mask, i)); + ignore_value(virBitmapClearBit(f_mask, i)); + } + + if (virResctrlAllocSetMask(a_type, cache, a_mask) < 0) { + virBitmapFree(a_mask); + return -1; + } + + return 0; +} + + +/* + * This function is called when creating an allocation in the system. What it + * does is that it gets all the unused bits using virResctrlAllocGetUnused() and + * then tries to find a proper space for every requested allocation effectively + * transforming `sizes` into `masks`. + */ +static int +virResctrlAllocMasksAssign(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc) +{ + int ret = -1; + unsigned int level = 0; + virResctrlAllocPtr alloc_free = NULL; + + alloc_free = virResctrlAllocGetUnused(resctrl); + if (!alloc_free) + return -1; + + for (level = 0; level < alloc->nlevels; level++) { + virResctrlAllocPerLevelPtr a_level = alloc->levels[level]; + virResctrlAllocPerLevelPtr f_level = NULL; + unsigned int type = 0; + + if (!a_level) + continue; + + if (level < alloc_free->nlevels) + f_level = alloc_free->levels[level]; + + if (!f_level) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache level %d does not support tuning"), + level); + goto cleanup; + } + + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) { + virResctrlAllocPerTypePtr a_type = a_level->types[type]; + virResctrlAllocPerTypePtr f_type = f_level->types[type]; + unsigned int cache = 0; + + if (!a_type) + continue; + + if (!f_type) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache level %d does not support tuning for " + "scope type '%s'"), + level, virCacheTypeToString(type)); + goto cleanup; + } + + for (cache = 0; cache < a_type->nsizes; cache++) { + virResctrlInfoPerLevelPtr i_level = resctrl->levels[level]; + virResctrlInfoPerTypePtr i_type = i_level->types[type]; + + if (virResctrlAllocFindUnused(a_type, i_type, f_type, level, type, cache) < 0) + goto cleanup; + } + } + } + + ret = 0; + cleanup: + virObjectUnref(alloc_free); + return ret; +} + + +/* This checks if the directory for the alloc exists. If not it tries to create + * it and apply appropriate alloc settings. */ +int +virResctrlAllocCreate(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + const char *machinename) +{ + char *schemata_path = NULL; + char *alloc_str = NULL; + int ret = -1; + int lockfd = -1; + + if (!alloc) + return 0; + + if (!resctrl) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Resource control is not supported on this host")); + return -1; + } + + if (!alloc->id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl Allocation ID must be set before creation")); + return -1; + } + + if (!alloc->path && + virAsprintf(&alloc->path, "%s/%s-%s", + SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + return -1; + + /* Check if this allocation was already created */ + if (virFileIsDir(alloc->path)) + return 0; + + if (virFileExists(alloc->path)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Path '%s' for resctrl allocation exists and is not a " + "directory"), alloc->path); + goto cleanup; + } + + lockfd = virResctrlLockWrite(); + if (lockfd < 0) + goto cleanup; + + if (virResctrlAllocMasksAssign(resctrl, alloc) < 0) + goto cleanup; + + alloc_str = virResctrlAllocFormat(alloc); + if (!alloc_str) + goto cleanup; + + if (virAsprintf(&schemata_path, "%s/schemata", alloc->path) < 0) + goto cleanup; + + if (virFileMakePath(alloc->path) < 0) { + virReportSystemError(errno, + _("Cannot create resctrl directory '%s'"), + alloc->path); + goto cleanup; + } + + VIR_DEBUG("Writing resctrl schemata '%s' into '%s'", alloc_str, schemata_path); + if (virFileWriteStr(schemata_path, alloc_str, 0) < 0) { + rmdir(alloc->path); + virReportSystemError(errno, + _("Cannot write into schemata file '%s'"), + schemata_path); + goto cleanup; + } + + ret = 0; + cleanup: + virResctrlUnlock(lockfd); + VIR_FREE(alloc_str); + VIR_FREE(schemata_path); + return ret; +} + + +int +virResctrlAllocAddPID(virResctrlAllocPtr alloc, + pid_t pid) +{ + char *tasks = NULL; + char *pidstr = NULL; + int ret = 0; + + if (!alloc->path) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot add pid to non-existing resctrl allocation")); + return -1; + } + + if (virAsprintf(&tasks, "%s/tasks", alloc->path) < 0) + return -1; + + if (virAsprintf(&pidstr, "%lld", (long long int) pid) < 0) + goto cleanup; + + if (virFileWriteStr(tasks, pidstr, 0) < 0) { + virReportSystemError(errno, + _("Cannot write pid in tasks file '%s'"), + tasks); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(tasks); + VIR_FREE(pidstr); + return ret; +} + + +int +virResctrlAllocRemove(virResctrlAllocPtr alloc) +{ + int ret = 0; + + if (!alloc->path) + return 0; + + VIR_DEBUG("Removing resctrl allocation %s", alloc->path); + if (rmdir(alloc->path) != 0 && errno != ENOENT) { + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), alloc->path, errno); + } + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 471c02f28dcd..27793c86a745 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -68,4 +68,53 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, size_t *ncontrols, virResctrlInfoPerCachePtr **controls); +/* Alloc-related things */ +typedef struct _virResctrlAlloc virResctrlAlloc; +typedef virResctrlAlloc *virResctrlAllocPtr; + +typedef int virResctrlAllocForeachSizeCallback(unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size, + void *opaque); + +virResctrlAllocPtr +virResctrlAllocNew(void); + +bool +virResctrlAllocIsEmpty(virResctrlAllocPtr resctrl); + +int +virResctrlAllocSetSize(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size); + +int +virResctrlAllocForeachSize(virResctrlAllocPtr resctrl, + virResctrlAllocForeachSizeCallback cb, + void *opaque); + +int +virResctrlAllocSetID(virResctrlAllocPtr alloc, + const char *id); +const char * +virResctrlAllocGetID(virResctrlAllocPtr alloc); + +char * +virResctrlAllocFormat(virResctrlAllocPtr alloc); + +int +virResctrlAllocCreate(virResctrlInfoPtr r_info, + virResctrlAllocPtr alloc, + const char *machinename); + +int +virResctrlAllocAddPID(virResctrlAllocPtr alloc, + pid_t pid); + +int +virResctrlAllocRemove(virResctrlAllocPtr alloc); + #endif /* __VIR_RESCTRL_H__ */ diff --git a/src/util/virresctrlpriv.h b/src/util/virresctrlpriv.h new file mode 100644 index 000000000000..72b115afd13e --- /dev/null +++ b/src/util/virresctrlpriv.h @@ -0,0 +1,27 @@ +/* + * virresctrlpriv.h: + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_RESCTRL_PRIV_H__ +# define __VIR_RESCTRL_PRIV_H__ + +# include "virresctrl.h" + +virResctrlAllocPtr +virResctrlAllocGetUnused(virResctrlInfoPtr resctrl); + +#endif /* __VIR_RESCTRL_PRIV_H__ */ -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:14PM +0100, Martin Kletzander wrote:
With this commit we finally have a way to read and manipulate basic resctrl settings. Locking is done only on exposed functions that read/write from/to resctrlfs. Not in functions that are exposed in virresctrlpriv.h as those are only supposed to be used from tests.
More information about how resctrl works:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu...
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/Makefile.am | 2 +- src/libvirt_private.syms | 10 + src/util/virresctrl.c | 1211 ++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 49 ++ src/util/virresctrlpriv.h | 27 + 5 files changed, 1288 insertions(+), 11 deletions(-) create mode 100644 src/util/virresctrlpriv.h
diff --git a/src/Makefile.am b/src/Makefile.am index 166c9a8e9199..289b03747212 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -167,7 +167,7 @@ UTIL_SOURCES = \ util/virprocess.c util/virprocess.h \ util/virqemu.c util/virqemu.h \ util/virrandom.h util/virrandom.c \ - util/virresctrl.h util/virresctrl.c \ + util/virresctrl.h util/virresctrl.c util/virresctrlpriv.h \ util/virrotatingfile.h util/virrotatingfile.c \ util/virscsi.c util/virscsi.h \ util/virscsihost.c util/virscsihost.h \ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d63474e0fc20..f2a2c8650d97 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2552,6 +2552,16 @@ virRandomInt; # util/virresctrl.h virCacheTypeFromString; virCacheTypeToString; +virResctrlAllocAddPID; +virResctrlAllocCreate; +virResctrlAllocForeachSize; +virResctrlAllocFormat; +virResctrlAllocGetID; +virResctrlAllocGetUnused; +virResctrlAllocNew; +virResctrlAllocRemove; +virResctrlAllocSetID; +virResctrlAllocSetSize; virResctrlGetInfo; virResctrlInfoGetCache; virResctrlInfoNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index f32929468c80..b5c0ea338830 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -18,8 +18,12 @@
#include <config.h>
-#include "virresctrl.h" +#include <sys/file.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h>
+#include "virresctrlpriv.h" #include "c-ctype.h" #include "count-one-bits.h" #include "viralloc.h" @@ -152,6 +156,195 @@ virResctrlInfoNew(void) }
+/* Alloc-related definitions and AllocClass-related functions */ + +/* + * virResctrlAlloc represents one allocation (in XML under cputune/cachetune and + * consequently a directory under /sys/fs/resctrl). Since it can have multiple + * parts of multiple caches allocated it is represented as bunch of nested + * sparse arrays (by sparse I mean array of pointers so that each might be NULL + * in case there is no allocation for that particular one (level, cache, ...)). + * + * Since one allocation can be made for caches on different levels, the first + * nested sparse array is of types virResctrlAllocPerLevel. For example if you + * have allocation for level 3 cache, there will be three NULL pointers and then + * allocated pointer to virResctrlAllocPerLevel. That way you can access it by + * `alloc[level]` as O(1) is desired instead of crawling through normal arrays + * or lists in three nested loops. The code uses a lot of direct accesses. + * + * Each virResctrlAllocPerLevel can have allocations for different cache + * allocation types. You can allocate instruction cache (VIR_CACHE_TYPE_CODE), + * data cache (VIR_CACHE_TYPE_DATA) or unified cache (VIR_CACHE_TYPE_BOTH). + * Those allocations are kept in sparse array of virResctrlAllocPerType pointers. + * + * For each virResctrlAllocPerType users can request some size of the cache to + * be allocated. That's what the sparse array `sizes` is for. Non-NULL + * pointers represent requested size allocations. The array is indexed by host + * cache id (gotten from `/sys/devices/system/cpu/cpuX/cache/indexY/id`). Users + * can see this information e.g. in the output of `virsh capabilities` (for that + * information there's the other struct, namely `virResctrlInfo`). + * + * When allocation is being created we need to find unused part of the cache for + * all of them. While doing that we store the bitmask in a sparse array of + * virBitmaps named `masks` indexed the same way as `sizes`. The upper bounds + * of the sparse arrays are stored in nmasks or nsizes, respectively. + */ + +/* virResctrlAllocPerType represents */
Incomplete comment? :)
+typedef struct _virResctrlAllocPerType virResctrlAllocPerType; +typedef virResctrlAllocPerType *virResctrlAllocPerTypePtr; +struct _virResctrlAllocPerType { + /* There could be bool saying whether this is set or not, but since everything + * in virResctrlAlloc (and most of libvirt) goes with pointer arrays we would + * have to have one more level of allocation anyway, so this stays faithful to + * the concept */ + unsigned long long **sizes; + size_t nsizes; + + /* Mask for each cache */ + virBitmapPtr *masks; + size_t nmasks; +}; + +typedef struct _virResctrlAllocPerLevel virResctrlAllocPerLevel; +typedef virResctrlAllocPerLevel *virResctrlAllocPerLevelPtr; +struct _virResctrlAllocPerLevel { + virResctrlAllocPerTypePtr *types; /* Indexed with enum virCacheType */ + /* There is no `ntypes` member variable as it is always allocated for + * VIR_CACHE_TYPE_LAST number of items */ +}; + +struct _virResctrlAlloc { + virObject parent; + + virResctrlAllocPerLevelPtr *levels; + size_t nlevels; + + /* The identifier (any unique string for now) */ + char *id; + /* libvirt-generated path in /sys/fs/resctrl for this particular + * allocation */ + char *path; +}; + +static virClassPtr virResctrlAllocClass; + +static void +virResctrlAllocDispose(void *obj) +{ + size_t i = 0; + size_t j = 0; + size_t k = 0; + + virResctrlAllocPtr resctrl = obj; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlAllocPerLevelPtr level = resctrl->levels[i]; + + if (!level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlAllocPerTypePtr type = level->types[j]; + + if (!type) + continue; + + for (k = 0; k < type->nsizes; k++) + VIR_FREE(type->sizes[k]); + + for (k = 0; k < type->nmasks; k++) + virBitmapFree(type->masks[k]); + + VIR_FREE(type->sizes); + VIR_FREE(type->masks); + VIR_FREE(type); + } + VIR_FREE(level->types); + VIR_FREE(level); + } + + VIR_FREE(resctrl->id); + VIR_FREE(resctrl->path); + VIR_FREE(resctrl->levels); +} + + +static int +virResctrlAllocOnceInit(void) +{ + if (!(virResctrlAllocClass = virClassNew(virClassForObject(), + "virResctrlAlloc", + sizeof(virResctrlAlloc), + virResctrlAllocDispose))) + return -1; + + return 0; +} + + +VIR_ONCE_GLOBAL_INIT(virResctrlAlloc) + + +virResctrlAllocPtr +virResctrlAllocNew(void) +{ + if (virResctrlAllocInitialize() < 0) + return NULL; + + return virObjectNew(virResctrlAllocClass); +} + + +/* Common functions */ +static int +virResctrlLockInternal(int op) +{ + int fd = open(SYSFS_RESCTRL_PATH, O_DIRECTORY | O_CLOEXEC); + + if (fd < 0) { + virReportSystemError(errno, "%s", _("Cannot open resctrlfs")); + return -1; + } + + if (flock(fd, op) < 0) { + virReportSystemError(errno, "%s", _("Cannot lock resctrlfs")); + VIR_FORCE_CLOSE(fd); + return -1; + } + + return fd; +} + + +static inline int +virResctrlLockWrite(void) +{ + return virResctrlLockInternal(LOCK_EX); +} + + +static int +virResctrlUnlock(int fd) +{ + if (fd == -1) + return 0; + + /* The lock gets unlocked by closing the fd, which we need to do anyway in + * order to clean up properly */ + if (VIR_CLOSE(fd) < 0) { + virReportSystemError(errno, "%s", _("Cannot close resctrlfs")); + + /* Trying to save the already broken */ + if (flock(fd, LOCK_UN) < 0) + virReportSystemError(errno, "%s", _("Cannot unlock resctrlfs")); + return -1; + } + + return 0; +} + + /* Info-related functions */ bool virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl) @@ -200,6 +393,8 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) }
while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH "/info")) > 0) { + VIR_DEBUG("Parsing info type '%s'", ent->d_name); + if (ent->d_type != DT_DIR) continue;
@@ -207,16 +402,14 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) continue;
if (virStrToLong_uip(ent->d_name + 1, &endptr, 10, &level) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Cannot parse resctrl cache info level")); + VIR_DEBUG("Cannot parse resctrl cache info level '%s'", ent->d_name + 1); goto cleanup;
I guess that there should be "continue" instead of "goto cleanup".
}
type = virResctrlTypeFromString(endptr); if (type < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Cannot parse resctrl cache info type")); - goto cleanup; + VIR_DEBUG("Cannot parse resctrl cache info type '%s'", endptr); + continue; }
if (VIR_ALLOC(i_type) < 0) @@ -259,10 +452,19 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) level - resctrl->nlevels + 1) < 0) goto cleanup;
- if (!resctrl->levels[level] && - (VIR_ALLOC(resctrl->levels[level]) < 0 || - VIR_ALLOC_N(resctrl->levels[level]->types, VIR_CACHE_TYPE_LAST) < 0)) - goto cleanup; + if (!resctrl->levels[level]) { + virResctrlInfoPerTypePtr *types = NULL; + + if (VIR_ALLOC_N(types, VIR_CACHE_TYPE_LAST) < 0) + goto cleanup; + + if (VIR_ALLOC(resctrl->levels[level]) < 0) { + VIR_FREE(types); + goto cleanup; + } + resctrl->levels[level]->types = types; + } + i_level = resctrl->levels[level];
if (i_level->types[type]) { @@ -358,3 +560,992 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, VIR_FREE(*controls); goto cleanup; }
This whole info related part should be probably squashed into patch 3.
+ + +/* Alloc-related functions */ +bool +virResctrlAllocIsEmpty(virResctrlAllocPtr resctrl) +{ + size_t i = 0; + size_t j = 0; + size_t k = 0; + + if (!resctrl) + return true; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlAllocPerLevelPtr a_level = resctrl->levels[i]; + + if (!a_level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlAllocPerTypePtr a_type = a_level->types[j]; + + if (!a_type) + continue; + + for (k = 0; k < a_type->nsizes; k++) { + if (a_type->sizes[k]) + return false; + } + + for (k = 0; k < a_type->nmasks; k++) { + if (a_type->masks[k]) + return false; + } + } + } + + return true; +} + + +static virResctrlAllocPerTypePtr +virResctrlAllocGetType(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type) +{ + virResctrlAllocPerLevelPtr a_level = NULL; + + if (resctrl->nlevels <= level && + VIR_EXPAND_N(resctrl->levels, resctrl->nlevels, level - resctrl->nlevels + 1) < 0) + return NULL; + + if (!resctrl->levels[level]) { + virResctrlAllocPerTypePtr *types = NULL; + + if (VIR_ALLOC_N(types, VIR_CACHE_TYPE_LAST) < 0) + return NULL; + + if (VIR_ALLOC(resctrl->levels[level]) < 0) { + VIR_FREE(types); + return NULL; + } + resctrl->levels[level]->types = types; + } + + a_level = resctrl->levels[level]; + + if (!a_level->types[type] && VIR_ALLOC(a_level->types[type]) < 0) + return NULL; + + return a_level->types[type]; +} + + +static int +virResctrlAllocUpdateMask(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + virBitmapPtr mask) +{ + virResctrlAllocPerTypePtr a_type = virResctrlAllocGetType(resctrl, level, type); + + if (!a_type) + return -1; + + if (a_type->nmasks <= cache && + VIR_EXPAND_N(a_type->masks, a_type->nmasks, + cache - a_type->nmasks + 1) < 0) + return -1; + + if (!a_type->masks[cache]) { + a_type->masks[cache] = virBitmapNew(virBitmapSize(mask)); + + if (!a_type->masks[cache]) + return -1; + } + + return virBitmapCopy(a_type->masks[cache], mask); +} + + +static int +virResctrlAllocUpdateSize(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size) +{ + virResctrlAllocPerTypePtr a_type = virResctrlAllocGetType(resctrl, level, type); + + if (!a_type) + return -1; + + if (a_type->nsizes <= cache && + VIR_EXPAND_N(a_type->sizes, a_type->nsizes, + cache - a_type->nsizes + 1) < 0) + return -1; + + if (!a_type->sizes[cache] && VIR_ALLOC(a_type->sizes[cache]) < 0) + return -1; + + *(a_type->sizes[cache]) = size; + + return 0; +} + + +/* + * Check if there is an allocation for this level/type/cache already. Called + * before updating the structure. VIR_CACHE_TYPE_BOTH collides with any type, + * the other types collide with itself. This code basically checks if either: + * `alloc[level]->types[type]->sizes[cache]` + * or + * `alloc[level]->types[VIR_CACHE_TYPE_BOTH]->sizes[cache]` + * is non-NULL. All the fuzz around it is checking for NULL pointers along + * the way. + */ +static bool +virResctrlAllocCheckCollision(virResctrlAllocPtr alloc, + unsigned int level, + virCacheType type, + unsigned int cache) +{ + virResctrlAllocPerLevelPtr a_level = NULL; + virResctrlAllocPerTypePtr a_type = NULL; + + if (!alloc) + return false; + + if (alloc->nlevels <= level) + return false; + + a_level = alloc->levels[level]; + + if (!a_level) + return false; + + a_type = a_level->types[VIR_CACHE_TYPE_BOTH]; + + /* If there is an allocation for type 'both', there can be no other + * allocation for the same cache */ + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + + if (type == VIR_CACHE_TYPE_BOTH) { + a_type = a_level->types[VIR_CACHE_TYPE_CODE]; + + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + + a_type = a_level->types[VIR_CACHE_TYPE_DATA]; + + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + } else { + a_type = a_level->types[type]; + + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + } + + return false; +} + + +int +virResctrlAllocSetSize(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size) +{ + if (virResctrlAllocCheckCollision(resctrl, level, type, cache)) { + virReportError(VIR_ERR_XML_ERROR, + _("Colliding cache allocations for cache " + "level '%u' id '%u', type '%s'"), + level, cache, virCacheTypeToString(type)); + return -1; + } + + return virResctrlAllocUpdateSize(resctrl, level, type, cache, size); +} + + +int +virResctrlAllocForeachSize(virResctrlAllocPtr resctrl, + virResctrlAllocForeachSizeCallback cb, + void *opaque) +{ + int ret = 0; + unsigned int level = 0; + unsigned int type = 0; + unsigned int cache = 0; + + if (!resctrl) + return 0; + + for (level = 0; level < resctrl->nlevels; level++) { + virResctrlAllocPerLevelPtr a_level = resctrl->levels[level]; + + if (!a_level) + continue; + + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) { + virResctrlAllocPerTypePtr a_type = a_level->types[type]; + + if (!a_type) + continue; + + for (cache = 0; cache < a_type->nsizes; cache++) { + unsigned long long *size = a_type->sizes[cache]; + + if (!size) + continue; + + ret = cb(level, type, cache, *size, opaque); + if (ret < 0) + return ret; + } + } + } + + return 0; +} + + +int +virResctrlAllocSetID(virResctrlAllocPtr alloc, + const char *id) +{ + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl allocation 'id' cannot be NULL")); + return -1; + } + + return VIR_STRDUP(alloc->id, id); +} + + +const char * +virResctrlAllocGetID(virResctrlAllocPtr alloc) +{ + return alloc->id; +} + + +char * +virResctrlAllocFormat(virResctrlAllocPtr resctrl) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + unsigned int level = 0; + unsigned int type = 0; + unsigned int cache = 0; + + if (!resctrl) + return NULL; + + for (level = 0; level < resctrl->nlevels; level++) { + virResctrlAllocPerLevelPtr a_level = resctrl->levels[level]; + + if (!a_level) + continue; + + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) { + virResctrlAllocPerTypePtr a_type = a_level->types[type]; + + if (!a_type) + continue; + + virBufferAsprintf(&buf, "L%u%s:", level, virResctrlTypeToString(type)); + + for (cache = 0; cache < a_type->nmasks; cache++) { + virBitmapPtr mask = a_type->masks[cache]; + char *mask_str = NULL; + + if (!mask) + continue; + + mask_str = virBitmapToString(mask, false, true); + if (!mask_str) { + virBufferFreeAndReset(&buf); + return NULL; + } + + virBufferAsprintf(&buf, "%u=%s;", cache, mask_str); + VIR_FREE(mask_str); + } + + virBufferTrim(&buf, ";", 1); + virBufferAddChar(&buf, '\n'); + } + } + + virBufferCheckError(&buf); + return virBufferContentAndReset(&buf); +} + + +static int +virResctrlAllocParseProcessCache(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + unsigned int level, + virCacheType type, + char *cache) +{ + char *tmp = strchr(cache, '='); + unsigned int cache_id = 0; + virBitmapPtr mask = NULL; + int ret = -1; + + if (!tmp) + return 0; + + *tmp = '\0'; + tmp++; + + if (virStrToLong_uip(cache, NULL, 10, &cache_id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid cache id '%s'"), cache); + return -1; + } + + mask = virBitmapNewString(tmp); + if (!mask) + return -1; + + virBitmapShrink(mask, resctrl->levels[level]->types[type]->bits); + + /* Guess what! All bits set mean there is no allocation and it uses only + * unallocated parts of the cache. */
Actually this doesn't seem to be correct from the documentation. It looks like that in order to make some part of the case dedicated to a group of processes you need to create that group, set which part of the cache should be used and make sure that no other group including the default one doesn't use that part of the cache. In other words, all bits set means that the processes can use any part of the cache.
+ if (!virBitmapIsAllSet(mask) && + virResctrlAllocUpdateMask(alloc, level, type, cache_id, mask) < 0) + goto cleanup; + + ret = 0; + cleanup: + virBitmapFree(mask); + return ret; +} + + +static int +virResctrlAllocParseProcessLine(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *line) +{ + char **caches = NULL; + char *tmp = NULL; + unsigned int level = 0; + int type = -1; + size_t ncaches = 0; + size_t i = 0; + int ret = -1; + + /* For no reason there can be spaces */ + virSkipSpaces((const char **) &line); + + /* Skip lines that don't concern caches, e.g. MB: etc. */ + if (line[0] != 'L') + return 0; + + /* And lines that we can't parse too */ + tmp = strchr(line, ':'); + if (!tmp) + return 0; + + *tmp = '\0'; + tmp++; + + if (virStrToLong_uip(line + 1, &line, 10, &level) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse resctrl schema level '%s'"), + line + 1); + return -1; + } + + type = virResctrlTypeFromString(line); + if (type < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse resctrl schema level '%s'"), + line + 1); + return -1; + } + + caches = virStringSplitCount(tmp, ";", 0, &ncaches); + if (!caches) + return 0; + + for (i = 0; i < ncaches; i++) { + if (virResctrlAllocParseProcessCache(resctrl, alloc, level, type, caches[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virStringListFree(caches); + return ret; +} + + +static int +virResctrlAllocParse(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + const char *schemata) +{ + char **lines = NULL; + size_t nlines = 0; + size_t i = 0; + int ret = -1; + + lines = virStringSplitCount(schemata, "\n", 0, &nlines); + for (i = 0; i < nlines; i++) { + if (virResctrlAllocParseProcessLine(resctrl, alloc, lines[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virStringListFree(lines); + return ret; +} + + +static void +virResctrlAllocSubtractPerType(virResctrlAllocPerTypePtr dst, + virResctrlAllocPerTypePtr src) +{ + size_t i = 0; + + if (!dst || !src) + return; + + for (i = 0; i < dst->nmasks && i < src->nmasks; i++) { + if (dst->masks[i] && src->masks[i]) + virBitmapSubtract(dst->masks[i], src->masks[i]); + } +} + + +static void +virResctrlAllocSubtract(virResctrlAllocPtr dst, + virResctrlAllocPtr src) +{ + size_t i = 0; + size_t j = 0; + + if (!src) + return; + + for (i = 0; i < dst->nlevels && i < src->nlevels; i++) { + if (dst->levels[i] && src->levels[i]) { + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlAllocSubtractPerType(dst->levels[i]->types[j], + src->levels[i]->types[j]); + } + } + } +} + + +static virResctrlAllocPtr +virResctrlAllocNewFromInfo(virResctrlInfoPtr info) +{ + size_t i = 0; + size_t j = 0; + size_t k = 0; + virResctrlAllocPtr ret = virResctrlAllocNew(); + virBitmapPtr mask = NULL; + + if (!ret) + return NULL; + + for (i = 0; i < info->nlevels; i++) { + virResctrlInfoPerLevelPtr i_level = info->levels[i]; + + if (!i_level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlInfoPerTypePtr i_type = i_level->types[j]; + + if (!i_type) + continue; + + virBitmapFree(mask); + mask = virBitmapNew(i_type->bits); + if (!mask) + goto error; + virBitmapSetAll(mask); + + for (k = 0; k <= i_type->max_cache_id; k++) { + if (virResctrlAllocUpdateMask(ret, i, j, k, mask) < 0) + goto error; + } + } + } + + cleanup: + virBitmapFree(mask); + return ret; + error: + virObjectUnref(ret); + ret = NULL; + goto cleanup; +} + +/* + * This function creates an allocation that represents all unused parts of all + * caches in the system. It uses virResctrlInfo for creating a new full + * allocation with all bits set (using virResctrlAllocNewFromInfo()) and then + * scans for all allocations under /sys/fs/resctrl and subtracts each one of + * them from it. That way it can then return an allocation with only bit set + * being those that are not mentioned in any other allocation. It is used for + * two things, a) calculating the masks when creating allocations and b) from + * tests. + */ +virResctrlAllocPtr +virResctrlAllocGetUnused(virResctrlInfoPtr resctrl) +{ + virResctrlAllocPtr ret = NULL; + virResctrlAllocPtr alloc = NULL; + struct dirent *ent = NULL; + DIR *dirp = NULL; + char *schemata = NULL; + int rv = -1; + + if (virResctrlInfoIsEmpty(resctrl)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Resource control is not supported on this host")); + return NULL; + } + + ret = virResctrlAllocNewFromInfo(resctrl); + if (!ret) + return NULL; + + if (virFileReadValueString(&schemata, + SYSFS_RESCTRL_PATH + "/schemata") < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not read schemata file for the default group")); + goto error; + } + + alloc = virResctrlAllocNew(); + if (!alloc) + goto error; + + if (virResctrlAllocParse(resctrl, alloc, schemata) < 0) + goto error; + if (!virResctrlAllocIsEmpty(alloc))
This check is not required if we change the logic in virResctrlAllocParseProcessCache(). There cannot be any cache group that has empty allocation.
+ virResctrlAllocSubtract(ret, alloc); + + if (virDirOpen(&dirp, SYSFS_RESCTRL_PATH) < 0) + goto error; + + while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH)) > 0) { + if (ent->d_type != DT_DIR) + continue; + + if (STREQ(ent->d_name, "info")) + continue; + + VIR_FREE(schemata); + rv = virFileReadValueString(&schemata, + SYSFS_RESCTRL_PATH + "/%s/schemata", + ent->d_name); + if (rv == -2) + continue; + + if (rv < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Could not read schemata file for group %s"), + ent->d_name); + goto error; + } + + virObjectUnref(alloc); + alloc = virResctrlAllocNew(); + if (!alloc) + goto error; + + if (virResctrlAllocParse(resctrl, alloc, schemata) < 0) + goto error; + + virResctrlAllocSubtract(ret, alloc); + } + if (rv < 0) + goto error; + + cleanup: + virObjectUnref(alloc); + VIR_DIR_CLOSE(dirp); + VIR_FREE(schemata); + return ret; + + error: + virObjectUnref(ret); + ret = NULL; + goto cleanup; +} + + + +static int +virResctrlAllocSetMask(virResctrlAllocPerTypePtr a_type, + unsigned int cache, + virBitmapPtr mask) +{ + if (a_type->nmasks <= cache && + VIR_EXPAND_N(a_type->masks, a_type->nmasks, + cache - a_type->nmasks + 1) < 0) + return -1; + + a_type->masks[cache] = mask; + + return 0; +} + + +/* + * Given the information about requested allocation type `a_type`, the host + * cache for a particular type `i_type` and unused bits in the system `f_type` + * this function tries to find the smallest free space in which the allocation + * for cache id `cache` would fit. We're looking for the smallest place in + * order to minimize fragmentation and maximize the possibility of succeeding. + */ +static int +virResctrlAllocFindUnused(virResctrlAllocPerTypePtr a_type, + virResctrlInfoPerTypePtr i_type, + virResctrlAllocPerTypePtr f_type, + unsigned int level, + unsigned int type, + unsigned int cache) +{ + unsigned long long *size = a_type->sizes[cache]; + virBitmapPtr a_mask = NULL; + virBitmapPtr f_mask = NULL; + unsigned long long granularity; + unsigned long long need_bits; + size_t i = 0; + ssize_t pos = -1; + ssize_t last_bits = 0; + ssize_t last_pos = -1; + + /* If there is no reservation requested we need to set all bits. That's due + * to weird interface of the resctrl sysfs. It's also the reason why we + * cannot reserve the whole cache in one allocation. */ + if (!size) { + a_mask = virBitmapNew(i_type->bits); + if (!a_mask) + return -1; + + virBitmapSetAll(a_mask); + + if (virResctrlAllocSetMask(a_type, cache, a_mask) < 0) { + virBitmapFree(a_mask); + return -1; + } + + return 0; + } + + if (cache >= f_type->nmasks) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache with id %u does not exists for level %d"), + cache, level); + return -1; + } + + f_mask = f_type->masks[cache]; + if (!f_mask) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache level %d id %u does not support tuning for " + "scope type '%s'"), + level, cache, virCacheTypeToString(type)); + return -1; + } + + if (*size == i_type->size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache allocation for the whole cache is not " + "possible, specify size smaller than %llu"), + i_type->size); + return -1; + } + + granularity = i_type->size / i_type->bits;
Can we use the already calculated granularity? i_type->control.granularity
+ need_bits = *size / granularity; + + if (*size % granularity) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache allocation of size %llu is not " + "divisible by granularity %llu"), + *size, granularity); + return -1; + } + + if (need_bits < i_type->min_cbm_bits) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache allocation of size %llu is smaller " + "than the minimum allowed allocation %llu"), + *size, granularity * i_type->min_cbm_bits);
Here we could use the already calculated minimum allowed allocation, but we set that value only if it's different from granularity. This makes me think whether we could modify the code to always set the minimum allocation and format it only if it's different from granularity.
+ return -1; + } + + while ((pos = virBitmapNextSetBit(f_mask, pos)) >= 0) { + ssize_t pos_clear = virBitmapNextClearBit(f_mask, pos); + ssize_t bits; + + if (pos_clear < 0) + pos_clear = virBitmapSize(f_mask); + + bits = pos_clear - pos; + + /* Not enough bits, move on and skip all of them */ + if (bits < need_bits) { + pos = pos_clear; + continue; + } + + /* This fits perfectly */ + if (bits == need_bits) { + last_pos = pos; + break; + } + + /* Remember the smaller region if we already found on before */ + if (last_pos < 0 || (last_bits && bits < last_bits)) { + last_bits = bits; + last_pos = pos; + } + + pos = pos_clear; + } + + if (last_pos < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Not enough room for allocation of " + "%llu bytes for level %u cache %u " + "scope type '%s'"), + *size, level, cache, + virCacheTypeToString(type)); + return -1; + } + + a_mask = virBitmapNew(i_type->bits); + if (!a_mask) + return -1; + + for (i = last_pos; i < last_pos + need_bits; i++) { + ignore_value(virBitmapSetBit(a_mask, i)); + ignore_value(virBitmapClearBit(f_mask, i)); + } + + if (virResctrlAllocSetMask(a_type, cache, a_mask) < 0) { + virBitmapFree(a_mask); + return -1; + } + + return 0; +} + + +/* + * This function is called when creating an allocation in the system. What it + * does is that it gets all the unused bits using virResctrlAllocGetUnused() and + * then tries to find a proper space for every requested allocation effectively + * transforming `sizes` into `masks`. + */ +static int +virResctrlAllocMasksAssign(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc) +{ + int ret = -1; + unsigned int level = 0; + virResctrlAllocPtr alloc_free = NULL; + + alloc_free = virResctrlAllocGetUnused(resctrl); + if (!alloc_free) + return -1; + + for (level = 0; level < alloc->nlevels; level++) { + virResctrlAllocPerLevelPtr a_level = alloc->levels[level]; + virResctrlAllocPerLevelPtr f_level = NULL; + unsigned int type = 0; + + if (!a_level) + continue; + + if (level < alloc_free->nlevels) + f_level = alloc_free->levels[level]; + + if (!f_level) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache level %d does not support tuning"), + level); + goto cleanup; + } + + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) { + virResctrlAllocPerTypePtr a_type = a_level->types[type]; + virResctrlAllocPerTypePtr f_type = f_level->types[type]; + unsigned int cache = 0; + + if (!a_type) + continue; + + if (!f_type) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache level %d does not support tuning for " + "scope type '%s'"), + level, virCacheTypeToString(type)); + goto cleanup; + } + + for (cache = 0; cache < a_type->nsizes; cache++) { + virResctrlInfoPerLevelPtr i_level = resctrl->levels[level]; + virResctrlInfoPerTypePtr i_type = i_level->types[type]; + + if (virResctrlAllocFindUnused(a_type, i_type, f_type, level, type, cache) < 0) + goto cleanup; + } + } + } + + ret = 0; + cleanup: + virObjectUnref(alloc_free); + return ret; +} + + +/* This checks if the directory for the alloc exists. If not it tries to create + * it and apply appropriate alloc settings. */ +int +virResctrlAllocCreate(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + const char *machinename) +{ + char *schemata_path = NULL; + char *alloc_str = NULL; + int ret = -1; + int lockfd = -1; + + if (!alloc) + return 0; + + if (!resctrl) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Resource control is not supported on this host")); + return -1; + } + + if (!alloc->id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl Allocation ID must be set before creation")); + return -1; + } + + if (!alloc->path && + virAsprintf(&alloc->path, "%s/%s-%s", + SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + return -1; + + /* Check if this allocation was already created */ + if (virFileIsDir(alloc->path)) + return 0;
Shouldn't we fail if the allocation already exists? This function is called when the guest is started therefore the allocation should not exist and if it exists there is something wrong and the allocation will not be applied or wrong allocation may be applied.
+ + if (virFileExists(alloc->path)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Path '%s' for resctrl allocation exists and is not a " + "directory"), alloc->path); + goto cleanup; + } + + lockfd = virResctrlLockWrite(); + if (lockfd < 0) + goto cleanup; + + if (virResctrlAllocMasksAssign(resctrl, alloc) < 0) + goto cleanup; + + alloc_str = virResctrlAllocFormat(alloc); + if (!alloc_str) + goto cleanup; + + if (virAsprintf(&schemata_path, "%s/schemata", alloc->path) < 0) + goto cleanup; + + if (virFileMakePath(alloc->path) < 0) { + virReportSystemError(errno, + _("Cannot create resctrl directory '%s'"), + alloc->path); + goto cleanup; + } + + VIR_DEBUG("Writing resctrl schemata '%s' into '%s'", alloc_str, schemata_path); + if (virFileWriteStr(schemata_path, alloc_str, 0) < 0) { + rmdir(alloc->path); + virReportSystemError(errno, + _("Cannot write into schemata file '%s'"), + schemata_path); + goto cleanup; + } + + ret = 0; + cleanup: + virResctrlUnlock(lockfd); + VIR_FREE(alloc_str); + VIR_FREE(schemata_path); + return ret; +} + + +int +virResctrlAllocAddPID(virResctrlAllocPtr alloc, + pid_t pid) +{ + char *tasks = NULL; + char *pidstr = NULL; + int ret = 0; + + if (!alloc->path) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot add pid to non-existing resctrl allocation")); + return -1; + } + + if (virAsprintf(&tasks, "%s/tasks", alloc->path) < 0) + return -1; + + if (virAsprintf(&pidstr, "%lld", (long long int) pid) < 0) + goto cleanup; + + if (virFileWriteStr(tasks, pidstr, 0) < 0) { + virReportSystemError(errno, + _("Cannot write pid in tasks file '%s'"), + tasks); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(tasks); + VIR_FREE(pidstr); + return ret; +} + + +int +virResctrlAllocRemove(virResctrlAllocPtr alloc) +{ + int ret = 0; + + if (!alloc->path) + return 0; + + VIR_DEBUG("Removing resctrl allocation %s", alloc->path); + if (rmdir(alloc->path) != 0 && errno != ENOENT) { + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), alloc->path, errno); + } + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 471c02f28dcd..27793c86a745 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -68,4 +68,53 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, size_t *ncontrols, virResctrlInfoPerCachePtr **controls);
+/* Alloc-related things */ +typedef struct _virResctrlAlloc virResctrlAlloc; +typedef virResctrlAlloc *virResctrlAllocPtr; + +typedef int virResctrlAllocForeachSizeCallback(unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size, + void *opaque); + +virResctrlAllocPtr +virResctrlAllocNew(void); + +bool +virResctrlAllocIsEmpty(virResctrlAllocPtr resctrl); + +int +virResctrlAllocSetSize(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size); + +int +virResctrlAllocForeachSize(virResctrlAllocPtr resctrl, + virResctrlAllocForeachSizeCallback cb, + void *opaque); + +int +virResctrlAllocSetID(virResctrlAllocPtr alloc, + const char *id); +const char * +virResctrlAllocGetID(virResctrlAllocPtr alloc); + +char * +virResctrlAllocFormat(virResctrlAllocPtr alloc); + +int +virResctrlAllocCreate(virResctrlInfoPtr r_info, + virResctrlAllocPtr alloc, + const char *machinename); + +int +virResctrlAllocAddPID(virResctrlAllocPtr alloc, + pid_t pid); + +int +virResctrlAllocRemove(virResctrlAllocPtr alloc); + #endif /* __VIR_RESCTRL_H__ */ diff --git a/src/util/virresctrlpriv.h b/src/util/virresctrlpriv.h new file mode 100644 index 000000000000..72b115afd13e --- /dev/null +++ b/src/util/virresctrlpriv.h @@ -0,0 +1,27 @@ +/* + * virresctrlpriv.h: + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_RESCTRL_PRIV_H__ +# define __VIR_RESCTRL_PRIV_H__ + +# include "virresctrl.h" + +virResctrlAllocPtr +virResctrlAllocGetUnused(virResctrlInfoPtr resctrl); + +#endif /* __VIR_RESCTRL_PRIV_H__ */ -- 2.16.1
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Wed, Jan 24, 2018 at 02:51:46PM +0100, Pavel Hrdina wrote:
On Tue, Jan 23, 2018 at 07:05:14PM +0100, Martin Kletzander wrote:
With this commit we finally have a way to read and manipulate basic resctrl settings. Locking is done only on exposed functions that read/write from/to resctrlfs. Not in functions that are exposed in virresctrlpriv.h as those are only supposed to be used from tests.
More information about how resctrl works:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu...
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/Makefile.am | 2 +- src/libvirt_private.syms | 10 + src/util/virresctrl.c | 1211 ++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 49 ++ src/util/virresctrlpriv.h | 27 + 5 files changed, 1288 insertions(+), 11 deletions(-) create mode 100644 src/util/virresctrlpriv.h
diff --git a/src/Makefile.am b/src/Makefile.am index 166c9a8e9199..289b03747212 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -167,7 +167,7 @@ UTIL_SOURCES = \ util/virprocess.c util/virprocess.h \ util/virqemu.c util/virqemu.h \ util/virrandom.h util/virrandom.c \ - util/virresctrl.h util/virresctrl.c \ + util/virresctrl.h util/virresctrl.c util/virresctrlpriv.h \ util/virrotatingfile.h util/virrotatingfile.c \ util/virscsi.c util/virscsi.h \ util/virscsihost.c util/virscsihost.h \ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d63474e0fc20..f2a2c8650d97 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2552,6 +2552,16 @@ virRandomInt; # util/virresctrl.h virCacheTypeFromString; virCacheTypeToString; +virResctrlAllocAddPID; +virResctrlAllocCreate; +virResctrlAllocForeachSize; +virResctrlAllocFormat; +virResctrlAllocGetID; +virResctrlAllocGetUnused; +virResctrlAllocNew; +virResctrlAllocRemove; +virResctrlAllocSetID; +virResctrlAllocSetSize; virResctrlGetInfo; virResctrlInfoGetCache; virResctrlInfoNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index f32929468c80..b5c0ea338830 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -18,8 +18,12 @@
#include <config.h>
-#include "virresctrl.h" +#include <sys/file.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h>
+#include "virresctrlpriv.h" #include "c-ctype.h" #include "count-one-bits.h" #include "viralloc.h" @@ -152,6 +156,195 @@ virResctrlInfoNew(void) }
+/* Alloc-related definitions and AllocClass-related functions */ + +/* + * virResctrlAlloc represents one allocation (in XML under cputune/cachetune and + * consequently a directory under /sys/fs/resctrl). Since it can have multiple + * parts of multiple caches allocated it is represented as bunch of nested + * sparse arrays (by sparse I mean array of pointers so that each might be NULL + * in case there is no allocation for that particular one (level, cache, ...)). + * + * Since one allocation can be made for caches on different levels, the first + * nested sparse array is of types virResctrlAllocPerLevel. For example if you + * have allocation for level 3 cache, there will be three NULL pointers and then + * allocated pointer to virResctrlAllocPerLevel. That way you can access it by + * `alloc[level]` as O(1) is desired instead of crawling through normal arrays + * or lists in three nested loops. The code uses a lot of direct accesses. + * + * Each virResctrlAllocPerLevel can have allocations for different cache + * allocation types. You can allocate instruction cache (VIR_CACHE_TYPE_CODE), + * data cache (VIR_CACHE_TYPE_DATA) or unified cache (VIR_CACHE_TYPE_BOTH). + * Those allocations are kept in sparse array of virResctrlAllocPerType pointers. + * + * For each virResctrlAllocPerType users can request some size of the cache to + * be allocated. That's what the sparse array `sizes` is for. Non-NULL + * pointers represent requested size allocations. The array is indexed by host + * cache id (gotten from `/sys/devices/system/cpu/cpuX/cache/indexY/id`). Users + * can see this information e.g. in the output of `virsh capabilities` (for that + * information there's the other struct, namely `virResctrlInfo`). + * + * When allocation is being created we need to find unused part of the cache for + * all of them. While doing that we store the bitmask in a sparse array of + * virBitmaps named `masks` indexed the same way as `sizes`. The upper bounds + * of the sparse arrays are stored in nmasks or nsizes, respectively. + */ + +/* virResctrlAllocPerType represents */
Incomplete comment? :)
Fixed.
+typedef struct _virResctrlAllocPerType virResctrlAllocPerType; +typedef virResctrlAllocPerType *virResctrlAllocPerTypePtr; +struct _virResctrlAllocPerType { + /* There could be bool saying whether this is set or not, but since everything + * in virResctrlAlloc (and most of libvirt) goes with pointer arrays we would + * have to have one more level of allocation anyway, so this stays faithful to + * the concept */ + unsigned long long **sizes; + size_t nsizes; + + /* Mask for each cache */ + virBitmapPtr *masks; + size_t nmasks; +}; + +typedef struct _virResctrlAllocPerLevel virResctrlAllocPerLevel; +typedef virResctrlAllocPerLevel *virResctrlAllocPerLevelPtr; +struct _virResctrlAllocPerLevel { + virResctrlAllocPerTypePtr *types; /* Indexed with enum virCacheType */ + /* There is no `ntypes` member variable as it is always allocated for + * VIR_CACHE_TYPE_LAST number of items */ +}; + +struct _virResctrlAlloc { + virObject parent; + + virResctrlAllocPerLevelPtr *levels; + size_t nlevels; + + /* The identifier (any unique string for now) */ + char *id; + /* libvirt-generated path in /sys/fs/resctrl for this particular + * allocation */ + char *path; +}; + +static virClassPtr virResctrlAllocClass; + +static void +virResctrlAllocDispose(void *obj) +{ + size_t i = 0; + size_t j = 0; + size_t k = 0; + + virResctrlAllocPtr resctrl = obj; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlAllocPerLevelPtr level = resctrl->levels[i]; + + if (!level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlAllocPerTypePtr type = level->types[j]; + + if (!type) + continue; + + for (k = 0; k < type->nsizes; k++) + VIR_FREE(type->sizes[k]); + + for (k = 0; k < type->nmasks; k++) + virBitmapFree(type->masks[k]); + + VIR_FREE(type->sizes); + VIR_FREE(type->masks); + VIR_FREE(type); + } + VIR_FREE(level->types); + VIR_FREE(level); + } + + VIR_FREE(resctrl->id); + VIR_FREE(resctrl->path); + VIR_FREE(resctrl->levels); +} + + +static int +virResctrlAllocOnceInit(void) +{ + if (!(virResctrlAllocClass = virClassNew(virClassForObject(), + "virResctrlAlloc", + sizeof(virResctrlAlloc), + virResctrlAllocDispose))) + return -1; + + return 0; +} + + +VIR_ONCE_GLOBAL_INIT(virResctrlAlloc) + + +virResctrlAllocPtr +virResctrlAllocNew(void) +{ + if (virResctrlAllocInitialize() < 0) + return NULL; + + return virObjectNew(virResctrlAllocClass); +} + + +/* Common functions */ +static int +virResctrlLockInternal(int op) +{ + int fd = open(SYSFS_RESCTRL_PATH, O_DIRECTORY | O_CLOEXEC); + + if (fd < 0) { + virReportSystemError(errno, "%s", _("Cannot open resctrlfs")); + return -1; + } + + if (flock(fd, op) < 0) { + virReportSystemError(errno, "%s", _("Cannot lock resctrlfs")); + VIR_FORCE_CLOSE(fd); + return -1; + } + + return fd; +} + + +static inline int +virResctrlLockWrite(void) +{ + return virResctrlLockInternal(LOCK_EX); +} + + +static int +virResctrlUnlock(int fd) +{ + if (fd == -1) + return 0; + + /* The lock gets unlocked by closing the fd, which we need to do anyway in + * order to clean up properly */ + if (VIR_CLOSE(fd) < 0) { + virReportSystemError(errno, "%s", _("Cannot close resctrlfs")); + + /* Trying to save the already broken */ + if (flock(fd, LOCK_UN) < 0) + virReportSystemError(errno, "%s", _("Cannot unlock resctrlfs")); + return -1; + } + + return 0; +} + + /* Info-related functions */ bool virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl) @@ -200,6 +393,8 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) }
while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH "/info")) > 0) { + VIR_DEBUG("Parsing info type '%s'", ent->d_name); + if (ent->d_type != DT_DIR) continue;
@@ -207,16 +402,14 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) continue;
if (virStrToLong_uip(ent->d_name + 1, &endptr, 10, &level) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Cannot parse resctrl cache info level")); + VIR_DEBUG("Cannot parse resctrl cache info level '%s'", ent->d_name + 1); goto cleanup;
I guess that there should be "continue" instead of "goto cleanup".
Good point, fixed.
}
type = virResctrlTypeFromString(endptr); if (type < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Cannot parse resctrl cache info type")); - goto cleanup; + VIR_DEBUG("Cannot parse resctrl cache info type '%s'", endptr); + continue; }
if (VIR_ALLOC(i_type) < 0) @@ -259,10 +452,19 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) level - resctrl->nlevels + 1) < 0) goto cleanup;
- if (!resctrl->levels[level] && - (VIR_ALLOC(resctrl->levels[level]) < 0 || - VIR_ALLOC_N(resctrl->levels[level]->types, VIR_CACHE_TYPE_LAST) < 0)) - goto cleanup; + if (!resctrl->levels[level]) { + virResctrlInfoPerTypePtr *types = NULL; + + if (VIR_ALLOC_N(types, VIR_CACHE_TYPE_LAST) < 0) + goto cleanup; + + if (VIR_ALLOC(resctrl->levels[level]) < 0) { + VIR_FREE(types); + goto cleanup; + } + resctrl->levels[level]->types = types; + } + i_level = resctrl->levels[level];
if (i_level->types[type]) { @@ -358,3 +560,992 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, VIR_FREE(*controls); goto cleanup; }
This whole info related part should be probably squashed into patch 3.
I've done that now. Probably just a silly mistake from addition of lot of fixups.
+ + +/* Alloc-related functions */ +bool +virResctrlAllocIsEmpty(virResctrlAllocPtr resctrl) +{ + size_t i = 0; + size_t j = 0; + size_t k = 0; + + if (!resctrl) + return true; + + for (i = 0; i < resctrl->nlevels; i++) { + virResctrlAllocPerLevelPtr a_level = resctrl->levels[i]; + + if (!a_level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlAllocPerTypePtr a_type = a_level->types[j]; + + if (!a_type) + continue; + + for (k = 0; k < a_type->nsizes; k++) { + if (a_type->sizes[k]) + return false; + } + + for (k = 0; k < a_type->nmasks; k++) { + if (a_type->masks[k]) + return false; + } + } + } + + return true; +} + + +static virResctrlAllocPerTypePtr +virResctrlAllocGetType(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type) +{ + virResctrlAllocPerLevelPtr a_level = NULL; + + if (resctrl->nlevels <= level && + VIR_EXPAND_N(resctrl->levels, resctrl->nlevels, level - resctrl->nlevels + 1) < 0) + return NULL; + + if (!resctrl->levels[level]) { + virResctrlAllocPerTypePtr *types = NULL; + + if (VIR_ALLOC_N(types, VIR_CACHE_TYPE_LAST) < 0) + return NULL; + + if (VIR_ALLOC(resctrl->levels[level]) < 0) { + VIR_FREE(types); + return NULL; + } + resctrl->levels[level]->types = types; + } + + a_level = resctrl->levels[level]; + + if (!a_level->types[type] && VIR_ALLOC(a_level->types[type]) < 0) + return NULL; + + return a_level->types[type]; +} + + +static int +virResctrlAllocUpdateMask(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + virBitmapPtr mask) +{ + virResctrlAllocPerTypePtr a_type = virResctrlAllocGetType(resctrl, level, type); + + if (!a_type) + return -1; + + if (a_type->nmasks <= cache && + VIR_EXPAND_N(a_type->masks, a_type->nmasks, + cache - a_type->nmasks + 1) < 0) + return -1; + + if (!a_type->masks[cache]) { + a_type->masks[cache] = virBitmapNew(virBitmapSize(mask)); + + if (!a_type->masks[cache]) + return -1; + } + + return virBitmapCopy(a_type->masks[cache], mask); +} + + +static int +virResctrlAllocUpdateSize(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size) +{ + virResctrlAllocPerTypePtr a_type = virResctrlAllocGetType(resctrl, level, type); + + if (!a_type) + return -1; + + if (a_type->nsizes <= cache && + VIR_EXPAND_N(a_type->sizes, a_type->nsizes, + cache - a_type->nsizes + 1) < 0) + return -1; + + if (!a_type->sizes[cache] && VIR_ALLOC(a_type->sizes[cache]) < 0) + return -1; + + *(a_type->sizes[cache]) = size; + + return 0; +} + + +/* + * Check if there is an allocation for this level/type/cache already. Called + * before updating the structure. VIR_CACHE_TYPE_BOTH collides with any type, + * the other types collide with itself. This code basically checks if either: + * `alloc[level]->types[type]->sizes[cache]` + * or + * `alloc[level]->types[VIR_CACHE_TYPE_BOTH]->sizes[cache]` + * is non-NULL. All the fuzz around it is checking for NULL pointers along + * the way. + */ +static bool +virResctrlAllocCheckCollision(virResctrlAllocPtr alloc, + unsigned int level, + virCacheType type, + unsigned int cache) +{ + virResctrlAllocPerLevelPtr a_level = NULL; + virResctrlAllocPerTypePtr a_type = NULL; + + if (!alloc) + return false; + + if (alloc->nlevels <= level) + return false; + + a_level = alloc->levels[level]; + + if (!a_level) + return false; + + a_type = a_level->types[VIR_CACHE_TYPE_BOTH]; + + /* If there is an allocation for type 'both', there can be no other + * allocation for the same cache */ + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + + if (type == VIR_CACHE_TYPE_BOTH) { + a_type = a_level->types[VIR_CACHE_TYPE_CODE]; + + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + + a_type = a_level->types[VIR_CACHE_TYPE_DATA]; + + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + } else { + a_type = a_level->types[type]; + + if (a_type && a_type->nsizes > cache && a_type->sizes[cache]) + return true; + } + + return false; +} + + +int +virResctrlAllocSetSize(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size) +{ + if (virResctrlAllocCheckCollision(resctrl, level, type, cache)) { + virReportError(VIR_ERR_XML_ERROR, + _("Colliding cache allocations for cache " + "level '%u' id '%u', type '%s'"), + level, cache, virCacheTypeToString(type)); + return -1; + } + + return virResctrlAllocUpdateSize(resctrl, level, type, cache, size); +} + + +int +virResctrlAllocForeachSize(virResctrlAllocPtr resctrl, + virResctrlAllocForeachSizeCallback cb, + void *opaque) +{ + int ret = 0; + unsigned int level = 0; + unsigned int type = 0; + unsigned int cache = 0; + + if (!resctrl) + return 0; + + for (level = 0; level < resctrl->nlevels; level++) { + virResctrlAllocPerLevelPtr a_level = resctrl->levels[level]; + + if (!a_level) + continue; + + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) { + virResctrlAllocPerTypePtr a_type = a_level->types[type]; + + if (!a_type) + continue; + + for (cache = 0; cache < a_type->nsizes; cache++) { + unsigned long long *size = a_type->sizes[cache]; + + if (!size) + continue; + + ret = cb(level, type, cache, *size, opaque); + if (ret < 0) + return ret; + } + } + } + + return 0; +} + + +int +virResctrlAllocSetID(virResctrlAllocPtr alloc, + const char *id) +{ + if (!id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl allocation 'id' cannot be NULL")); + return -1; + } + + return VIR_STRDUP(alloc->id, id); +} + + +const char * +virResctrlAllocGetID(virResctrlAllocPtr alloc) +{ + return alloc->id; +} + + +char * +virResctrlAllocFormat(virResctrlAllocPtr resctrl) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + unsigned int level = 0; + unsigned int type = 0; + unsigned int cache = 0; + + if (!resctrl) + return NULL; + + for (level = 0; level < resctrl->nlevels; level++) { + virResctrlAllocPerLevelPtr a_level = resctrl->levels[level]; + + if (!a_level) + continue; + + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) { + virResctrlAllocPerTypePtr a_type = a_level->types[type]; + + if (!a_type) + continue; + + virBufferAsprintf(&buf, "L%u%s:", level, virResctrlTypeToString(type)); + + for (cache = 0; cache < a_type->nmasks; cache++) { + virBitmapPtr mask = a_type->masks[cache]; + char *mask_str = NULL; + + if (!mask) + continue; + + mask_str = virBitmapToString(mask, false, true); + if (!mask_str) { + virBufferFreeAndReset(&buf); + return NULL; + } + + virBufferAsprintf(&buf, "%u=%s;", cache, mask_str); + VIR_FREE(mask_str); + } + + virBufferTrim(&buf, ";", 1); + virBufferAddChar(&buf, '\n'); + } + } + + virBufferCheckError(&buf); + return virBufferContentAndReset(&buf); +} + + +static int +virResctrlAllocParseProcessCache(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + unsigned int level, + virCacheType type, + char *cache) +{ + char *tmp = strchr(cache, '='); + unsigned int cache_id = 0; + virBitmapPtr mask = NULL; + int ret = -1; + + if (!tmp) + return 0; + + *tmp = '\0'; + tmp++; + + if (virStrToLong_uip(cache, NULL, 10, &cache_id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid cache id '%s'"), cache); + return -1; + } + + mask = virBitmapNewString(tmp); + if (!mask) + return -1; + + virBitmapShrink(mask, resctrl->levels[level]->types[type]->bits); + + /* Guess what! All bits set mean there is no allocation and it uses only + * unallocated parts of the cache. */
Actually this doesn't seem to be correct from the documentation. It looks like that in order to make some part of the case dedicated to a group of processes you need to create that group, set which part of the cache should be used and make sure that no other group including the default one doesn't use that part of the cache.
In other words, all bits set means that the processes can use any part of the cache.
Unfortunately, that's right. I went through the documentation many times and missed that. Also I got confused by other misinformation that's circulating in the air =) I fixed it back to the previous version (it was added in this iteration).
+ if (!virBitmapIsAllSet(mask) && + virResctrlAllocUpdateMask(alloc, level, type, cache_id, mask) < 0) + goto cleanup; + + ret = 0; + cleanup: + virBitmapFree(mask); + return ret; +} + + +static int +virResctrlAllocParseProcessLine(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *line) +{ + char **caches = NULL; + char *tmp = NULL; + unsigned int level = 0; + int type = -1; + size_t ncaches = 0; + size_t i = 0; + int ret = -1; + + /* For no reason there can be spaces */ + virSkipSpaces((const char **) &line); + + /* Skip lines that don't concern caches, e.g. MB: etc. */ + if (line[0] != 'L') + return 0; + + /* And lines that we can't parse too */ + tmp = strchr(line, ':'); + if (!tmp) + return 0; + + *tmp = '\0'; + tmp++; + + if (virStrToLong_uip(line + 1, &line, 10, &level) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse resctrl schema level '%s'"), + line + 1); + return -1; + } + + type = virResctrlTypeFromString(line); + if (type < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse resctrl schema level '%s'"), + line + 1); + return -1; + } + + caches = virStringSplitCount(tmp, ";", 0, &ncaches); + if (!caches) + return 0; + + for (i = 0; i < ncaches; i++) { + if (virResctrlAllocParseProcessCache(resctrl, alloc, level, type, caches[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virStringListFree(caches); + return ret; +} + + +static int +virResctrlAllocParse(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + const char *schemata) +{ + char **lines = NULL; + size_t nlines = 0; + size_t i = 0; + int ret = -1; + + lines = virStringSplitCount(schemata, "\n", 0, &nlines); + for (i = 0; i < nlines; i++) { + if (virResctrlAllocParseProcessLine(resctrl, alloc, lines[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virStringListFree(lines); + return ret; +} + + +static void +virResctrlAllocSubtractPerType(virResctrlAllocPerTypePtr dst, + virResctrlAllocPerTypePtr src) +{ + size_t i = 0; + + if (!dst || !src) + return; + + for (i = 0; i < dst->nmasks && i < src->nmasks; i++) { + if (dst->masks[i] && src->masks[i]) + virBitmapSubtract(dst->masks[i], src->masks[i]); + } +} + + +static void +virResctrlAllocSubtract(virResctrlAllocPtr dst, + virResctrlAllocPtr src) +{ + size_t i = 0; + size_t j = 0; + + if (!src) + return; + + for (i = 0; i < dst->nlevels && i < src->nlevels; i++) { + if (dst->levels[i] && src->levels[i]) { + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlAllocSubtractPerType(dst->levels[i]->types[j], + src->levels[i]->types[j]); + } + } + } +} + + +static virResctrlAllocPtr +virResctrlAllocNewFromInfo(virResctrlInfoPtr info) +{ + size_t i = 0; + size_t j = 0; + size_t k = 0; + virResctrlAllocPtr ret = virResctrlAllocNew(); + virBitmapPtr mask = NULL; + + if (!ret) + return NULL; + + for (i = 0; i < info->nlevels; i++) { + virResctrlInfoPerLevelPtr i_level = info->levels[i]; + + if (!i_level) + continue; + + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) { + virResctrlInfoPerTypePtr i_type = i_level->types[j]; + + if (!i_type) + continue; + + virBitmapFree(mask); + mask = virBitmapNew(i_type->bits); + if (!mask) + goto error; + virBitmapSetAll(mask); + + for (k = 0; k <= i_type->max_cache_id; k++) { + if (virResctrlAllocUpdateMask(ret, i, j, k, mask) < 0) + goto error; + } + } + } + + cleanup: + virBitmapFree(mask); + return ret; + error: + virObjectUnref(ret); + ret = NULL; + goto cleanup; +} + +/* + * This function creates an allocation that represents all unused parts of all + * caches in the system. It uses virResctrlInfo for creating a new full + * allocation with all bits set (using virResctrlAllocNewFromInfo()) and then + * scans for all allocations under /sys/fs/resctrl and subtracts each one of + * them from it. That way it can then return an allocation with only bit set + * being those that are not mentioned in any other allocation. It is used for + * two things, a) calculating the masks when creating allocations and b) from + * tests. + */ +virResctrlAllocPtr +virResctrlAllocGetUnused(virResctrlInfoPtr resctrl) +{ + virResctrlAllocPtr ret = NULL; + virResctrlAllocPtr alloc = NULL; + struct dirent *ent = NULL; + DIR *dirp = NULL; + char *schemata = NULL; + int rv = -1; + + if (virResctrlInfoIsEmpty(resctrl)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Resource control is not supported on this host")); + return NULL; + } + + ret = virResctrlAllocNewFromInfo(resctrl); + if (!ret) + return NULL; + + if (virFileReadValueString(&schemata, + SYSFS_RESCTRL_PATH + "/schemata") < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not read schemata file for the default group")); + goto error; + } + + alloc = virResctrlAllocNew(); + if (!alloc) + goto error; + + if (virResctrlAllocParse(resctrl, alloc, schemata) < 0) + goto error; + if (!virResctrlAllocIsEmpty(alloc))
This check is not required if we change the logic in virResctrlAllocParseProcessCache(). There cannot be any cache group that has empty allocation.
Yeah, fixed.
+ virResctrlAllocSubtract(ret, alloc); + + if (virDirOpen(&dirp, SYSFS_RESCTRL_PATH) < 0) + goto error; + + while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH)) > 0) { + if (ent->d_type != DT_DIR) + continue; + + if (STREQ(ent->d_name, "info")) + continue; + + VIR_FREE(schemata); + rv = virFileReadValueString(&schemata, + SYSFS_RESCTRL_PATH + "/%s/schemata", + ent->d_name); + if (rv == -2) + continue; + + if (rv < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Could not read schemata file for group %s"), + ent->d_name); + goto error; + } + + virObjectUnref(alloc); + alloc = virResctrlAllocNew(); + if (!alloc) + goto error; + + if (virResctrlAllocParse(resctrl, alloc, schemata) < 0) + goto error; + + virResctrlAllocSubtract(ret, alloc); + } + if (rv < 0) + goto error; + + cleanup: + virObjectUnref(alloc); + VIR_DIR_CLOSE(dirp); + VIR_FREE(schemata); + return ret; + + error: + virObjectUnref(ret); + ret = NULL; + goto cleanup; +} + + + +static int +virResctrlAllocSetMask(virResctrlAllocPerTypePtr a_type, + unsigned int cache, + virBitmapPtr mask) +{ + if (a_type->nmasks <= cache && + VIR_EXPAND_N(a_type->masks, a_type->nmasks, + cache - a_type->nmasks + 1) < 0) + return -1; + + a_type->masks[cache] = mask; + + return 0; +} + + +/* + * Given the information about requested allocation type `a_type`, the host + * cache for a particular type `i_type` and unused bits in the system `f_type` + * this function tries to find the smallest free space in which the allocation + * for cache id `cache` would fit. We're looking for the smallest place in + * order to minimize fragmentation and maximize the possibility of succeeding. + */ +static int +virResctrlAllocFindUnused(virResctrlAllocPerTypePtr a_type, + virResctrlInfoPerTypePtr i_type, + virResctrlAllocPerTypePtr f_type, + unsigned int level, + unsigned int type, + unsigned int cache) +{ + unsigned long long *size = a_type->sizes[cache]; + virBitmapPtr a_mask = NULL; + virBitmapPtr f_mask = NULL; + unsigned long long granularity; + unsigned long long need_bits; + size_t i = 0; + ssize_t pos = -1; + ssize_t last_bits = 0; + ssize_t last_pos = -1; + + /* If there is no reservation requested we need to set all bits. That's due + * to weird interface of the resctrl sysfs. It's also the reason why we + * cannot reserve the whole cache in one allocation. */ + if (!size) { + a_mask = virBitmapNew(i_type->bits); + if (!a_mask) + return -1; + + virBitmapSetAll(a_mask); + + if (virResctrlAllocSetMask(a_type, cache, a_mask) < 0) { + virBitmapFree(a_mask); + return -1; + } + + return 0; + } + + if (cache >= f_type->nmasks) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache with id %u does not exists for level %d"), + cache, level); + return -1; + } + + f_mask = f_type->masks[cache]; + if (!f_mask) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache level %d id %u does not support tuning for " + "scope type '%s'"), + level, cache, virCacheTypeToString(type)); + return -1; + } + + if (*size == i_type->size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache allocation for the whole cache is not " + "possible, specify size smaller than %llu"), + i_type->size); + return -1; + } + + granularity = i_type->size / i_type->bits;
Can we use the already calculated granularity?
i_type->control.granularity
Good point, modified appropriately now.
+ need_bits = *size / granularity; + + if (*size % granularity) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache allocation of size %llu is not " + "divisible by granularity %llu"), + *size, granularity); + return -1; + } + + if (need_bits < i_type->min_cbm_bits) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache allocation of size %llu is smaller " + "than the minimum allowed allocation %llu"), + *size, granularity * i_type->min_cbm_bits);
Here we could use the already calculated minimum allowed allocation, but we set that value only if it's different from granularity.
This makes me think whether we could modify the code to always set the minimum allocation and format it only if it's different from granularity.
Yeah, we could do that. Since it requires patch in yet another part of the code and the functionality would be unchanged, I'll leave this for a follow-up.
+ return -1; + } + + while ((pos = virBitmapNextSetBit(f_mask, pos)) >= 0) { + ssize_t pos_clear = virBitmapNextClearBit(f_mask, pos); + ssize_t bits; + + if (pos_clear < 0) + pos_clear = virBitmapSize(f_mask); + + bits = pos_clear - pos; + + /* Not enough bits, move on and skip all of them */ + if (bits < need_bits) { + pos = pos_clear; + continue; + } + + /* This fits perfectly */ + if (bits == need_bits) { + last_pos = pos; + break; + } + + /* Remember the smaller region if we already found on before */ + if (last_pos < 0 || (last_bits && bits < last_bits)) { + last_bits = bits; + last_pos = pos; + } + + pos = pos_clear; + } + + if (last_pos < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Not enough room for allocation of " + "%llu bytes for level %u cache %u " + "scope type '%s'"), + *size, level, cache, + virCacheTypeToString(type)); + return -1; + } + + a_mask = virBitmapNew(i_type->bits); + if (!a_mask) + return -1; + + for (i = last_pos; i < last_pos + need_bits; i++) { + ignore_value(virBitmapSetBit(a_mask, i)); + ignore_value(virBitmapClearBit(f_mask, i)); + } + + if (virResctrlAllocSetMask(a_type, cache, a_mask) < 0) { + virBitmapFree(a_mask); + return -1; + } + + return 0; +} + + +/* + * This function is called when creating an allocation in the system. What it + * does is that it gets all the unused bits using virResctrlAllocGetUnused() and + * then tries to find a proper space for every requested allocation effectively + * transforming `sizes` into `masks`. + */ +static int +virResctrlAllocMasksAssign(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc) +{ + int ret = -1; + unsigned int level = 0; + virResctrlAllocPtr alloc_free = NULL; + + alloc_free = virResctrlAllocGetUnused(resctrl); + if (!alloc_free) + return -1; + + for (level = 0; level < alloc->nlevels; level++) { + virResctrlAllocPerLevelPtr a_level = alloc->levels[level]; + virResctrlAllocPerLevelPtr f_level = NULL; + unsigned int type = 0; + + if (!a_level) + continue; + + if (level < alloc_free->nlevels) + f_level = alloc_free->levels[level]; + + if (!f_level) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache level %d does not support tuning"), + level); + goto cleanup; + } + + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) { + virResctrlAllocPerTypePtr a_type = a_level->types[type]; + virResctrlAllocPerTypePtr f_type = f_level->types[type]; + unsigned int cache = 0; + + if (!a_type) + continue; + + if (!f_type) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cache level %d does not support tuning for " + "scope type '%s'"), + level, virCacheTypeToString(type)); + goto cleanup; + } + + for (cache = 0; cache < a_type->nsizes; cache++) { + virResctrlInfoPerLevelPtr i_level = resctrl->levels[level]; + virResctrlInfoPerTypePtr i_type = i_level->types[type]; + + if (virResctrlAllocFindUnused(a_type, i_type, f_type, level, type, cache) < 0) + goto cleanup; + } + } + } + + ret = 0; + cleanup: + virObjectUnref(alloc_free); + return ret; +} + + +/* This checks if the directory for the alloc exists. If not it tries to create + * it and apply appropriate alloc settings. */ +int +virResctrlAllocCreate(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + const char *machinename) +{ + char *schemata_path = NULL; + char *alloc_str = NULL; + int ret = -1; + int lockfd = -1; + + if (!alloc) + return 0; + + if (!resctrl) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Resource control is not supported on this host")); + return -1; + } + + if (!alloc->id) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Resctrl Allocation ID must be set before creation")); + return -1; + } + + if (!alloc->path && + virAsprintf(&alloc->path, "%s/%s-%s", + SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0) + return -1; + + /* Check if this allocation was already created */ + if (virFileIsDir(alloc->path)) + return 0;
Shouldn't we fail if the allocation already exists? This function is called when the guest is started therefore the allocation should not exist and if it exists there is something wrong and the allocation will not be applied or wrong allocation may be applied.
Yeah, I thought this will get called for each vCPU (that was one of the earlier not-sent-upstream implemetations), but it is not now, so I removed this part and adjusted the next error message that will catch it.
+ + if (virFileExists(alloc->path)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Path '%s' for resctrl allocation exists and is not a " + "directory"), alloc->path); + goto cleanup; + } + + lockfd = virResctrlLockWrite(); + if (lockfd < 0) + goto cleanup; + + if (virResctrlAllocMasksAssign(resctrl, alloc) < 0) + goto cleanup; + + alloc_str = virResctrlAllocFormat(alloc); + if (!alloc_str) + goto cleanup; + + if (virAsprintf(&schemata_path, "%s/schemata", alloc->path) < 0) + goto cleanup; + + if (virFileMakePath(alloc->path) < 0) { + virReportSystemError(errno, + _("Cannot create resctrl directory '%s'"), + alloc->path); + goto cleanup; + } + + VIR_DEBUG("Writing resctrl schemata '%s' into '%s'", alloc_str, schemata_path); + if (virFileWriteStr(schemata_path, alloc_str, 0) < 0) { + rmdir(alloc->path); + virReportSystemError(errno, + _("Cannot write into schemata file '%s'"), + schemata_path); + goto cleanup; + } + + ret = 0; + cleanup: + virResctrlUnlock(lockfd); + VIR_FREE(alloc_str); + VIR_FREE(schemata_path); + return ret; +} + + +int +virResctrlAllocAddPID(virResctrlAllocPtr alloc, + pid_t pid) +{ + char *tasks = NULL; + char *pidstr = NULL; + int ret = 0; + + if (!alloc->path) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot add pid to non-existing resctrl allocation")); + return -1; + } + + if (virAsprintf(&tasks, "%s/tasks", alloc->path) < 0) + return -1; + + if (virAsprintf(&pidstr, "%lld", (long long int) pid) < 0) + goto cleanup; + + if (virFileWriteStr(tasks, pidstr, 0) < 0) { + virReportSystemError(errno, + _("Cannot write pid in tasks file '%s'"), + tasks); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(tasks); + VIR_FREE(pidstr); + return ret; +} + + +int +virResctrlAllocRemove(virResctrlAllocPtr alloc) +{ + int ret = 0; + + if (!alloc->path) + return 0; + + VIR_DEBUG("Removing resctrl allocation %s", alloc->path); + if (rmdir(alloc->path) != 0 && errno != ENOENT) { + ret = -errno; + VIR_ERROR(_("Unable to remove %s (%d)"), alloc->path, errno); + } + + return ret; +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 471c02f28dcd..27793c86a745 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -68,4 +68,53 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, size_t *ncontrols, virResctrlInfoPerCachePtr **controls);
+/* Alloc-related things */ +typedef struct _virResctrlAlloc virResctrlAlloc; +typedef virResctrlAlloc *virResctrlAllocPtr; + +typedef int virResctrlAllocForeachSizeCallback(unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size, + void *opaque); + +virResctrlAllocPtr +virResctrlAllocNew(void); + +bool +virResctrlAllocIsEmpty(virResctrlAllocPtr resctrl); + +int +virResctrlAllocSetSize(virResctrlAllocPtr resctrl, + unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size); + +int +virResctrlAllocForeachSize(virResctrlAllocPtr resctrl, + virResctrlAllocForeachSizeCallback cb, + void *opaque); + +int +virResctrlAllocSetID(virResctrlAllocPtr alloc, + const char *id); +const char * +virResctrlAllocGetID(virResctrlAllocPtr alloc); + +char * +virResctrlAllocFormat(virResctrlAllocPtr alloc); + +int +virResctrlAllocCreate(virResctrlInfoPtr r_info, + virResctrlAllocPtr alloc, + const char *machinename); + +int +virResctrlAllocAddPID(virResctrlAllocPtr alloc, + pid_t pid); + +int +virResctrlAllocRemove(virResctrlAllocPtr alloc); + #endif /* __VIR_RESCTRL_H__ */ diff --git a/src/util/virresctrlpriv.h b/src/util/virresctrlpriv.h new file mode 100644 index 000000000000..72b115afd13e --- /dev/null +++ b/src/util/virresctrlpriv.h @@ -0,0 +1,27 @@ +/* + * virresctrlpriv.h: + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_RESCTRL_PRIV_H__ +# define __VIR_RESCTRL_PRIV_H__ + +# include "virresctrl.h" + +virResctrlAllocPtr +virResctrlAllocGetUnused(virResctrlInfoPtr resctrl); + +#endif /* __VIR_RESCTRL_PRIV_H__ */ -- 2.16.1
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Wed, Jan 24, 2018 at 10:57:05PM +0100, Martin Kletzander wrote:
On Wed, Jan 24, 2018 at 02:51:46PM +0100, Pavel Hrdina wrote:
On Tue, Jan 23, 2018 at 07:05:14PM +0100, Martin Kletzander wrote:
With this commit we finally have a way to read and manipulate basic resctrl settings. Locking is done only on exposed functions that read/write from/to resctrlfs. Not in functions that are exposed in virresctrlpriv.h as those are only supposed to be used from tests.
More information about how resctrl works:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu...
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/Makefile.am | 2 +- src/libvirt_private.syms | 10 + src/util/virresctrl.c | 1211 ++++++++++++++++++++++++++++++++++++++++++++- src/util/virresctrl.h | 49 ++ src/util/virresctrlpriv.h | 27 + 5 files changed, 1288 insertions(+), 11 deletions(-) create mode 100644 src/util/virresctrlpriv.h
With the fixes included Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

More info in the documentation, this is basically the XML parsing/formatting support, schemas, tests and documentation for the new cputune/cachetune element that will get used by following patches. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- docs/formatdomain.html.in | 54 ++++ docs/schemas/domaincommon.rng | 32 +++ src/conf/domain_conf.c | 295 ++++++++++++++++++++- src/conf/domain_conf.h | 13 + tests/genericxml2xmlindata/cachetune-cdp.xml | 36 +++ .../cachetune-colliding-allocs.xml | 30 +++ .../cachetune-colliding-tunes.xml | 32 +++ .../cachetune-colliding-types.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 29 ++ tests/genericxml2xmlindata/cachetune.xml | 33 +++ tests/genericxml2xmltest.c | 10 + 11 files changed, 592 insertions(+), 2 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-cdp.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-tunes.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-types.xml create mode 100644 tests/genericxml2xmlindata/cachetune-small.xml create mode 100644 tests/genericxml2xmlindata/cachetune.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index d272cc1ba698..7b4d9051a551 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -689,6 +689,10 @@ <iothread_quota>-1</iothread_quota> <vcpusched vcpus='0-4,^3' scheduler='fifo' priority='1'/> <iothreadsched iothreads='2' scheduler='batch'/> + <cachetune vcpus='0-3'> + <cache id='0' level='3' type='both' size='3' unit='MiB'/> + <cache id='1' level='3' type='both' size='3' unit='MiB'/> + </cachetune> </cputune> ... </domain> @@ -834,6 +838,56 @@ <span class="since">Since 1.2.13</span> </dd> + <dt><code>cachetune</code><span class="since">Since 4.1.0</span></dt> + <dd> + Optional <code>cachetune</code> element can control allocations for CPU + caches using the resctrl on the host. Whether or not is this supported + can be gathered from capabilities where some limitations like minimum + size and required granularity are reported as well. The required + attribute <code>vcpus</code> specifies to which vCPUs this allocation + applies. A vCPU can only be member of one <code>cachetune</code> element + allocations. Supported subelements are: + <dl> + <dt><code>cache</code></dt> + <dd> + This element controls the allocation of CPU cache and has the + following attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level from which to allocate. + </dd> + <dt><code>id</code></dt> + <dd> + Host cache id from which to allocate. + </dd> + <dt><code>type</code></dt> + <dd> + Type of allocation. Can be <code>code</code> for code + (instructions), <code>data</code> for data or <code>both</code> + for both code and data (unified). Currently the allocation can + be done only with the same type as the host supports, meaning + you cannot request <code>both</code> for host with CDP + (code/data prioritization) enabled. + </dd> + <dt><code>size</code></dt> + <dd> + The size of the region to allocate. The value by default is in + bytes, but the <code>unit</code> attribute can be used to scale + the value. + </dd> + <dt><code>unit</code> (optional)</dt> + <dd> + If specified it is the unit such as KiB, MiB, GiB, or TiB + (described in the <code>memory</code> element + for <a href="#elementsMemoryAllocation">Memory Allocation</a>) + in which <code>size</code> is specified, defaults to bytes. + </dd> + </dl> + </dd> + </dl> + + </dd> </dl> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index f22c932f6c09..252f58f4379c 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -900,6 +900,38 @@ <ref name="schedparam"/> </element> </zeroOrMore> + <zeroOrMore> + <element name="cachetune"> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + <oneOrMore> + <element name="cache"> + <attribute name="id"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="type"> + <choice> + <value>both</value> + <value>code</value> + <value>data</value> + </choice> + </attribute> + <attribute name="size"> + <ref name='unsignedLong'/> + </attribute> + <optional> + <attribute name='unit'> + <ref name='unit'/> + </attribute> + </optional> + </element> + </oneOrMore> + </element> + </zeroOrMore> </interleave> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a1c25060f9e9..27665d0372a7 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2883,6 +2883,19 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) VIR_FREE(loader); } + +static void +virDomainCachetuneDefFree(virDomainCachetuneDefPtr cachetune) +{ + if (!cachetune) + return; + + virObjectUnref(cachetune->alloc); + virBitmapFree(cachetune->vcpus); + VIR_FREE(cachetune); +} + + void virDomainDefFree(virDomainDefPtr def) { size_t i; @@ -3055,6 +3068,10 @@ void virDomainDefFree(virDomainDefPtr def) virDomainShmemDefFree(def->shmems[i]); VIR_FREE(def->shmems); + for (i = 0; i < def->ncachetunes; i++) + virDomainCachetuneDefFree(def->cachetunes[i]); + VIR_FREE(def->cachetunes); + VIR_FREE(def->keywrap); if (def->namespaceData && def->ns.free) @@ -18247,6 +18264,194 @@ virDomainDefParseBootOptions(virDomainDefPtr def, } +static int +virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlAllocPtr alloc) +{ + xmlNodePtr oldnode = ctxt->node; + unsigned int level; + unsigned int cache; + int type; + unsigned long long size; + char *tmp = NULL; + int ret = -1; + + ctxt->node = node; + + tmp = virXMLPropString(node, "id"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'id'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &cache) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'id' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + tmp = virXMLPropString(node, "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'level'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + tmp = virXMLPropString(node, "type"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'type'")); + goto cleanup; + } + type = virCacheTypeFromString(tmp); + if (type < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'type' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + if (virDomainParseScaledValue("./@size", "./@unit", + ctxt, &size, 1024, + ULLONG_MAX, true) < 0) + goto cleanup; + + if (virResctrlAllocSetSize(alloc, level, type, cache, size) < 0) + goto cleanup; + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(tmp); + return ret; +} + + +static int +virDomainCachetuneDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + unsigned int flags) +{ + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + virBitmapPtr vcpus = NULL; + virResctrlAllocPtr alloc = virResctrlAllocNew(); + virDomainCachetuneDefPtr tmp_cachetune = NULL; + char *tmp = NULL; + char *vcpus_str = NULL; + char *alloc_id = NULL; + ssize_t i = 0; + int n; + int ret = -1; + + ctxt->node = node; + + if (!alloc) + goto cleanup; + + if (VIR_ALLOC(tmp_cachetune) < 0) + goto cleanup; + + vcpus_str = virXMLPropString(node, "vcpus"); + if (!vcpus_str) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'vcpus'")); + goto cleanup; + } + if (virBitmapParse(vcpus_str, &vcpus, VIR_DOMAIN_CPUMASK_LEN) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'vcpus' value '%s'"), + vcpus_str); + goto cleanup; + } + + /* We need to limit the bitmap to number of vCPUs. If there's nothing left, + * then we can just clean up and return 0 immediately */ + virBitmapShrink(vcpus, def->maxvcpus); + if (virBitmapIsAllClear(vcpus)) { + ret = 0; + goto cleanup; + } + + if ((n = virXPathNodeSet("./cache", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract cache nodes under cachetune")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } + + if (virResctrlAllocIsEmpty(alloc)) { + ret = 0; + goto cleanup; + } + + for (i = 0; i < def->ncachetunes; i++) { + if (virBitmapOverlaps(def->cachetunes[i]->vcpus, vcpus)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Overlapping vcpus in cachetunes")); + goto cleanup; + } + } + + /* We need to format it back because we need to be consistent in the naming + * even when users specify some "sub-optimal" string there. */ + VIR_FREE(vcpus_str); + vcpus_str = virBitmapFormat(vcpus); + if (!vcpus_str) + goto cleanup; + + if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) + alloc_id = virXMLPropString(node, "id"); + + if (!alloc_id) { + /* The number of allocatios is limited and the directory structure is flat, + * not hierarchical, so we need to have all same allocations in one + * directory, so it's nice to have it named appropriately. For now it's + * 'vcpus_...' but it's designed in order for it to be changeable in the + * future (it's part of the status XML). */ + if (virAsprintf(&alloc_id, "vcpus_%s", vcpus_str) < 0) + goto cleanup; + } + + if (virResctrlAllocSetID(alloc, alloc_id) < 0) + goto cleanup; + + VIR_STEAL_PTR(tmp_cachetune->vcpus, vcpus); + VIR_STEAL_PTR(tmp_cachetune->alloc, alloc); + + if (VIR_APPEND_ELEMENT(def->cachetunes, def->ncachetunes, tmp_cachetune) < 0) + goto cleanup; + + ret = 0; + cleanup: + ctxt->node = oldnode; + virDomainCachetuneDefFree(tmp_cachetune); + virObjectUnref(alloc); + virBitmapFree(vcpus); + VIR_FREE(alloc_id); + VIR_FREE(vcpus_str); + VIR_FREE(nodes); + VIR_FREE(tmp); + return ret; +} + + static virDomainDefPtr virDomainDefParseXML(xmlDocPtr xml, xmlNodePtr root, @@ -18799,6 +19004,18 @@ virDomainDefParseXML(xmlDocPtr xml, } VIR_FREE(nodes); + if ((n = virXPathNodeSet("./cputune/cachetune", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("cannot extract cachetune nodes")); + goto error; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParse(def, ctxt, nodes[i], flags) < 0) + goto error; + } + VIR_FREE(nodes); + if (virCPUDefParseXML(ctxt, "./cpu[1]", VIR_CPU_TYPE_GUEST, &def->cpu) < 0) goto error; @@ -25750,9 +25967,80 @@ virDomainSchedulerFormat(virBufferPtr buf, } +static int +virDomainCachetuneDefFormatHelper(unsigned int level, + virCacheType type, + unsigned int cache, + unsigned long long size, + void *opaque) +{ + const char *unit; + virBufferPtr buf = opaque; + unsigned long long short_size = virFormatIntPretty(size, &unit); + + virBufferAsprintf(buf, + "<cache id='%u' level='%u' type='%s' " + "size='%llu' unit='%s'/>\n", + cache, level, virCacheTypeToString(type), + short_size, unit); + + return 0; +} + + +static int +virDomainCachetuneDefFormat(virBufferPtr buf, + virDomainCachetuneDefPtr cachetune, + unsigned int flags) +{ + virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; + char *vcpus = NULL; + int ret = -1; + + virBufferSetChildIndent(&childrenBuf, buf); + virResctrlAllocForeachSize(cachetune->alloc, + virDomainCachetuneDefFormatHelper, + &childrenBuf); + + + if (virBufferCheckError(&childrenBuf) < 0) + goto cleanup; + + if (!virBufferUse(&childrenBuf)) { + ret = 0; + goto cleanup; + } + + vcpus = virBitmapFormat(cachetune->vcpus); + if (!vcpus) + goto cleanup; + + virBufferAsprintf(buf, "<cachetune vcpus='%s'", vcpus); + + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { + const char *alloc_id = virResctrlAllocGetID(cachetune->alloc); + if (!alloc_id) + goto cleanup; + + virBufferAsprintf(buf, " id='%s'", alloc_id); + } + virBufferAddLit(buf, ">\n"); + + virBufferAddBuffer(buf, &childrenBuf); + virBufferAddLit(buf, "</cachetune>\n"); + + ret = 0; + cleanup: + virBufferFreeAndReset(&childrenBuf); + VIR_FREE(vcpus); + return ret; +} + + static int virDomainCputuneDefFormat(virBufferPtr buf, - virDomainDefPtr def) + virDomainDefPtr def, + unsigned int flags) { size_t i; virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; @@ -25851,6 +26139,9 @@ virDomainCputuneDefFormat(virBufferPtr buf, def->iothreadids[i]->iothread_id); } + for (i = 0; i < def->ncachetunes; i++) + virDomainCachetuneDefFormat(&childrenBuf, def->cachetunes[i], flags); + if (virBufferCheckError(&childrenBuf) < 0) return -1; @@ -26188,7 +26479,7 @@ virDomainDefFormatInternal(virDomainDefPtr def, } } - if (virDomainCputuneDefFormat(buf, def) < 0) + if (virDomainCputuneDefFormat(buf, def, flags) < 0) goto error; if (virDomainNumatuneFormatXML(buf, def->numa) < 0) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 6f7f96b3ddea..ed85260926de 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -56,6 +56,7 @@ # include "virperf.h" # include "virtypedparam.h" # include "virsavecookie.h" +# include "virresctrl.h" /* forward declarations of all device types, required by * virDomainDeviceDef @@ -2194,6 +2195,15 @@ struct _virDomainCputune { }; +typedef struct _virDomainCachetuneDef virDomainCachetuneDef; +typedef virDomainCachetuneDef *virDomainCachetuneDefPtr; + +struct _virDomainCachetuneDef { + virBitmapPtr vcpus; + virResctrlAllocPtr alloc; +}; + + typedef struct _virDomainVcpuDef virDomainVcpuDef; typedef virDomainVcpuDef *virDomainVcpuDefPtr; @@ -2322,6 +2332,9 @@ struct _virDomainDef { virDomainCputune cputune; + virDomainCachetuneDefPtr *cachetunes; + size_t ncachetunes; + virDomainNumaPtr numa; virDomainResourceDefPtr resource; virDomainIdMapDef idmap; diff --git a/tests/genericxml2xmlindata/cachetune-cdp.xml b/tests/genericxml2xmlindata/cachetune-cdp.xml new file mode 100644 index 000000000000..9718f060987a --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-cdp.xml @@ -0,0 +1,36 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='code' size='7680' unit='KiB'/> + <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + </cachetune> + <cachetune vcpus='2'> + <cache id='1' level='3' type='code' size='6' unit='MiB'/> + </cachetune> + <cachetune vcpus='3'> + <cache id='1' level='3' type='data' size='6912' unit='KiB'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-colliding-allocs.xml b/tests/genericxml2xmlindata/cachetune-colliding-allocs.xml new file mode 100644 index 000000000000..82c9176cbafa --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-colliding-allocs.xml @@ -0,0 +1,30 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0'> + <cache id='0' level='3' type='code' size='12' unit='KiB'/> + <cache id='0' level='3' type='code' size='18' unit='KiB'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-colliding-tunes.xml b/tests/genericxml2xmlindata/cachetune-colliding-tunes.xml new file mode 100644 index 000000000000..a0f37028c98c --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-colliding-tunes.xml @@ -0,0 +1,32 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0'> + <cache id='0' level='3' type='code' size='12' unit='KiB'/> + </cachetune> + <cachetune vcpus='0'> + <cache id='0' level='3' type='data' size='18' unit='KiB'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-colliding-types.xml b/tests/genericxml2xmlindata/cachetune-colliding-types.xml new file mode 100644 index 000000000000..c229eccee4b6 --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-colliding-types.xml @@ -0,0 +1,30 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='12' unit='KiB'/> + <cache id='0' level='3' type='code' size='6' unit='KiB'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune-small.xml b/tests/genericxml2xmlindata/cachetune-small.xml new file mode 100644 index 000000000000..ab2d9cf8856a --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune-small.xml @@ -0,0 +1,29 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='768' unit='KiB'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmlindata/cachetune.xml b/tests/genericxml2xmlindata/cachetune.xml new file mode 100644 index 000000000000..645cab7771cf --- /dev/null +++ b/tests/genericxml2xmlindata/cachetune.xml @@ -0,0 +1,33 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='3' unit='MiB'/> + <cache id='1' level='3' type='both' size='3' unit='MiB'/> + </cachetune> + <cachetune vcpus='3'> + <cache id='0' level='3' type='both' size='3' unit='MiB'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index deaf5bd9b823..496e9260fa86 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -130,6 +130,16 @@ mymain(void) DO_TEST_FULL("chardev-reconnect-invalid-mode", 0, false, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST("cachetune"); + DO_TEST("cachetune-small"); + DO_TEST("cachetune-cdp"); + DO_TEST_FULL("cachetune-colliding-allocs", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST_FULL("cachetune-colliding-tunes", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + DO_TEST_FULL("cachetune-colliding-types", false, true, + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); + virObjectUnref(caps); virObjectUnref(xmlopt); -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:15PM +0100, Martin Kletzander wrote:
More info in the documentation, this is basically the XML parsing/formatting support, schemas, tests and documentation for the new cputune/cachetune element that will get used by following patches.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- docs/formatdomain.html.in | 54 ++++ docs/schemas/domaincommon.rng | 32 +++ src/conf/domain_conf.c | 295 ++++++++++++++++++++- src/conf/domain_conf.h | 13 + tests/genericxml2xmlindata/cachetune-cdp.xml | 36 +++ .../cachetune-colliding-allocs.xml | 30 +++ .../cachetune-colliding-tunes.xml | 32 +++ .../cachetune-colliding-types.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 29 ++ tests/genericxml2xmlindata/cachetune.xml | 33 +++ tests/genericxml2xmltest.c | 10 + 11 files changed, 592 insertions(+), 2 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-cdp.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-tunes.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-types.xml create mode 100644 tests/genericxml2xmlindata/cachetune-small.xml create mode 100644 tests/genericxml2xmlindata/cachetune.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index d272cc1ba698..7b4d9051a551 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -689,6 +689,10 @@ <iothread_quota>-1</iothread_quota> <vcpusched vcpus='0-4,^3' scheduler='fifo' priority='1'/> <iothreadsched iothreads='2' scheduler='batch'/> + <cachetune vcpus='0-3'> + <cache id='0' level='3' type='both' size='3' unit='MiB'/> + <cache id='1' level='3' type='both' size='3' unit='MiB'/> + </cachetune> </cputune> ... </domain> @@ -834,6 +838,56 @@ <span class="since">Since 1.2.13</span> </dd>
+ <dt><code>cachetune</code><span class="since">Since 4.1.0</span></dt> + <dd> + Optional <code>cachetune</code> element can control allocations for CPU + caches using the resctrl on the host. Whether or not is this supported + can be gathered from capabilities where some limitations like minimum + size and required granularity are reported as well. The required
s/ / /
+ attribute <code>vcpus</code> specifies to which vCPUs this allocation + applies. A vCPU can only be member of one <code>cachetune</code> element + allocations. Supported subelements are: + <dl> + <dt><code>cache</code></dt> + <dd> + This element controls the allocation of CPU cache and has the + following attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level from which to allocate. + </dd> + <dt><code>id</code></dt> + <dd> + Host cache id from which to allocate. + </dd> + <dt><code>type</code></dt> + <dd> + Type of allocation. Can be <code>code</code> for code
s/ / /
+ (instructions), <code>data</code> for data or <code>both</code> + for both code and data (unified). Currently the allocation can
s/ / /
+ be done only with the same type as the host supports, meaning + you cannot request <code>both</code> for host with CDP + (code/data prioritization) enabled. + </dd> + <dt><code>size</code></dt> + <dd> + The size of the region to allocate. The value by default is in + bytes, but the <code>unit</code> attribute can be used to scale + the value. + </dd> + <dt><code>unit</code> (optional)</dt> + <dd> + If specified it is the unit such as KiB, MiB, GiB, or TiB + (described in the <code>memory</code> element + for <a href="#elementsMemoryAllocation">Memory Allocation</a>) + in which <code>size</code> is specified, defaults to bytes. + </dd> + </dl> + </dd> + </dl> + + </dd> </dl>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index f22c932f6c09..252f58f4379c 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -900,6 +900,38 @@ <ref name="schedparam"/> </element> </zeroOrMore> + <zeroOrMore> + <element name="cachetune"> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + <oneOrMore> + <element name="cache"> + <attribute name="id"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="type"> + <choice> + <value>both</value> + <value>code</value> + <value>data</value> + </choice> + </attribute> + <attribute name="size"> + <ref name='unsignedLong'/> + </attribute> + <optional> + <attribute name='unit'> + <ref name='unit'/> + </attribute> + </optional> + </element> + </oneOrMore> + </element> + </zeroOrMore> </interleave> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a1c25060f9e9..27665d0372a7 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2883,6 +2883,19 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) VIR_FREE(loader); }
+ +static void +virDomainCachetuneDefFree(virDomainCachetuneDefPtr cachetune) +{ + if (!cachetune) + return; + + virObjectUnref(cachetune->alloc); + virBitmapFree(cachetune->vcpus); + VIR_FREE(cachetune); +} + + void virDomainDefFree(virDomainDefPtr def) { size_t i; @@ -3055,6 +3068,10 @@ void virDomainDefFree(virDomainDefPtr def) virDomainShmemDefFree(def->shmems[i]); VIR_FREE(def->shmems);
+ for (i = 0; i < def->ncachetunes; i++) + virDomainCachetuneDefFree(def->cachetunes[i]); + VIR_FREE(def->cachetunes); + VIR_FREE(def->keywrap);
if (def->namespaceData && def->ns.free) @@ -18247,6 +18264,194 @@ virDomainDefParseBootOptions(virDomainDefPtr def, }
+static int +virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlAllocPtr alloc) +{ + xmlNodePtr oldnode = ctxt->node; + unsigned int level; + unsigned int cache; + int type; + unsigned long long size; + char *tmp = NULL; + int ret = -1; + + ctxt->node = node; + + tmp = virXMLPropString(node, "id"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'id'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &cache) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'id' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + tmp = virXMLPropString(node, "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'level'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + tmp = virXMLPropString(node, "type"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'type'")); + goto cleanup; + } + type = virCacheTypeFromString(tmp); + if (type < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'type' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + if (virDomainParseScaledValue("./@size", "./@unit", + ctxt, &size, 1024, + ULLONG_MAX, true) < 0) + goto cleanup; + + if (virResctrlAllocSetSize(alloc, level, type, cache, size) < 0) + goto cleanup; + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(tmp); + return ret; +} + + +static int +virDomainCachetuneDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + unsigned int flags) +{ + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + virBitmapPtr vcpus = NULL; + virResctrlAllocPtr alloc = virResctrlAllocNew(); + virDomainCachetuneDefPtr tmp_cachetune = NULL; + char *tmp = NULL; + char *vcpus_str = NULL; + char *alloc_id = NULL; + ssize_t i = 0; + int n; + int ret = -1; + + ctxt->node = node; + + if (!alloc) + goto cleanup; + + if (VIR_ALLOC(tmp_cachetune) < 0) + goto cleanup; + + vcpus_str = virXMLPropString(node, "vcpus"); + if (!vcpus_str) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'vcpus'")); + goto cleanup; + } + if (virBitmapParse(vcpus_str, &vcpus, VIR_DOMAIN_CPUMASK_LEN) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'vcpus' value '%s'"), + vcpus_str); + goto cleanup; + } + + /* We need to limit the bitmap to number of vCPUs. If there's nothing left, + * then we can just clean up and return 0 immediately */ + virBitmapShrink(vcpus, def->maxvcpus); + if (virBitmapIsAllClear(vcpus)) { + ret = 0; + goto cleanup; + } + + if ((n = virXPathNodeSet("./cache", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract cache nodes under cachetune")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } + + if (virResctrlAllocIsEmpty(alloc)) { + ret = 0; + goto cleanup; + }
Can this ever happen? The
+ + for (i = 0; i < def->ncachetunes; i++) { + if (virBitmapOverlaps(def->cachetunes[i]->vcpus, vcpus)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Overlapping vcpus in cachetunes")); + goto cleanup; + } + } + + /* We need to format it back because we need to be consistent in the naming + * even when users specify some "sub-optimal" string there. */ + VIR_FREE(vcpus_str); + vcpus_str = virBitmapFormat(vcpus); + if (!vcpus_str) + goto cleanup; + + if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) + alloc_id = virXMLPropString(node, "id"); + + if (!alloc_id) { + /* The number of allocatios is limited and the directory structure is flat,
s/allocatios/allocations/ Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

On Wed, Jan 24, 2018 at 03:04:37PM +0100, Pavel Hrdina wrote:
On Tue, Jan 23, 2018 at 07:05:15PM +0100, Martin Kletzander wrote:
More info in the documentation, this is basically the XML parsing/formatting support, schemas, tests and documentation for the new cputune/cachetune element that will get used by following patches.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- docs/formatdomain.html.in | 54 ++++ docs/schemas/domaincommon.rng | 32 +++ src/conf/domain_conf.c | 295 ++++++++++++++++++++- src/conf/domain_conf.h | 13 + tests/genericxml2xmlindata/cachetune-cdp.xml | 36 +++ .../cachetune-colliding-allocs.xml | 30 +++ .../cachetune-colliding-tunes.xml | 32 +++ .../cachetune-colliding-types.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 29 ++ tests/genericxml2xmlindata/cachetune.xml | 33 +++ tests/genericxml2xmltest.c | 10 + 11 files changed, 592 insertions(+), 2 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-cdp.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-tunes.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-types.xml create mode 100644 tests/genericxml2xmlindata/cachetune-small.xml create mode 100644 tests/genericxml2xmlindata/cachetune.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index d272cc1ba698..7b4d9051a551 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -689,6 +689,10 @@ <iothread_quota>-1</iothread_quota> <vcpusched vcpus='0-4,^3' scheduler='fifo' priority='1'/> <iothreadsched iothreads='2' scheduler='batch'/> + <cachetune vcpus='0-3'> + <cache id='0' level='3' type='both' size='3' unit='MiB'/> + <cache id='1' level='3' type='both' size='3' unit='MiB'/> + </cachetune> </cputune> ... </domain> @@ -834,6 +838,56 @@ <span class="since">Since 1.2.13</span> </dd>
+ <dt><code>cachetune</code><span class="since">Since 4.1.0</span></dt> + <dd> + Optional <code>cachetune</code> element can control allocations for CPU + caches using the resctrl on the host. Whether or not is this supported + can be gathered from capabilities where some limitations like minimum + size and required granularity are reported as well. The required
s/ / /
+ attribute <code>vcpus</code> specifies to which vCPUs this allocation + applies. A vCPU can only be member of one <code>cachetune</code> element + allocations. Supported subelements are: + <dl> + <dt><code>cache</code></dt> + <dd> + This element controls the allocation of CPU cache and has the + following attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level from which to allocate. + </dd> + <dt><code>id</code></dt> + <dd> + Host cache id from which to allocate. + </dd> + <dt><code>type</code></dt> + <dd> + Type of allocation. Can be <code>code</code> for code
s/ / /
+ (instructions), <code>data</code> for data or <code>both</code> + for both code and data (unified). Currently the allocation can
s/ / /
Fixed all three. /me wonders why people still insist on this. Is someone editing these files with proportional font? Do we have any other output than HTML? I guess this will be yet another thing I have to get used to and surrender.
+ be done only with the same type as the host supports, meaning + you cannot request <code>both</code> for host with CDP + (code/data prioritization) enabled. + </dd> + <dt><code>size</code></dt> + <dd> + The size of the region to allocate. The value by default is in + bytes, but the <code>unit</code> attribute can be used to scale + the value. + </dd> + <dt><code>unit</code> (optional)</dt> + <dd> + If specified it is the unit such as KiB, MiB, GiB, or TiB + (described in the <code>memory</code> element + for <a href="#elementsMemoryAllocation">Memory Allocation</a>) + in which <code>size</code> is specified, defaults to bytes. + </dd> + </dl> + </dd> + </dl> + + </dd> </dl>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index f22c932f6c09..252f58f4379c 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -900,6 +900,38 @@ <ref name="schedparam"/> </element> </zeroOrMore> + <zeroOrMore> + <element name="cachetune"> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + <oneOrMore> + <element name="cache"> + <attribute name="id"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="type"> + <choice> + <value>both</value> + <value>code</value> + <value>data</value> + </choice> + </attribute> + <attribute name="size"> + <ref name='unsignedLong'/> + </attribute> + <optional> + <attribute name='unit'> + <ref name='unit'/> + </attribute> + </optional> + </element> + </oneOrMore> + </element> + </zeroOrMore> </interleave> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a1c25060f9e9..27665d0372a7 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2883,6 +2883,19 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) VIR_FREE(loader); }
+ +static void +virDomainCachetuneDefFree(virDomainCachetuneDefPtr cachetune) +{ + if (!cachetune) + return; + + virObjectUnref(cachetune->alloc); + virBitmapFree(cachetune->vcpus); + VIR_FREE(cachetune); +} + + void virDomainDefFree(virDomainDefPtr def) { size_t i; @@ -3055,6 +3068,10 @@ void virDomainDefFree(virDomainDefPtr def) virDomainShmemDefFree(def->shmems[i]); VIR_FREE(def->shmems);
+ for (i = 0; i < def->ncachetunes; i++) + virDomainCachetuneDefFree(def->cachetunes[i]); + VIR_FREE(def->cachetunes); + VIR_FREE(def->keywrap);
if (def->namespaceData && def->ns.free) @@ -18247,6 +18264,194 @@ virDomainDefParseBootOptions(virDomainDefPtr def, }
+static int +virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlAllocPtr alloc) +{ + xmlNodePtr oldnode = ctxt->node; + unsigned int level; + unsigned int cache; + int type; + unsigned long long size; + char *tmp = NULL; + int ret = -1; + + ctxt->node = node; + + tmp = virXMLPropString(node, "id"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'id'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &cache) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'id' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + tmp = virXMLPropString(node, "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'level'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + tmp = virXMLPropString(node, "type"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'type'")); + goto cleanup; + } + type = virCacheTypeFromString(tmp); + if (type < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'type' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + if (virDomainParseScaledValue("./@size", "./@unit", + ctxt, &size, 1024, + ULLONG_MAX, true) < 0) + goto cleanup; + + if (virResctrlAllocSetSize(alloc, level, type, cache, size) < 0) + goto cleanup; + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(tmp); + return ret; +} + + +static int +virDomainCachetuneDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + unsigned int flags) +{ + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + virBitmapPtr vcpus = NULL; + virResctrlAllocPtr alloc = virResctrlAllocNew(); + virDomainCachetuneDefPtr tmp_cachetune = NULL; + char *tmp = NULL; + char *vcpus_str = NULL; + char *alloc_id = NULL; + ssize_t i = 0; + int n; + int ret = -1; + + ctxt->node = node; + + if (!alloc) + goto cleanup; + + if (VIR_ALLOC(tmp_cachetune) < 0) + goto cleanup; + + vcpus_str = virXMLPropString(node, "vcpus"); + if (!vcpus_str) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'vcpus'")); + goto cleanup; + } + if (virBitmapParse(vcpus_str, &vcpus, VIR_DOMAIN_CPUMASK_LEN) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'vcpus' value '%s'"), + vcpus_str); + goto cleanup; + } + + /* We need to limit the bitmap to number of vCPUs. If there's nothing left, + * then we can just clean up and return 0 immediately */ + virBitmapShrink(vcpus, def->maxvcpus); + if (virBitmapIsAllClear(vcpus)) { + ret = 0; + goto cleanup; + } + + if ((n = virXPathNodeSet("./cache", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract cache nodes under cachetune")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } + + if (virResctrlAllocIsEmpty(alloc)) { + ret = 0; + goto cleanup; + }
Can this ever happen? The
Sure, for example: <cachetune/>
+ + for (i = 0; i < def->ncachetunes; i++) { + if (virBitmapOverlaps(def->cachetunes[i]->vcpus, vcpus)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Overlapping vcpus in cachetunes")); + goto cleanup; + } + } + + /* We need to format it back because we need to be consistent in the naming + * even when users specify some "sub-optimal" string there. */ + VIR_FREE(vcpus_str); + vcpus_str = virBitmapFormat(vcpus); + if (!vcpus_str) + goto cleanup; + + if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) + alloc_id = virXMLPropString(node, "id"); + + if (!alloc_id) { + /* The number of allocatios is limited and the directory structure is flat,
s/allocatios/allocations/
Fixed.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

On Wed, Jan 24, 2018 at 10:32:07PM +0100, Martin Kletzander wrote:
On Wed, Jan 24, 2018 at 03:04:37PM +0100, Pavel Hrdina wrote:
On Tue, Jan 23, 2018 at 07:05:15PM +0100, Martin Kletzander wrote:
More info in the documentation, this is basically the XML parsing/formatting support, schemas, tests and documentation for the new cputune/cachetune element that will get used by following patches.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- docs/formatdomain.html.in | 54 ++++ docs/schemas/domaincommon.rng | 32 +++ src/conf/domain_conf.c | 295 ++++++++++++++++++++- src/conf/domain_conf.h | 13 + tests/genericxml2xmlindata/cachetune-cdp.xml | 36 +++ .../cachetune-colliding-allocs.xml | 30 +++ .../cachetune-colliding-tunes.xml | 32 +++ .../cachetune-colliding-types.xml | 30 +++ tests/genericxml2xmlindata/cachetune-small.xml | 29 ++ tests/genericxml2xmlindata/cachetune.xml | 33 +++ tests/genericxml2xmltest.c | 10 + 11 files changed, 592 insertions(+), 2 deletions(-) create mode 100644 tests/genericxml2xmlindata/cachetune-cdp.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-tunes.xml create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-types.xml create mode 100644 tests/genericxml2xmlindata/cachetune-small.xml create mode 100644 tests/genericxml2xmlindata/cachetune.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index d272cc1ba698..7b4d9051a551 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -689,6 +689,10 @@ <iothread_quota>-1</iothread_quota> <vcpusched vcpus='0-4,^3' scheduler='fifo' priority='1'/> <iothreadsched iothreads='2' scheduler='batch'/> + <cachetune vcpus='0-3'> + <cache id='0' level='3' type='both' size='3' unit='MiB'/> + <cache id='1' level='3' type='both' size='3' unit='MiB'/> + </cachetune> </cputune> ... </domain> @@ -834,6 +838,56 @@ <span class="since">Since 1.2.13</span> </dd>
+ <dt><code>cachetune</code><span class="since">Since 4.1.0</span></dt> + <dd> + Optional <code>cachetune</code> element can control allocations for CPU + caches using the resctrl on the host. Whether or not is this supported + can be gathered from capabilities where some limitations like minimum + size and required granularity are reported as well. The required
s/ / /
+ attribute <code>vcpus</code> specifies to which vCPUs this allocation + applies. A vCPU can only be member of one <code>cachetune</code> element + allocations. Supported subelements are: + <dl> + <dt><code>cache</code></dt> + <dd> + This element controls the allocation of CPU cache and has the + following attributes: + <dl> + <dt><code>level</code></dt> + <dd> + Host cache level from which to allocate. + </dd> + <dt><code>id</code></dt> + <dd> + Host cache id from which to allocate. + </dd> + <dt><code>type</code></dt> + <dd> + Type of allocation. Can be <code>code</code> for code
s/ / /
+ (instructions), <code>data</code> for data or <code>both</code> + for both code and data (unified). Currently the allocation can
s/ / /
Fixed all three.
/me wonders why people still insist on this. Is someone editing these files with proportional font? Do we have any other output than HTML? I guess this will be yet another thing I have to get used to and surrender.
I personally don't care about the double space or single space, this was just for consistency.
+ be done only with the same type as the host supports, meaning + you cannot request <code>both</code> for host with CDP + (code/data prioritization) enabled. + </dd> + <dt><code>size</code></dt> + <dd> + The size of the region to allocate. The value by default is in + bytes, but the <code>unit</code> attribute can be used to scale + the value. + </dd> + <dt><code>unit</code> (optional)</dt> + <dd> + If specified it is the unit such as KiB, MiB, GiB, or TiB + (described in the <code>memory</code> element + for <a href="#elementsMemoryAllocation">Memory Allocation</a>) + in which <code>size</code> is specified, defaults to bytes. + </dd> + </dl> + </dd> + </dl> + + </dd> </dl>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index f22c932f6c09..252f58f4379c 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -900,6 +900,38 @@ <ref name="schedparam"/> </element> </zeroOrMore> + <zeroOrMore> + <element name="cachetune"> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + <oneOrMore> + <element name="cache"> + <attribute name="id"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="type"> + <choice> + <value>both</value> + <value>code</value> + <value>data</value> + </choice> + </attribute> + <attribute name="size"> + <ref name='unsignedLong'/> + </attribute> + <optional> + <attribute name='unit'> + <ref name='unit'/> + </attribute> + </optional> + </element> + </oneOrMore> + </element> + </zeroOrMore> </interleave> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a1c25060f9e9..27665d0372a7 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2883,6 +2883,19 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) VIR_FREE(loader); }
+ +static void +virDomainCachetuneDefFree(virDomainCachetuneDefPtr cachetune) +{ + if (!cachetune) + return; + + virObjectUnref(cachetune->alloc); + virBitmapFree(cachetune->vcpus); + VIR_FREE(cachetune); +} + + void virDomainDefFree(virDomainDefPtr def) { size_t i; @@ -3055,6 +3068,10 @@ void virDomainDefFree(virDomainDefPtr def) virDomainShmemDefFree(def->shmems[i]); VIR_FREE(def->shmems);
+ for (i = 0; i < def->ncachetunes; i++) + virDomainCachetuneDefFree(def->cachetunes[i]); + VIR_FREE(def->cachetunes); + VIR_FREE(def->keywrap);
if (def->namespaceData && def->ns.free) @@ -18247,6 +18264,194 @@ virDomainDefParseBootOptions(virDomainDefPtr def, }
+static int +virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlAllocPtr alloc) +{ + xmlNodePtr oldnode = ctxt->node; + unsigned int level; + unsigned int cache; + int type; + unsigned long long size; + char *tmp = NULL; + int ret = -1; + + ctxt->node = node; + + tmp = virXMLPropString(node, "id"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'id'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &cache) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'id' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + tmp = virXMLPropString(node, "level"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'level'")); + goto cleanup; + } + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'level' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + tmp = virXMLPropString(node, "type"); + if (!tmp) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'type'")); + goto cleanup; + } + type = virCacheTypeFromString(tmp); + if (type < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'type' value '%s'"), + tmp); + goto cleanup; + } + VIR_FREE(tmp); + + if (virDomainParseScaledValue("./@size", "./@unit", + ctxt, &size, 1024, + ULLONG_MAX, true) < 0) + goto cleanup; + + if (virResctrlAllocSetSize(alloc, level, type, cache, size) < 0) + goto cleanup; + + ret = 0; + cleanup: + ctxt->node = oldnode; + VIR_FREE(tmp); + return ret; +} + + +static int +virDomainCachetuneDefParse(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + unsigned int flags) +{ + xmlNodePtr oldnode = ctxt->node; + xmlNodePtr *nodes = NULL; + virBitmapPtr vcpus = NULL; + virResctrlAllocPtr alloc = virResctrlAllocNew(); + virDomainCachetuneDefPtr tmp_cachetune = NULL; + char *tmp = NULL; + char *vcpus_str = NULL; + char *alloc_id = NULL; + ssize_t i = 0; + int n; + int ret = -1; + + ctxt->node = node; + + if (!alloc) + goto cleanup; + + if (VIR_ALLOC(tmp_cachetune) < 0) + goto cleanup; + + vcpus_str = virXMLPropString(node, "vcpus"); + if (!vcpus_str) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing cachetune attribute 'vcpus'")); + goto cleanup; + } + if (virBitmapParse(vcpus_str, &vcpus, VIR_DOMAIN_CPUMASK_LEN) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid cachetune attribute 'vcpus' value '%s'"), + vcpus_str); + goto cleanup; + } + + /* We need to limit the bitmap to number of vCPUs. If there's nothing left, + * then we can just clean up and return 0 immediately */ + virBitmapShrink(vcpus, def->maxvcpus); + if (virBitmapIsAllClear(vcpus)) { + ret = 0; + goto cleanup; + } + + if ((n = virXPathNodeSet("./cache", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot extract cache nodes under cachetune")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) + goto cleanup; + } + + if (virResctrlAllocIsEmpty(alloc)) { + ret = 0; + goto cleanup; + }
Can this ever happen? The
Sure, for example:
<cachetune/>
Right, I figure that out and I thought that I removed this comment.
+ + for (i = 0; i < def->ncachetunes; i++) { + if (virBitmapOverlaps(def->cachetunes[i]->vcpus, vcpus)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Overlapping vcpus in cachetunes")); + goto cleanup; + } + } + + /* We need to format it back because we need to be consistent in the naming + * even when users specify some "sub-optimal" string there. */ + VIR_FREE(vcpus_str); + vcpus_str = virBitmapFormat(vcpus); + if (!vcpus_str) + goto cleanup; + + if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) + alloc_id = virXMLPropString(node, "id"); + + if (!alloc_id) { + /* The number of allocatios is limited and the directory structure is flat,
s/allocatios/allocations/
Fixed.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

This test initializes capabilities from vircaps2xmldata (since it exists there already) and then requests list of free bitmaps (all unallocated space) from virresctrl.c Desirable outputs are saved in virresctrldata. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tests/Makefile.am | 9 +- tests/virresctrldata/resctrl-cdp.schemata | 2 + .../virresctrldata/resctrl-skx-twocaches.schemata | 1 + tests/virresctrldata/resctrl-skx.schemata | 1 + tests/virresctrldata/resctrl.schemata | 1 + tests/virresctrltest.c | 102 +++++++++++++++++++++ 6 files changed, 115 insertions(+), 1 deletion(-) create mode 100644 tests/virresctrldata/resctrl-cdp.schemata create mode 100644 tests/virresctrldata/resctrl-skx-twocaches.schemata create mode 100644 tests/virresctrldata/resctrl-skx.schemata create mode 100644 tests/virresctrldata/resctrl.schemata create mode 100644 tests/virresctrltest.c diff --git a/tests/Makefile.am b/tests/Makefile.am index 3441dab6f6bb..497bd21a2587 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -165,6 +165,7 @@ EXTRA_DIST = \ xml2vmxdata \ virstorageutildata \ virfilecachedata \ + virresctrldata \ $(NULL) test_helpers = commandhelper ssh @@ -232,6 +233,7 @@ if WITH_LINUX test_programs += fchosttest test_programs += scsihosttest test_programs += vircaps2xmltest +test_programs += virresctrltest test_libraries += virusbmock.la \ virnetdevbandwidthmock.la \ virnumamock.la \ @@ -1165,8 +1167,13 @@ virnumamock_la_SOURCES = \ virnumamock_la_LDFLAGS = $(MOCKLIBS_LDFLAGS) virnumamock_la_LIBADD = $(MOCKLIBS_LIBS) +virresctrltest_SOURCES = \ + virresctrltest.c testutils.h testutils.c virfilewrapper.h virfilewrapper.c +virresctrltest_LDADD = $(LDADDS) + else ! WITH_LINUX -EXTRA_DIST += vircaps2xmltest.c virnumamock.c virfilewrapper.c virfilewrapper.h +EXTRA_DIST += vircaps2xmltest.c virnumamock.c virfilewrapper.c \ + virfilewrapper.h virresctrltest.c endif ! WITH_LINUX if WITH_NSS diff --git a/tests/virresctrldata/resctrl-cdp.schemata b/tests/virresctrldata/resctrl-cdp.schemata new file mode 100644 index 000000000000..2c31c3b957d5 --- /dev/null +++ b/tests/virresctrldata/resctrl-cdp.schemata @@ -0,0 +1,2 @@ +L3CODE:0=00ffc;1=0ff00 +L3DATA:0=3ffff;1=03fff diff --git a/tests/virresctrldata/resctrl-skx-twocaches.schemata b/tests/virresctrldata/resctrl-skx-twocaches.schemata new file mode 100644 index 000000000000..86b3801a04c2 --- /dev/null +++ b/tests/virresctrldata/resctrl-skx-twocaches.schemata @@ -0,0 +1 @@ +L3:0=001;1=400 diff --git a/tests/virresctrldata/resctrl-skx.schemata b/tests/virresctrldata/resctrl-skx.schemata new file mode 100644 index 000000000000..5e8b0d636277 --- /dev/null +++ b/tests/virresctrldata/resctrl-skx.schemata @@ -0,0 +1 @@ +L3:0=70f diff --git a/tests/virresctrldata/resctrl.schemata b/tests/virresctrldata/resctrl.schemata new file mode 100644 index 000000000000..fa980e58c9dd --- /dev/null +++ b/tests/virresctrldata/resctrl.schemata @@ -0,0 +1 @@ +L3:0=000ff;1=000f0 diff --git a/tests/virresctrltest.c b/tests/virresctrltest.c new file mode 100644 index 000000000000..99e20d5dec99 --- /dev/null +++ b/tests/virresctrltest.c @@ -0,0 +1,102 @@ +#include <config.h> +#include <stdlib.h> + +#include "testutils.h" +#include "virfilewrapper.h" +#include "virresctrlpriv.h" + + +#define VIR_FROM_THIS VIR_FROM_NONE + +struct virResctrlData { + const char *filename; + bool fail; +}; + + +static int +test_virResctrlGetUnused(const void *opaque) +{ + struct virResctrlData *data = (struct virResctrlData *) opaque; + char *system_dir = NULL; + char *resctrl_dir = NULL; + int ret = -1; + virResctrlAllocPtr alloc = NULL; + char *schemata_str = NULL; + char *schemata_file; + virCapsPtr caps = NULL; + + if (virAsprintf(&system_dir, "%s/vircaps2xmldata/linux-%s/system", + abs_srcdir, data->filename) < 0) + goto cleanup; + + if (virAsprintf(&resctrl_dir, "%s/vircaps2xmldata/linux-%s/resctrl", + abs_srcdir, data->filename) < 0) + goto cleanup; + + if (virAsprintf(&schemata_file, "%s/virresctrldata/%s.schemata", + abs_srcdir, data->filename) < 0) + goto cleanup; + + virFileWrapperAddPrefix("/sys/devices/system", system_dir); + virFileWrapperAddPrefix("/sys/fs/resctrl", resctrl_dir); + + caps = virCapabilitiesNew(VIR_ARCH_X86_64, false, false); + if (!caps || virCapabilitiesInitCaches(caps) < 0) { + fprintf(stderr, "Could not initialize capabilities"); + goto cleanup; + } + + alloc = virResctrlAllocGetUnused(caps->host.resctrl); + + virFileWrapperClearPrefixes(); + + if (!alloc) { + if (data->fail) + ret = 0; + goto cleanup; + } else if (data->fail) { + VIR_TEST_DEBUG("Error expected but there wasn't any.\n"); + ret = -1; + goto cleanup; + } + + schemata_str = virResctrlAllocFormat(alloc); + + if (virTestCompareToFile(schemata_str, schemata_file) < 0) + goto cleanup; + + ret = 0; + cleanup: + virObjectUnref(caps); + virObjectUnref(alloc); + VIR_FREE(system_dir); + VIR_FREE(resctrl_dir); + VIR_FREE(schemata_str); + VIR_FREE(schemata_file); + return ret; +} + + +static int +mymain(void) +{ + struct virResctrlData data = {0}; + int ret = 0; + +#define DO_TEST_UNUSED(_filename) \ + do { \ + data = (struct virResctrlData) { .filename = _filename }; \ + if (virTestRun("Free: " _filename, test_virResctrlGetUnused, &data) < 0) \ + ret = -1; \ + } while (0) + + DO_TEST_UNUSED("resctrl"); + DO_TEST_UNUSED("resctrl-cdp"); + DO_TEST_UNUSED("resctrl-skx"); + DO_TEST_UNUSED("resctrl-skx-twocaches"); + + return ret; +} + +VIR_TEST_MAIN(mymain) -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:16PM +0100, Martin Kletzander wrote:
This test initializes capabilities from vircaps2xmldata (since it exists there already) and then requests list of free bitmaps (all unallocated space) from virresctrl.c
Desirable outputs are saved in virresctrldata.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tests/Makefile.am | 9 +- tests/virresctrldata/resctrl-cdp.schemata | 2 + .../virresctrldata/resctrl-skx-twocaches.schemata | 1 + tests/virresctrldata/resctrl-skx.schemata | 1 + tests/virresctrldata/resctrl.schemata | 1 + tests/virresctrltest.c | 102 +++++++++++++++++++++ 6 files changed, 115 insertions(+), 1 deletion(-) create mode 100644 tests/virresctrldata/resctrl-cdp.schemata create mode 100644 tests/virresctrldata/resctrl-skx-twocaches.schemata create mode 100644 tests/virresctrldata/resctrl-skx.schemata create mode 100644 tests/virresctrldata/resctrl.schemata create mode 100644 tests/virresctrltest.c
The output data will need to be modified if we change the behavior of what we consider as free cache. Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

We've been building up to this. This adds support for cputune/cachetune settings for domains in the QEMU driver. The addition into qemuProcessSetupVcpu() automatically adds support for hotplug. For hot-unplug we need to remove the allocation only if all the vCPUs were unplugged. But since the threads are left running, we can't really do much about it now. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_process.c | 60 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 55 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 25ec464d3e1c..3a697de037e1 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2507,6 +2507,32 @@ qemuProcessSetupEmulator(virDomainObjPtr vm) } +static int +qemuProcessResctrlCreate(virQEMUDriverPtr driver, + virDomainObjPtr vm) +{ + int ret = -1; + size_t i = 0; + virCapsPtr caps = virQEMUDriverGetCapabilities(driver, false); + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (!caps) + return -1; + + for (i = 0; i < vm->def->ncachetunes; i++) { + if (virResctrlAllocCreate(caps->host.resctrl, + vm->def->cachetunes[i]->alloc, + priv->machineName) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virObjectUnref(caps); + return ret; +} + + static int qemuProcessInitPasswords(virConnectPtr conn, virQEMUDriverPtr driver, @@ -5018,12 +5044,26 @@ qemuProcessSetupVcpu(virDomainObjPtr vm, { pid_t vcpupid = qemuDomainGetVcpuPid(vm, vcpuid); virDomainVcpuDefPtr vcpu = virDomainDefGetVcpu(vm->def, vcpuid); + size_t i = 0; - return qemuProcessSetupPid(vm, vcpupid, VIR_CGROUP_THREAD_VCPU, - vcpuid, vcpu->cpumask, - vm->def->cputune.period, - vm->def->cputune.quota, - &vcpu->sched); + if (qemuProcessSetupPid(vm, vcpupid, VIR_CGROUP_THREAD_VCPU, + vcpuid, vcpu->cpumask, + vm->def->cputune.period, + vm->def->cputune.quota, + &vcpu->sched) < 0) + return -1; + + for (i = 0; i < vm->def->ncachetunes; i++) { + virDomainCachetuneDefPtr ct = vm->def->cachetunes[i]; + + if (virBitmapIsBitSet(ct->vcpus, vcpuid)) { + if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0) + return -1; + break; + } + } + + return 0; } @@ -5896,6 +5936,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessSetupEmulator(vm) < 0) goto cleanup; + VIR_DEBUG("Setting up resctrlfs"); + if (qemuProcessResctrlCreate(driver, vm) < 0) + goto cleanup; + VIR_DEBUG("Setting domain security labels"); if (qemuSecuritySetAllLabel(driver, vm, @@ -6544,6 +6588,12 @@ void qemuProcessStop(virQEMUDriverPtr driver, vm->def->name); } + /* Remove resctrl allocation after cgroups are cleaned up which makes it + * kind of safer (although removing the allocation should work even with + * pids in tasks file */ + for (i = 0; i < vm->def->ncachetunes; i++) + virResctrlAllocRemove(vm->def->cachetunes[i]->alloc); + qemuProcessRemoveDomainStatus(driver, vm); /* Remove VNC and Spice ports from port reservation bitmap, but only if -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:17PM +0100, Martin Kletzander wrote:
We've been building up to this. This adds support for cputune/cachetune settings for domains in the QEMU driver. The addition into qemuProcessSetupVcpu() automatically adds support for hotplug. For hot-unplug we need to remove the allocation only if all the vCPUs were unplugged. But since the threads are left running, we can't really do much about it now.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_process.c | 60 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 55 insertions(+), 5 deletions(-)
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- docs/news.xml | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/news.xml b/docs/news.xml index b4d980624956..4ffa736c04c8 100644 --- a/docs/news.xml +++ b/docs/news.xml @@ -35,6 +35,15 @@ <libvirt> <release version="v4.1.0" date="unreleased"> <section title="New features"> + <change> + <summary> + Added support for CAT (Cache allocation Technology) + </summary> + <description> + Domain vCPU threads can now have allocated some parts of host cache + using the <code>cachetune</code> element in <code>cputune</code>. + </description> + </change> </section> <section title="Improvements"> </section> -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:18PM +0100, Martin Kletzander wrote:
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- docs/news.xml | 9 +++++++++ 1 file changed, 9 insertions(+)
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>

Basically the `cpus` and `tasks` files are not needed, and I've witnessed on a real system that the schemata file may have spaces prepended to a line, so let's adjust at least one test so that it reflects what can happen. Also `000` allocation is invalid and a full mask means it's all free. So adjust for that too. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/cpus | 1 - tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/tasks | 0 tests/vircaps2xmldata/linux-resctrl-skx-twocaches/resctrl/schemata | 2 +- tests/vircaps2xmldata/linux-resctrl/resctrl/cpus | 1 - tests/vircaps2xmldata/linux-resctrl/resctrl/schemata | 2 +- tests/vircaps2xmldata/linux-resctrl/resctrl/tasks | 0 6 files changed, 2 insertions(+), 4 deletions(-) delete mode 100644 tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/cpus delete mode 100644 tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/tasks delete mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/cpus delete mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/tasks diff --git a/tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/cpus b/tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/cpus deleted file mode 100644 index b3a79aa9539f..000000000000 --- a/tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/cpus +++ /dev/null @@ -1 +0,0 @@ -ffffff,ffffffff,ffffffff diff --git a/tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/tasks b/tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/tasks deleted file mode 100644 index e69de29bb2d1..000000000000 diff --git a/tests/vircaps2xmldata/linux-resctrl-skx-twocaches/resctrl/schemata b/tests/vircaps2xmldata/linux-resctrl-skx-twocaches/resctrl/schemata index 691fbaf887d1..23af473be4ed 100644 --- a/tests/vircaps2xmldata/linux-resctrl-skx-twocaches/resctrl/schemata +++ b/tests/vircaps2xmldata/linux-resctrl-skx-twocaches/resctrl/schemata @@ -1 +1 @@ -L3:0=0f0;1=000 +L3:0=0f0;1=7ff diff --git a/tests/vircaps2xmldata/linux-resctrl/resctrl/cpus b/tests/vircaps2xmldata/linux-resctrl/resctrl/cpus deleted file mode 100644 index 98d03acc98ed..000000000000 --- a/tests/vircaps2xmldata/linux-resctrl/resctrl/cpus +++ /dev/null @@ -1 +0,0 @@ -fff diff --git a/tests/vircaps2xmldata/linux-resctrl/resctrl/schemata b/tests/vircaps2xmldata/linux-resctrl/resctrl/schemata index c1a765f1e0a7..78d2d8a9b826 100644 --- a/tests/vircaps2xmldata/linux-resctrl/resctrl/schemata +++ b/tests/vircaps2xmldata/linux-resctrl/resctrl/schemata @@ -1 +1 @@ -L3:0=1ff00;1=1ff0f + L3:0=1ff00;1=1ff0f diff --git a/tests/vircaps2xmldata/linux-resctrl/resctrl/tasks b/tests/vircaps2xmldata/linux-resctrl/resctrl/tasks deleted file mode 100644 index e69de29bb2d1..000000000000 -- 2.16.1

On Tue, Jan 23, 2018 at 07:05:19PM +0100, Martin Kletzander wrote:
Basically the `cpus` and `tasks` files are not needed, and I've witnessed on a real system that the schemata file may have spaces prepended to a line, so let's adjust at least one test so that it reflects what can happen. Also `000` allocation is invalid and a full mask means it's all free. So adjust for that too.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/cpus | 1 - tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/tasks | 0 tests/vircaps2xmldata/linux-resctrl-skx-twocaches/resctrl/schemata | 2 +- tests/vircaps2xmldata/linux-resctrl/resctrl/cpus | 1 - tests/vircaps2xmldata/linux-resctrl/resctrl/schemata | 2 +- tests/vircaps2xmldata/linux-resctrl/resctrl/tasks | 0 6 files changed, 2 insertions(+), 4 deletions(-) delete mode 100644 tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/cpus delete mode 100644 tests/vircaps2xmldata/linux-resctrl-cdp/resctrl/tasks delete mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/cpus delete mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/tasks
This will also need some modifications if we change how we handle the "full" allocation. Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
participants (2)
-
Martin Kletzander
-
Pavel Hrdina