On Wed, Jan 03, 2018 at 04:05:38PM -0500, John Ferlan wrote:
On 12/13/2017 10:39 AM, Martin Kletzander wrote:
> With this commit we finally have a way to read and manipulate basic resctrl
> settings. Locking is done only on exposed functions that read/write from/to
> resctrlfs. Not in functions that are exposed in virresctrlpriv.h as those are
> only supposed to be used from tests.
>
> Signed-off-by: Martin Kletzander <mkletzan(a)redhat.com>
> ---
> src/Makefile.am | 2 +-
> src/libvirt_private.syms | 11 +
> src/util/virresctrl.c | 1041 ++++++++++++++++++++++++++++++++++++++++++++-
> src/util/virresctrl.h | 62 +++
> src/util/virresctrlpriv.h | 27 ++
> 5 files changed, 1141 insertions(+), 2 deletions(-)
> create mode 100644 src/util/virresctrlpriv.h
>
Geez, as referenced by the cover letter, I'm glad this is the non way
too complicated parts of the code (tongue firmly planted in my cheek
with the obligatory eye roll emoji).
So, IIUC virResctrlInfoPtr is the "host" world and virResctrlAllocPtr is
the "guest" world, right? If so might be something to add to the commit
messages to make things just slightly more clear.
Lots of code here - hopefully another set of eyes can look at this too -
I'm sure I'll miss something as it's been very difficult to review this
in one (long) session...
Note to self, no sense in grep-ing for "alloc" or "free" any more
after
this code is pushed as they're liberally used. Kind of funny to see
"alloc_free" in the same line ;-)
The other usage that was really confusing is @cache w/ @[n]masks and
@[n]sizes. It seems to be used as the array index for both and it's
never really crystal clear over the relationship between masks and sizes.
> diff --git a/src/Makefile.am b/src/Makefile.am
> index 4c022d1e44b9..9b4fd0c1d597 100644
> --- a/src/Makefile.am
> +++ b/src/Makefile.am
> @@ -167,7 +167,7 @@ UTIL_SOURCES = \
> util/virprocess.c util/virprocess.h \
> util/virqemu.c util/virqemu.h \
> util/virrandom.h util/virrandom.c \
> - util/virresctrl.h util/virresctrl.c \
> + util/virresctrl.h util/virresctrl.c util/virresctrlpriv.h \
> util/virrotatingfile.h util/virrotatingfile.c \
> util/virscsi.c util/virscsi.h \
> util/virscsihost.c util/virscsihost.h \
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index 1a4f304f5e1f..a8009d913669 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -2548,6 +2548,17 @@ virRandomInt;
> # util/virresctrl.h
> virCacheTypeFromString;
> virCacheTypeToString;
> +virResctrlAllocAddPID;
> +virResctrlAllocCreate;
> +virResctrlAllocForeachSize;
> +virResctrlAllocFormat;
> +virResctrlAllocGetFree;
> +virResctrlAllocNew;
> +virResctrlAllocRemove;
> +virResctrlAllocSetID;
> +virResctrlAllocSetSize;
> +virResctrlAllocUpdateMask;
> +virResctrlAllocUpdateSize;
> virResctrlGetInfo;
> virResctrlInfoGetCache;
> virResctrlInfoNew;
> diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
> index a681322905be..32ab84b6f121 100644
> --- a/src/util/virresctrl.c
> +++ b/src/util/virresctrl.c
> @@ -18,8 +18,12 @@
>
> #include <config.h>
>
> -#include "virresctrl.h"
> +#include <sys/file.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
>
> +#include "virresctrlpriv.h"
> #include "c-ctype.h"
> #include "count-one-bits.h"
> #include "viralloc.h"
> @@ -151,6 +155,156 @@ virResctrlInfoNew(void)
> }
>
>
> +/* Alloc-related definitions and AllocClass-related functions */
> +typedef struct _virResctrlAllocPerType virResctrlAllocPerType;
> +typedef virResctrlAllocPerType *virResctrlAllocPerTypePtr;
> +struct _virResctrlAllocPerType {
> + /* There could be bool saying whether this is set or not, but since everything
> + * in virResctrlAlloc (and most of libvirt) goes with pointer arrays we would
> + * have to have one more level of allocation anyway, so this stays faithful to
> + * the concept */
> + unsigned long long **sizes;
> + size_t nsizes;
Perhaps something I should have noted in patch 2 - these @sizes are what
exactly? Is it a page size or just a "max size" in ?bytes for something
stored in the cache?
It's an array that stores allocation size per CPU cache, since there
can be multiple CPUs in the host system.
> +
> + /* Mask for each cache */
> + virBitmapPtr *masks;
> + size_t nmasks;
And of course, what does the mask represent?
The mask represents how the cache allocation is written into the
schemata file and it is based on the cache allocation size.
Let's say that you have L3 cache with size 15 MiB where granularity
of cache allocation is 768 KiB but minimal allocation is 1536 KiB.
This means that you have 20bit mask to configure the cache allocation.
So for example, if you would like to configure 3072 KiB of L3 cache
for a guest, the size would be stored in sizes. When the guest is
started, the following code tries to find a free allocation, that can
be used by the guest and if the allocation is possible it will store
the allocation mask for that specified CPU.
This would be the guest XML part:
<cachetune vcpus='0-3'>
<cache id='0' level='3' type='both' size='3'
unit='MiB'/>
</cachetune>
And the resulting mask can be for example:
0x00F00
Which would lead into this schemata file:
L3:0=00F00
Just a quick comment would suffice - for the future readers as well.
> +};
> +
> +typedef struct _virResctrlAllocPerLevel virResctrlAllocPerLevel;
> +typedef virResctrlAllocPerLevel *virResctrlAllocPerLevelPtr;
> +struct _virResctrlAllocPerLevel {
> + virResctrlAllocPerTypePtr *types; /* Indexed with enum virCacheType */
> +};
> +
> +struct _virResctrlAlloc {
> + virObject parent;
> +
> + virResctrlAllocPerLevelPtr *levels;
> + size_t nlevels;
These represent the "L#" for the directory, right? Examples would be
L1both, L2code, l3data, right?
Just trying to provide enough information so that some future reader
will have a chance to understand the design at it's simplist level.
> +
> + char *id; /* The identifier (any unique string for now) */
we could call this uniqueId too, IDC really though. Just having "id"
makes it a bit difficult to search on (cacheUID also works).
> + char *path;
This is our generated path to the guest cache allocation data.
> +};
> +
> +static virClassPtr virResctrlAllocClass;
> +
> +static void
> +virResctrlAllocDispose(void *obj)
> +{
> + size_t i = 0;
> + size_t j = 0;
> + size_t k = 0;
> +
> + virResctrlAllocPtr resctrl = obj;
> +
> + for (i = 0; i < resctrl->nlevels; i++) {
> + virResctrlAllocPerLevelPtr level = resctrl->levels[i];
> +
> + if (!level)
> + continue;
Looking at virResctrlAllocGetType, it's possible that ->levels is
allocated, but ->levels[level]->types is not, thus potentially causing
problems for the following loop.
I think a " || !level->types" above would do the trick.
Nice catch, however I would rather refactor the virResctrlAllocGetType
function to always allocate it correctly or cleanup after itself it
the allocation fails.
For example instead of this code:
if (!resctrl->levels[level] &&
(VIR_ALLOC(resctrl->levels[level]) < 0 ||
VIR_ALLOC_N(resctrl->levels[level]->types, VIR_CACHE_TYPE_LAST) < 0))
return NULL;
we can have something like this:
if (!resctrl->levels[level]) {
virResctrlAllocPerTypePtr *types = NULL;
if (VIR_ALLOC_N(types, VIR_CACHE_TYPE_LAST) < 0)
return NULL;
if (VIR_ALLOC(resctrl->levels[level]) < 0) {
VIR_FREE(types);
return NULL;
}
resctrl->levels[level]->types = types;
}
> +
> + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) {
> + virResctrlAllocPerTypePtr type = level->types[j];
> +
> + if (!type)
> + continue;
> +
> + for (k = 0; k < type->nsizes; k++)
> + VIR_FREE(type->sizes[k]);
> +
> + for (k = 0; k < type->nmasks; k++)
> + virBitmapFree(type->masks[k]);
> +
> + VIR_FREE(type->sizes);
> + VIR_FREE(type->masks);
> + VIR_FREE(type);
> + }
> + VIR_FREE(level->types);
> + VIR_FREE(level);
> + }
> +
> + VIR_FREE(resctrl->id);
> + VIR_FREE(resctrl->path);
> + VIR_FREE(resctrl->levels);
> +}
> +
> +
> +static int
> +virResctrlAllocOnceInit(void)
> +{
> + if (!(virResctrlAllocClass = virClassNew(virClassForObject(),
> + "virResctrlAlloc",
> + sizeof(virResctrlAlloc),
> + virResctrlAllocDispose)))
> + return -1;
> +
> + return 0;
> +}
> +
> +
> +VIR_ONCE_GLOBAL_INIT(virResctrlAlloc)
> +
> +
> +virResctrlAllocPtr
> +virResctrlAllocNew(void)
> +{
> + if (virResctrlAllocInitialize() < 0)
> + return NULL;
> +
> + return virObjectNew(virResctrlAllocClass);
> +}
> +
> +
> +/* Common functions */
> +static int
> +virResctrlLockInternal(int op)
> +{
> + int fd = open(SYSFS_RESCTRL_PATH, O_DIRECTORY | O_CLOEXEC);
I started reviewing bottom up and wondered later on - do we really want
to be writing into /sys/fs/resctrl? See virResctrlAllocCreate comments.
Well, we kind of need to write to /sys/fs/resctrl to configure the
cache allocation in the host for the guest that is about to start.
> +
> + if (fd < 0) {
> + virReportSystemError(errno, "%s", _("Cannot open
resctrlfs"));
> + return -1;
> + }
> +
> + if (flock(fd, op) < 0) {
> + virReportSystemError(errno, "%s", _("Cannot lock
resctrlfs"));
> + VIR_FORCE_CLOSE(fd);
> + return -1;
> + }
> +
> + return fd;
> +}
> +
> +
> +static inline int
> +virResctrlLockWrite(void)
> +{
> + return virResctrlLockInternal(LOCK_EX);
> +}
> +
> +
> +static int
> +virResctrlUnlock(int fd)
> +{
> + if (fd == -1)
> + return 0;
> +
> + /* The lock gets unlocked by closing the fd, which we need to do anyway in
> + * order to clean up properly */
> + if (VIR_CLOSE(fd) < 0) {
> + virReportSystemError(errno, "%s", _("Cannot close
resctrlfs"));
> +
> + /* Trying to save the already broken */
> + if (flock(fd, LOCK_UN) < 0)
> + virReportSystemError(errno, "%s", _("Cannot unlock
resctrlfs"));
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +
> /* Info-related functions */
> bool
> virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl)
> @@ -354,3 +508,888 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl,
> VIR_FREE(*controls);
> goto cleanup;
> }
> +
> +
> +/* Alloc-related functions */
> +bool
> +virResctrlAllocIsEmpty(virResctrlAllocPtr resctrl)
> +{
> + size_t i = 0;
> + size_t j = 0;
> + size_t k = 0;
> +
> + if (!resctrl)
> + return true;
> +
Rather than numerous loops - would it perhaps be better to have a single
boolean field that gets set whenever a "sizes" or "masks" entry is
set?
Of course this would all need to be locked, right?
> + for (i = 0; i < resctrl->nlevels; i++) {
> + virResctrlAllocPerLevelPtr a_level = resctrl->levels[i];
> +
> + if (!a_level)
> + continue;
> +
> + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) {
> + virResctrlAllocPerTypePtr a_type = a_level->types[j];
> +
> + if (!a_type)
> + continue;
> +
> + for (k = 0; k < a_type->nsizes; k++) {
> + if (a_type->sizes[k])
> + return false;
> + }
> +
> + for (k = 0; k < a_type->nmasks; k++) {
> + if (a_type->masks[k])
> + return false;
> + }
> + }
> + }
> +
> + return true;
> +}
> +
> +
> +static virResctrlAllocPerTypePtr
> +virResctrlAllocGetType(virResctrlAllocPtr resctrl,
> + unsigned int level,
> + virCacheType type)
> +{
> + virResctrlAllocPerLevelPtr a_level = NULL;
> +
> + if (resctrl->nlevels <= level &&
> + VIR_EXPAND_N(resctrl->levels, resctrl->nlevels, level -
resctrl->nlevels + 1) < 0)
> + return NULL;
> +
> + if (!resctrl->levels[level] &&
> + (VIR_ALLOC(resctrl->levels[level]) < 0 ||
> + VIR_ALLOC_N(resctrl->levels[level]->types, VIR_CACHE_TYPE_LAST) <
0))
> + return NULL;
Could have a problem if ->levels[level] ALLOC succeeds, but
->levels[level]->types ALLOC_N fails vis-a-vis the assumption in Dispose.
> +
> + a_level = resctrl->levels[level];
> +
> + if (!a_level->types[type] && VIR_ALLOC(a_level->types[type]) <
0)
> + return NULL;
> +
> + return a_level->types[type];
> +}
> +
> +
> +int
> +virResctrlAllocUpdateMask(virResctrlAllocPtr resctrl,
> + unsigned int level,
> + virCacheType type,
> + unsigned int cache,
> + virBitmapPtr mask)
So far, only ever called from virresctrl.c, so can this be static? Or
are there future plans?
> +{
> + virResctrlAllocPerTypePtr a_type = virResctrlAllocGetType(resctrl, level,
type);
> +
> + if (!a_type)
> + return -1;
> +
> + if (a_type->nmasks <= cache &&
> + VIR_EXPAND_N(a_type->masks, a_type->nmasks,
> + cache - a_type->nmasks + 1) < 0)
> + return -1;
> +
> + if (!a_type->masks[cache]) {
> + a_type->masks[cache] = virBitmapNew(virBitmapSize(mask));
> +
> + if (!a_type->masks[cache])
> + return -1;
> + }
Since this is an Update function, is it possible that the incoming mask
is a different size than what was already existing? IOW: In an else
condition, do we need to compare the size of the source and destination
before the Copy? And if so, Free the current and allocate one using the
new size...
This shouldn't ever happen and if it does, something is seriously
wrong since the mask is stored in /sys/fs/resctrl and it should
already ensure that everything is correctly formatted.
> +
> + return virBitmapCopy(a_type->masks[cache], mask);
In order to speed up IsEmpty we could set a boolean indicating the
change to resctrl
> +}
> +
> +
> +int
> +virResctrlAllocUpdateSize(virResctrlAllocPtr resctrl,
> + unsigned int level,
> + virCacheType type,
> + unsigned int cache,
> + unsigned long long size)
Same here, only called from virresctrl.c
> +{
> + virResctrlAllocPerTypePtr a_type = virResctrlAllocGetType(resctrl, level,
type);
> +
> + if (!a_type)
> + return -1;
> +
> + if (a_type->nsizes <= cache &&
> + VIR_EXPAND_N(a_type->sizes, a_type->nsizes,
> + cache - a_type->nsizes + 1) < 0)
> + return -1;
> +
> + if (!a_type->sizes[cache] && VIR_ALLOC(a_type->sizes[cache]) <
0)
> + return -1;
> +
> + *(a_type->sizes[cache]) = size;
> +
Same comment regarding IsEmpty and setting a boolean...
> + return 0;
> +}
> +
> +
> +static bool
> +virResctrlAllocCheckCollision(virResctrlAllocPtr a,
> + unsigned int level,
> + virCacheType type,
> + unsigned int cache)
> +{
> + virResctrlAllocPerLevelPtr a_level = NULL;
> + virResctrlAllocPerTypePtr a_type = NULL;
> +
> + if (!a)
> + return false;
> +
> + if (a->nlevels <= level)
> + return false;
> +
> + a_level = a->levels[level];
> +
> + if (!a_level)
> + return false;
> +
> + /* All types should be always allocated */
> + a_type = a_level->types[VIR_CACHE_TYPE_BOTH];
> +
> + /* If there is an allocation for type 'both', there can be no other
> + * allocation for the same cache */
> + if (a_type && a_type->nsizes > cache &&
a_type->sizes[cache])
> + return true;
> +
> + if (type == VIR_CACHE_TYPE_BOTH) {
> + a_type = a_level->types[VIR_CACHE_TYPE_CODE];
> +
> + if (a_type && a_type->nsizes > cache &&
a_type->sizes[cache])
> + return true;
> +
> + a_type = a_level->types[VIR_CACHE_TYPE_DATA];
> +
> + if (a_type && a_type->nsizes > cache &&
a_type->sizes[cache])
> + return true;
> + } else {
> + a_type = a_level->types[type];
> +
> + if (a_type && a_type->nsizes > cache &&
a_type->sizes[cache])
> + return true;
> + }
This is simple yet confusing all in the same pile of code. I'm sure it
helps to have the overall design in mind though.
Would it even matter to check any of this if IsEmpty returns true?
> +
> + return false;
> +}
> +
> +
> +int
> +virResctrlAllocSetSize(virResctrlAllocPtr resctrl,
> + unsigned int level,
> + virCacheType type,
> + unsigned int cache,
> + unsigned long long size)
> +{
> + if (virResctrlAllocCheckCollision(resctrl, level, type, cache)) {
> + virReportError(VIR_ERR_XML_ERROR,
> + _("Colliding cache allocations for cache "
> + "level '%u' id '%u', type
'%s'"),
> + level, cache, virCacheTypeToString(type));
> + return -1;
> + }
> +
> + return virResctrlAllocUpdateSize(resctrl, level, type, cache, size);
> +}
> +
> +
> +int
> +virResctrlAllocForeachSize(virResctrlAllocPtr resctrl,
> + virResctrlAllocForeachSizeCallback cb,
> + void *opaque)
> +{
> + int ret = 0;
> + unsigned int level = 0;
> + unsigned int type = 0;
> + unsigned int cache = 0;
> +
> + if (!resctrl)
> + return 0;
Is it possible there are no sizes set?
> +
> + for (level = 0; level < resctrl->nlevels; level++) {
> + virResctrlAllocPerLevelPtr a_level = resctrl->levels[level];
> +
> + if (!a_level)
> + continue;
> +
> + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) {
> + virResctrlAllocPerTypePtr a_type = a_level->types[type];
> +
> + if (!a_type)
> + continue;
> +
> + for (cache = 0; cache < a_type->nsizes; cache++) {
> + unsigned long long *size = a_type->sizes[cache];
> +
> + if (!size)
> + continue;
> +
> + ret = cb(level, type, cache, *size, opaque);
> + if (ret < 0)
> + return ret;
> + }
> + }
> + }
> +
> + return 0;
> +}
> +
> +
> +int
> +virResctrlAllocSetID(virResctrlAllocPtr alloc,
> + const char *id)
> +{
> + if (!id) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("Resctrl allocation 'id' cannot be
NULL"));
> + return -1;
> + }
> +
> + return VIR_STRDUP(alloc->id, id);
> +}
> +
> +
Just a few words here about the expected format you'll be formatting and
parsing would be helpful. Still not 100% clear how [n]sizes comes into
play. When you're saving/re-reading I guess I would have thought (for
some reason) that perhaps sizes would be important
> +char *
> +virResctrlAllocFormat(virResctrlAllocPtr resctrl)
So far only ever called from virresctrl.c
> +{
> + virBuffer buf = VIR_BUFFER_INITIALIZER;
> + unsigned int level = 0;
> + unsigned int type = 0;
> + unsigned int cache = 0;
> +
> + if (!resctrl)
> + return NULL;
> +
> + for (level = 0; level < resctrl->nlevels; level++) {
> + virResctrlAllocPerLevelPtr a_level = resctrl->levels[level];
> +
> + if (!a_level)
> + continue;
> +
> + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) {
> + virResctrlAllocPerTypePtr a_type = a_level->types[type];
> +
> + if (!a_type)
> + continue;
> +
> + virBufferAsprintf(&buf, "L%u%s:", level,
virResctrlTypeToString(type));
> +
> + for (cache = 0; cache < a_type->nmasks; cache++) {
> + virBitmapPtr mask = a_type->masks[cache];
> + char *mask_str = NULL;
> +
> + if (!mask)
> + continue;
> +
> + mask_str = virBitmapToString(mask, false, true);
> + if (!mask_str) {
> + virBufferFreeAndReset(&buf);
> + return NULL;
> + }
> +
> + virBufferAsprintf(&buf, "%u=%s;", cache, mask_str);
> + VIR_FREE(mask_str);
> + }
> +
> + virBufferTrim(&buf, ";", 1);
> + virBufferAddChar(&buf, '\n');
> + }
> + }
> +
> + virBufferCheckError(&buf);
> + return virBufferContentAndReset(&buf);
Joy - some day it'd be nice to get back to the code that was going to
combine the CheckError and ContentAndReset... In the meantime, once this
is pushed I'll have another Coverity whine to filter (since CheckError
doesn't check status like 217 out of 220 calls do).
> +}
> +
> +
> +static int
> +virResctrlAllocParseProcessCache(virResctrlAllocPtr resctrl,
> + unsigned int level,
> + virCacheType type,
> + char *cache)
> +{
> + char *tmp = strchr(cache, '=');
> + unsigned int cache_id = 0;
> + virBitmapPtr mask = NULL;
> + int ret = -1;
> +
> + if (!tmp)
> + return 0;
> +
> + *tmp = '\0';
> + tmp++;
> +
> + if (virStrToLong_uip(cache, NULL, 10, &cache_id) < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Invalid cache id '%s'"), cache);
> + return -1;
> + }
> +
> + mask = virBitmapNewString(tmp);
> + if (!mask)
> + return -1;
> +
> + if (virResctrlAllocUpdateMask(resctrl, level, type, cache_id, mask) < 0)
> + goto cleanup;
> +
> + ret = 0;
> + cleanup:
> + virBitmapFree(mask);
> + return ret;
> +}
> +
> +
> +static int
> +virResctrlAllocParseProcessLine(virResctrlAllocPtr resctrl,
> + char *line)
> +{
> + char **caches = NULL;
> + char *tmp = NULL;
> + unsigned int level = 0;
> + int type = -1;
> + size_t ncaches = 0;
> + size_t i = 0;
> + int ret = -1;
> +
> + /* Skip lines that don't concern caches, e.g. MB: etc. */
> + if (line[0] != 'L')
> + return 0;
> +
> + /* And lines that we can't parse too */
> + tmp = strchr(line, ':');
> + if (!tmp)
> + return 0;
> +
> + *tmp = '\0';
> + tmp++;
> +
> + if (virStrToLong_uip(line + 1, &line, 10, &level) < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Cannot parse resctrl schema level
'%s'"),
> + line + 1);
> + return -1;
> + }
> +
> + type = virResctrlTypeFromString(line);
> + if (type < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Cannot parse resctrl schema level
'%s'"),
> + line + 1);
> + return -1;
> + }
> +
> + caches = virStringSplitCount(tmp, ";", 0, &ncaches);
> + if (!caches)
> + return 0;
> +
> + for (i = 0; i < ncaches; i++) {
> + if (virResctrlAllocParseProcessCache(resctrl, level, type, caches[i]) <
0)
> + goto cleanup;
> + }
> +
> + ret = 0;
> + cleanup:
> + virStringListFree(caches);
> + return ret;
> +}
> +
> +
> +static int
> +virResctrlAllocParse(virResctrlAllocPtr alloc,
> + const char *schemata)
> +{
> + char **lines = NULL;
> + size_t nlines = 0;
> + size_t i = 0;
> + int ret = -1;
> +
> + lines = virStringSplitCount(schemata, "\n", 0, &nlines);
> + for (i = 0; i < nlines; i++) {
> + if (virResctrlAllocParseProcessLine(alloc, lines[i]) < 0)
> + goto cleanup;
> + }
> +
> + ret = 0;
> + cleanup:
> + virStringListFree(lines);
> + return ret;
> +}
> +
> +
> +static void
> +virResctrlAllocSubtractPerType(virResctrlAllocPerTypePtr a,
> + virResctrlAllocPerTypePtr b)
> +{
> + size_t i = 0;
> +
> + if (!a || !b)
> + return;
> +
> + for (i = 0; i < a->nmasks && i < b->nmasks; ++i) {
> + if (a->masks[i] && b->masks[i])
> + virBitmapSubtract(a->masks[i], b->masks[i]);
> + }
Does [n]sizes affect anything? IOW: does it matter.. probably not, but
my brain is overloaded right now!
> +}
> +
> +
> +static void
> +virResctrlAllocSubtract(virResctrlAllocPtr a,
> + virResctrlAllocPtr b)
perhaps easier to "think about" if using dst and src... here and above.
> +{
> + size_t i = 0;
> + size_t j = 0;
> +
> + if (!b)
> + return;
> +
> + for (i = 0; i < a->nlevels && b->nlevels; ++i) {
b->nlevels is not a boolean or pointer... should be "i < " too right?
Any special reason to use ++i over i++?
> + if (a->levels[i] && b->levels[i]) {
> + /* Here we rely on all the system allocations to use the same types.
> + * We kind of _hope_ it's the case. If this is left here until
the
> + * review and someone finds it, then suggest only removing this last
> + * sentence. */
Your golden egg has been discovered and I agree, how do we know? and of
course what role does CheckCollision (eventually) play?
> + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) {
> + virResctrlAllocSubtractPerType(a->levels[i]->types[j],
> + b->levels[i]->types[j]);
> + }
> + }
> + }
> +}
> +
> +
> +static virResctrlAllocPtr
> +virResctrlAllocNewFromInfo(virResctrlInfoPtr info)
> +{
> + size_t i = 0;
> + size_t j = 0;
> + size_t k = 0;
> + virResctrlAllocPtr ret = virResctrlAllocNew();
> + virBitmapPtr mask = NULL;
> +
> + if (!ret)
> + return NULL;
> +
> + for (i = 0; i < info->nlevels; i++) {
> + virResctrlInfoPerLevelPtr i_level = info->levels[i];
> +
> + if (!i_level)
> + continue;
> +
> + for (j = 0; j < VIR_CACHE_TYPE_LAST; j++) {
> + virResctrlInfoPerTypePtr i_type = i_level->types[j];
> +
> + if (!i_type)
> + continue;
> +
> + virBitmapFree(mask);
> + mask = virBitmapNew(i_type->bits);
> + if (!mask)
> + goto error;
> + virBitmapSetAll(mask);
> +
> + for (k = 0; k <= i_type->max_cache_id; k++) {
> + if (virResctrlAllocUpdateMask(ret, i, j, k, mask) < 0)
> + goto error;
> + }
> + }
> + }
> +
> + cleanup:
> + virBitmapFree(mask);
> + return ret;
> + error:
> + virObjectUnref(ret);
> + ret = NULL;
> + goto cleanup;
> +}
> +
> +
This needs a function header - args, what does it do, etc.
> +virResctrlAllocPtr
> +virResctrlAllocGetFree(virResctrlInfoPtr resctrl)
So far only ever called from virresctrl.c and from a static function, so
is it really necessary to be external?
It's also really "odd" to have Free in the name and this isn't
necessarily a Free type function - far from it! Perhaps Unused is closer
to reality with the usage of the Subtract calls. Of course that could
affect variable names selected already...
> +{
> + virResctrlAllocPtr ret = NULL;
> + virResctrlAllocPtr alloc = NULL;
> + virBitmapPtr mask = NULL;
@mask is unused in this context except for the virBitmapFree
> + struct dirent *ent = NULL;
> + DIR *dirp = NULL;
> + char *schemata = NULL;
> + int rv = -1;
> +
> + if (virResctrlInfoIsEmpty(resctrl)) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("Resource control is not supported on this
host"));
> + return NULL;
> + }
> +
> + ret = virResctrlAllocNewFromInfo(resctrl);
> + if (!ret)
> + return NULL;
> +
> + if (virFileReadValueString(&schemata,
> + SYSFS_RESCTRL_PATH
> + "/schemata") < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("Could not read schemata file for the default
group"));
> + goto error;
> + }
Shall I assume that schemata is a CAT specific file? So is "default
group" the right wording? Just seems strange - consistently used
though... If changed you'd have multiple changes. I guess something is
my mind is wanting to associate it with a cgroup.
> +
> + alloc = virResctrlAllocNew();
> + if (!alloc)
> + goto error;
> +
> + if (virResctrlAllocParse(alloc, schemata) < 0)
> + goto error;
> + if (virResctrlAllocIsEmpty(alloc)) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("No schemata for default resctrlfs group"));
> + goto error;
> + }
> + virResctrlAllocSubtract(ret, alloc);
> +
> + if (virDirOpen(&dirp, SYSFS_RESCTRL_PATH) < 0)
> + goto error;
> +
> + while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH)) > 0) {
> + if (ent->d_type != DT_DIR)
> + continue;
> +
> + if (STREQ(ent->d_name, "info"))
> + continue;
> +
> + VIR_FREE(schemata);
> + if (virFileReadValueString(&schemata,
> + SYSFS_RESCTRL_PATH
> + "/%s/schemata",
> + ent->d_name) < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Could not read schemata file for group
%s"),
> + ent->d_name);
> + goto error;
> + }
> +
> + virObjectUnref(alloc);
> + alloc = virResctrlAllocNew();
> + if (!alloc)
> + goto error;
> +
> + if (virResctrlAllocParse(alloc, schemata) < 0)
> + goto error;
> +
> + virResctrlAllocSubtract(ret, alloc);
> +
> + VIR_FREE(schemata);
I think this one could be duplicitous
> + }
> + if (rv < 0)
> + goto error;
> +
> + cleanup:
> + virObjectUnref(alloc);
> + VIR_DIR_CLOSE(dirp);
> + VIR_FREE(schemata);
> + virBitmapFree(mask);
> + return ret;
> +
> + error:
> + virObjectUnref(ret);
> + ret = NULL;
> + goto cleanup;
> +}
> +
> +
Could use a short (haha) description for all the magic that happens here!
> +static int
> +virResctrlAllocMasksAssign(virResctrlInfoPtr r_info,
s/r_info/resctrl/ for consistency?
> + virResctrlAllocPtr alloc)
> +{
> + int ret = -1;
> + unsigned int level = 0;
> + virResctrlAllocPtr alloc_free = NULL;
> +
> + if (!r_info) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("Resource control is not supported on this
host"));
> + return -1;
> + }
Caller already checks this... and it's a static function...
> +
> + alloc_free = virResctrlAllocGetFree(r_info);
> + if (!alloc_free)
> + return -1;
> +
> + for (level = 0; level < alloc->nlevels; level++) {
> + virResctrlAllocPerLevelPtr a_level = alloc->levels[level];
> + virResctrlAllocPerLevelPtr f_level = NULL;
> + unsigned int type = 0;
> +
> + if (!a_level)
> + continue;
> +
> + if (level < alloc_free->nlevels)
> + f_level = alloc_free->levels[level];
> +
> + if (!f_level) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Cache level %d does not support tuning"),
> + level);
> + goto cleanup;
> + }
> +
> + for (type = 0; type < VIR_CACHE_TYPE_LAST; type++) {
> + virResctrlAllocPerTypePtr a_type = a_level->types[type];
> + virResctrlAllocPerTypePtr f_type = f_level->types[type];
> + unsigned int cache = 0;
> +
> + if (!a_type)
> + continue;
> +
> + if (!f_type) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Cache level %d does not support tuning for
"
> + "scope type '%s'"),
> + level, virCacheTypeToString(type));
> + goto cleanup;
> + }
> +
large wet piles of brain matter all over my keyboard for this following
loop. It is 100% not self documenting - I'm not sure WTF is going on.
I have to agree that following this code is extremely hard,
perhaps splitting it into several functions would help to review
this madness :).
> + for (cache = 0; cache < a_type->nsizes;
cache++) {
> + unsigned long long *size = a_type->sizes[cache];
> + virBitmapPtr a_mask = NULL;
> + virBitmapPtr f_mask = NULL;
> + virResctrlInfoPerLevelPtr i_level = r_info->levels[level];
> + virResctrlInfoPerTypePtr i_type = i_level->types[type];
> + unsigned long long granularity;
> + unsigned long long need_bits;
> + size_t i = 0;
> + ssize_t pos = -1;
> + ssize_t last_bits = 0;
> + ssize_t last_pos = -1;
> +
> + if (!size)
> + continue;
> +
> + if (cache >= f_type->nmasks) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Cache with id %u does not exists for
level %d"),
> + cache, level);
> + goto cleanup;
> + }
> +
> + f_mask = f_type->masks[cache];
> + if (!f_mask) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Cache level %d id %u does not support
tuning for "
> + "scope type '%s'"),
> + level, cache, virCacheTypeToString(type));
> + goto cleanup;
> + }
> +
> + granularity = i_type->size / i_type->bits;
> + need_bits = *size / granularity;
> +
> + if (*size % granularity) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Cache allocation of size %llu is not
"
> + "divisible by granularity %llu"),
> + *size, granularity);
> + goto cleanup;
> + }
> +
> + if (need_bits < i_type->min_cbm_bits) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Cache allocation of size %llu is smaller
"
> + "than the minimum allowed allocation
%llu"),
> + *size, granularity * i_type->min_cbm_bits);
> + goto cleanup;
> + }
> +
> + while ((pos = virBitmapNextSetBit(f_mask, pos)) >= 0) {
> + ssize_t pos_clear = virBitmapNextClearBit(f_mask, pos);
> + ssize_t bits;
> +
> + if (pos_clear < 0)
> + pos_clear = virBitmapSize(f_mask);
> +
> + bits = pos_clear - pos;
> +
> + /* Not enough bits, move on and skip all of them */
> + if (bits < need_bits) {
> + pos = pos_clear;
> + continue;
> + }
> +
> + /* This fits perfectly */
> + if (bits == need_bits) {
> + last_pos = pos;
> + break;
> + }
> +
> + /* Remember the smaller region if we already found on before
*/
> + if (last_pos < 0 || (last_bits && bits <
last_bits)) {
> + last_bits = bits;
> + last_pos = pos;
> + }
> +
> + pos = pos_clear;
> + }
> +
> + if (last_pos < 0) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Not enough room for allocation of
"
> + "%llu bytes for level %u cache %u "
> + "scope type '%s'"),
> + *size, level, cache,
> + virCacheTypeToString(type));
> + goto cleanup;
> + }
> +
> + a_mask = virBitmapNew(i_type->bits);
Coverity let me know @a_mask needs to be checked vs. NULL
> + for (i = last_pos; i < last_pos + need_bits; i++) {
> + ignore_value(virBitmapSetBit(a_mask, i));
> + ignore_value(virBitmapClearBit(f_mask, i));
> + }
> +
> + if (a_type->nmasks <= cache) {
> + if (VIR_EXPAND_N(a_type->masks, a_type->nmasks,
> + cache - a_type->nmasks + 1) < 0) {
> + virBitmapFree(a_mask);
> + goto cleanup;
> + }
> + }
> + a_type->masks[cache] = a_mask;
> + }
> + }
> + }
> +
> + ret = 0;
> + cleanup:
> + virObjectUnref(alloc_free);
> + return ret;
> +}
> +
> +
> +/* This checks if the directory for the alloc exists. If not it tries to create
> + * it and apply appropriate alloc settings. */
An overly simplified description for all that happens here.
> +int
> +virResctrlAllocCreate(virResctrlInfoPtr r_info,
s/r_info/resctrl/ for consistency?
> + virResctrlAllocPtr alloc,
> + const char *drivername,
> + const char *machinename)
> +{
> + char *schemata_path = NULL;
> + char *alloc_str = NULL;
> + int ret = -1;
> + int lockfd = -1;
> +
> + if (!alloc)
> + return 0;
> +
> + if (!r_info) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("Resource control is not supported on this
host"));
> + return -1;
> + }
> +
> + if (!alloc->path &&
> + virAsprintf(&alloc->path, "%s/%s-%s-%s",
> + SYSFS_RESCTRL_PATH, drivername, machinename, alloc->id) <
0)
> + return -1;
FWIW: From one of the .0 responses, this is where you get that
"qemu-qemu..." as @drivername is "qemu". I think the @drivername
could
be superfluous, but it's not 100% clear what the options from the
systemd calls to build @machinename are...
In any case, if I'm reading things correctly it seems we would be saving
the guest data in the /sys/fs/resctrl path - is that the "right" place?
I can see for Info (e.g. host) level data that's read, sure, but for
guest (or Alloc) data that's written (and read) it's less clear in my
mind. Worried a bit about polluting /sys, some permissions issues
there, and a bit of future proofing - similar to cgroup layouts which
haven't remained constant through time. It would seem guests using this
would need to be sufficiently privileged or essentially not a session
guest...
Shouldn't things go in the guest libDir? IOW: vm->privateData->libDir
that could be passed from qemuProcessResctrlCreate and avoid replicating
the build-up logic here?
I would think avoiding /sys for guest data would probably be a good
idea. If the host dies how do we clean up after ourselves? At least if
we use guest directories, we can 'easily' document how to clean up.
As I've wrote, we need to write to /sys/fs/resctrl in order to apply
the cache allocation for a guest. This is not saving the "guest data"
anywhere, that's already done by having it configured in the guest XML.
I was able to find some documentation:
<
https://github.com/intel/intel-cmt-cat/wiki/resctrl>
Pavel
> +
> + /* Check if this allocation was already created */
> + if (virFileIsDir(alloc->path))
> + return 0;
> +
> + if (virFileExists(alloc->path)) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Path '%s' for resctrl allocation exists and
is not a "
> + "directory"), alloc->path);
> + goto cleanup;
> + }
> +
> + lockfd = virResctrlLockWrite();
> + if (lockfd < 0)
> + goto cleanup;
> +
> + if (virResctrlAllocMasksAssign(r_info, alloc) < 0)
> + goto cleanup;
> +
> + alloc_str = virResctrlAllocFormat(alloc);
> + if (!alloc_str)
> + goto cleanup;
> +
There is a *lot* of magic going on here regarding how the above calls
will end up generating alloc_str and then how this schemata file gets
populated with data that I really think is not self documenting.
> + if (virAsprintf(&schemata_path, "%s/schemata", alloc->path)
< 0)
> + goto cleanup;
> +
> + if (virFileMakePath(alloc->path) < 0) {
> + virReportSystemError(errno,
> + _("Cannot create resctrl directory
'%s'"),
> + alloc->path);
> + goto cleanup;
> + }
> +
> + if (virFileWriteStr(schemata_path, alloc_str, 0) < 0) {
> + rmdir(alloc->path);
open coded virResctrlAllocRemove... could go with virFileDeleteTree too
> + virReportSystemError(errno,
> + _("Cannot write into schemata file
'%s'"),
> + schemata_path);
> + goto cleanup;
> + }
> +
> + ret = 0;
> + cleanup:
> + virResctrlUnlock(lockfd);
> + VIR_FREE(alloc_str);
> + VIR_FREE(schemata_path);
> + return ret;
> +}
> +
> +
Might be nice to describe what this is for... Why does this need to be
generated/saved?
> +int
> +virResctrlAllocAddPID(virResctrlAllocPtr alloc,
> + pid_t pid)
> +{
> + char *tasks = NULL;
> + char *pidstr = NULL;
> + int ret = 0;
> +
> + if (!alloc->path) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("Cannot add pid to non-existing resctrl
allocation"));
> + return -1;
> + }
> +
Should we do any locking here?
> + if (virAsprintf(&tasks, "%s/tasks", alloc->path) < 0)
> + return -1;
> +
> + if (virAsprintf(&pidstr, "%lld", (long long int) pid) < 0)
> + goto cleanup;
> +
> + if (virFileWriteStr(tasks, pidstr, 0) < 0) {
> + virReportSystemError(errno,
> + _("Cannot write pid in tasks file
'%s'"),
> + tasks);
> + goto cleanup;
> + }
We only ever write to this file a @pidstr, what happens when the vCPU is
unplugged? Can the cachetunes be updated in that manner?
> +
> + ret = 0;
> + cleanup:
> + VIR_FREE(tasks);
> + VIR_FREE(pidstr);
> + return ret;
> +}
> +
> +
> +int
> +virResctrlAllocRemove(virResctrlAllocPtr alloc)
> +{
> + int ret = 0;
> +
> + if (!alloc->path)
> + return 0;
> +
> + VIR_DEBUG("Removing resctrl allocation %s", alloc->path);
> + if (rmdir(alloc->path) != 0 && errno != ENOENT) {
Is it only one level of directory or multiple? e.g. would
virFileDeleteTree be better? In fact overall it may be better.
I hope I found everything ;-)
John
> + ret = -errno;
> + VIR_ERROR(_("Unable to remove %s (%d)"), alloc->path, errno);
> + }
> +
> + return ret;
> +}
> diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
> index 471c02f28dcd..0fbd9dd3acfb 100644
> --- a/src/util/virresctrl.h
> +++ b/src/util/virresctrl.h
> @@ -68,4 +68,66 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl,
> size_t *ncontrols,
> virResctrlInfoPerCachePtr **controls);
>
> +/* Alloc-related things */
> +typedef struct _virResctrlAlloc virResctrlAlloc;
> +typedef virResctrlAlloc *virResctrlAllocPtr;
> +
> +typedef int virResctrlAllocForeachSizeCallback(unsigned int level,
> + virCacheType type,
> + unsigned int cache,
> + unsigned long long size,
> + void *opaque);
> +
> +virResctrlAllocPtr
> +virResctrlAllocNew(void);
> +
> +bool
> +virResctrlAllocIsEmpty(virResctrlAllocPtr resctrl);
> +
> +int
> +virResctrlAllocUpdateSize(virResctrlAllocPtr resctrl,
> + unsigned int level,
> + virCacheType type,
> + unsigned int cache,
> + unsigned long long size);
> +
> +int
> +virResctrlAllocUpdateMask(virResctrlAllocPtr resctrl,
> + unsigned int level,
> + virCacheType type,
> + unsigned int cache,
> + virBitmapPtr mask);
> +
> +int
> +virResctrlAllocSetSize(virResctrlAllocPtr resctrl,
> + unsigned int level,
> + virCacheType type,
> + unsigned int cache,
> + unsigned long long size);
> +
> +int
> +virResctrlAllocForeachSize(virResctrlAllocPtr resctrl,
> + virResctrlAllocForeachSizeCallback cb,
> + void *opaque);
> +
> +int
> +virResctrlAllocSetID(virResctrlAllocPtr alloc,
> + const char *id);
> +
> +char *
> +virResctrlAllocFormat(virResctrlAllocPtr alloc);
> +
> +int
> +virResctrlAllocCreate(virResctrlInfoPtr r_info,
> + virResctrlAllocPtr alloc,
> + const char *drivername,
> + const char *machinename);
> +
> +int
> +virResctrlAllocAddPID(virResctrlAllocPtr alloc,
> + pid_t pid);
> +
> +int
> +virResctrlAllocRemove(virResctrlAllocPtr alloc);
> +
> #endif /* __VIR_RESCTRL_H__ */
> diff --git a/src/util/virresctrlpriv.h b/src/util/virresctrlpriv.h
> new file mode 100644
> index 000000000000..932ff7329c8c
> --- /dev/null
> +++ b/src/util/virresctrlpriv.h
> @@ -0,0 +1,27 @@
> +/*
> + * virresctrlpriv.h:
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library. If not, see
> + * <
http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef __VIR_RESCTRL_PRIV_H__
> +# define __VIR_RESCTRL_PRIV_H__
> +
> +# include "virresctrl.h"
> +
> +virResctrlAllocPtr
> +virResctrlAllocGetFree(virResctrlInfoPtr resctrl);
> +
> +#endif /* __VIR_RESCTRL_PRIV_H__ */
>
--
libvir-list mailing list
libvir-list(a)redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list