[PATCH v2 0/5] resctrl: add energy monitoring via PERF_PKG_MON
Linux kernel 7.0 introduced PERF_PKG_MON support in resctrl filesystem which exposes per-workload energy and performance monitoring. This series enables per-VM energy monitoring via core_energy (Joules) and activity (Farads) counters. Energy monitors can be configured through new <energytune> element under <cputune> following earlier cachetune and memorytune patterns. Design notes: Energy values from resctrl are floating-point. I added separate dvals/ndvals pair to virResctrlMonitorStats to handle them. I kept them in a single struct for easier integration with performance counters (integers and floats within same monitor) that might be integrated in another patch. The new XML element is <energytune> under <cputune> following earlier pattern for resctrl features (cachetune, memorytune). Energytune doesn't currently support the "tuning" part, only monitoring. I added it as energytune for consistency with cache and memory features, keeping all resctrl handling under cputune. This also makes sense with current resctrl architecture - all monitoring groups are part of an allocation group. This approach allows for easy opt-in/out on the monitoring features. Changes since v1 (RFC): - Split into 5 patches (see below) - Changed VIR_INFO/VIR_WARN to VIR_DEBUG - Tightened energyMonitorFeature to <choice> of core_energy/activity instead of regex - Renamed test case energytune-basic -> energytune - Minor comment cleanups in virresctrl.h Jedrzej Wasiukiewicz (5): util: add PERF_PKG_MON energy monitoring support in virresctrl conf: report energy monitoring in host capabilities conf: add energytune to domain XML qemu: resctrl energy counters via domstats NEWS: document resctrl energy monitoring NEWS.rst | 6 + docs/formatdomain.rst | 20 +++ include/libvirt/libvirt-domain.h | 65 ++++++++ src/conf/capabilities.c | 42 +++++ src/conf/capabilities.h | 6 + src/conf/domain_conf.c | 99 ++++++++++++ src/conf/schemas/capability.rng | 27 ++++ src/conf/schemas/domaincommon.rng | 19 +++ src/conf/virconftypes.h | 2 + src/qemu/qemu_driver.c | 70 ++++++++- src/util/virresctrl.c | 180 ++++++++++++++++++---- src/util/virresctrl.h | 10 +- tests/genericxml2xmlindata/energytune.xml | 32 ++++ tests/genericxml2xmltest.c | 1 + 14 files changed, 542 insertions(+), 37 deletions(-) create mode 100644 tests/genericxml2xmlindata/energytune.xml -- 2.34.1 --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
Linux 7.0 introduced in resctrl PERF_PKG_MON interface that exposes per-package energy and performance counters. This patch extends virresctrl implementation to discover and read energy counters from this new resource type. (core_energy - Joules, activity - Farads) Changes: - Add Energy features allow-list virResctrlEnergyFeatures since PERF_PKG_MON is not prefix-based. - Added perf_monitor_info to _virResctrlInfo to contain _virResctrlInfo capabilities - New virResctrlGetPerfMonitorInfo following earlier virResctrlGetMonitorInfo to check new resource capabilities - Added VIR_RESCTRL_MONITOR_TYPE_ENERGY and mapped it to energy allow-list - Added dvals/ndvals pair to _virResctrlMonitorStats to support floating-point counters and integer counters in single monitor (to support integer perf counters in the future). - Added floating-point read + parse in virResctrlMonitorGetStats for energy counters - Stubbed VIR_RESCTRL_MONITOR_TYPE_ENERGY in qemu_driver Signed-off-by: Jedrzej Wasiukiewicz <jedrzej.wasiukiewicz@intel.com> Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com> --- src/qemu/qemu_driver.c | 1 + src/util/virresctrl.c | 180 ++++++++++++++++++++++++++++++++++------- src/util/virresctrl.h | 10 ++- 3 files changed, 158 insertions(+), 33 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 264799a864..2d509cd2b9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -17085,6 +17085,7 @@ qemuDomainGetResctrlMonData(virQEMUDriver *driver, if (caps->host.memBW.monitor) features = caps->host.memBW.monitor->features; break; + case VIR_RESCTRL_MONITOR_TYPE_ENERGY: case VIR_RESCTRL_MONITOR_TYPE_UNSUPPORT: case VIR_RESCTRL_MONITOR_TYPE_LAST: virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 8f33a85a56..66df44fb58 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -79,15 +79,24 @@ VIR_ENUM_IMPL(virResctrl, "DATA", ); -/* Monitor feature name prefix mapping for monitor naming */ +/* Monitor feature prefix/type mapping for monitor naming */ VIR_ENUM_IMPL(virResctrlMonitorPrefix, VIR_RESCTRL_MONITOR_TYPE_LAST, "__unsupported__", "llc_", "mbm_", + "energy", ); +/* PERF_PKG_MON features that report energy data (floating-point) */ +static const char *virResctrlEnergyFeatures[] = { + "core_energy", + "activity", + NULL, +}; + + /* All private typedefs so that they exist for all later definitions. This way * structs can be included in one or another without reorganizing the code every * time. */ @@ -183,6 +192,8 @@ struct _virResctrlInfo { virResctrlInfoMemBW *membw_info; virResctrlInfoMongrp *monitor_info; + + virResctrlInfoMongrp *perf_monitor_info; }; static void @@ -235,10 +246,14 @@ virResctrlInfoDispose(void *obj) if (resctrl->monitor_info) g_strfreev(resctrl->monitor_info->features); + if (resctrl->perf_monitor_info) + g_strfreev(resctrl->perf_monitor_info->features); + virResctrlInfoMemBWFree(resctrl->membw_info); g_free(resctrl->levels); g_free(resctrl->monitor_info); + g_free(resctrl->perf_monitor_info); } @@ -771,6 +786,52 @@ virResctrlGetMonitorInfo(virResctrlInfo *resctrl) } +static int +virResctrlGetPerfMonitorInfo(virResctrlInfo *resctrl) +{ + int rv = -1; + g_autofree char *featurestr = NULL; + g_autofree virResctrlInfoMongrp *info_monitor = NULL; + + info_monitor = g_new0(virResctrlInfoMongrp, 1); + + rv = virFileReadValueUint(&info_monitor->max_monitor, + SYSFS_RESCTRL_PATH + "/info/PERF_PKG_MON/num_rmids"); + if (rv == -2) { + VIR_DEBUG("The file '" SYSFS_RESCTRL_PATH "/info/PERF_PKG_MON/num_rmids' " + "does not exist"); + return 0; + } else if (rv < 0) { + return -1; + } + + rv = virFileReadValueString(&featurestr, + SYSFS_RESCTRL_PATH + "/info/PERF_PKG_MON/mon_features"); + if (rv == -2) + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get mon_features from resctrl PERF_PKG_MON")); + if (rv < 0) + return -1; + + if (!*featurestr) { + VIR_DEBUG("Got empty feature list from PERF_PKG_MON; " + "energy monitoring will not be available"); + return 0; + } + + info_monitor->features = g_strsplit(featurestr, "\n", 0); + info_monitor->nfeatures = g_strv_length(info_monitor->features); + VIR_DEBUG("Resctrl supported %zd PERF_PKG_MON monitoring features", + info_monitor->nfeatures); + + resctrl->perf_monitor_info = g_steal_pointer(&info_monitor); + + return 0; +} + + static int virResctrlGetInfo(virResctrlInfo *resctrl) { @@ -790,6 +851,9 @@ virResctrlGetInfo(virResctrlInfo *resctrl) if ((ret = virResctrlGetMonitorInfo(resctrl)) < 0) return -1; + if ((ret = virResctrlGetPerfMonitorInfo(resctrl)) < 0) + return -1; + return 0; } @@ -830,6 +894,9 @@ virResctrlInfoIsEmpty(virResctrlInfo *resctrl) if (resctrl->monitor_info) return false; + if (resctrl->perf_monitor_info) + return false; + for (i = 0; i < resctrl->nlevels; i++) { virResctrlInfoPerLevel *i_level = resctrl->levels[i]; @@ -986,13 +1053,6 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfo *resctrl, if (virResctrlInfoIsEmpty(resctrl)) return 0; - mongrp_info = resctrl->monitor_info; - - if (!mongrp_info) { - VIR_INFO("Monitor is not supported in host"); - return 0; - } - for (i = 0; i < VIR_RESCTRL_MONITOR_TYPE_LAST; i++) { if (STREQ(prefix, virResctrlMonitorPrefixTypeToString(i))) { mon = g_new0(virResctrlInfoMon, 1); @@ -1008,6 +1068,19 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfo *resctrl, return -1; } + if (mon->type == VIR_RESCTRL_MONITOR_TYPE_ENERGY) + mongrp_info = resctrl->perf_monitor_info; + else + mongrp_info = resctrl->monitor_info; + + if (!mongrp_info) { + VIR_DEBUG("Monitor prefix '%s' is not supported in host", prefix); + virResctrlInfoMonFree(*monitor); + *monitor = NULL; + ret = 0; + goto cleanup; + } + mon->max_monitor = mongrp_info->max_monitor; if (mon->type == VIR_RESCTRL_MONITOR_TYPE_CACHE) { @@ -1018,8 +1091,12 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfo *resctrl, mon->features = g_new0(char *, mongrp_info->nfeatures + 1); for (i = 0; i < mongrp_info->nfeatures; i++) { - if (STRPREFIX(mongrp_info->features[i], prefix)) + if (mon->type == VIR_RESCTRL_MONITOR_TYPE_ENERGY) { + if (g_strv_contains(virResctrlEnergyFeatures, mongrp_info->features[i])) + mon->features[mon->nfeatures++] = g_strdup(mongrp_info->features[i]); + } else if (STRPREFIX(mongrp_info->features[i], prefix)) { mon->features[mon->nfeatures++] = g_strdup(mongrp_info->features[i]); + } } mon->features = g_renew(char *, mon->features, mon->nfeatures + 1); @@ -2558,7 +2635,7 @@ virResctrlMonitorStatsSorter(const void *a, * memory bandwidth usage data. * @nstats: A size_t pointer to hold the returned array length of @stats * - * Get cache or memory bandwidth utilization information. + * Get cache, memory bandwidth or energy utilization information. * * Returns 0 on success, -1 on error. */ @@ -2593,6 +2670,7 @@ virResctrlMonitorGetStats(virResctrlMonitor *monitor, while (virDirRead(dirp, &ent, datapath) > 0) { g_autofree char *filepath = NULL; char *node_id = NULL; + bool is_energy = false; /* Looking for directory that contains resource utilization * information file. The directory name is arranged in format @@ -2605,18 +2683,17 @@ virResctrlMonitorGetStats(virResctrlMonitor *monitor, if (!virFileIsDir(filepath)) continue; - /* Looking for directory has a prefix 'mon_L' */ - if (!(node_id = STRSKIP(ent->d_name, "mon_L"))) - continue; - - /* Looking for directory has another '_' */ - node_id = strchr(node_id, '_'); - if (!node_id) - continue; - - /* Skip the character '_' */ - if (!(node_id = STRSKIP(node_id, "_"))) + if ((node_id = STRSKIP(ent->d_name, "mon_PERF_PKG_"))) { + is_energy = true; + } else if ((node_id = STRSKIP(ent->d_name, "mon_L"))) { + node_id = strchr(node_id, '_'); + if (!node_id) + continue; + if (!(node_id = STRSKIP(node_id, "_"))) + continue; + } else { continue; + } stat = g_new0(virResctrlMonitorStats, 1); stat->features = g_new0(char *, nresources + 1); @@ -2626,21 +2703,61 @@ virResctrlMonitorGetStats(virResctrlMonitor *monitor, goto cleanup; for (i = 0; resources[i]; i++) { - rv = virFileReadValueUllong(&val, "%s/%s/%s", datapath, - ent->d_name, resources[i]); - if (rv == -2) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("File '%1$s/%2$s/%3$s' does not exist."), - datapath, ent->d_name, resources[i]); + if (is_energy) { + g_autofree char *valstr = NULL; + double dval = 0.0; + char *endp = NULL; + + rv = virFileReadValueString(&valstr, "%s/%s/%s", datapath, + ent->d_name, resources[i]); + if (rv == -2) { + if (i == 0) + break; + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%1$s/%2$s/%3$s' does not exist."), + datapath, ent->d_name, resources[i]); + goto cleanup; + } + if (rv < 0) + goto cleanup; + + g_strstrip(valstr); + errno = 0; + dval = g_ascii_strtod(valstr, &endp); + if (endp == valstr || *endp != '\0' || errno != 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse resctrl monitor value '%1$s' from '%2$s/%3$s/%4$s'"), + valstr, datapath, ent->d_name, resources[i]); + goto cleanup; + } + + VIR_APPEND_ELEMENT(stat->dvals, stat->ndvals, dval); + } else { + rv = virFileReadValueUllong(&val, "%s/%s/%s", datapath, + ent->d_name, resources[i]); + if (rv == -2) { + if (i == 0) + break; + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%1$s/%2$s/%3$s' does not exist."), + datapath, ent->d_name, resources[i]); + goto cleanup; + } + if (rv < 0) + goto cleanup; + + VIR_APPEND_ELEMENT(stat->vals, stat->nvals, val); } - if (rv < 0) - goto cleanup; - - VIR_APPEND_ELEMENT(stat->vals, stat->nvals, val); stat->features[i] = g_strdup(resources[i]); } + if (resources[i]) { + virResctrlMonitorStatsFree(stat); + stat = NULL; + continue; + } + VIR_APPEND_ELEMENT(*stats, *nstats, stat); } @@ -2665,5 +2782,6 @@ virResctrlMonitorStatsFree(virResctrlMonitorStats *stat) g_strfreev(stat->features); g_free(stat->vals); + g_free(stat->dvals); g_free(stat); } diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index c70b112864..857afe6f1e 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -38,6 +38,7 @@ typedef enum { VIR_RESCTRL_MONITOR_TYPE_UNSUPPORT, VIR_RESCTRL_MONITOR_TYPE_CACHE, VIR_RESCTRL_MONITOR_TYPE_MEMBW, + VIR_RESCTRL_MONITOR_TYPE_ENERGY, VIR_RESCTRL_MONITOR_TYPE_LAST } virResctrlMonitorType; @@ -196,11 +197,16 @@ struct _virResctrlMonitorStats { /* @features is a NULL terminal string list tracking the statistical record * name.*/ char **features; - /* @vals store the statistical record values and @val[0] is the value for - * @features[0], @val[1] for@features[1] ... respectively */ + /* @vals store the statistical record values for integer-valued resources. Entries correspond 1:1 with + * @features; empty when the resource reports floating-point data. */ unsigned long long *vals; /* The length of @vals array */ size_t nvals; + /* @dvals store double-precision values for floating-point resources. + * Entries correspond 1:1 with @features; empty when the resource reports integer data. */ + double *dvals; + /* The length of @dvals array */ + size_t ndvals; }; virResctrlMonitor * -- 2.34.1 --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
On 5/14/26 16:41, Jedrzej Wasiukiewicz wrote:
Linux 7.0 introduced in resctrl PERF_PKG_MON interface that exposes per-package energy and performance counters. This patch extends virresctrl implementation to discover and read energy counters from this new resource type. (core_energy - Joules, activity - Farads)
Changes: - Add Energy features allow-list virResctrlEnergyFeatures since PERF_PKG_MON is not prefix-based. - Added perf_monitor_info to _virResctrlInfo to contain _virResctrlInfo capabilities - New virResctrlGetPerfMonitorInfo following earlier virResctrlGetMonitorInfo to check new resource capabilities - Added VIR_RESCTRL_MONITOR_TYPE_ENERGY and mapped it to energy allow-list - Added dvals/ndvals pair to _virResctrlMonitorStats to support floating-point counters and integer counters in single monitor (to support integer perf counters in the future). - Added floating-point read + parse in virResctrlMonitorGetStats for energy counters - Stubbed VIR_RESCTRL_MONITOR_TYPE_ENERGY in qemu_driver
Signed-off-by: Jedrzej Wasiukiewicz <jedrzej.wasiukiewicz@intel.com> Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com> --- src/qemu/qemu_driver.c | 1 + src/util/virresctrl.c | 180 ++++++++++++++++++++++++++++++++++------- src/util/virresctrl.h | 10 ++- 3 files changed, 158 insertions(+), 33 deletions(-)
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index c70b112864..857afe6f1e 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -38,6 +38,7 @@ typedef enum { VIR_RESCTRL_MONITOR_TYPE_UNSUPPORT, VIR_RESCTRL_MONITOR_TYPE_CACHE, VIR_RESCTRL_MONITOR_TYPE_MEMBW, + VIR_RESCTRL_MONITOR_TYPE_ENERGY,
VIR_RESCTRL_MONITOR_TYPE_LAST } virResctrlMonitorType; @@ -196,11 +197,16 @@ struct _virResctrlMonitorStats { /* @features is a NULL terminal string list tracking the statistical record * name.*/ char **features; - /* @vals store the statistical record values and @val[0] is the value for - * @features[0], @val[1] for@features[1] ... respectively */ + /* @vals store the statistical record values for integer-valued resources. Entries correspond 1:1 with + * @features; empty when the resource reports floating-point data. */
Long lines.
unsigned long long *vals; /* The length of @vals array */ size_t nvals; + /* @dvals store double-precision values for floating-point resources. + * Entries correspond 1:1 with @features; empty when the resource reports integer data. */
Again.
+ double *dvals; + /* The length of @dvals array */ + size_t ndvals; };
virResctrlMonitor *
Michal
Expose PERF_PKG_MON energy monitoring capabilities in the host capabilities XML. <energy> <monitor maxMonitors='576'> <feature name='core_energy'/> <feature name='activity'/> </monitor> </energy> Changes: - Add virCapabilitiesFormatEnergy() to emit <energy> XML block - Add virCapabilitiesInitEnergy() to init from PERF_PKG_MON info - Add <energy> element and energyMonitorFeature to capability.rng - Update qemu_driver to support VIR_RESCTRL_MONITOR_TYPE_ENERGY - Add virCapsHostEnergy struct Signed-off-by: Jedrzej Wasiukiewicz <jedrzej.wasiukiewicz@intel.com> Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com> --- src/conf/capabilities.c | 42 +++++++++++++++++++++++++++++++++ src/conf/capabilities.h | 6 +++++ src/conf/schemas/capability.rng | 27 +++++++++++++++++++++ src/conf/virconftypes.h | 2 ++ src/qemu/qemu_driver.c | 3 +++ 5 files changed, 80 insertions(+) diff --git a/src/conf/capabilities.c b/src/conf/capabilities.c index 1821e36e61..e83de6d1bc 100644 --- a/src/conf/capabilities.c +++ b/src/conf/capabilities.c @@ -263,6 +263,8 @@ virCapsDispose(void *object) virResctrlInfoMonFree(caps->host.memBW.monitor); g_free(caps->host.memBW.nodes); + virResctrlInfoMonFree(caps->host.energy.monitor); + g_free(caps->host.netprefix); g_free(caps->host.pagesSize); virCPUDefFree(caps->host.cpu); @@ -1057,6 +1059,26 @@ virCapabilitiesFormatMemoryBandwidth(virBuffer *buf, } +static int +virCapabilitiesFormatEnergy(virBuffer *buf, + virCapsHostEnergy *energy) +{ + if (!energy->monitor) + return 0; + + virBufferAddLit(buf, "<energy>\n"); + virBufferAdjustIndent(buf, 2); + + if (virCapabilitiesFormatResctrlMonitor(buf, energy->monitor) < 0) + return -1; + + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</energy>\n"); + + return 0; +} + + static int virCapabilitiesFormatHostXML(virCapsHost *host, virBuffer *buf) @@ -1156,6 +1178,9 @@ virCapabilitiesFormatHostXML(virCapsHost *host, if (virCapabilitiesFormatMemoryBandwidth(buf, &host->memBW) < 0) return -1; + if (virCapabilitiesFormatEnergy(buf, &host->energy) < 0) + return -1; + for (i = 0; i < host->nsecModels; i++) { virBufferAddLit(buf, "<secmodel>\n"); virBufferAdjustIndent(buf, 2); @@ -2143,6 +2168,20 @@ virCapabilitiesInitResctrlMemory(virCaps *caps) } +static int +virCapabilitiesInitEnergy(virCaps *caps) +{ + const char *prefix = virResctrlMonitorPrefixTypeToString( + VIR_RESCTRL_MONITOR_TYPE_ENERGY); + + if (virResctrlInfoGetMonitorPrefix(caps->host.resctrl, prefix, + &caps->host.energy.monitor) < 0) + return -1; + + return 0; +} + + int virCapabilitiesInitCaches(virCaps *caps) { @@ -2294,6 +2333,9 @@ virCapabilitiesInitCaches(virCaps *caps) &caps->host.cache.monitor) < 0) return -1; + if (virCapabilitiesInitEnergy(caps) < 0) + return -1; + return 0; } diff --git a/src/conf/capabilities.h b/src/conf/capabilities.h index daea835817..0482e4297a 100644 --- a/src/conf/capabilities.h +++ b/src/conf/capabilities.h @@ -162,6 +162,10 @@ struct _virCapsHostMemBW { virResctrlInfoMon *monitor; }; +struct _virCapsHostEnergy { + virResctrlInfoMon *monitor; +}; + struct _virCapsHost { virArch arch; size_t nfeatures; @@ -184,6 +188,8 @@ struct _virCapsHost { virCapsHostMemBW memBW; + virCapsHostEnergy energy; + size_t nsecModels; virCapsHostSecModel *secModels; diff --git a/src/conf/schemas/capability.rng b/src/conf/schemas/capability.rng index 8ef6e9a282..6160067dc7 100644 --- a/src/conf/schemas/capability.rng +++ b/src/conf/schemas/capability.rng @@ -45,6 +45,9 @@ <optional> <ref name="memory_bandwidth"/> </optional> + <optional> + <ref name="energy"/> + </optional> <zeroOrMore> <ref name="secmodel"/> </zeroOrMore> @@ -333,6 +336,30 @@ </data> </define> + <define name="energyMonitorFeature"> + <choice> + <value>core_energy</value> + <value>activity</value> + </choice> + </define> + + <define name="energy"> + <element name="energy"> + <element name="monitor"> + <attribute name="maxMonitors"> + <ref name="unsignedInt"/> + </attribute> + <oneOrMore> + <element name="feature"> + <attribute name="name"> + <ref name="energyMonitorFeature"/> + </attribute> + </element> + </oneOrMore> + </element> + </element> + </define> + <define name="guestcaps"> <element name="guest"> <ref name="ostype"/> diff --git a/src/conf/virconftypes.h b/src/conf/virconftypes.h index 0596791a4d..f1a200bfe2 100644 --- a/src/conf/virconftypes.h +++ b/src/conf/virconftypes.h @@ -52,6 +52,8 @@ typedef struct _virCapsHostMemBW virCapsHostMemBW; typedef struct _virCapsHostMemBWNode virCapsHostMemBWNode; +typedef struct _virCapsHostEnergy virCapsHostEnergy; + typedef struct _virCapsHostNUMA virCapsHostNUMA; typedef struct _virCapsHostNUMACell virCapsHostNUMACell; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2d509cd2b9..a3d648e268 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -17086,6 +17086,9 @@ qemuDomainGetResctrlMonData(virQEMUDriver *driver, features = caps->host.memBW.monitor->features; break; case VIR_RESCTRL_MONITOR_TYPE_ENERGY: + if (caps->host.energy.monitor) + features = caps->host.energy.monitor->features; + break; case VIR_RESCTRL_MONITOR_TYPE_UNSUPPORT: case VIR_RESCTRL_MONITOR_TYPE_LAST: virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", -- 2.34.1 --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
The new XML element is <energytune> under <cputune> following earlier pattern for resctrl features (cachetune, memorytune). Energytune doesn't currently support the "tuning" part, only monitoring. I added it as energytune for consistency with cache and memory features, keeping all resctrl handling under cputune. This also makes sense with current resctrl architecture - all monitoring groups are part of an allocation group. Changes: - Added <energytune> parsing to domain_conf.c - Added schema definition in domaincommon.rng - Documented the element in formatdomain.rst - Added energytune test Signed-off-by: Jedrzej Wasiukiewicz <jedrzej.wasiukiewicz@intel.com> Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com> --- docs/formatdomain.rst | 20 +++++ src/conf/domain_conf.c | 99 +++++++++++++++++++++++ src/conf/schemas/domaincommon.rng | 19 +++++ tests/genericxml2xmlindata/energytune.xml | 32 ++++++++ tests/genericxml2xmltest.c | 1 + 5 files changed, 171 insertions(+) create mode 100644 tests/genericxml2xmlindata/energytune.xml diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 078cd7aa84..db1ca5637a 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -900,6 +900,9 @@ CPU Tuning <memorytune vcpus='0-3'> <node id='0' bandwidth='60'/> </memorytune> + <energytune vcpus='0-3'> + <monitor vcpus='0-3'/> + </energytune> </cputune> ... @@ -1084,6 +1087,23 @@ CPU Tuning responsible for making sure the value makes sense on their system and configuration. +``energytune`` :since:`Since 12.4.0` + Optional ``energytune`` element allows to monitor energy consumption using the + resctrl filesystem on the host. Whether or not is this supported can be + gathered from capabilities where number of monitors and available features are + reported. The required attribute ``vcpus`` specifies to which allocation group + this monitor belongs. A vCPU can only be member of one allocation group and monitor + group. The ``vcpus`` specified by ``energytune`` can be identical to those + specified by ``cachetune`` or ``memorytune``. However they are not allowed to + overlap each other. Supported subelements are: + + ``monitor`` + The optional element ``monitor`` creates the energy monitor for + this allocation group and has the following required attribute: + + ``vcpus`` + vCPU list the monitor applies to. + Memory Allocation ----------------- diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d73bac5cc5..2d3e646bcb 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19467,6 +19467,57 @@ virDomainMemorytuneDefParse(virDomainDef *def, } +static int +virDomainEnergytuneDefParse(virDomainDef *def, + xmlXPathContextPtr ctxt, + xmlNodePtr node, + unsigned int flags) +{ + VIR_XPATH_NODE_AUTORESTORE(ctxt) + virDomainResctrlDef *resctrl = NULL; + virDomainResctrlDef *newresctrl = NULL; + g_autoptr(virBitmap) vcpus = NULL; + g_autoptr(virResctrlAlloc) alloc = NULL; + size_t nmons; + int ret = -1; + + ctxt->node = node; + + if (virDomainResctrlParseVcpus(def, node, &vcpus) < 0) + return -1; + + if (virBitmapIsAllClear(vcpus)) + return 0; + + if (virDomainResctrlVcpuMatch(def, vcpus, &resctrl) < 0) + return -1; + + if (resctrl) { + alloc = virObjectRef(resctrl->alloc); + } else { + if (!(alloc = virResctrlAllocNew())) + return -1; + if (!(newresctrl = virDomainResctrlNew(node, alloc, vcpus, flags))) + return -1; + resctrl = newresctrl; + } + + nmons = resctrl->nmonitors; + if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_ENERGY, + resctrl) < 0) + goto cleanup; + + if (newresctrl && resctrl->nmonitors > nmons) + VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, newresctrl); + + ret = 0; + cleanup: + virDomainResctrlDefFree(newresctrl); + return ret; +} + + static int virDomainDefTunablesParse(virDomainDef *def, xmlXPathContextPtr ctxt, @@ -19671,6 +19722,15 @@ virDomainDefTunablesParse(virDomainDef *def, } VIR_FREE(nodes); + if ((n = virXPathNodeSet("./cputune/energytune", ctxt, &nodes)) < 0) + return -1; + + for (i = 0; i < n; i++) { + if (virDomainEnergytuneDefParse(def, ctxt, nodes[i], flags) < 0) + return -1; + } + VIR_FREE(nodes); + return 0; } @@ -28721,6 +28781,42 @@ virDomainMemorytuneDefFormat(virBuffer *buf, return 0; } + +static int +virDomainEnergytuneDefFormat(virBuffer *buf, + virDomainResctrlDef *resctrl, + unsigned int flags) +{ + g_auto(virBuffer) childrenBuf = VIR_BUFFER_INIT_CHILD(buf); + g_auto(virBuffer) attrBuf = VIR_BUFFER_INITIALIZER; + g_autofree char *vcpus = NULL; + size_t i; + + for (i = 0; i < resctrl->nmonitors; i++) { + if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i], + VIR_RESCTRL_MONITOR_TYPE_ENERGY, + &childrenBuf) < 0) + return -1; + } + + if (!virBufferUse(&childrenBuf)) + return 0; + + vcpus = virBitmapFormat(resctrl->vcpus); + virBufferAsprintf(&attrBuf, " vcpus='%s'", vcpus); + + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { + const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); + if (!alloc_id) + return -1; + + virBufferAsprintf(&attrBuf, " id='%s'", alloc_id); + } + + virXMLFormatElement(buf, "energytune", &attrBuf, &childrenBuf); + return 0; +} + static int virDomainCputuneDefFormat(virBuffer *buf, virDomainDef *def, @@ -28821,6 +28917,9 @@ virDomainCputuneDefFormat(virBuffer *buf, for (i = 0; i < def->nresctrls; i++) virDomainMemorytuneDefFormat(&childrenBuf, def->resctrls[i], flags); + for (i = 0; i < def->nresctrls; i++) + virDomainEnergytuneDefFormat(&childrenBuf, def->resctrls[i], flags); + virXMLFormatElement(buf, "cputune", NULL, &childrenBuf); return 0; diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index 8c03e14d37..eb365a83b5 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -1293,6 +1293,25 @@ </oneOrMore> </element> </zeroOrMore> + <zeroOrMore> + <element name="energytune"> + <attribute name="vcpus"> + <ref name="cpuset"/> + </attribute> + <optional> + <attribute name="id"> + <data type="string"/> + </attribute> + </optional> + <oneOrMore> + <element name="monitor"> + <attribute name="vcpus"> + <ref name="cpuset"/> + </attribute> + </element> + </oneOrMore> + </element> + </zeroOrMore> </interleave> </element> </define> diff --git a/tests/genericxml2xmlindata/energytune.xml b/tests/genericxml2xmlindata/energytune.xml new file mode 100644 index 0000000000..4ee4bedb68 --- /dev/null +++ b/tests/genericxml2xmlindata/energytune.xml @@ -0,0 +1,32 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <energytune vcpus='0-1'> + <monitor vcpus='0-1'/> + </energytune> + <energytune vcpus='3'> + <monitor vcpus='3'/> + </energytune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i386</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index 6be694cac5..169c71efa3 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -210,6 +210,7 @@ mymain(void) DO_TEST("cachetune-small"); DO_TEST("cachetune-cdp"); DO_TEST("cachetune"); + DO_TEST("energytune"); DO_TEST_DIFFERENT("cachetune-extra-tunes"); DO_TEST_FAIL_INACTIVE("cachetune-colliding-allocs"); DO_TEST_FAIL_INACTIVE("cachetune-colliding-tunes"); -- 2.34.1 --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
Added energy reporting from resctrl PERF_PK_MON through domstats: cpu.energy.monitor.count=1 cpu.energy.monitor.0.name=vcpus_0 cpu.energy.monitor.0.vcpus=0 cpu.energy.monitor.0.pkg.count=2 cpu.energy.monitor.0.pkg.0.id=0 cpu.energy.monitor.0.pkg.0.core_energy=0.000000 cpu.energy.monitor.0.pkg.0.activity=0.000000 cpu.energy.monitor.0.pkg.1.id=1 cpu.energy.monitor.0.pkg.1.core_energy=2.888203 cpu.energy.monitor.0.pkg.1.activity=1.718601 Changes: - Added VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_* macros to libvirt-domain.h - Added qemuDomainGetStatsEnergy() to qemu_driver.c Signed-off-by: Jedrzej Wasiukiewicz <jedrzej.wasiukiewicz@intel.com> Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com> --- include/libvirt/libvirt-domain.h | 65 +++++++++++++++++++++++++++++++ src/qemu/qemu_driver.c | 66 ++++++++++++++++++++++++++++++-- 2 files changed, 127 insertions(+), 4 deletions(-) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index cf05bfe2b7..1066a0b3f1 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -4420,6 +4420,71 @@ struct _virDomainStatsRecord { # define VIR_DOMAIN_STATS_MEMORY_BANDWIDTH_MONITOR_SUFFIX_NODE_SUFFIX_BYTES_TOTAL ".bytes.total" +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_COUNT: + * + * The number of energy monitors for this domain, as an unsigned int. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_COUNT "cpu.energy.monitor.count" + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX: + * + * Prefix for an individual energy monitor group. Concatenate + * with the monitor index and one of the "cpu.energy.monitor.<i>." suffix + * macros below to form a full parameter name. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "cpu.energy.monitor." + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_NAME: + * + * Name of the monitor group as a string. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_NAME ".name" + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_VCPUS: + * + * vCPU set covered by the monitor group as a string. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_VCPUS ".vcpus" + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_COUNT: + * + * Number of PERF_PKG nodes the monitor group exposes, as an unsigned int. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_COUNT ".pkg.count" + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_PREFIX: + * + * Prefix for a single mon_PERF_PKG node inside a monitor group. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_PREFIX ".pkg." + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_SUFFIX_ID: + * + * Kernel-assigned mon_PERF_PKG node id, as an unsigned int. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_SUFFIX_ID ".id" + /** * VIR_DOMAIN_STATS_DIRTYRATE_CALC_STATUS: * diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a3d648e268..596e6ee7f3 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -17051,11 +17051,11 @@ qemuDomainFreeResctrlMonData(virQEMUResctrlMonData *resdata) * returns an error, the caller is also required to call * qemuDomainFreeResctrlMonData to free each element in the * *@resdata array and then the array itself. - * @tag: Could be VIR_RESCTRL_MONITOR_TYPE_CACHE for getting cache statistics - * from @dom cache monitors. VIR_RESCTRL_MONITOR_TYPE_MEMBW for - * getting memory bandwidth statistics from memory bandwidth monitors. + * @tag: VIR_RESCTRL_MONITOR_TYPE_CACHE for getting cache statistics. + * VIR_RESCTRL_MONITOR_TYPE_MEMBW for getting memory bandwidth statistics. + * VIR_RESCTRL_MONITOR_TYPE_ENERGY for getting energy statistics. * - * Get cache or memory bandwidth statistics from @dom monitors. + * Get cache, memory bandwidth or energy statistics from @dom monitors. * * Returns -1 on failure, or 0 on success. */ @@ -17204,6 +17204,62 @@ qemuDomainGetStatsMemoryBandwidth(virQEMUDriver *driver, } +static void +qemuDomainGetStatsEnergy(virQEMUDriver *driver, + virDomainObj *dom, + virTypedParamList *params) +{ + g_autofree virQEMUResctrlMonData **resdata = NULL; + size_t nresdata = 0; + size_t i = 0; + size_t j = 0; + size_t k = 0; + + if (!virDomainObjIsActive(dom)) + return; + + if (qemuDomainGetResctrlMonData(driver, dom, &resdata, &nresdata, + VIR_RESCTRL_MONITOR_TYPE_ENERGY) < 0) { + virResetLastError(); + return; + } + + if (nresdata == 0) + return; + + virTypedParamListAddUInt(params, nresdata, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_COUNT); + + for (i = 0; i < nresdata; i++) { + virTypedParamListAddString(params, resdata[i]->name, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_NAME, i); + virTypedParamListAddString(params, resdata[i]->vcpus, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_VCPUS, i); + virTypedParamListAddUInt(params, resdata[i]->nstats, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_COUNT, i); + + for (j = 0; j < resdata[i]->nstats; j++) { + char **features = resdata[i]->stats[j]->features; + + virTypedParamListAddUInt(params, resdata[i]->stats[j]->id, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_SUFFIX_ID, i, j); + + for (k = 0; features[k]; k++) { + if (k >= resdata[i]->stats[j]->ndvals) + break; + + virTypedParamListAddDouble(params, resdata[i]->stats[j]->dvals[k], + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_PREFIX "%zu" ".%s", i, j, + features[k]); + } + } + } + + for (i = 0; i < nresdata; i++) + qemuDomainFreeResctrlMonData(resdata[i]); +} + + static void qemuDomainGetStatsCpuCache(virQEMUDriver *driver, virDomainObj *dom, @@ -17415,6 +17471,8 @@ qemuDomainGetStatsCpu(virQEMUDriver *driver, qemuDomainGetStatsCpuCache(driver, dom, params); + qemuDomainGetStatsEnergy(driver, dom, params); + qemuDomainGetStatsCpuHaltPollTime(dom, params, privflags); } -- 2.34.1 --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
On 5/14/26 16:41, Jedrzej Wasiukiewicz wrote:
Added energy reporting from resctrl PERF_PK_MON through domstats: cpu.energy.monitor.count=1 cpu.energy.monitor.0.name=vcpus_0 cpu.energy.monitor.0.vcpus=0 cpu.energy.monitor.0.pkg.count=2 cpu.energy.monitor.0.pkg.0.id=0 cpu.energy.monitor.0.pkg.0.core_energy=0.000000 cpu.energy.monitor.0.pkg.0.activity=0.000000 cpu.energy.monitor.0.pkg.1.id=1 cpu.energy.monitor.0.pkg.1.core_energy=2.888203 cpu.energy.monitor.0.pkg.1.activity=1.718601
Changes: - Added VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_* macros to libvirt-domain.h - Added qemuDomainGetStatsEnergy() to qemu_driver.c
Signed-off-by: Jedrzej Wasiukiewicz <jedrzej.wasiukiewicz@intel.com> Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com> --- include/libvirt/libvirt-domain.h | 65 +++++++++++++++++++++++++++++++ src/qemu/qemu_driver.c | 66 ++++++++++++++++++++++++++++++-- 2 files changed, 127 insertions(+), 4 deletions(-)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index cf05bfe2b7..1066a0b3f1 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -4420,6 +4420,71 @@ struct _virDomainStatsRecord { # define VIR_DOMAIN_STATS_MEMORY_BANDWIDTH_MONITOR_SUFFIX_NODE_SUFFIX_BYTES_TOTAL ".bytes.total"
+/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_COUNT: + * + * The number of energy monitors for this domain, as an unsigned int. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_COUNT "cpu.energy.monitor.count" + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX: + * + * Prefix for an individual energy monitor group. Concatenate + * with the monitor index and one of the "cpu.energy.monitor.<i>." suffix + * macros below to form a full parameter name. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "cpu.energy.monitor." + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_NAME: + * + * Name of the monitor group as a string. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_NAME ".name" + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_VCPUS: + * + * vCPU set covered by the monitor group as a string. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_VCPUS ".vcpus" + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_COUNT: + * + * Number of PERF_PKG nodes the monitor group exposes, as an unsigned int. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_COUNT ".pkg.count" + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_PREFIX: + * + * Prefix for a single mon_PERF_PKG node inside a monitor group. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_PREFIX ".pkg." + +/** + * VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_SUFFIX_ID: + * + * Kernel-assigned mon_PERF_PKG node id, as an unsigned int. + * + * Since: 12.4.0 + */ +# define VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_SUFFIX_ID ".id" + /** * VIR_DOMAIN_STATS_DIRTYRATE_CALC_STATUS: * diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a3d648e268..596e6ee7f3 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -17051,11 +17051,11 @@ qemuDomainFreeResctrlMonData(virQEMUResctrlMonData *resdata) * returns an error, the caller is also required to call * qemuDomainFreeResctrlMonData to free each element in the * *@resdata array and then the array itself. - * @tag: Could be VIR_RESCTRL_MONITOR_TYPE_CACHE for getting cache statistics - * from @dom cache monitors. VIR_RESCTRL_MONITOR_TYPE_MEMBW for - * getting memory bandwidth statistics from memory bandwidth monitors. + * @tag: VIR_RESCTRL_MONITOR_TYPE_CACHE for getting cache statistics. + * VIR_RESCTRL_MONITOR_TYPE_MEMBW for getting memory bandwidth statistics. + * VIR_RESCTRL_MONITOR_TYPE_ENERGY for getting energy statistics. * - * Get cache or memory bandwidth statistics from @dom monitors. + * Get cache, memory bandwidth or energy statistics from @dom monitors. * * Returns -1 on failure, or 0 on success. */ @@ -17204,6 +17204,62 @@ qemuDomainGetStatsMemoryBandwidth(virQEMUDriver *driver, }
+static void +qemuDomainGetStatsEnergy(virQEMUDriver *driver, + virDomainObj *dom, + virTypedParamList *params) +{ + g_autofree virQEMUResctrlMonData **resdata = NULL; + size_t nresdata = 0; + size_t i = 0; + size_t j = 0; + size_t k = 0;
We tend to declare variables in their smallest possible scope. IOW, these (j, and k) could be declared inside for() loops.
+ + if (!virDomainObjIsActive(dom)) + return; + + if (qemuDomainGetResctrlMonData(driver, dom, &resdata, &nresdata, + VIR_RESCTRL_MONITOR_TYPE_ENERGY) < 0) { + virResetLastError(); + return; + } + + if (nresdata == 0) + return; + + virTypedParamListAddUInt(params, nresdata, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_COUNT); + + for (i = 0; i < nresdata; i++) { + virTypedParamListAddString(params, resdata[i]->name, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_NAME, i); + virTypedParamListAddString(params, resdata[i]->vcpus, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_VCPUS, i); + virTypedParamListAddUInt(params, resdata[i]->nstats, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_COUNT, i); + + for (j = 0; j < resdata[i]->nstats; j++) { + char **features = resdata[i]->stats[j]->features; + + virTypedParamListAddUInt(params, resdata[i]->stats[j]->id, + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_SUFFIX_ID, i, j); + + for (k = 0; features[k]; k++) { + if (k >= resdata[i]->stats[j]->ndvals) + break; + + virTypedParamListAddDouble(params, resdata[i]->stats[j]->dvals[k], + VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_PREFIX "%zu" VIR_DOMAIN_STATS_CPU_ENERGY_MONITOR_SUFFIX_PKG_PREFIX "%zu" ".%s", i, j, + features[k]); + } + } + } + + for (i = 0; i < nresdata; i++) + qemuDomainFreeResctrlMonData(resdata[i]); +} + + static void qemuDomainGetStatsCpuCache(virQEMUDriver *driver, virDomainObj *dom, @@ -17415,6 +17471,8 @@ qemuDomainGetStatsCpu(virQEMUDriver *driver,
qemuDomainGetStatsCpuCache(driver, dom, params);
+ qemuDomainGetStatsEnergy(driver, dom, params); + qemuDomainGetStatsCpuHaltPollTime(dom, params, privflags); }
Michal
Signed-off-by: Jedrzej Wasiukiewicz <jedrzej.wasiukiewicz@intel.com> Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com> --- NEWS.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/NEWS.rst b/NEWS.rst index 105a398ca8..02dda023ed 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -17,6 +17,12 @@ v12.4.0 (unreleased) * **New features** + * resctrl: Add energy monitoring via resctrl's PERF_PKG_MON + + Add support for Linux kernel 7.0 feature - energy monitoring via resctrl. + This allows to monitor per-VM energy consumption on supported platforms. + Implemented via ``energytune`` element in ``cputune`` . + * **Improvements** * **Bug fixes** -- 2.34.1 --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
On 5/14/26 16:41, Jedrzej Wasiukiewicz wrote:
Linux kernel 7.0 introduced PERF_PKG_MON support in resctrl filesystem which exposes per-workload energy and performance monitoring.
This series enables per-VM energy monitoring via core_energy (Joules) and activity (Farads) counters. Energy monitors can be configured through new <energytune> element under <cputune> following earlier cachetune and memorytune patterns.
Design notes:
Energy values from resctrl are floating-point. I added separate dvals/ndvals pair to virResctrlMonitorStats to handle them. I kept them in a single struct for easier integration with performance counters (integers and floats within same monitor) that might be integrated in another patch.
The new XML element is <energytune> under <cputune> following earlier pattern for resctrl features (cachetune, memorytune). Energytune doesn't currently support the "tuning" part, only monitoring. I added it as energytune for consistency with cache and memory features, keeping all resctrl handling under cputune. This also makes sense with current resctrl architecture - all monitoring groups are part of an allocation group. This approach allows for easy opt-in/out on the monitoring features.
Changes since v1 (RFC): - Split into 5 patches (see below) - Changed VIR_INFO/VIR_WARN to VIR_DEBUG - Tightened energyMonitorFeature to <choice> of core_energy/activity instead of regex - Renamed test case energytune-basic -> energytune - Minor comment cleanups in virresctrl.h
Jedrzej Wasiukiewicz (5): util: add PERF_PKG_MON energy monitoring support in virresctrl conf: report energy monitoring in host capabilities conf: add energytune to domain XML qemu: resctrl energy counters via domstats NEWS: document resctrl energy monitoring
NEWS.rst | 6 + docs/formatdomain.rst | 20 +++ include/libvirt/libvirt-domain.h | 65 ++++++++ src/conf/capabilities.c | 42 +++++ src/conf/capabilities.h | 6 + src/conf/domain_conf.c | 99 ++++++++++++ src/conf/schemas/capability.rng | 27 ++++ src/conf/schemas/domaincommon.rng | 19 +++ src/conf/virconftypes.h | 2 + src/qemu/qemu_driver.c | 70 ++++++++- src/util/virresctrl.c | 180 ++++++++++++++++++---- src/util/virresctrl.h | 10 +- tests/genericxml2xmlindata/energytune.xml | 32 ++++ tests/genericxml2xmltest.c | 1 + 14 files changed, 542 insertions(+), 37 deletions(-) create mode 100644 tests/genericxml2xmlindata/energytune.xml
-- 2.34.1 --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
I'm fixing those small nits I pointed our in my review and merging. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Congratulations on your first libvirt contribution! Michal
participants (2)
-
Jedrzej Wasiukiewicz -
Michal Prívozník