Libvirt exposes per VCPU stats for domains via [6]. I'd like to be able to export those via the exporter. One important metric to me would be things like the steal time (vcpu.<num>.delay), to determine is domains are starting to get cut short or even starve on cpu time. Apparently those metrics are / cannot be expose anymore since the switch to CGroupsV2? Reading [7] or [8] others seem to have run into this.I just tested that upstream libvirt on system with cgroups v2 reports vcpu.<num>.delay as this stat is not taken from cgroups at all, we use `/proc` for it. The stats you are asking for can be obtained using the libvirt API virConnectGetAllDomainStats [10]. The bugs you mentioned are talking about different stat, it affects different API virDomainGetCPUStats [11].Is this actually still the case, even for more recent kernels? If so, I am wondering if there is an issue being tracked to implement this functionality?As far as I know it is still the case there is no replacement for cpuacct.usage_percpu in cgroups v2, but that should not affect the data you seem to be consuming from libvirt.
Thanks for your time and the helpful answers to my questions!
We have now implemented this into the prometheus-libvirt-exporter
and released version 1.5.1 containing those (and other) new
metrics:
*
https://github.com/inovex/prometheus-libvirt-exporter/releases/tag/v1.5.1
Regards
Christian