This is the V2 of RFC and the POC source code for introducing x86 RDT CMT
feature, thanks Martin Kletzander for his review and constructive
suggestion for V1.
This series is trying to provide the similar functions of the perf event
based CMT, MBMT and MBML features in reporting cache occupancy, total
memory bandwidth utilization and local memory bandwidth utilization
information in livirt. Firstly we focus on cmt.
x86 RDT Cache Monitoring Technology (CMT) provides a medthod to track the
cache occupancy information per CPU thread. We are leveraging the
implementation of kernel resctrl filesystem and create our patches on top
of that.
Describing the functionality from a high level:
1. Extend the output of 'domstats' and report CMT inforamtion.
Comparing with perf event based CMT implementation in libvirt, this series
extends the output of command 'domstat' and reports cache occupancy
information like these:
<pre>
[root@dl-c200 libvirt]# virsh domstats vm3 --cpu-resource
Domain: 'vm3'
cpu.cacheoccupancy.vcpus_2.value=4415488
cpu.cacheoccupancy.vcpus_2.vcpus=2
cpu.cacheoccupancy.vcpus_1.value=7839744
cpu.cacheoccupancy.vcpus_1.vcpus=1
cpu.cacheoccupancy.vcpus_0,3.value=53796864
cpu.cacheoccupancy.vcpus_0,3.vcpus=0,3
</pre>
The vcpus have been arragned into three monitoring groups, these three
groups cover vcpu 1, vcpu 2 and vcpus 0,3 respectively. Take an example,
the 'cpu.cacheoccupancy.vcpus_0,3.value' reports the cache occupancy
information for vcpu 0 and vcpu 3, the 'cpu.cacheoccupancy.vcpus_0,3.vcpus'
represents the vcpu group information.
To address Martin's suggestion "beware as 1-4 is something else than 1,4 so
you need to differentiate that.", the content of 'vcpus'
(cpu.cacheoccupancy.<groupname>.vcpus=xxx) has been specially processed, if
vcpus is a continous range, e.g. 0-2, then the output of
cpu.cacheoccupancy.vcpus_0-2.vcpus will be like
'cpu.cacheoccupancy.vcpus_0-2.vcpus=0,1,2'
instead of
'cpu.cacheoccupancy.vcpus_0-2.vcpus=0-2'.
Please note that 'vcpus_0-2' is a name of this monitoring group, could be
specified any other word from the XML configuration file or lively changed
with the command introduced in following part.
2. A new command 'cpu-resource' for live changing CMT groups.
A virsh tool has been introduced in this series to dynamically create,
destroy monitoring groups as well as showing the existing grouping status.
The general command interface is like this:
<pre>
[root@dl-c200 libvirt]# virsh help cpu-resource
NAME
cpu-resource - get or set hardware CPU RDT monitoring group
SYNOPSIS
cpu-resource <domain> [--group-name <string>] [--vcpulist
<string>] [--create] [--destroy] [--live] [--config]
[--current]
DESCRIPTION
Create or destroy CPU resource monitoring group.
To get current CPU resource monitoring group status:
virsh # cpu-resource [domain]
OPTIONS
[--domain] <string> domain name, id or uuid
--group-name <string> group name to manipulate
--vcpulist <string> ids of vcpus to manipulate
--create Create CPU resctrl monitoring group for
functions such as monitoring cache occupancy
--destroy Destroy CPU resctrl monitoring group
--live modify/get running state
--config modify/get persistent configuration
--current affect current domain
</pre>
This command provides live interface of changing resource monitoring group
and keeping the result in persistent domain XML configuration file.
3. XML configuration changes for keeping CMT groups.
To keep the monitoring group information and monitoring CPU cache resource
utilization information at launch time, XML configuration file has been
changed by adding a new element
<resmongroup>:
<pre>
# Add a new element
<cputune>
<resmongroup vcpus='0-2'/>
<resmongroup vcpus='3'/>
</cputune>
</pre>
4. About the naming used in this series for RDT CMT technology.
About the wording and naming used in this series for Intel RDT CMT
technology, 'RDT', 'CMT' and 'resctrl' are currently used names in
Intel
documents and kernel namespace in the context of CPU resource, but they
are pretty confusing for system administrator. But 'Resource Control' or
'Monitoring' is a not good choice either, the scope of these two phrases
are too big which normally cover lots of aspects other than CPU cache and
memory hbandwidth. Intel 'RDT' is technology emphasizing on the resource
allocation and monitoring within the scope CPU, I would like to use the
term 'cpu-resource' here to describe the technology that these patches' are
trying to address.
This series is focusing on CPU cache occupancy monitoring(CMT), and this
naming seems has a wider scope than CMT, we could add the similar resource
monitoring part for technologies of MBML and MBMT under the framework that
introduced in these patches. This naming is also applicable to technology
of CPU resource allocation, it is possible to add some command by adding
some arguments to allocate cache or memory bandwidth at run time.
5. About emulator and io threads CMT
Currently, it is not possible to allocate an dedicated amount of cache or
memory bandwidth for emulator or io threads. so the resource monitoring for
emulator or io threads is not considered in this series.
Could be planned in next stage.
Changes since v1:
A lot of things changed, mainly
* report cache occupancy information based on vcpu group instead of whole
domain.
* be possible to destroy vcpu group at run time
* XML configuration file changed
* change naming for describing 'RDT CMT' to 'cpu-resource'
Wang Huaqiang (10):
util: add Intel x86 RDT/CMT support
conf: introduce <resmongroup> element
tests: add tests for validating <resmongroup>
libvirt: add public APIs for resource monitoring group
qemu: enable resctrl monitoring at booting stage
remote: add remote protocol for resctrl monitoring
qemu: add interfaces for dynamically manupulating resctl mon groups
tool: add command cpuresource to interact with cpu resources
tools: show cpu cache occupancy information in domstats
news: add Intel x86 RDT CMT feature
docs/formatdomain.html.in | 17 +
docs/news.xml | 10 +
docs/schemas/domaincommon.rng | 14 +
include/libvirt/libvirt-domain.h | 14 +
src/conf/domain_conf.c | 320 ++++++++++++++++++
src/conf/domain_conf.h | 25 ++
src/driver-hypervisor.h | 13 +
src/libvirt-domain.c | 96 ++++++
src/libvirt_private.syms | 13 +
src/libvirt_public.syms | 6 +
src/qemu/qemu_driver.c | 357 +++++++++++++++++++++
src/qemu/qemu_process.c | 45 ++-
src/remote/remote_daemon_dispatch.c | 45 +++
src/remote/remote_driver.c | 4 +-
src/remote/remote_protocol.x | 31 +-
src/remote_protocol-structs | 16 +
src/util/virresctrl.c | 338 +++++++++++++++++++
src/util/virresctrl.h | 40 +++
tests/genericxml2xmlindata/cachetune-cdp.xml | 3 +
tests/genericxml2xmlindata/cachetune-small.xml | 2 +
tests/genericxml2xmlindata/cachetune.xml | 2 +
.../resmongroup-colliding-cachetune.xml | 34 ++
tests/genericxml2xmltest.c | 3 +
tools/virsh-domain-monitor.c | 7 +
tools/virsh-domain.c | 139 ++++++++
25 files changed, 1588 insertions(+), 6 deletions(-)
create mode 100644 tests/genericxml2xmlindata/resmongroup-colliding-cachetune.xml
--
2.7.4