This series of patches introduced the x86 Cache Monitoring Technology
(CMT) to libvirt by interacting with kernel resource control (resctrl)
interface. CMT is one of the Intel(R) x86 CPU feature which belongs to
the Resource Director Technology (RDT). CMT reports the occupancy of the
last level cache, which is shared by all CPU cores.
We have serval discussion about the enabling of CMT, please refer to
following links for the RFCs.
RFCv3
https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html
RFCv2
https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html
https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html
RFCv1
https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html
1. About reason why CMT is necessary in libvirt?
The perf events of 'CMT, MBML, MBMT' have been phased out since Linux
kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt
the perf based cmt,mbm will not work with the latest linux kernel. These
patches add CMT feature to libvirt through kernel resctrlfs interface.
2. Interfaces for CMT from the high level.
2.1 Query the host capability of CMT.
The element 'monitor' represents the host capabilities of CMT.
The explanations of involved CMT attributes:
- 'maxAllocs' denotes the maximum monitoring groups could be created,
which is limited by the number of hardware 'RMID'.
- 'threshold' denotes the upper bound of cache occupancy for current
group, in bytes, to determine if an RMID can be reused.
- element 'feature' denotes the monitoring feature supported.
- 'llc_occupancy' is the feature for reporting the last level cache
occupancy information.
# virsh capabilities
...
<cache>
<bank id='0' level='3' type='both' size='15'
unit='MiB' cpus='0-5'>
<control granularity='768' unit='KiB' type='code'
maxAllocs='8'/>
<control granularity='768' unit='KiB' type='data'
maxAllocs='8'/>
+ <monitor threshold='540672' unit='B'
maxAllocs='176'/>
+ <feature name=llc_occupancy/>
+ </monitor>
</bank>
<bank id='1' level='3' type='both' size='15'
unit='MiB' cpus='6-11'>
<control granularity='768' unit='KiB' type='code'
maxAllocs='8'/>
<control granularity='768' unit='KiB' type='data'
maxAllocs='8'/>
+ <monitor threshold='540672' unit='B'
maxAllocs='176'/>
+ <feature name=llc_occupancy/>
+ </monitor>
</bank>
</cache>
...
2.2 Create cache monitoring group (cache monitor).
The main interface for creating monitoring group is through XML file. The
proposed configuration is like:
<cputune>
<cachetune vcpus='1'>
<cache id='0' level='3' type='code'
size='7680' unit='KiB'/>
<cache id='1' level='3' type='data'
size='3840' unit='KiB'/>
+ <monitor vcpus='1'/>
</cachetune>
<cachetune vcpus='4-7'>
+ <monitor vcpus='4-6'/>
</cachetune>
</cputune>
In above XML, created 2 cache resctrl allocation groups and 2 resctrl
monitoring groups.
The changes of cache monitor will be effective in next booting of VM.
2.3 Show CMT result through command 'domstats'
Adding the interface in qemu to report this information for resource
monitor group through command 'virsh domstats --cpu-total'.
Below is a typical output:
# virsh domstats 1 --cpu-total
Domain: 'ubuntu16.04-base'
...
cpu.cache.monitor.count=2
cpu.cache.0.name=vcpus_1
cpu.cache.0.vcpus=1
cpu.cache.0.bank.count=2
cpu.cache.0.bank.0.id=0
cpu.cache.0.bank.0.bytes=4505600
cpu.cache.0.bank.1.id=1
cpu.cache.0.bank.1.bytes=5586944
cpu.cache.1.name=vcpus_4-6
cpu.cache.1.vcpus=4,5,6
cpu.cache.1.bank.count=2
cpu.cache.1.bank.0.id=0
cpu.cache.1.bank.0.bytes=17571840
cpu.cache.1.bank.1.id=1
cpu.cache.1.bank.1.bytes=29106176
**Changes Since RFCv3**
In the output of 'domstats', added
'cpu.cache.<cmt_group_index>.bank.<bank_index>.id'
to tell the OS assigned cache bank id of current cache.
Changes is prefixed with a '+':
# virsh domstats 1 --cpu-total
Domain: 'ubuntu16.04-base'
...
cpu.cache.monitor.count=2
cpu.cache.0.name=vcpus_1
cpu.cache.0.vcpus=1
cpu.cache.0.bank.count=2
+ cpu.cache.0.bank.0.id=0
cpu.cache.0.bank.0.bytes=4505600
+ cpu.cache.0.bank.1.id=1
cpu.cache.0.bank.1.bytes=5586944
cpu.cache.1.name=vcpus_4-6
cpu.cache.1.vcpus=4,5,6
cpu.cache.1.bank.count=2
+ cpu.cache.1.bank.0.id=0
cpu.cache.1.bank.0.bytes=17571840
+ cpu.cache.1.bank.1.id=1
cpu.cache.1.bank.1.bytes=29106176
Wang Huaqiang (10):
conf: Renamed 'controlBuf' to 'childrenBuf'
util: add interface retrieving CMT capability
conf: Add CMT capability to host
test: add test case for resctrl monitor
util: resctrl: refactoring some functions
util: Introduce resctrl monitor for CMT
conf: refactor virDomainResctrlAppend
conf: introduce resctrl monitor group in domain
qemu: Introduce resctrl monitoring group
qemu: Report cache occupancy (CMT) with domstats
.gnulib | 1 -
docs/formatdomain.html.in | 14 +-
docs/schemas/capability.rng | 28 +
docs/schemas/domaincommon.rng | 11 +-
src/conf/capabilities.c | 51 +-
src/conf/capabilities.h | 1 +
src/conf/domain_conf.c | 159 +++++-
src/conf/domain_conf.h | 20 +
src/libvirt-domain.c | 9 +
src/libvirt_private.syms | 6 +
src/qemu/qemu_driver.c | 265 ++++++++-
src/qemu/qemu_process.c | 40 +-
src/util/virresctrl.c | 597 +++++++++++++++++++--
src/util/virresctrl.h | 48 +-
tests/genericxml2xmlindata/cachetune-cdp.xml | 2 +
.../cachetune-colliding-monitors.xml | 36 ++
tests/genericxml2xmlindata/cachetune-small.xml | 1 +
tests/genericxml2xmlindata/cachetune.xml | 3 +
tests/genericxml2xmltest.c | 4 +
.../resctrl/info/L3_MON/max_threshold_occupancy | 1 +
.../linux-resctrl/resctrl/info/L3_MON/mon_features | 3 +
.../linux-resctrl/resctrl/info/L3_MON/num_rmids | 1 +
tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 6 +
23 files changed, 1208 insertions(+), 99 deletions(-)
delete mode 160000 .gnulib
create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitors.xml
create mode 100644
tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/max_threshold_occupancy
create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/mon_features
create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/num_rmids
--
2.7.4