[libvirt] [PATCH 0/5] Introduce the support of Intel RDT-MBM

RDT is the short for Intel Resource Director Technology, consists of four sub-technologies until now: -. CAT for cache allocation -. CMT for cache usage monitoring -. MBA for memory bandwidth allocation -. MBM for memory bandwidth usage monitoring The Linux kernel interface is 'resctrl' file system, and we have already implemented the support of CAT, CMT and MBA, to accomplish the tasks such as allocating a part of shared CPU last level cache to particular domain vcpu or a list of vcpus and monitoring the usage of cache, or the task of allocating a mount of memory bandwidth to specify domain vcpu(s). This series is to introduce the support of MBM. Basically the interfaces are: ** Identify host capability ** Similar to identify the host capability of CMT, it could be gotten through the result of 'virsh capabilities', if following elements are found, then MBM is supported: <memory_bandwidth> <monitor maxMonitors='176'> <feature name='mbm_total_bytes'/> <feature name='mbm_local_bytes'/> </monitor> </memory_bandwidth> 'mbm_total_bytes' means supporting to report the memory bandwidth used by the vcpu(s) of specific monitor on all CPU sockets. 'mbm_local_bytes' means supporting to report the memory bandwidth used by vcpu(s) that is passing through local CPU socket. ** Create monitor group** The monitor group for specific domain vcpus, for example vcpu 0-4, is defined in domain configuration file, in such kind of way: <cputune> <memorytune vcpus='0-4'> <monitor vcpus='0-4'/> </memorytune> </cputune> ** Report memory usage ** Introduced an option '--memory' against 'virsh domstats' command to show the memory bandwidth usage in such way: (also very similar to the format of CMT result.) # virsh domstats --memory Domain: 'libvirt-vm' memory.bandwidth.monitor.count=4 memory.bandwidth.monitor.0.name=vcpus_0-4 memory.bandwidth.monitor.0.vcpus=0-4 memory.bandwidth.monitor.0.node.count=2 memory.bandwidth.monitor.0.node.0.id=0 memory.bandwidth.monitor.0.node.0.bytes.total=14201651200 memory.bandwidth.monitor.0.node.0.bytes.local=7369809920 memory.bandwidth.monitor.0.node.1.id=1 memory.bandwidth.monitor.0.node.1.bytes.total=188897640448 memory.bandwidth.monitor.0.node.1.bytes.local=170044047360 Huaqiang (5): util, resctrl: using 64bit interface instead of 32bit for counters conf: showing cache/memoryBW monitor features in capabilities cachetune schema: a looser check for the order of <cache> and <monitor> element conf: Parse dommon configure file for memorytune monitors virsh: show memoryBW info in 'virsh domstats' command docs/schemas/domaincommon.rng | 91 +++++++++--------- include/libvirt/libvirt-domain.h | 1 + src/conf/capabilities.c | 4 +- src/conf/domain_conf.c | 44 +++++++-- src/libvirt-domain.c | 21 +++++ src/qemu/qemu_driver.c | 103 ++++++++++++++++++++- src/util/virfile.c | 40 ++++++++ src/util/virfile.h | 2 + src/util/virresctrl.c | 6 +- src/util/virresctrl.h | 2 +- tests/genericxml2xmlindata/cachetune.xml | 1 + tests/genericxml2xmlindata/memorytune.xml | 5 + tests/genericxml2xmloutdata/cachetune.xml | 34 +++++++ tests/genericxml2xmloutdata/memorytune.xml | 42 +++++++++ tests/genericxml2xmltest.c | 4 +- tools/virsh-domain-monitor.c | 7 ++ tools/virsh.pod | 23 ++++- 17 files changed, 367 insertions(+), 63 deletions(-) create mode 100644 tests/genericxml2xmloutdata/cachetune.xml create mode 100644 tests/genericxml2xmloutdata/memorytune.xml -- 2.23.0

From: Huaqiang <huaqiang.wang@intel.com> The underlying resctrl monitoring is actually using 64 bit counters, not the 32bit one. Correct this by using 64bit interfaces. Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_driver.c | 4 ++-- src/util/virfile.c | 40 ++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ src/util/virresctrl.c | 6 +++--- src/util/virresctrl.h | 2 +- 5 files changed, 48 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f4ff2ba292..e396358871 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -20587,8 +20587,8 @@ qemuDomainGetStatsCpuCache(virQEMUDriverPtr driver, "cpu.cache.monitor.%zu.bank.%zu.id", i, j) < 0) goto cleanup; - if (virTypedParamListAddUInt(params, resdata[i]->stats[j]->vals[0], - "cpu.cache.monitor.%zu.bank.%zu.bytes", i, j) < 0) + if (virTypedParamListAddULLong(params, resdata[i]->stats[j]->vals[0], + "cpu.cache.monitor.%zu.bank.%zu.bytes", i, j) < 0) goto cleanup; } } diff --git a/src/util/virfile.c b/src/util/virfile.c index ced0ea70b7..372498e69e 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4204,6 +4204,46 @@ virFileReadValueUint(unsigned int *value, const char *format, ...) } +/** + * virFileReadValueUllong: + * @value: pointer to unsigned long long to be filled in with the value + * @format, ...: file to read from + * + * Read unsigned int from @format and put it into @value. + * + * Return -2 for non-existing file, -1 on other errors and 0 if everything went + * fine. + */ +int +virFileReadValueUllong(unsigned long long *value, const char *format, ...) +{ + g_autofree char *str = NULL; + g_autofree char *path = NULL; + va_list ap; + + va_start(ap, format); + path = g_strdup_vprintf(format, ap); + va_end(ap); + + if (!virFileExists(path)) + return -2; + + if (virFileReadAll(path, INT_BUFSIZE_BOUND(*value), &str) < 0) + return -1; + + virStringTrimOptionalNewline(str); + + if (virStrToLong_ullp(str, NULL, 10, value) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid unsigned long long value '%s' in file '%s'"), + str, path); + return -1; + } + + return 0; +} + + /** * virFileReadValueScaledInt: * @value: pointer to unsigned long long int to be filled in with the value diff --git a/src/util/virfile.h b/src/util/virfile.h index a570529330..f77d8501c3 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -363,6 +363,8 @@ int virFileReadValueInt(int *value, const char *format, ...) G_GNUC_PRINTF(2, 3); int virFileReadValueUint(unsigned int *value, const char *format, ...) G_GNUC_PRINTF(2, 3); +int virFileReadValueUllong(unsigned long long *value, const char *format, ...) + G_GNUC_PRINTF(2, 3); int virFileReadValueBitmap(virBitmapPtr *value, const char *format, ...) G_GNUC_PRINTF(2, 3); int virFileReadValueScaledInt(unsigned long long *value, const char *format, ...) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index b78fded026..684d2ce398 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2678,7 +2678,7 @@ virResctrlMonitorGetStats(virResctrlMonitorPtr monitor, int rv = -1; int ret = -1; size_t i = 0; - unsigned int val = 0; + unsigned long long val = 0; DIR *dirp = NULL; char *datapath = NULL; char *filepath = NULL; @@ -2734,8 +2734,8 @@ virResctrlMonitorGetStats(virResctrlMonitorPtr monitor, goto cleanup; for (i = 0; resources[i]; i++) { - rv = virFileReadValueUint(&val, "%s/%s/%s", datapath, - ent->d_name, resources[i]); + rv = virFileReadValueUllong(&val, "%s/%s/%s", datapath, + ent->d_name, resources[i]); if (rv == -2) { virReportError(VIR_ERR_INTERNAL_ERROR, _("File '%s/%s/%s' does not exist."), diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 3dd7c96348..11f275acf4 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -204,7 +204,7 @@ struct _virResctrlMonitorStats { char **features; /* @vals store the statistical record values and @val[0] is the value for * @features[0], @val[1] for@features[1] ... respectively */ - unsigned int *vals; + unsigned long long *vals; /* The length of @vals array */ size_t nvals; }; -- 2.23.0

On Thu, Nov 14, 2019 at 01:08:19AM +0800, Wang Huaqiang wrote:
From: Huaqiang <huaqiang.wang@intel.com>
The underlying resctrl monitoring is actually using 64 bit counters, not the 32bit one. Correct this by using 64bit interfaces.
Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_driver.c | 4 ++-- src/util/virfile.c | 40 ++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ src/util/virresctrl.c | 6 +++--- src/util/virresctrl.h | 2 +- 5 files changed, 48 insertions(+), 6 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f4ff2ba292..e396358871 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -20587,8 +20587,8 @@ qemuDomainGetStatsCpuCache(virQEMUDriverPtr driver, "cpu.cache.monitor.%zu.bank.%zu.id", i, j) < 0) goto cleanup;
- if (virTypedParamListAddUInt(params, resdata[i]->stats[j]->vals[0], - "cpu.cache.monitor.%zu.bank.%zu.bytes", i, j) < 0) + if (virTypedParamListAddULLong(params, resdata[i]->stats[j]->vals[0], + "cpu.cache.monitor.%zu.bank.%zu.bytes", i, j) < 0) goto cleanup; } }
Urgh, we cannot do this, as it changes API semantics for applications. Apps are expecting this field to be encoded as UInt & so the change will break their decoding. Is this 32 vs 64-bit difference actually a problem in the real world. eg can the 32-bit value genuinely overflow in real deployments of this feature ? If not, then we should not change this at all. If we do need to change this though, the only option is to leave the current field unchanged, and document that it can be truncated. Then introduce a new field with a different name eg cpu.cache.monitor.%zu.bank.%zu.bytes64 Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Hi Daniel, Thanks for your review. About patch 1/5, I understand we should be very cautious when we changes the determined interface. I'd like to reserved the old 32bit interface and propose a new 64bit interface just as you suggested if you still think so after you got my intention. Here actually I want to fix a bug, maybe I should title this patch as *bug fixing*. Reason is the underlying hardware, the cache monitor and memory bandwidth counters, are 64bit width. Using 32bit interface to access these counters are problematic. This bug is not found because this interface is only used for tracking the amount of cache that used before this patch, normally the occupied cache will not exceed 4GB range. (32bit counter can counter value up to 4GB). But for memory, this counter records the data passing through the memory controller issued by this CPU in bytes and accumulatively, this value can easily exceed the 4GB bound, so I don’t want to reserve the old 32 bit interface and let user use it, because it will report incorrect value.
-----Original Message----- From: Daniel P. Berrangé <berrange@redhat.com> Sent: Friday, December 13, 2019 11:01 PM To: Wang, Huaqiang <huaqiang.wang@intel.com> Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 1/5] util, resctrl: using 64bit interface instead of 32bit for counters
On Thu, Nov 14, 2019 at 01:08:19AM +0800, Wang Huaqiang wrote:
From: Huaqiang <huaqiang.wang@intel.com>
The underlying resctrl monitoring is actually using 64 bit counters, not the 32bit one. Correct this by using 64bit interfaces.
Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- src/qemu/qemu_driver.c | 4 ++-- src/util/virfile.c | 40 ++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 2 ++ src/util/virresctrl.c | 6 +++--- src/util/virresctrl.h | 2 +- 5 files changed, 48 insertions(+), 6 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f4ff2ba292..e396358871 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -20587,8 +20587,8 @@ qemuDomainGetStatsCpuCache(virQEMUDriverPtr driver, "cpu.cache.monitor.%zu.bank.%zu.id", i, j) < 0) goto cleanup;
- if (virTypedParamListAddUInt(params, resdata[i]->stats[j]->vals[0], - "cpu.cache.monitor.%zu.bank.%zu.bytes", i, j) < 0) + if (virTypedParamListAddULLong(params, resdata[i]->stats[j]- vals[0], + + "cpu.cache.monitor.%zu.bank.%zu.bytes", i, j) < 0) goto cleanup; } }
Urgh, we cannot do this, as it changes API semantics for applications.
Apps are expecting this field to be encoded as UInt & so the change will break their decoding.
Is this 32 vs 64-bit difference actually a problem in the real world.
eg can the 32-bit value genuinely overflow in real deployments of this feature ?
Yes. The underlying interface is 64bit, and, for memory monitor, it tracks and adds up the data passing through the memory controller belong to the monitor in Bytes, the counter can easily overflow in a very short of time, for example just for one second. This bug/issue is introduced in 3f2214c2cdd9751df92ec9aa801e6161fa1a7009.
If not, then we should not change this at all.
If we do need to change this though, the only option is to leave the current field unchanged, and document that it can be truncated.
Then introduce a new field with a different name
eg cpu.cache.monitor.%zu.bank.%zu.bytes64
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Thanks Huaqiang

On Sun, Dec 15, 2019 at 02:39:20AM +0000, Wang, Huaqiang wrote:
Hi Daniel,
Thanks for your review.
About patch 1/5, I understand we should be very cautious when we changes the determined interface.
I'd like to reserved the old 32bit interface and propose a new 64bit interface just as you suggested if you still think so after you got my intention. Here actually I want to fix a bug, maybe I should title this patch as *bug fixing*. Reason is the underlying hardware, the cache monitor and memory bandwidth counters, are 64bit width. Using 32bit interface to access these counters are problematic. This bug is not found because this interface is only used for tracking the amount of cache that used before this patch, normally the occupied cache will not exceed 4GB range. (32bit counter can counter value up to 4GB). But for memory, this counter records the data passing through the memory controller issued by this CPU in bytes and accumulatively, this value can easily exceed the 4GB bound, so I don’t want to reserve the old 32 bit interface and let user use it, because it will report incorrect value.
We simply have to document the limitati onof the old interface. We can *NOT* change it, because it WILL break API compatibility for apps that deserailize the current data. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 2019/12/16 下午6:12, Daniel P. Berrangé wrote:
On Sun, Dec 15, 2019 at 02:39:20AM +0000, Wang, Huaqiang wrote:
Hi Daniel,
Thanks for your review.
About patch 1/5, I understand we should be very cautious when we changes the determined interface.
I'd like to reserved the old 32bit interface and propose a new 64bit interface just as you suggested if you still think so after you got my intention. Here actually I want to fix a bug, maybe I should title this patch as *bug fixing*. Reason is the underlying hardware, the cache monitor and memory bandwidth counters, are 64bit width. Using 32bit interface to access these counters are problematic. This bug is not found because this interface is only used for tracking the amount of cache that used before this patch, normally the occupied cache will not exceed 4GB range. (32bit counter can counter value up to 4GB). But for memory, this counter records the data passing through the memory controller issued by this CPU in bytes and accumulatively, this value can easily exceed the 4GB bound, so I don’t want to reserve the old 32 bit interface and let user use it, because it will report incorrect value. We simply have to document the limitati onof the old interface. We can *NOT* change it, because it WILL break API compatibility for apps that deserailize the current data.
Got. I'll resend new patches for review. Thanks Huaqiang
Regards, Daniel

From: Huaqiang <huaqiang.wang@intel.com> We learned that the hardware features of CAT, CMT, MBA and MBM are orthogonal ones, if CAT or MBA is not supported in system, but CMT or MBM are supported, then the cache monitor or memoryBW monitor features may not be correctly displayed in host capabilities through command 'virsh capabilites'. Showing the cache/memoryBW monitor capabilities even there is no support of cache allocation or memoryBW allocation features. Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- src/conf/capabilities.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/conf/capabilities.c b/src/conf/capabilities.c index 7edec75c31..a048427241 100644 --- a/src/conf/capabilities.c +++ b/src/conf/capabilities.c @@ -939,7 +939,7 @@ virCapabilitiesFormatCaches(virBufferPtr buf, size_t i = 0; size_t j = 0; - if (!cache->nbanks) + if (!cache->nbanks && !cache->monitor) return 0; virBufferAddLit(buf, "<cache>\n"); @@ -1025,7 +1025,7 @@ virCapabilitiesFormatMemoryBandwidth(virBufferPtr buf, { size_t i = 0; - if (!memBW->nnodes) + if (!memBW->nnodes && !memBW->monitor) return 0; virBufferAddLit(buf, "<memory_bandwidth>\n"); -- 2.23.0

On Thu, Nov 14, 2019 at 01:08:20AM +0800, Wang Huaqiang wrote:
From: Huaqiang <huaqiang.wang@intel.com>
We learned that the hardware features of CAT, CMT, MBA and MBM are orthogonal ones, if CAT or MBA is not supported in system, but CMT or MBM are supported, then the cache monitor or memoryBW monitor features may not be correctly displayed in host capabilities through command 'virsh capabilites'.
Showing the cache/memoryBW monitor capabilities even there is no support of cache allocation or memoryBW allocation features.
Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- src/conf/capabilities.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

From: Huaqiang <huaqiang.wang@intel.com> Originally, inside <cputune/cachetune>, it requires the <cache> element to be in the position before <monitor>, and following configuration is not permitted by schema, but it is better to let it be valid. <cputune> <cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> ^ |__ Not permitted originally because it is in the place before <cache> element. <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> </cachetune> ... </cputune> And, let schema do more strict check by identifying following configuration to be invalid, due to <cachetune> should contain at least one <cache> or <monitor> element. <cputune> <cachetune vcpus='0-1'> ^ |__ a <cachetune> SHOULD contain at least one <cache> or <monitor> </cachetune> ... </cputune> Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- docs/schemas/domaincommon.rng | 68 +++++++++++------------ tests/genericxml2xmlindata/cachetune.xml | 1 + tests/genericxml2xmloutdata/cachetune.xml | 34 ++++++++++++ tests/genericxml2xmltest.c | 2 +- 4 files changed, 70 insertions(+), 35 deletions(-) create mode 100644 tests/genericxml2xmloutdata/cachetune.xml diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index e06f892da3..aa4f512e5c 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -980,41 +980,41 @@ <attribute name="vcpus"> <ref name='cpuset'/> </attribute> - <zeroOrMore> - <element name="cache"> - <attribute name="id"> - <ref name='unsignedInt'/> - </attribute> - <attribute name="level"> - <ref name='unsignedInt'/> - </attribute> - <attribute name="type"> - <choice> - <value>both</value> - <value>code</value> - <value>data</value> - </choice> - </attribute> - <attribute name="size"> - <ref name='unsignedLong'/> - </attribute> - <optional> - <attribute name='unit'> - <ref name='unit'/> + <oneOrMore> + <choice> + <element name="cache"> + <attribute name="id"> + <ref name='unsignedInt'/> </attribute> - </optional> - </element> - </zeroOrMore> - <zeroOrMore> - <element name="monitor"> - <attribute name="level"> - <ref name='unsignedInt'/> - </attribute> - <attribute name="vcpus"> - <ref name='cpuset'/> - </attribute> - </element> - </zeroOrMore> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="type"> + <choice> + <value>both</value> + <value>code</value> + <value>data</value> + </choice> + </attribute> + <attribute name="size"> + <ref name='unsignedLong'/> + </attribute> + <optional> + <attribute name='unit'> + <ref name='unit'/> + </attribute> + </optional> + </element> + <element name="monitor"> + <attribute name="level"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + </element> + </choice> + </oneOrMore> </element> </zeroOrMore> <zeroOrMore> diff --git a/tests/genericxml2xmlindata/cachetune.xml b/tests/genericxml2xmlindata/cachetune.xml index 645cab7771..eda2ca6fb6 100644 --- a/tests/genericxml2xmlindata/cachetune.xml +++ b/tests/genericxml2xmlindata/cachetune.xml @@ -6,6 +6,7 @@ <vcpu placement='static'>4</vcpu> <cputune> <cachetune vcpus='0-1'> + <monitor level='3' vcpus='0-1'/> <cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> </cachetune> diff --git a/tests/genericxml2xmloutdata/cachetune.xml b/tests/genericxml2xmloutdata/cachetune.xml new file mode 100644 index 0000000000..dcde0ebc2a --- /dev/null +++ b/tests/genericxml2xmloutdata/cachetune.xml @@ -0,0 +1,34 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='3' unit='MiB'/> + <cache id='1' level='3' type='both' size='3' unit='MiB'/> + <monitor level='3' vcpus='0-1'/> + </cachetune> + <cachetune vcpus='3'> + <cache id='0' level='3' type='both' size='3' unit='MiB'/> + </cachetune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index 0d04413712..62005a5393 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -125,9 +125,9 @@ mymain(void) DO_TEST_FULL("chardev-reconnect-invalid-mode", 0, false, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); - DO_TEST("cachetune"); DO_TEST("cachetune-small"); DO_TEST("cachetune-cdp"); + DO_TEST_DIFFERENT("cachetune"); DO_TEST_DIFFERENT("cachetune-extra-tunes"); DO_TEST_FULL("cachetune-colliding-allocs", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); -- 2.23.0

On Thu, Nov 14, 2019 at 01:08:21AM +0800, Wang Huaqiang wrote:
From: Huaqiang <huaqiang.wang@intel.com>
Originally, inside <cputune/cachetune>, it requires the <cache> element to be in the position before <monitor>, and following configuration is not permitted by schema, but it is better to let it be valid.
<cputune> <cachetune vcpus='0-1'> <monitor level='3' vcpus='0-1'/> ^ |__ Not permitted originally because it is in the place before <cache> element.
<cache id='0' level='3' type='both' size='3' unit='MiB'/> <cache id='1' level='3' type='both' size='3' unit='MiB'/> </cachetune> ... </cputune>
And, let schema do more strict check by identifying following configuration to be invalid, due to <cachetune> should contain at least one <cache> or <monitor> element.
<cputune> <cachetune vcpus='0-1'> ^ |__ a <cachetune> SHOULD contain at least one <cache> or <monitor>
</cachetune> ... </cputune>
Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- docs/schemas/domaincommon.rng | 68 +++++++++++------------ tests/genericxml2xmlindata/cachetune.xml | 1 + tests/genericxml2xmloutdata/cachetune.xml | 34 ++++++++++++ tests/genericxml2xmltest.c | 2 +- 4 files changed, 70 insertions(+), 35 deletions(-) create mode 100644 tests/genericxml2xmloutdata/cachetune.xml
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

From: Huaqiang <huaqiang.wang@intel.com> Create memory bandwidth monitor. Following domain configuration changes create two memory bandwidth monitors: one is monitoring the bandwidth consumed by vCPU 0, another is for vCPU 5. ``` <cputune> <memorytune vcpus='0-4'> <node id='0' bandwidth='20'/> <node id='1' bandwidth='30'/> + <monitor vcpus='0'/> </memorytune> + <memorytune vcpus='5'> + <monitor vcpus='5'/> + </memorytune> </cputune> ``` Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- docs/schemas/domaincommon.rng | 23 +++++++---- src/conf/domain_conf.c | 44 +++++++++++++++++----- tests/genericxml2xmlindata/memorytune.xml | 5 +++ tests/genericxml2xmloutdata/memorytune.xml | 42 +++++++++++++++++++++ tests/genericxml2xmltest.c | 2 +- 5 files changed, 98 insertions(+), 18 deletions(-) create mode 100644 tests/genericxml2xmloutdata/memorytune.xml diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index aa4f512e5c..adf1fcb36d 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1023,14 +1023,21 @@ <ref name='cpuset'/> </attribute> <oneOrMore> - <element name="node"> - <attribute name="id"> - <ref name='unsignedInt'/> - </attribute> - <attribute name="bandwidth"> - <ref name='unsignedInt'/> - </attribute> - </element> + <choice> + <element name="node"> + <attribute name="id"> + <ref name='unsignedInt'/> + </attribute> + <attribute name="bandwidth"> + <ref name='unsignedInt'/> + </attribute> + </element> + <element name="monitor"> + <attribute name="vcpus"> + <ref name='cpuset'/> + </attribute> + </element> + </choice> </oneOrMore> </element> </zeroOrMore> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 875490dd2b..51bf18a42b 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19639,10 +19639,14 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, { VIR_XPATH_NODE_AUTORESTORE(ctxt); virDomainResctrlDefPtr resctrl = NULL; + virDomainResctrlDefPtr newresctrl = NULL; g_autoptr(virBitmap) vcpus = NULL; g_autofree xmlNodePtr *nodes = NULL; g_autoptr(virResctrlAlloc) alloc = NULL; ssize_t i = 0; + size_t nmons = 0; + size_t ret = -1; + int n; ctxt->node = node; @@ -19669,29 +19673,44 @@ virDomainMemorytuneDefParse(virDomainDefPtr def, return -1; } + /* First, parse <memorytune/node> element if any <node> element exists */ for (i = 0; i < n; i++) { if (virDomainMemorytuneDefParseMemory(ctxt, nodes[i], alloc) < 0) return -1; } - if (n == 0) - return 0; - /* * If this is a new allocation, format ID and append to resctrl, otherwise * just update the existing alloc information, which is done in above * virDomainMemorytuneDefParseMemory */ if (!resctrl) { - if (!(resctrl = virDomainResctrlNew(node, alloc, vcpus, flags))) + if (!(newresctrl = virDomainResctrlNew(node, alloc, vcpus, flags))) return -1; - if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, resctrl) < 0) { - virDomainResctrlDefFree(resctrl); - return -1; - } + resctrl = newresctrl; } - return 0; + /* Next, parse <memorytune/monitor> element */ + nmons = resctrl->nmonitors; + if (virDomainResctrlMonDefParse(def, ctxt, node, + VIR_RESCTRL_MONITOR_TYPE_MEMBW, + resctrl) < 0) + goto cleanup; + + nmons = resctrl->nmonitors - nmons; + /* Now @nmons contains the new <monitor> element number found in current + * <memorytune> element, and @n holds the number of new <node> element, + * only append the new @newresctrl object to domain if any of them is + * not zero. */ + if (newresctrl && (nmons || n)) { + if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, newresctrl) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virDomainResctrlDefFree(newresctrl); + return ret; } @@ -27607,6 +27626,7 @@ virDomainMemorytuneDefFormat(virBufferPtr buf, { g_auto(virBuffer) childrenBuf = VIR_BUFFER_INITIALIZER; g_autofree char *vcpus = NULL; + size_t i = 0; virBufferSetChildIndent(&childrenBuf, buf); if (virResctrlAllocForeachMemory(resctrl->alloc, @@ -27614,6 +27634,12 @@ virDomainMemorytuneDefFormat(virBufferPtr buf, &childrenBuf) < 0) return -1; + for (i = 0; i< resctrl->nmonitors; i++) { + if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i], + VIR_RESCTRL_MONITOR_TYPE_MEMBW, + &childrenBuf) < 0) + return -1; + } if (!virBufferUse(&childrenBuf)) return 0; diff --git a/tests/genericxml2xmlindata/memorytune.xml b/tests/genericxml2xmlindata/memorytune.xml index 7486b542c5..8d86ce4282 100644 --- a/tests/genericxml2xmlindata/memorytune.xml +++ b/tests/genericxml2xmlindata/memorytune.xml @@ -10,12 +10,17 @@ <cache id='1' level='3' type='both' size='768' unit='KiB'/> </cachetune> <memorytune vcpus='0-1'> + <monitor vcpus='0-1'/> <node id='0' bandwidth='20'/> + <monitor vcpus='0'/> <node id='1' bandwidth='30'/> </memorytune> <memorytune vcpus='3'> <node id='0' bandwidth='50'/> </memorytune> + <memorytune vcpus='2'> + <monitor vcpus='2'/> + </memorytune> </cputune> <os> <type arch='i686' machine='pc'>hvm</type> diff --git a/tests/genericxml2xmloutdata/memorytune.xml b/tests/genericxml2xmloutdata/memorytune.xml new file mode 100644 index 0000000000..f996435a6e --- /dev/null +++ b/tests/genericxml2xmloutdata/memorytune.xml @@ -0,0 +1,42 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>4</vcpu> + <cputune> + <cachetune vcpus='0-1'> + <cache id='0' level='3' type='both' size='768' unit='KiB'/> + <cache id='1' level='3' type='both' size='768' unit='KiB'/> + </cachetune> + <memorytune vcpus='0-1'> + <node id='0' bandwidth='20'/> + <node id='1' bandwidth='30'/> + <monitor vcpus='0-1'/> + <monitor vcpus='0'/> + </memorytune> + <memorytune vcpus='3'> + <node id='0' bandwidth='50'/> + </memorytune> + <memorytune vcpus='2'> + <monitor vcpus='2'/> + </memorytune> + </cputune> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i686</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c index 62005a5393..93f90f7cfd 100644 --- a/tests/genericxml2xmltest.c +++ b/tests/genericxml2xmltest.c @@ -137,7 +137,7 @@ mymain(void) TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST_FULL("cachetune-colliding-monitor", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); - DO_TEST("memorytune"); + DO_TEST_DIFFERENT("memorytune"); DO_TEST_FULL("memorytune-colliding-allocs", false, true, TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE); DO_TEST_FULL("memorytune-colliding-cachetune", false, true, -- 2.23.0

On Thu, Nov 14, 2019 at 01:08:22AM +0800, Wang Huaqiang wrote:
From: Huaqiang <huaqiang.wang@intel.com>
Create memory bandwidth monitor.
Following domain configuration changes create two memory bandwidth monitors: one is monitoring the bandwidth consumed by vCPU 0, another is for vCPU 5.
``` <cputune> <memorytune vcpus='0-4'> <node id='0' bandwidth='20'/> <node id='1' bandwidth='30'/> + <monitor vcpus='0'/> </memorytune> + <memorytune vcpus='5'> + <monitor vcpus='5'/> + </memorytune>
</cputune> ```
Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- docs/schemas/domaincommon.rng | 23 +++++++---- src/conf/domain_conf.c | 44 +++++++++++++++++----- tests/genericxml2xmlindata/memorytune.xml | 5 +++ tests/genericxml2xmloutdata/memorytune.xml | 42 +++++++++++++++++++++ tests/genericxml2xmltest.c | 2 +- 5 files changed, 98 insertions(+), 18 deletions(-) create mode 100644 tests/genericxml2xmloutdata/memorytune.xml
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

From: Huaqiang <huaqiang.wang@intel.com> Introduce an option '--memory' for showing memory related information. The memory bandwidth infomatio is listed as: Domain: 'libvirt-vm' memory.bandwidth.monitor.count=4 memory.bandwidth.monitor.0.name=vcpus_0-4 memory.bandwidth.monitor.0.vcpus=0-4 memory.bandwidth.monitor.0.node.count=2 memory.bandwidth.monitor.0.node.0.id=0 memory.bandwidth.monitor.0.node.0.bytes.total=10208067584 memory.bandwidth.monitor.0.node.0.bytes.local=4807114752 memory.bandwidth.monitor.0.node.1.id=1 memory.bandwidth.monitor.0.node.1.bytes.total=8693735424 memory.bandwidth.monitor.0.node.1.bytes.local=5850161152 memory.bandwidth.monitor.1.name=vcpus_7 memory.bandwidth.monitor.1.vcpus=7 memory.bandwidth.monitor.1.node.count=2 memory.bandwidth.monitor.1.node.0.id=0 memory.bandwidth.monitor.1.node.0.bytes.total=853811200 memory.bandwidth.monitor.1.node.0.bytes.local=290701312 memory.bandwidth.monitor.1.node.1.id=1 memory.bandwidth.monitor.1.node.1.bytes.total=406044672 memory.bandwidth.monitor.1.node.1.bytes.local=229425152 Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- include/libvirt/libvirt-domain.h | 1 + src/libvirt-domain.c | 21 +++++++ src/qemu/qemu_driver.c | 99 ++++++++++++++++++++++++++++++++ tools/virsh-domain-monitor.c | 7 +++ tools/virsh.pod | 23 +++++++- 5 files changed, 149 insertions(+), 2 deletions(-) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 22277b0a84..2b621ff162 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -2146,6 +2146,7 @@ typedef enum { VIR_DOMAIN_STATS_BLOCK = (1 << 5), /* return domain block info */ VIR_DOMAIN_STATS_PERF = (1 << 6), /* return domain perf event info */ VIR_DOMAIN_STATS_IOTHREAD = (1 << 7), /* return iothread poll info */ + VIR_DOMAIN_STATS_MEMORY= (1 << 8), /* return domain memory info */ } virDomainStatsTypes; typedef enum { diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index dcab179e6e..c8c543ccde 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -11641,6 +11641,27 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * hypervisor to choose how to shrink the * polling time. * + * VIR_DOMAIN_STATS_MEMORY: + * Return memory bandwidth statistics and the usage information. The typed + * parameter keys are in this format: + * + * "memory.bandwidth.monitor.count" - the number of memory bandwidth + * monitors for this domain + * "memory.bandwidth.monitor.<num>.name" - the name of monitor <num> + * "memory.bandwidth.monitor.<num>.vcpus" - the vcpu list of monitor <num> + * "memory.bandwidth.monitor.<num>.node.count" - the number of memory + * controller in monitor <num> + * "memory.bandwidth.monitor.<num>.node.<index>.id" - host allocated memory + * controller id for controller + * <index> of monitor <num> + * "memory.bandwidth.monitor.<num>.node.<index>.bytes.local" - the + * accumulative bytes consumed by @vcpus that passing + * through the memory controller in the same processor + * that the scheduled host CPU belongs to. + * "memory.bandwidth.monitor.<num>.node.<index>.bytes.total" - the total + * bytes consumed by @vcpus that passing through all + * memory controllers, either local or remote controller. + * * Note that entire stats groups or individual stat fields may be missing from * the output in case they are not supported by the given hypervisor, are not * applicable for the current state of the guest domain, or their retrieval diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e396358871..37a986a1bd 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -20496,6 +20496,9 @@ qemuDomainGetResctrlMonData(virQEMUDriverPtr driver, features = caps->host.cache.monitor->features; break; case VIR_RESCTRL_MONITOR_TYPE_MEMBW: + if (caps->host.memBW.monitor) + features = caps->host.memBW.monitor->features; + break; case VIR_RESCTRL_MONITOR_TYPE_UNSUPPORT: case VIR_RESCTRL_MONITOR_TYPE_LAST: virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", @@ -20548,6 +20551,90 @@ qemuDomainGetResctrlMonData(virQEMUDriverPtr driver, } +static int +qemuDomainGetStatsMemoryBandwidth(virQEMUDriverPtr driver, + virDomainObjPtr dom, + virTypedParamListPtr params) +{ + virQEMUResctrlMonDataPtr *resdata = NULL; + char **features = NULL; + size_t nresdata = 0; + size_t i = 0; + size_t j = 0; + size_t k = 0; + int ret = -1; + + if (!virDomainObjIsActive(dom)) + return 0; + + if (qemuDomainGetResctrlMonData(driver, dom, &resdata, &nresdata, + VIR_RESCTRL_MONITOR_TYPE_MEMBW) < 0) + goto cleanup; + + if (nresdata == 0) + return 0; + + if (virTypedParamListAddUInt(params, nresdata, + "memory.bandwidth.monitor.count") < 0) + goto cleanup; + + for (i = 0; i < nresdata; i++) { + if (virTypedParamListAddString(params, resdata[i]->name, + "memory.bandwidth.monitor.%zu.name", + i) < 0) + goto cleanup; + + if (virTypedParamListAddString(params, resdata[i]->vcpus, + "memory.bandwidth.monitor.%zu.vcpus", + i) < 0) + goto cleanup; + + if (virTypedParamListAddUInt(params, resdata[i]->nstats, + "memory.bandwidth.monitor.%zu.node.count", + i) < 0) + goto cleanup; + + + for (j = 0; j < resdata[i]->nstats; j++) { + if (virTypedParamListAddUInt(params, resdata[i]->stats[j]->id, + "memory.bandwidth.monitor.%zu." + "node.%zu.id", + i, j) < 0) + goto cleanup; + + + features = resdata[i]->stats[j]->features; + for (k = 0; features[k]; k++) { + if (STREQ(features[k], "mbm_local_bytes")) { + if (virTypedParamListAddULLong(params, + resdata[i]->stats[j]->vals[k], + "memory.bandwidth.monitor." + "%zu.node.%zu.bytes.local", + i, j) < 0) + goto cleanup; + } + + if (STREQ(features[k], "mbm_total_bytes")) { + if (virTypedParamListAddULLong(params, + resdata[i]->stats[j]->vals[k], + "memory.bandwidth.monitor." + "%zu.node.%zu.bytes.total", + i, j) < 0) + goto cleanup; + } + } + } + } + + ret = 0; + cleanup: + for (i = 0; i < nresdata; i++) + qemuDomainFreeResctrlMonData(resdata[i]); + VIR_FREE(resdata); + return ret; +} + + static int qemuDomainGetStatsCpuCache(virQEMUDriverPtr driver, virDomainObjPtr dom, @@ -20645,6 +20732,17 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver, } +static int +qemuDomainGetStatsMemory(virQEMUDriverPtr driver, + virDomainObjPtr dom, + virTypedParamListPtr params, + unsigned int privflags G_GNUC_UNUSED) + +{ + return qemuDomainGetStatsMemoryBandwidth(driver, dom, params); +} + + static int qemuDomainGetStatsBalloon(virQEMUDriverPtr driver, virDomainObjPtr dom, @@ -21314,6 +21412,7 @@ static struct qemuDomainGetStatsWorker qemuDomainGetStatsWorkers[] = { { qemuDomainGetStatsBlock, VIR_DOMAIN_STATS_BLOCK, true }, { qemuDomainGetStatsPerf, VIR_DOMAIN_STATS_PERF, false }, { qemuDomainGetStatsIOThread, VIR_DOMAIN_STATS_IOTHREAD, true }, + { qemuDomainGetStatsMemory, VIR_DOMAIN_STATS_MEMORY, false }, { NULL, 0, false } }; diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c index 034c913d5e..8abd0f2d0b 100644 --- a/tools/virsh-domain-monitor.c +++ b/tools/virsh-domain-monitor.c @@ -2111,6 +2111,10 @@ static const vshCmdOptDef opts_domstats[] = { .type = VSH_OT_BOOL, .help = N_("report domain IOThread information"), }, + {.name = "memory", + .type = VSH_OT_BOOL, + .help = N_("report domain memory usage"), + }, {.name = "list-active", .type = VSH_OT_BOOL, .help = N_("list only active domains"), @@ -2227,6 +2231,9 @@ cmdDomstats(vshControl *ctl, const vshCmd *cmd) if (vshCommandOptBool(cmd, "iothread")) stats |= VIR_DOMAIN_STATS_IOTHREAD; + if (vshCommandOptBool(cmd, "memory")) + stats |= VIR_DOMAIN_STATS_MEMORY; + if (vshCommandOptBool(cmd, "list-active")) flags |= VIR_CONNECT_GET_ALL_DOMAINS_STATS_ACTIVE; diff --git a/tools/virsh.pod b/tools/virsh.pod index cf2798e71a..30effffcba 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -1483,7 +1483,7 @@ reason for the state. =item B<domstats> [I<--raw>] [I<--enforce>] [I<--backing>] [I<--nowait>] [I<--state>] [I<--cpu-total>] [I<--balloon>] [I<--vcpu>] [I<--interface>] -[I<--block>] [I<--perf>] [I<--iothread>] +[I<--block>] [I<--perf>] [I<--iothread>] [I<--memory>] [[I<--list-active>] [I<--list-inactive>] [I<--list-persistent>] [I<--list-transient>] [I<--list-running>] [I<--list-paused>] [I<--list-shutoff>] [I<--list-other>]] | [I<domain> ...] @@ -1502,7 +1502,7 @@ behavior use the I<--raw> flag. The individual statistics groups are selectable via specific flags. By default all supported statistics groups are returned. Supported statistics groups flags are: I<--state>, I<--cpu-total>, I<--balloon>, -I<--vcpu>, I<--interface>, I<--block>, I<--perf>, I<--iothread>. +I<--vcpu>, I<--interface>, I<--block>, I<--perf>, I<--iothread>, I<--memory>. Note that - depending on the hypervisor type and version or the domain state - not all of the following statistics may be returned. @@ -1670,6 +1670,25 @@ not available for statistical purposes. 0 (zero) indicates shrink is managed by the hypervisor. +I<--memory> returns: + + "memory.bandwidth.monitor.count" - the number of memory bandwidth + monitors for this domain + "memory.bandwidth.monitor.<num>.name" - the name of monitor <num> + "memory.bandwidth.monitor.<num>.vcpus" - the vcpu list of monitor <num> + "memory.bandwidth.monitor.<num>.node.count" - the number of memory + controller in monitor <num> + "memory.bandwidth.monitor.<num>.node.<index>.id" - host allocated memory + controller id for controller + <index> of monitor <num> + "memory.bandwidth.monitor.<num>.node.<index>.bytes.local" - the accumulative + bytes consumed by @vcpus that passing through + the memory controller in the same processor + that the scheduled host CPU belongs to. + "memory.bandwidth.monitor.<num>.node.<index>.bytes.total" - the total + bytes consumed by @vcpus that passing through all + memory controllers, either local or remote controller. + Selecting a specific statistics groups doesn't guarantee that the daemon supports the selected group of stats. Flag I<--enforce> forces the command to fail if the daemon doesn't support the -- 2.23.0

On Thu, Nov 14, 2019 at 01:08:23AM +0800, Wang Huaqiang wrote:
From: Huaqiang <huaqiang.wang@intel.com>
Introduce an option '--memory' for showing memory related information. The memory bandwidth infomatio is listed as:
Domain: 'libvirt-vm' memory.bandwidth.monitor.count=4 memory.bandwidth.monitor.0.name=vcpus_0-4 memory.bandwidth.monitor.0.vcpus=0-4 memory.bandwidth.monitor.0.node.count=2 memory.bandwidth.monitor.0.node.0.id=0 memory.bandwidth.monitor.0.node.0.bytes.total=10208067584 memory.bandwidth.monitor.0.node.0.bytes.local=4807114752 memory.bandwidth.monitor.0.node.1.id=1 memory.bandwidth.monitor.0.node.1.bytes.total=8693735424 memory.bandwidth.monitor.0.node.1.bytes.local=5850161152 memory.bandwidth.monitor.1.name=vcpus_7 memory.bandwidth.monitor.1.vcpus=7 memory.bandwidth.monitor.1.node.count=2 memory.bandwidth.monitor.1.node.0.id=0 memory.bandwidth.monitor.1.node.0.bytes.total=853811200 memory.bandwidth.monitor.1.node.0.bytes.local=290701312 memory.bandwidth.monitor.1.node.1.id=1 memory.bandwidth.monitor.1.node.1.bytes.total=406044672 memory.bandwidth.monitor.1.node.1.bytes.local=229425152
Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- include/libvirt/libvirt-domain.h | 1 + src/libvirt-domain.c | 21 +++++++ src/qemu/qemu_driver.c | 99 ++++++++++++++++++++++++++++++++ tools/virsh-domain-monitor.c | 7 +++ tools/virsh.pod | 23 +++++++-
Unfortunately this doesn't apply since I converted the manpages from POD to RST. Can you resend with update to docs/manpages/virsh.rst instead
5 files changed, 149 insertions(+), 2 deletions(-)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 22277b0a84..2b621ff162 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -2146,6 +2146,7 @@ typedef enum { VIR_DOMAIN_STATS_BLOCK = (1 << 5), /* return domain block info */ VIR_DOMAIN_STATS_PERF = (1 << 6), /* return domain perf event info */ VIR_DOMAIN_STATS_IOTHREAD = (1 << 7), /* return iothread poll info */ + VIR_DOMAIN_STATS_MEMORY= (1 << 8), /* return domain memory info */
Nitpick, whitespace before '='
} virDomainStatsTypes;
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

-----Original Message----- From: Daniel P. Berrangé <berrange@redhat.com> Sent: Friday, December 13, 2019 11:56 PM To: Wang, Huaqiang <huaqiang.wang@intel.com> Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 5/5] virsh: show memoryBW info in 'virsh domstats' command
On Thu, Nov 14, 2019 at 01:08:23AM +0800, Wang Huaqiang wrote:
From: Huaqiang <huaqiang.wang@intel.com>
Introduce an option '--memory' for showing memory related information. The memory bandwidth infomatio is listed as:
Domain: 'libvirt-vm' memory.bandwidth.monitor.count=4 memory.bandwidth.monitor.0.name=vcpus_0-4 memory.bandwidth.monitor.0.vcpus=0-4 memory.bandwidth.monitor.0.node.count=2 memory.bandwidth.monitor.0.node.0.id=0 memory.bandwidth.monitor.0.node.0.bytes.total=10208067584 memory.bandwidth.monitor.0.node.0.bytes.local=4807114752 memory.bandwidth.monitor.0.node.1.id=1 memory.bandwidth.monitor.0.node.1.bytes.total=8693735424 memory.bandwidth.monitor.0.node.1.bytes.local=5850161152 memory.bandwidth.monitor.1.name=vcpus_7 memory.bandwidth.monitor.1.vcpus=7 memory.bandwidth.monitor.1.node.count=2 memory.bandwidth.monitor.1.node.0.id=0 memory.bandwidth.monitor.1.node.0.bytes.total=853811200 memory.bandwidth.monitor.1.node.0.bytes.local=290701312 memory.bandwidth.monitor.1.node.1.id=1 memory.bandwidth.monitor.1.node.1.bytes.total=406044672 memory.bandwidth.monitor.1.node.1.bytes.local=229425152
Signed-off-by: Huaqiang <huaqiang.wang@intel.com> --- include/libvirt/libvirt-domain.h | 1 + src/libvirt-domain.c | 21 +++++++ src/qemu/qemu_driver.c | 99 ++++++++++++++++++++++++++++++++ tools/virsh-domain-monitor.c | 7 +++ tools/virsh.pod | 23 +++++++-
Unfortunately this doesn't apply since I converted the manpages from POD to RST.
Can you resend with update to docs/manpages/virsh.rst instead
Got. I'll make a update and resend after we fixed patch 1/5. Thanks for review! Huaqiang
5 files changed, 149 insertions(+), 2 deletions(-)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 22277b0a84..2b621ff162 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -2146,6 +2146,7 @@ typedef enum { VIR_DOMAIN_STATS_BLOCK = (1 << 5), /* return domain block info */ VIR_DOMAIN_STATS_PERF = (1 << 6), /* return domain perf event info */ VIR_DOMAIN_STATS_IOTHREAD = (1 << 7), /* return iothread poll info */ + VIR_DOMAIN_STATS_MEMORY= (1 << 8), /* return domain memory info + */
Nitpick, whitespace before '='
} virDomainStatsTypes;
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Just a reminder that libvirt binds need to be updated after patches introduced. Refer to libvirt python and perl bindings: commit b0a7747 Author: Pavel Hrdina <phrdina@redhat.com> Date: Fri Sep 20 11:14:35 2019 +0200 virDomainMemoryStats: include disk caches Introduced in libvirt 4.6.0 by commit <aee04655089>. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1683516 Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> commit e562e58 Author: Daniel P. Berrangé <berrange@redhat.com> Date: Wed May 22 14:07:57 2019 +0100 Add new hugetlb memory stats constants Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> On Wed, Nov 13, 2019 at 6:34 PM Wang Huaqiang <huaqiang.wang@intel.com> wrote:
RDT is the short for Intel Resource Director Technology, consists of four sub-technologies until now:
-. CAT for cache allocation -. CMT for cache usage monitoring -. MBA for memory bandwidth allocation -. MBM for memory bandwidth usage monitoring
The Linux kernel interface is 'resctrl' file system, and we have already implemented the support of CAT, CMT and MBA, to accomplish the tasks such as allocating a part of shared CPU last level cache to particular domain vcpu or a list of vcpus and monitoring the usage of cache, or the task of allocating a mount of memory bandwidth to specify domain vcpu(s).
This series is to introduce the support of MBM.
Basically the interfaces are:
** Identify host capability **
Similar to identify the host capability of CMT, it could be gotten through the result of 'virsh capabilities', if following elements are found, then MBM is supported:
<memory_bandwidth> <monitor maxMonitors='176'> <feature name='mbm_total_bytes'/> <feature name='mbm_local_bytes'/> </monitor> </memory_bandwidth>
'mbm_total_bytes' means supporting to report the memory bandwidth used by the vcpu(s) of specific monitor on all CPU sockets.
'mbm_local_bytes' means supporting to report the memory bandwidth used by vcpu(s) that is passing through local CPU socket.
** Create monitor group**
The monitor group for specific domain vcpus, for example vcpu 0-4, is defined in domain configuration file, in such kind of way:
<cputune> <memorytune vcpus='0-4'> <monitor vcpus='0-4'/> </memorytune> </cputune>
** Report memory usage **
Introduced an option '--memory' against 'virsh domstats' command to show the memory bandwidth usage in such way: (also very similar to the format of CMT result.)
# virsh domstats --memory
Domain: 'libvirt-vm' memory.bandwidth.monitor.count=4 memory.bandwidth.monitor.0.name=vcpus_0-4 memory.bandwidth.monitor.0.vcpus=0-4 memory.bandwidth.monitor.0.node.count=2 memory.bandwidth.monitor.0.node.0.id=0 memory.bandwidth.monitor.0.node.0.bytes.total=14201651200 memory.bandwidth.monitor.0.node.0.bytes.local=7369809920 memory.bandwidth.monitor.0.node.1.id=1 memory.bandwidth.monitor.0.node.1.bytes.total=188897640448 memory.bandwidth.monitor.0.node.1.bytes.local=170044047360
Huaqiang (5): util, resctrl: using 64bit interface instead of 32bit for counters conf: showing cache/memoryBW monitor features in capabilities cachetune schema: a looser check for the order of <cache> and <monitor> element conf: Parse dommon configure file for memorytune monitors virsh: show memoryBW info in 'virsh domstats' command
docs/schemas/domaincommon.rng | 91 +++++++++--------- include/libvirt/libvirt-domain.h | 1 + src/conf/capabilities.c | 4 +- src/conf/domain_conf.c | 44 +++++++-- src/libvirt-domain.c | 21 +++++ src/qemu/qemu_driver.c | 103 ++++++++++++++++++++- src/util/virfile.c | 40 ++++++++ src/util/virfile.h | 2 + src/util/virresctrl.c | 6 +- src/util/virresctrl.h | 2 +- tests/genericxml2xmlindata/cachetune.xml | 1 + tests/genericxml2xmlindata/memorytune.xml | 5 + tests/genericxml2xmloutdata/cachetune.xml | 34 +++++++ tests/genericxml2xmloutdata/memorytune.xml | 42 +++++++++ tests/genericxml2xmltest.c | 4 +- tools/virsh-domain-monitor.c | 7 ++ tools/virsh.pod | 23 ++++- 17 files changed, 367 insertions(+), 63 deletions(-) create mode 100644 tests/genericxml2xmloutdata/cachetune.xml create mode 100644 tests/genericxml2xmloutdata/memorytune.xml
-- 2.23.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Best regards, ----------------------------------- Han Han Quality Engineer Redhat. Email: hhan@redhat.com Phone: +861065339333

Hi Han, Thanks for your kind reminder. I haven't used the 'virsh dommemstat' command for reporting the domain vcpus' memory bandwidth usage over host. What I implemented is add a new option '--memory' under command 'virsh domstats'. Reason for such kind of implementation is I haven't realized there is already one interface, the 'dommemstat' command, with the intention of showing domain memory related statistics. But after examining the ways to report the block device statistics, and the network traffic statistics, I found they are similar, for example you can find block device statistics from command 'virsh domblkstat' and 'virsh domstats --block'. So I tent to use the way that I have done in patch '1/5' to let these memory information be shown in command 'virsh domstasts --memory'. Reason is the memory bandwidth information is associated with the memory bandwidth monitor (a hardware feature from cpu manufacture), and each monitor could be applied to one or some vcpus. This is much similar to the case of 'virsh domstats --interface' and 'virsh domstats --block'. I hope more reviewers involve this discussion. Thanks Huaqiang On 2019/11/14 下午2:43, Han Han wrote:
Just a reminder that libvirt binds need to be updated after patches introduced. Refer to libvirt python and perl bindings: commit b0a7747 Author: Pavel Hrdina <phrdina@redhat.com <mailto:phrdina@redhat.com>> Date: Fri Sep 20 11:14:35 2019 +0200
virDomainMemoryStats: include disk caches
Introduced in libvirt 4.6.0 by commit <aee04655089>.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1683516
Signed-off-by: Pavel Hrdina <phrdina@redhat.com <mailto:phrdina@redhat.com>> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com <mailto:berrange@redhat.com>>
commit e562e58 Author: Daniel P. Berrangé <berrange@redhat.com <mailto:berrange@redhat.com>> Date: Wed May 22 14:07:57 2019 +0100
Add new hugetlb memory stats constants
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com <mailto:berrange@redhat.com>>
On Wed, Nov 13, 2019 at 6:34 PM Wang Huaqiang <huaqiang.wang@intel.com <mailto:huaqiang.wang@intel.com>> wrote:
RDT is the short for Intel Resource Director Technology, consists of four sub-technologies until now:
-. CAT for cache allocation -. CMT for cache usage monitoring -. MBA for memory bandwidth allocation -. MBM for memory bandwidth usage monitoring
The Linux kernel interface is 'resctrl' file system, and we have already implemented the support of CAT, CMT and MBA, to accomplish the tasks such as allocating a part of shared CPU last level cache to particular domain vcpu or a list of vcpus and monitoring the usage of cache, or the task of allocating a mount of memory bandwidth to specify domain vcpu(s).
This series is to introduce the support of MBM.
Basically the interfaces are:
** Identify host capability **
Similar to identify the host capability of CMT, it could be gotten through the result of 'virsh capabilities', if following elements are found, then MBM is supported:
<memory_bandwidth> <monitor maxMonitors='176'> <feature name='mbm_total_bytes'/> <feature name='mbm_local_bytes'/> </monitor> </memory_bandwidth>
'mbm_total_bytes' means supporting to report the memory bandwidth used by the vcpu(s) of specific monitor on all CPU sockets.
'mbm_local_bytes' means supporting to report the memory bandwidth used by vcpu(s) that is passing through local CPU socket.
** Create monitor group**
The monitor group for specific domain vcpus, for example vcpu 0-4, is defined in domain configuration file, in such kind of way:
<cputune> <memorytune vcpus='0-4'> <monitor vcpus='0-4'/> </memorytune> </cputune>
** Report memory usage **
Introduced an option '--memory' against 'virsh domstats' command to show the memory bandwidth usage in such way: (also very similar to the format of CMT result.)
# virsh domstats --memory
Domain: 'libvirt-vm' memory.bandwidth.monitor.count=4 memory.bandwidth.monitor.0.name <http://memory.bandwidth.monitor.0.name>=vcpus_0-4 memory.bandwidth.monitor.0.vcpus=0-4 memory.bandwidth.monitor.0.node.count=2 memory.bandwidth.monitor.0.node.0.id <http://memory.bandwidth.monitor.0.node.0.id>=0 memory.bandwidth.monitor.0.node.0.bytes.total=14201651200 memory.bandwidth.monitor.0.node.0.bytes.local=7369809920 memory.bandwidth.monitor.0.node.1.id <http://memory.bandwidth.monitor.0.node.1.id>=1 memory.bandwidth.monitor.0.node.1.bytes.total=188897640448 memory.bandwidth.monitor.0.node.1.bytes.local=170044047360
Huaqiang (5): util, resctrl: using 64bit interface instead of 32bit for counters conf: showing cache/memoryBW monitor features in capabilities cachetune schema: a looser check for the order of <cache> and <monitor> element conf: Parse dommon configure file for memorytune monitors virsh: show memoryBW info in 'virsh domstats' command
docs/schemas/domaincommon.rng | 91 +++++++++--------- include/libvirt/libvirt-domain.h | 1 + src/conf/capabilities.c | 4 +- src/conf/domain_conf.c | 44 +++++++-- src/libvirt-domain.c | 21 +++++ src/qemu/qemu_driver.c | 103 ++++++++++++++++++++- src/util/virfile.c | 40 ++++++++ src/util/virfile.h | 2 + src/util/virresctrl.c | 6 +- src/util/virresctrl.h | 2 +- tests/genericxml2xmlindata/cachetune.xml | 1 + tests/genericxml2xmlindata/memorytune.xml | 5 + tests/genericxml2xmloutdata/cachetune.xml | 34 +++++++ tests/genericxml2xmloutdata/memorytune.xml | 42 +++++++++ tests/genericxml2xmltest.c | 4 +- tools/virsh-domain-monitor.c | 7 ++ tools/virsh.pod | 23 ++++- 17 files changed, 367 insertions(+), 63 deletions(-) create mode 100644 tests/genericxml2xmloutdata/cachetune.xml create mode 100644 tests/genericxml2xmloutdata/memorytune.xml
-- 2.23.0
-- libvir-list mailing list libvir-list@redhat.com <mailto:libvir-list@redhat.com> https://www.redhat.com/mailman/listinfo/libvir-list
-- Best regards, ----------------------------------- Han Han Quality Engineer Redhat.
Email: hhan@redhat.com <mailto:hhan@redhat.com> Phone: +861065339333

On Thu, Nov 14, 2019 at 01:08:18AM +0800, Wang Huaqiang wrote:
RDT is the short for Intel Resource Director Technology, consists of four sub-technologies until now:
-. CAT for cache allocation -. CMT for cache usage monitoring -. MBA for memory bandwidth allocation -. MBM for memory bandwidth usage monitoring
The Linux kernel interface is 'resctrl' file system, and we have already implemented the support of CAT, CMT and MBA, to accomplish the tasks such as allocating a part of shared CPU last level cache to particular domain vcpu or a list of vcpus and monitoring the usage of cache, or the task of allocating a mount of memory bandwidth to specify domain vcpu(s).
This series is to introduce the support of MBM.
Basically the interfaces are:
** Identify host capability **
Similar to identify the host capability of CMT, it could be gotten through the result of 'virsh capabilities', if following elements are found, then MBM is supported:
<memory_bandwidth> <monitor maxMonitors='176'> <feature name='mbm_total_bytes'/> <feature name='mbm_local_bytes'/> </monitor> </memory_bandwidth>
'mbm_total_bytes' means supporting to report the memory bandwidth used by the vcpu(s) of specific monitor on all CPU sockets.
'mbm_local_bytes' means supporting to report the memory bandwidth used by vcpu(s) that is passing through local CPU socket.
** Create monitor group**
The monitor group for specific domain vcpus, for example vcpu 0-4, is defined in domain configuration file, in such kind of way:
<cputune> <memorytune vcpus='0-4'> <monitor vcpus='0-4'/> </memorytune> </cputune>
** Report memory usage **
Introduced an option '--memory' against 'virsh domstats' command to show the memory bandwidth usage in such way: (also very similar to the format of CMT result.)
# virsh domstats --memory
Domain: 'libvirt-vm' memory.bandwidth.monitor.count=4 memory.bandwidth.monitor.0.name=vcpus_0-4 memory.bandwidth.monitor.0.vcpus=0-4 memory.bandwidth.monitor.0.node.count=2 memory.bandwidth.monitor.0.node.0.id=0 memory.bandwidth.monitor.0.node.0.bytes.total=14201651200 memory.bandwidth.monitor.0.node.0.bytes.local=7369809920 memory.bandwidth.monitor.0.node.1.id=1 memory.bandwidth.monitor.0.node.1.bytes.total=188897640448 memory.bandwidth.monitor.0.node.1.bytes.local=170044047360
Huaqiang (5): util, resctrl: using 64bit interface instead of 32bit for counters conf: showing cache/memoryBW monitor features in capabilities cachetune schema: a looser check for the order of <cache> and <monitor> element conf: Parse dommon configure file for memorytune monitors virsh: show memoryBW info in 'virsh domstats' command
I've pushed patches 2, 3, & 4, so you only need update 1 & 5 and resend those two. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (4)
-
Daniel P. Berrangé
-
Han Han
-
Wang Huaqiang
-
Wang, Huaqiang