[libvirt] [PATCH 0/8] Add perf and Intel CMT feature support

The series mainly adds Intel CMT feature support into libvirt. CMT is new introduced PQos (Platform Qos) feature to monitor the usage of cache by applications running on the platform. Currently CMT patches has been merged into Linux kernel mainline. The CMT implementation in Linux kernel is based on perf mechanism and there is no support for perf in libvirt, and so this series firstly add perf support into libvirt, including two public API and a set of util interfaces. And based on these APIs and interfaces, thie series implements CMT perf event support. TODO: 1. This series relys on keeping an open file descriptor for the guest. We should add one patch to call sys_perf_event_open again iff libvirtd is restarted. Qiaowei Ren (8): perf: add new public APIs for perf event perf: define internal driver API for perf event perf: implement the public APIs for perf event perf: implement the remote protocol for perf event perf: implement a set of util functions for perf event qemu_driver: add support to perf event virsh: extend domstats command virsh: implement new command to support perf daemon/remote.c | 60 +++++++++++ include/libvirt/libvirt-domain.h | 19 ++++ include/libvirt/virterror.h | 2 + src/Makefile.am | 1 + src/driver-hypervisor.h | 12 +++ src/libvirt-domain.c | 106 +++++++++++++++++++ src/libvirt_private.syms | 8 ++ src/libvirt_public.syms | 6 ++ src/qemu/qemu_domain.h | 3 + src/qemu/qemu_driver.c | 181 +++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 6 ++ src/remote/remote_driver.c | 49 +++++++++ src/remote/remote_protocol.x | 32 +++++- src/remote_protocol-structs | 20 ++++ src/util/virerror.c | 2 + src/util/virperf.c | 222 +++++++++++++++++++++++++++++++++++++++ src/util/virperf.h | 60 +++++++++++ tools/virsh-domain-monitor.c | 7 ++ tools/virsh-domain.c | 139 ++++++++++++++++++++++++ tools/virsh.pod | 15 ++- 20 files changed, 947 insertions(+), 3 deletions(-) create mode 100644 src/util/virperf.c create mode 100644 src/util/virperf.h -- 1.9.1

API agreed on in https://www.redhat.com/archives/libvir-list/2015-October/msg00872.html * include/libvirt/libvirt-domain.h (virDomainGetPerfEvents, virDomainSetPerfEvents): New declarations. * src/libvirt_public.syms: Export new symbols. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/libvirt-domain.h | 18 ++++++++++++++++++ src/libvirt_public.syms | 6 ++++++ 2 files changed, 24 insertions(+) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index a1ea6a5..69965e6 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1802,6 +1802,24 @@ int virDomainListGetStats(virDomainPtr *doms, void virDomainStatsRecordListFree(virDomainStatsRecordPtr *stats); /* + * Perf Event API + */ + +/** + * VIR_DOMAIN_PERF_CMT: + * + * Macro for typed parameter name that represents CMT perf event. + */ +# define VIR_DOMAIN_PERF_CMT "cmt" + +int virDomainGetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int * nparams); +int virDomainSetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int nparams); + +/* * BlockJob API */ diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index dd94191..03206e7 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -725,4 +725,10 @@ LIBVIRT_1.2.19 { virDomainRename; } LIBVIRT_1.2.17; +LIBVIRT_1.2.23 { + global: + virDomainGetPerfEvents; + virDomainSetPerfEvents; +} LIBVIRT_1.2.19; + # .... define new API here using predicted next version number .... -- 1.9.1

On Tue, Nov 17, 2015 at 16:00:41 +0800, Qiaowei Ren wrote:
API agreed on in https://www.redhat.com/archives/libvir-list/2015-October/msg00872.html
* include/libvirt/libvirt-domain.h (virDomainGetPerfEvents, virDomainSetPerfEvents): New declarations. * src/libvirt_public.syms: Export new symbols.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/libvirt-domain.h | 18 ++++++++++++++++++ src/libvirt_public.syms | 6 ++++++ 2 files changed, 24 insertions(+)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index a1ea6a5..69965e6 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1802,6 +1802,24 @@ int virDomainListGetStats(virDomainPtr *doms, void virDomainStatsRecordListFree(virDomainStatsRecordPtr *stats);
/* + * Perf Event API + */ + +/** + * VIR_DOMAIN_PERF_CMT: + * + * Macro for typed parameter name that represents CMT perf event. + */ +# define VIR_DOMAIN_PERF_CMT "cmt" + +int virDomainGetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int * nparams);
We don't put a space between * and the name.
+int virDomainSetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int nparams); + +/* * BlockJob API */
diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index dd94191..03206e7 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -725,4 +725,10 @@ LIBVIRT_1.2.19 { virDomainRename; } LIBVIRT_1.2.17;
+LIBVIRT_1.2.23 { + global: + virDomainGetPerfEvents; + virDomainSetPerfEvents; +} LIBVIRT_1.2.19; + # .... define new API here using predicted next version number ....
This patch does not compile, you should squash patches 1, 2, and 3 together, since 'make all check syntax-check' should pass after each patch in the series. Jirka

-----Original Message----- From: Jiri Denemark [mailto:jdenemar@redhat.com] Sent: Monday, November 23, 2015 11:26 PM To: Ren, Qiaowei Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 1/8] perf: add new public APIs for perf event
On Tue, Nov 17, 2015 at 16:00:41 +0800, Qiaowei Ren wrote:
API agreed on in https://www.redhat.com/archives/libvir-list/2015-October/msg00872.html
* include/libvirt/libvirt-domain.h (virDomainGetPerfEvents, virDomainSetPerfEvents): New declarations. * src/libvirt_public.syms: Export new symbols.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/libvirt-domain.h | 18 ++++++++++++++++++ src/libvirt_public.syms | 6 ++++++ 2 files changed, 24 insertions(+)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index a1ea6a5..69965e6 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1802,6 +1802,24 @@ int virDomainListGetStats(virDomainPtr *doms, void virDomainStatsRecordListFree(virDomainStatsRecordPtr *stats);
/* + * Perf Event API + */ + +/** + * VIR_DOMAIN_PERF_CMT: + * + * Macro for typed parameter name that represents CMT perf event. + */ +# define VIR_DOMAIN_PERF_CMT "cmt" + +int virDomainGetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int * nparams);
We don't put a space between * and the name.
+int virDomainSetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int nparams); + +/* * BlockJob API */
diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index dd94191..03206e7 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -725,4 +725,10 @@ LIBVIRT_1.2.19 { virDomainRename; } LIBVIRT_1.2.17;
+LIBVIRT_1.2.23 { + global: + virDomainGetPerfEvents; + virDomainSetPerfEvents; +} LIBVIRT_1.2.19; + # .... define new API here using predicted next version number ....
This patch does not compile, you should squash patches 1, 2, and 3 together, since 'make all check syntax-check' should pass after each patch in the series.
Ok. This is just from the sample about "Implementing a new API in Libvirt" (https://libvirt.org/api_extension.html ). ^-^ I will make these 3 patches together in next version. Thanks, Qiaowei

* src/driver-hypervisor.h (virDrvDomainGetPerfEvents, virDrvDomainSetPerfEvents): New typedefs. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- src/driver-hypervisor.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/src/driver-hypervisor.h b/src/driver-hypervisor.h index ae2ec4d..2b72e23 100644 --- a/src/driver-hypervisor.h +++ b/src/driver-hypervisor.h @@ -958,6 +958,16 @@ typedef int unsigned int flags); typedef int +(*virDrvDomainGetPerfEvents)(virDomainPtr dom, + virTypedParameterPtr params, + int * nparams); + +typedef int +(*virDrvDomainSetPerfEvents)(virDomainPtr dom, + virTypedParameterPtr params, + int nparams); + +typedef int (*virDrvDomainBlockJobAbort)(virDomainPtr dom, const char *path, unsigned int flags); @@ -1413,6 +1423,8 @@ struct _virHypervisorDriver { virDrvConnectSetKeepAlive connectSetKeepAlive; virDrvConnectIsAlive connectIsAlive; virDrvNodeSuspendForDuration nodeSuspendForDuration; + virDrvDomainGetPerfEvents domainGetPerfEvents; + virDrvDomainSetPerfEvents domainSetPerfEvents; virDrvDomainSetBlockIoTune domainSetBlockIoTune; virDrvDomainGetBlockIoTune domainGetBlockIoTune; virDrvDomainGetCPUStats domainGetCPUStats; -- 1.9.1

* src/libvirt-domain.c: Implement virDomainGetPerfEvents and virDomainSetPerfEvents. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- src/libvirt-domain.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+) diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index de7eb04..e767606 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -9572,6 +9572,112 @@ virDomainOpenChannel(virDomainPtr dom, /** + * virDomainGetPerfEvents: + * @domain: pointer to domain object + * @params: pointer to perf events parameter object + * (return value, allocated by the caller) + * @nparams: pointer to number of perf event parameters + * + * Get all perf events setting. On input, @nparams gives the size of the + * @params array; on output, @nparams gives how many slots were filled + * with parameter information, which might be less but will not exceed + * the input value. + * + * As a special case, calling with @params as NULL and @nparams as 0 on + * input will cause @nparams on output to contain the number of parameters + * supported by the hypervisor. The caller should then allocate @params + * array, i.e. (sizeof(@virTypedParameter) * @nparams) bytes and call the + * API again. + * + * See virDomainGetMemoryParameters() for an equivalent usage example. + * + * Returns -1 in case of error, 0 in case of success. + */ +int virDomainGetPerfEvents(virDomainPtr domain, + virTypedParameterPtr params, + int * nparams) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "params=%p, nparams=%d", + params, (nparams) ? *nparams : -1); + + virResetLastError(); + + virCheckDomainReturn(domain, -1); + virCheckNonNullArgGoto(nparams, error); + virCheckNonNegativeArgGoto(*nparams, error); + if (*nparams != 0) + virCheckNonNullArgGoto(params, error); + + conn = domain->conn; + + if (conn->driver->domainGetPerfEvents) { + int ret; + ret = conn->driver->domainGetPerfEvents(domain, params, nparams); + if (ret < 0) + goto error; + return ret; + } + virReportUnsupportedError(); + + error: + virDispatchError(domain->conn); + return -1; +} + + +/** + * virDomainSetPerfEvents: + * @domain: pointer to domain object + * @params: pointer to perf events parameter object + * @nparams: number of perf event parameters (this value can be the same + * less than the number of parameters supported) + * + * Enable or disable the particular list of perf events you care about. + * + * Returns -1 in case of error, 0 in case of success. + */ +int virDomainSetPerfEvents(virDomainPtr domain, + virTypedParameterPtr params, + int nparams) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "params=%p, nparams=%d", + params, nparams); + VIR_TYPED_PARAMS_DEBUG(params, nparams); + + virResetLastError(); + + virCheckDomainReturn(domain, -1); + conn = domain->conn; + + virCheckReadOnlyGoto(conn->flags, error); + virCheckNonNullArgGoto(params, error); + virCheckPositiveArgGoto(nparams, error); + + if (virTypedParameterValidateSet(conn, params, nparams) < 0) + goto error; + + if (conn->driver->domainSetPerfEvents) { + int ret; + ret = conn->driver->domainSetPerfEvents(domain, params, nparams); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + + error: + virDispatchError(domain->conn); + return -1; + +} + + +/** * virDomainBlockJobAbort: * @dom: pointer to domain object * @disk: path to the block device, or device shorthand -- 1.9.1

On Tue, Nov 17, 2015 at 16:00:43 +0800, Qiaowei Ren wrote:
* src/libvirt-domain.c: Implement virDomainGetPerfEvents and virDomainSetPerfEvents.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- src/libvirt-domain.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+)
diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index de7eb04..e767606 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -9572,6 +9572,112 @@ virDomainOpenChannel(virDomainPtr dom,
/** + * virDomainGetPerfEvents: + * @domain: pointer to domain object + * @params: pointer to perf events parameter object + * (return value, allocated by the caller) + * @nparams: pointer to number of perf event parameters + * + * Get all perf events setting. On input, @nparams gives the size of the + * @params array; on output, @nparams gives how many slots were filled + * with parameter information, which might be less but will not exceed + * the input value. + * + * As a special case, calling with @params as NULL and @nparams as 0 on + * input will cause @nparams on output to contain the number of parameters + * supported by the hypervisor. The caller should then allocate @params + * array, i.e. (sizeof(@virTypedParameter) * @nparams) bytes and call the + * API again. + * + * See virDomainGetMemoryParameters() for an equivalent usage example.
You took a wrong API to copy. Requiring the caller to run the API with @params == NULL to get the number of parameters and then pass the allocated @params to get the values is the original design which we do not use for new APIs. The API should just allocate @params, fill it with all the values it knows about, and return the count in @nparams. In other words, no caller allocated arrays, please. You can look at, e.g., virDomainGetJobStats to see how it works. Jirka

-----Original Message----- From: Jiri Denemark [mailto:jdenemar@redhat.com] Sent: Tuesday, November 24, 2015 12:25 AM To: Ren, Qiaowei Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 3/8] perf: implement the public APIs for perf event
On Tue, Nov 17, 2015 at 16:00:43 +0800, Qiaowei Ren wrote:
* src/libvirt-domain.c: Implement virDomainGetPerfEvents and virDomainSetPerfEvents.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- src/libvirt-domain.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+)
diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index de7eb04..e767606 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -9572,6 +9572,112 @@ virDomainOpenChannel(virDomainPtr dom,
/** + * virDomainGetPerfEvents: + * @domain: pointer to domain object + * @params: pointer to perf events parameter object + * (return value, allocated by the caller) + * @nparams: pointer to number of perf event parameters + * + * Get all perf events setting. On input, @nparams gives the size of + the + * @params array; on output, @nparams gives how many slots were + filled + * with parameter information, which might be less but will not + exceed + * the input value. + * + * As a special case, calling with @params as NULL and @nparams as 0 + on + * input will cause @nparams on output to contain the number of + parameters + * supported by the hypervisor. The caller should then allocate + @params + * array, i.e. (sizeof(@virTypedParameter) * @nparams) bytes and call + the + * API again. + * + * See virDomainGetMemoryParameters() for an equivalent usage example.
You took a wrong API to copy. Requiring the caller to run the API with @params == NULL to get the number of parameters and then pass the allocated @params to get the values is the original design which we do not use for new APIs. The API should just allocate @params, fill it with all the values it knows about, and return the count in @nparams. In other words, no caller allocated arrays, please. You can look at, e.g., virDomainGetJobStats to see how it works.
Thanks, Jirka. I will update this patch according to new API design. Thanks, Qiaowei

Add remote support for perf event. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- daemon/remote.c | 60 ++++++++++++++++++++++++++++++++++++++++++++ src/remote/remote_driver.c | 49 ++++++++++++++++++++++++++++++++++++ src/remote/remote_protocol.x | 32 ++++++++++++++++++++++- src/remote_protocol-structs | 20 +++++++++++++++ 4 files changed, 160 insertions(+), 1 deletion(-) diff --git a/daemon/remote.c b/daemon/remote.c index 3a3eb09..68a319c 100644 --- a/daemon/remote.c +++ b/daemon/remote.c @@ -2810,6 +2810,66 @@ remoteDispatchNodeGetMemoryStats(virNetServerPtr server ATTRIBUTE_UNUSED, } static int +remoteDispatchDomainGetPerfEvents(virNetServerPtr server ATTRIBUTE_UNUSED, + virNetServerClientPtr client ATTRIBUTE_UNUSED, + virNetMessagePtr msg ATTRIBUTE_UNUSED, + virNetMessageErrorPtr rerr, + remote_domain_get_perf_events_args *args, + remote_domain_get_perf_events_ret *ret) +{ + virDomainPtr dom = NULL; + virTypedParameterPtr params = NULL; + int nparams = 0; + int rv = -1; + struct daemonClientPrivate *priv = + virNetServerClientGetPrivateData(client); + + if (!priv->conn) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("connection not open")); + goto cleanup; + } + + if (args->nparams > REMOTE_DOMAIN_MEMORY_PARAMETERS_MAX) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("nparams too large")); + goto cleanup; + } + if (args->nparams && VIR_ALLOC_N(params, args->nparams) < 0) + goto cleanup; + nparams = args->nparams; + + if (!(dom = get_nonnull_domain(priv->conn, args->dom))) + goto cleanup; + + if (virDomainGetPerfEvents(dom, params, &nparams) < 0) + goto cleanup; + + /* In this case, we need to send back the number of parameters + * supported + */ + if (args->nparams == 0) { + ret->nparams = nparams; + goto success; + } + + if (remoteSerializeTypedParameters(params, nparams, + &ret->params.params_val, + &ret->params.params_len, + 0) < 0) + goto cleanup; + + success: + rv = 0; + + cleanup: + if (rv < 0) + virNetMessageSaveError(rerr); + virTypedParamsFree(params, nparams); + virObjectUnref(dom); + return rv; + +} + +static int remoteDispatchDomainGetBlockJobInfo(virNetServerPtr server ATTRIBUTE_UNUSED, virNetServerClientPtr client ATTRIBUTE_UNUSED, virNetMessagePtr msg ATTRIBUTE_UNUSED, diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index a1dd640..30297c7 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -2028,6 +2028,53 @@ remoteDomainGetNumaParameters(virDomainPtr domain, } static int +remoteDomainGetPerfEvents(virDomainPtr domain, + virTypedParameterPtr params, + int *nparams) +{ + int rv = -1; + remote_domain_get_perf_events_args args; + remote_domain_get_perf_events_ret ret; + struct private_data *priv = domain->conn->privateData; + + remoteDriverLock(priv); + + make_nonnull_domain(&args.dom, domain); + args.nparams = *nparams; + + memset(&ret, 0, sizeof(ret)); + if (call(domain->conn, priv, 0, REMOTE_PROC_DOMAIN_GET_PERF_EVENTS, + (xdrproc_t) xdr_remote_domain_get_perf_events_args, (char *) &args, + (xdrproc_t) xdr_remote_domain_get_perf_events_ret, (char *) &ret) == -1) + goto done; + + /* Handle the case when the caller does not know the number of parameters + * and is asking for the number of parameters supported + */ + if (*nparams == 0) { + *nparams = ret.nparams; + rv = 0; + goto cleanup; + } + + if (remoteDeserializeTypedParameters(ret.params.params_val, + ret.params.params_len, + REMOTE_DOMAIN_NUMA_PARAMETERS_MAX, + ¶ms, + nparams) < 0) + goto cleanup; + + rv = 0; + + cleanup: + xdr_free((xdrproc_t) xdr_remote_domain_get_perf_events_ret, + (char *) &ret); + done: + remoteDriverUnlock(priv); + return rv; +} + +static int remoteDomainGetBlkioParameters(virDomainPtr domain, virTypedParameterPtr params, int *nparams, unsigned int flags) @@ -8253,6 +8300,8 @@ static virHypervisorDriver hypervisor_driver = { .domainGetMemoryParameters = remoteDomainGetMemoryParameters, /* 0.8.5 */ .domainSetBlkioParameters = remoteDomainSetBlkioParameters, /* 0.9.0 */ .domainGetBlkioParameters = remoteDomainGetBlkioParameters, /* 0.9.0 */ + .domainGetPerfEvents = remoteDomainGetPerfEvents, /* 1.2.23 */ + .domainSetPerfEvents = remoteDomainSetPerfEvents, /* 1.2.23 */ .domainGetInfo = remoteDomainGetInfo, /* 0.3.0 */ .domainGetState = remoteDomainGetState, /* 0.9.2 */ .domainGetControlInfo = remoteDomainGetControlInfo, /* 0.9.3 */ diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 80f4a8b..b6b9494 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -127,6 +127,9 @@ const REMOTE_DOMAIN_BLOCK_IO_TUNE_PARAMETERS_MAX = 16; /* Upper limit on list of numa parameters. */ const REMOTE_DOMAIN_NUMA_PARAMETERS_MAX = 16; +/* Upper limit on list of perf events. */ +const REMOTE_DOMAIN_PERF_EVENTS_MAX = 64; + /* Upper limit on block copy tunable parameters. */ const REMOTE_DOMAIN_BLOCK_COPY_PARAMETERS_MAX = 16; @@ -646,6 +649,21 @@ struct remote_domain_get_numa_parameters_ret { int nparams; }; +struct remote_domain_set_perf_events_args { + remote_nonnull_domain dom; + remote_typed_param params<REMOTE_DOMAIN_PERF_EVENTS_MAX>; +}; + +struct remote_domain_get_perf_events_args { + remote_nonnull_domain dom; + int nparams; +}; + +struct remote_domain_get_perf_events_ret { + remote_typed_param params<REMOTE_DOMAIN_PERF_EVENTS_MAX>; + int nparams; +}; + struct remote_domain_block_stats_args { remote_nonnull_domain dom; remote_nonnull_string path; @@ -5712,5 +5730,17 @@ enum remote_procedure { * @acl: domain:write * @acl: domain:save */ - REMOTE_PROC_DOMAIN_RENAME = 358 + REMOTE_PROC_DOMAIN_RENAME = 358, + + /** + * @generate: none + * @acl: domain:read + */ + REMOTE_PROC_DOMAIN_GET_PERF_EVENTS = 359, + + /** + * @generate: both + * @acl: domain:write + */ + REMOTE_PROC_DOMAIN_SET_PERF_EVENTS = 360 }; diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index ff99c00..b9fcd69 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -304,6 +304,24 @@ struct remote_domain_get_numa_parameters_ret { } params; int nparams; }; +struct remote_domain_set_perf_events_args { + remote_nonnull_domain dom; + struct { + u_int params_len; + remote_typed_param * params_val; + } params; +}; +struct remote_domain_get_perf_events_args { + remote_nonnull_domain dom; + int nparams; +}; +struct remote_domain_get_perf_events_ret { + struct { + u_int params_len; + remote_typed_param * params_val; + } params; + int nparams; +}; struct remote_domain_block_stats_args { remote_nonnull_domain dom; remote_nonnull_string path; @@ -3051,4 +3069,6 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_DEL_IOTHREAD = 356, REMOTE_PROC_DOMAIN_SET_USER_PASSWORD = 357, REMOTE_PROC_DOMAIN_RENAME = 358, + REMOTE_PROC_DOMAIN_GET_PERF_EVENTS = 359, + REMOTE_PROC_DOMAIN_SET_PERF_EVENTS = 360, }; -- 1.9.1

On Tue, Nov 17, 2015 at 16:00:44 +0800, Qiaowei Ren wrote:
Add remote support for perf event.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- daemon/remote.c | 60 ++++++++++++++++++++++++++++++++++++++++++++ src/remote/remote_driver.c | 49 ++++++++++++++++++++++++++++++++++++ src/remote/remote_protocol.x | 32 ++++++++++++++++++++++- src/remote_protocol-structs | 20 +++++++++++++++ 4 files changed, 160 insertions(+), 1 deletion(-)
This will need to be changed to match the new style of the API. But, since you are adding a new API, would you mind creating a patch for libvirt-python to add this new API to python bindings? Jirka

-----Original Message----- From: Jiri Denemark [mailto:jdenemar@redhat.com] Sent: Tuesday, November 24, 2015 5:43 PM To: Ren, Qiaowei Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 4/8] perf: implement the remote protocol for perf event
On Tue, Nov 17, 2015 at 16:00:44 +0800, Qiaowei Ren wrote:
Add remote support for perf event.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- daemon/remote.c | 60 ++++++++++++++++++++++++++++++++++++++++++++ src/remote/remote_driver.c | 49 ++++++++++++++++++++++++++++++++++++ src/remote/remote_protocol.x | 32 ++++++++++++++++++++++- src/remote_protocol-structs | 20 +++++++++++++++ 4 files changed, 160 insertions(+), 1 deletion(-)
This will need to be changed to match the new style of the API. But, since you are adding a new API, would you mind creating a patch for libvirt-python to add this new API to python bindings?
Sure. The patch for libvirt-python should be also sent this maillist with this series, right? Thanks, Qiaowei

On Wed, Nov 25, 2015 at 01:22:22 +0000, Ren, Qiaowei wrote:
-----Original Message----- From: Jiri Denemark [mailto:jdenemar@redhat.com] Sent: Tuesday, November 24, 2015 5:43 PM To: Ren, Qiaowei Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 4/8] perf: implement the remote protocol for perf event
On Tue, Nov 17, 2015 at 16:00:44 +0800, Qiaowei Ren wrote:
Add remote support for perf event.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- daemon/remote.c | 60 ++++++++++++++++++++++++++++++++++++++++++++ src/remote/remote_driver.c | 49 ++++++++++++++++++++++++++++++++++++ src/remote/remote_protocol.x | 32 ++++++++++++++++++++++- src/remote_protocol-structs | 20 +++++++++++++++ 4 files changed, 160 insertions(+), 1 deletion(-)
This will need to be changed to match the new style of the API. But, since you are adding a new API, would you mind creating a patch for libvirt-python to add this new API to python bindings?
Sure. The patch for libvirt-python should be also sent this maillist with this series, right?
Well, it's a different repository so sending it in the same series might be a bit tricky. So it's up to you if you send it in one series or send it as a separate patch. And yes, they should all be sent to this mailing list. Jirka

This patch implement a set of interfaces for perf event. Based on these interfaces, we can implement internal driver API for perf, and get the results of perf conuter you care about. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/virterror.h | 2 + src/Makefile.am | 1 + src/libvirt_private.syms | 8 ++ src/util/virerror.c | 2 + src/util/virperf.c | 222 ++++++++++++++++++++++++++++++++++++++++++++ src/util/virperf.h | 60 ++++++++++++ 6 files changed, 295 insertions(+) create mode 100644 src/util/virperf.c create mode 100644 src/util/virperf.h diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index f716cb9..59dd8e7 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -128,6 +128,8 @@ typedef enum { VIR_FROM_THREAD = 61, /* Error from thread utils */ VIR_FROM_ADMIN = 62, /* Error from admin backend */ + VIR_FROM_PERF = 63, /* Error from perf */ + # ifdef VIR_ENUM_SENTINELS VIR_ERR_DOMAIN_LAST # endif diff --git a/src/Makefile.am b/src/Makefile.am index 99b4993..afc6cf0 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -98,6 +98,7 @@ UTIL_SOURCES = \ util/virauthconfig.c util/virauthconfig.h \ util/virbitmap.c util/virbitmap.h \ util/virbuffer.c util/virbuffer.h \ + util/virperf.c util/virperf.h \ util/vircgroup.c util/vircgroup.h util/vircgrouppriv.h \ util/virclosecallbacks.c util/virclosecallbacks.h \ util/vircommand.c util/vircommand.h util/vircommandpriv.h \ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a835f18..f0e2bd5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1988,6 +1988,14 @@ virPCIGetVirtualFunctions; virPCIIsVirtualFunction; +# util/vircgroup.h +virPerfInitialize; +virPerfFree; +virPerfEventOfType; +virPerfCmtEnable; +virPerfEventDisable; + + # util/virpidfile.h virPidFileAcquire; virPidFileAcquirePath; diff --git a/src/util/virerror.c b/src/util/virerror.c index 6dc05f4..12a5218 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -134,6 +134,8 @@ VIR_ENUM_IMPL(virErrorDomain, VIR_ERR_DOMAIN_LAST, "Polkit", /* 60 */ "Thread jobs", "Admin Interface", + + "Perf", ) diff --git a/src/util/virperf.c b/src/util/virperf.c new file mode 100644 index 0000000..9c71858 --- /dev/null +++ b/src/util/virperf.c @@ -0,0 +1,222 @@ +/* + * virperf.c: methods for managing perf events + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Ren Qiaowei <qiaowei.ren@intel.com> + */ +#include <config.h> + +#include <sys/ioctl.h> +#if defined HAVE_SYS_SYSCALL_H +# include <sys/syscall.h> +#endif + +#include "virperf.h" +#include "viralloc.h" +#include "virerror.h" +#include "virlog.h" +#include "virfile.h" +#include "virstring.h" +#include "virtypedparam.h" + +VIR_LOG_INIT("util.perf"); + +#define VIR_FROM_THIS VIR_FROM_PERF + +#if defined(__linux__) && defined(HAVE_SYS_SYSCALL_H) +# define VIR_PERF_SUPPORTED +#endif + +VIR_ENUM_IMPL(virPerfEvent, VIR_PERF_EVENT_LAST, + "cmt"); + + +#ifdef VIR_PERF_SUPPORTED + +#include <linux/perf_event.h> + +int +virPerfInitialize(virPerfPtr *perf) +{ + size_t i; + + if (VIR_ALLOC((*perf)) < 0) + return -1; + + for (i = 0; i < VIR_PERF_EVENT_LAST; i++) { + (*perf)->events[i].type = i; + (*perf)->events[i].fd = -1; + (*perf)->events[i].enabled = false; + } + + return 0; +} + +void +virPerfFree(virPerfPtr *perf) +{ + size_t i; + + if (*perf == NULL) + return; + + for (i = 0; i < VIR_PERF_EVENT_LAST; i++) { + if ((*perf)->events[i].enabled) + virPerfEventDisable(i, *perf); + } + + VIR_FREE(*perf); +} + +virPerfEventPtr +virPerfEventOfType(virPerfPtr perf, + int type) +{ + size_t i; + for (i = 0; i < VIR_PERF_EVENT_LAST; i++) { + if (perf->events[i].type == type) + return &perf->events[i]; + } + + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Event '%d' is not supported"), + type); + + return NULL; +} + +static int +sys_perf_event_open(struct perf_event_attr *hw_event, + pid_t pid, + int cpu, + int group_fd, + unsigned long flags) +{ + return syscall(__NR_perf_event_open, hw_event, pid, cpu, + group_fd, flags); +} + +int +virPerfCmtEnable(pid_t pid, + virPerfPtr perf) +{ + struct perf_event_attr cmt_attr; + int event_type; + FILE *fp = NULL; + virPerfEventPtr event = NULL; + + fp = fopen("/sys/devices/intel_cqm/type", "r"); + if (!fp) { + virReportSystemError(errno, "%s", + _("CMT is not available on this host")); + return -1; + } + if (fscanf(fp, "%d", &event_type) != 1) { + virReportSystemError(errno, "%s", + _("Unable to read event type file.")); + VIR_FORCE_FCLOSE(fp); + return -1; + } + VIR_FORCE_FCLOSE(fp); + + event = virPerfEventOfType(perf, VIR_PERF_EVENT_CMT); + + memset(&cmt_attr, 0, sizeof(struct perf_event_attr)); + cmt_attr.size = sizeof(struct perf_event_attr); + cmt_attr.type = event_type; + cmt_attr.config = 1; + cmt_attr.inherit = 1; + cmt_attr.disabled = 1; + cmt_attr.enable_on_exec = 0; + + event->fd = sys_perf_event_open(&cmt_attr, pid, -1, -1, 0); + if (event->fd < 0) { + virReportSystemError(errno, + _("Unable to open perf type=%d for pid=%d"), + event_type, pid); + return -1; + } + + if (ioctl(event->fd, PERF_EVENT_IOC_ENABLE) < 0) { + virReportSystemError(errno, "%s", + _("Unable to enable perf event for CMT")); + return -1; + } + + event->enabled = true; + return 0; +} + +int +virPerfEventDisable(int type, + virPerfPtr perf) +{ + virPerfEventPtr event = virPerfEventOfType(perf, type); + + if (event && ioctl(event->fd, PERF_EVENT_IOC_DISABLE) < 0) { + virReportSystemError(errno, "%s", + _("Unable to disable perf event for CMT")); + return -1; + } + + event->enabled = false; + VIR_FORCE_CLOSE(event->fd); + return 0; +} + +#else /* !VIR_PERF_SUPPORTED */ +int +virPerfInitialize(virPerfPtr perf ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENXIO, "%s", + _("Perf not supported on this platform")); + return 0; +} + +void +virPerfFree(virPerfPtr perf ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENXIO, "%s", + _("Perf not supported on this platform")); +} + +virPerfEventPtr +virPerfEventOfType(virPerfPtr perf ATTRIBUTE_UNUSED, + int type ATTRIBUTE_UNUSED) +{ + return NULL; +} + +int +virPerfCmtEnable(pid_t pid ATTRIBUTE_UNUSED, + virPerfPtr perf ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENXIO, "%s", + _("Perf not supported on this platform")); + return -1; +} + +int +virPerfEventDisable(int event ATTRIBUTE_UNUSED, + virPerfPtr perf ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENXIO, "%s", + _("Perf not supported on this platform")); + return -1; +} + +#endif /* !VIR_PERF_SUPPORTED */ diff --git a/src/util/virperf.h b/src/util/virperf.h new file mode 100644 index 0000000..bab6597 --- /dev/null +++ b/src/util/virperf.h @@ -0,0 +1,60 @@ +/* + * virperf.h: methods for managing perf events + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Ren Qiaowei <qiaowei.ren@intel.com> + */ + +#ifndef __VIR_PERF_H__ +# define __VIR_PERF_H__ + +# include "virutil.h" + +enum { + VIR_PERF_EVENT_CMT, + + VIR_PERF_EVENT_LAST +}; + +VIR_ENUM_DECL(virPerfEvent); + +struct virPerfEvent { + int type; + int fd; + bool enabled; +}; +typedef struct virPerfEvent *virPerfEventPtr; + +struct virPerf { + struct virPerfEvent events[VIR_PERF_EVENT_LAST]; +}; +typedef struct virPerf *virPerfPtr; + +int virPerfInitialize(virPerfPtr *perf); + +void virPerfFree(virPerfPtr *perf); + +virPerfEventPtr virPerfEventOfType(virPerfPtr perf, + int type); + +int virPerfCmtEnable(pid_t pid, + virPerfPtr perf); + +int virPerfEventDisable(int type, + virPerfPtr perf); + +#endif /* __VIR_PERF_H__ */ -- 1.9.1

On Tue, Nov 17, 2015 at 16:00:45 +0800, Qiaowei Ren wrote:
This patch implement a set of interfaces for perf event. Based on these interfaces, we can implement internal driver API for perf, and get the results of perf conuter you care about.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/virterror.h | 2 + src/Makefile.am | 1 + src/libvirt_private.syms | 8 ++ src/util/virerror.c | 2 + src/util/virperf.c | 222 ++++++++++++++++++++++++++++++++++++++++++++ src/util/virperf.h | 60 ++++++++++++ 6 files changed, 295 insertions(+) create mode 100644 src/util/virperf.c create mode 100644 src/util/virperf.h
diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index f716cb9..59dd8e7 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -128,6 +128,8 @@ typedef enum { VIR_FROM_THREAD = 61, /* Error from thread utils */ VIR_FROM_ADMIN = 62, /* Error from admin backend */
Extra empty line.
+ VIR_FROM_PERF = 63, /* Error from perf */ + # ifdef VIR_ENUM_SENTINELS VIR_ERR_DOMAIN_LAST # endif diff --git a/src/Makefile.am b/src/Makefile.am index 99b4993..afc6cf0 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -98,6 +98,7 @@ UTIL_SOURCES = \ util/virauthconfig.c util/virauthconfig.h \ util/virbitmap.c util/virbitmap.h \ util/virbuffer.c util/virbuffer.h \ + util/virperf.c util/virperf.h \ util/vircgroup.c util/vircgroup.h util/vircgrouppriv.h \ util/virclosecallbacks.c util/virclosecallbacks.h \ util/vircommand.c util/vircommand.h util/vircommandpriv.h \ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a835f18..f0e2bd5 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1988,6 +1988,14 @@ virPCIGetVirtualFunctions; virPCIIsVirtualFunction;
+# util/vircgroup.h
s/vircgroup/virperf/
+virPerfInitialize; +virPerfFree; +virPerfEventOfType; +virPerfCmtEnable; +virPerfEventDisable; + + # util/virpidfile.h virPidFileAcquire; virPidFileAcquirePath; diff --git a/src/util/virerror.c b/src/util/virerror.c index 6dc05f4..12a5218 100644 --- a/src/util/virerror.c +++ b/src/util/virerror.c @@ -134,6 +134,8 @@ VIR_ENUM_IMPL(virErrorDomain, VIR_ERR_DOMAIN_LAST, "Polkit", /* 60 */ "Thread jobs", "Admin Interface", +
Extra empty line.
+ "Perf", )
diff --git a/src/util/virperf.c b/src/util/virperf.c new file mode 100644 index 0000000..9c71858 --- /dev/null +++ b/src/util/virperf.c @@ -0,0 +1,222 @@ +/* + * virperf.c: methods for managing perf events + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Ren Qiaowei <qiaowei.ren@intel.com> + */ +#include <config.h> + +#include <sys/ioctl.h> +#if defined HAVE_SYS_SYSCALL_H +# include <sys/syscall.h> +#endif + +#include "virperf.h" +#include "viralloc.h" +#include "virerror.h" +#include "virlog.h" +#include "virfile.h" +#include "virstring.h" +#include "virtypedparam.h" + +VIR_LOG_INIT("util.perf"); + +#define VIR_FROM_THIS VIR_FROM_PERF + +#if defined(__linux__) && defined(HAVE_SYS_SYSCALL_H) +# define VIR_PERF_SUPPORTED +#endif + +VIR_ENUM_IMPL(virPerfEvent, VIR_PERF_EVENT_LAST, + "cmt"); + + +#ifdef VIR_PERF_SUPPORTED
Why don't you use defined(__linux__) && defined(HAVE_SYS_SYSCALL_H) directly here?
+ +#include <linux/perf_event.h>
This should be "# include <linux/perf_event.h>".
+ +int +virPerfInitialize(virPerfPtr *perf)
We usually create such functions with virPerfPtr virPerfNew(void) prototype. You can then use it if (!(perf = virPerfNew())) goto error;
+{ + size_t i; + + if (VIR_ALLOC((*perf)) < 0) + return -1; + + for (i = 0; i < VIR_PERF_EVENT_LAST; i++) { + (*perf)->events[i].type = i; + (*perf)->events[i].fd = -1; + (*perf)->events[i].enabled = false; + } + + return 0; +}
+ +void +virPerfFree(virPerfPtr *perf)
void virPerFree(virPerfPtr perf) is the prototype we usually use for this kind of functions.
+{ + size_t i; + + if (*perf == NULL) + return; + + for (i = 0; i < VIR_PERF_EVENT_LAST; i++) { + if ((*perf)->events[i].enabled) + virPerfEventDisable(i, *perf); + } + + VIR_FREE(*perf); +} + +virPerfEventPtr +virPerfEventOfType(virPerfPtr perf, + int type)
This is a private API, using enum type [1] is OK. I think virPerfGetEvent would be a better name for this API.
+{ + size_t i; + for (i = 0; i < VIR_PERF_EVENT_LAST; i++) { + if (perf->events[i].type == type) + return &perf->events[i]; + } + + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Event '%d' is not supported"), + type); + + return NULL; +}
The perf->events array is indexed by event type so why do we walk through all of them instead of just: if (type >= VIR_PERF_EVENT_LAST) { virReportError(VIR_ERR_INTERNAL_ERROR, ...); return NULL; } return perf->events + type;
+ +static int +sys_perf_event_open(struct perf_event_attr *hw_event, + pid_t pid, + int cpu, + int group_fd, + unsigned long flags) +{ + return syscall(__NR_perf_event_open, hw_event, pid, cpu, + group_fd, flags); +}
The name of this wrapper does not really match the name convention used for all other functions in this file. But, I don't see a real reason for this one-line wrapper since it is only used once.
+ +int +virPerfCmtEnable(pid_t pid, + virPerfPtr perf)
I think 'perf' should always be the first parameter of all virPerf functions, since it is the object all the functions operate on.
From a caller perspective, I think it would be a bit better to create a generic int virPerfEventEnable(perf, type, pid) entry point, and make virPerfCmtEnable static. Unless each type requires different input data.
+{ + struct perf_event_attr cmt_attr; + int event_type; + FILE *fp = NULL; + virPerfEventPtr event = NULL; + + fp = fopen("/sys/devices/intel_cqm/type", "r"); + if (!fp) { + virReportSystemError(errno, "%s", + _("CMT is not available on this host")); + return -1; + } + if (fscanf(fp, "%d", &event_type) != 1) { + virReportSystemError(errno, "%s", + _("Unable to read event type file.")); + VIR_FORCE_FCLOSE(fp); + return -1; + } + VIR_FORCE_FCLOSE(fp); + + event = virPerfEventOfType(perf, VIR_PERF_EVENT_CMT); + + memset(&cmt_attr, 0, sizeof(struct perf_event_attr)); + cmt_attr.size = sizeof(struct perf_event_attr);
Using sizeof(cmt_attr) instead of sizeof(struct perf_event_attr) is a bit more robust and it is the preferred way in libvirt.
+ cmt_attr.type = event_type; + cmt_attr.config = 1; + cmt_attr.inherit = 1; + cmt_attr.disabled = 1; + cmt_attr.enable_on_exec = 0; + + event->fd = sys_perf_event_open(&cmt_attr, pid, -1, -1, 0); + if (event->fd < 0) { + virReportSystemError(errno, + _("Unable to open perf type=%d for pid=%d"), + event_type, pid); + return -1; + } + + if (ioctl(event->fd, PERF_EVENT_IOC_ENABLE) < 0) { + virReportSystemError(errno, "%s", + _("Unable to enable perf event for CMT"));
You should close and reset event->fd here.
+ return -1; + } + + event->enabled = true; + return 0;
Overall, this function would benefit from introducing error and cleanup labels to group all cleanups at one place. And you should avoid touching event structure until you know everything succeeded to make the cleanups cleaner.
+} + +int +virPerfEventDisable(int type, + virPerfPtr perf) +{ + virPerfEventPtr event = virPerfEventOfType(perf, type);
You should return -1 if event is NULL...
+ + if (event && ioctl(event->fd, PERF_EVENT_IOC_DISABLE) < 0) { + virReportSystemError(errno, "%s", + _("Unable to disable perf event for CMT")); + return -1; + } + + event->enabled = false;
otherwise libvirt will crash here.
+ VIR_FORCE_CLOSE(event->fd); + return 0; +} + +#else /* !VIR_PERF_SUPPORTED */ +int +virPerfInitialize(virPerfPtr perf ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENXIO, "%s", + _("Perf not supported on this platform")); + return 0; +} + +void +virPerfFree(virPerfPtr perf ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENXIO, "%s", + _("Perf not supported on this platform")); +}
At least virPerfNew and virPerfFree should work even if perf is not supported so that the caller can initialize and free its structures without thinking about this.
+ +virPerfEventPtr +virPerfEventOfType(virPerfPtr perf ATTRIBUTE_UNUSED, + int type ATTRIBUTE_UNUSED) +{
No error?
+ return NULL; +} + +int +virPerfCmtEnable(pid_t pid ATTRIBUTE_UNUSED, + virPerfPtr perf ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENXIO, "%s", + _("Perf not supported on this platform")); + return -1; +} + +int +virPerfEventDisable(int event ATTRIBUTE_UNUSED, + virPerfPtr perf ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENXIO, "%s", + _("Perf not supported on this platform")); + return -1; +} + +#endif /* !VIR_PERF_SUPPORTED */ diff --git a/src/util/virperf.h b/src/util/virperf.h new file mode 100644 index 0000000..bab6597 --- /dev/null +++ b/src/util/virperf.h @@ -0,0 +1,60 @@ +/* + * virperf.h: methods for managing perf events + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Ren Qiaowei <qiaowei.ren@intel.com> + */ + +#ifndef __VIR_PERF_H__ +# define __VIR_PERF_H__ + +# include "virutil.h" + +enum { + VIR_PERF_EVENT_CMT, + + VIR_PERF_EVENT_LAST +};
This enum should be a typedef so that it can be used for declaring variables [1].
+ +VIR_ENUM_DECL(virPerfEvent); + +struct virPerfEvent { + int type; + int fd; + bool enabled; +}; +typedef struct virPerfEvent *virPerfEventPtr; + +struct virPerf { + struct virPerfEvent events[VIR_PERF_EVENT_LAST]; +}; +typedef struct virPerf *virPerfPtr;
We usually try to hide the internals of such structures in .c and provide accessors.
+ +int virPerfInitialize(virPerfPtr *perf); + +void virPerfFree(virPerfPtr *perf); + +virPerfEventPtr virPerfEventOfType(virPerfPtr perf, + int type); + +int virPerfCmtEnable(pid_t pid, + virPerfPtr perf); + +int virPerfEventDisable(int type, + virPerfPtr perf); + +#endif /* __VIR_PERF_H__ */ -- 1.9.1
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, Nov 24, 2015 at 13:23:49 +0100, Jiri Denemark wrote:
On Tue, Nov 17, 2015 at 16:00:45 +0800, Qiaowei Ren wrote:
This patch implement a set of interfaces for perf event. Based on these interfaces, we can implement internal driver API for perf, and get the results of perf conuter you care about.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> ... diff --git a/src/util/virperf.c b/src/util/virperf.c new file mode 100644 index 0000000..9c71858 --- /dev/null +++ b/src/util/virperf.c ... + +int +virPerfCmtEnable(pid_t pid, + virPerfPtr perf)
I think 'perf' should always be the first parameter of all virPerf functions, since it is the object all the functions operate on.
From a caller perspective, I think it would be a bit better to create a generic int virPerfEventEnable(perf, type, pid) entry point, and make virPerfCmtEnable static. Unless each type requires different input data.
Actually, you can just pass vm to virPerfEventEnable, that should given enough input data to any perf event. Jirka

On Tue, Nov 24, 2015 at 01:44:53PM +0100, Jiri Denemark wrote:
On Tue, Nov 24, 2015 at 13:23:49 +0100, Jiri Denemark wrote:
On Tue, Nov 17, 2015 at 16:00:45 +0800, Qiaowei Ren wrote:
This patch implement a set of interfaces for perf event. Based on these interfaces, we can implement internal driver API for perf, and get the results of perf conuter you care about.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> ... diff --git a/src/util/virperf.c b/src/util/virperf.c new file mode 100644 index 0000000..9c71858 --- /dev/null +++ b/src/util/virperf.c ... + +int +virPerfCmtEnable(pid_t pid, + virPerfPtr perf)
I think 'perf' should always be the first parameter of all virPerf functions, since it is the object all the functions operate on.
From a caller perspective, I think it would be a bit better to create a generic int virPerfEventEnable(perf, type, pid) entry point, and make virPerfCmtEnable static. Unless each type requires different input data.
Actually, you can just pass vm to virPerfEventEnable, that should given enough input data to any perf event.
The virPerfEventEnable code lives in util which is supposed to be isolated from any conf classes, so passing 'vm' is not appropriate Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

This patch implement the internal driver API for perf event into qemu driver. In addition, this patch extend virDomainListGetStats API to get the statistics for perf event. To do so, we add a 'VIR_DOMAIN_STATS_PERF' enum to causes reporting of all previously enabled perf events. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/libvirt-domain.h | 1 + src/qemu/qemu_domain.h | 3 + src/qemu/qemu_driver.c | 181 +++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 6 ++ 4 files changed, 191 insertions(+) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 69965e6..5e74b29 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1771,6 +1771,7 @@ typedef enum { VIR_DOMAIN_STATS_VCPU = (1 << 3), /* return domain virtual CPU info */ VIR_DOMAIN_STATS_INTERFACE = (1 << 4), /* return domain interfaces info */ VIR_DOMAIN_STATS_BLOCK = (1 << 5), /* return domain block info */ + VIR_DOMAIN_STATS_PERF = (1 << 6), /* return domain perf event info */ } virDomainStatsTypes; typedef enum { diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 4be998c..0348d4a 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -26,6 +26,7 @@ # include "virthread.h" # include "vircgroup.h" +# include "virperf.h" # include "domain_addr.h" # include "domain_conf.h" # include "snapshot_conf.h" @@ -190,6 +191,8 @@ struct _qemuDomainObjPrivate { virCgroupPtr cgroup; + virPerfPtr perf; + virCond unplugFinished; /* signals that unpluggingDevice was unplugged */ const char *unpluggingDevice; /* alias of the device that is being unplugged */ char **qemuDevices; /* NULL-terminated list of devices aliases known to QEMU */ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 92a9961..5ad4d79 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -98,6 +98,7 @@ #include "virhostdev.h" #include "domain_capabilities.h" #include "vircgroup.h" +#include "virperf.h" #include "virnuma.h" #include "dirname.h" #include "network/bridge_driver.h" @@ -113,6 +114,8 @@ VIR_LOG_INIT("qemu.qemu_driver"); #define QEMU_NB_NUMA_PARAM 2 +#define QEMU_NB_PERF_PARAM VIR_PERF_EVENT_LAST + #define QEMU_SCHED_MIN_PERIOD 1000LL #define QEMU_SCHED_MAX_PERIOD 1000000LL #define QEMU_SCHED_MIN_QUOTA 1000LL @@ -10244,6 +10247,106 @@ qemuDomainGetNumaParameters(virDomainPtr dom, } static int +qemuDomainSetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int nparams) +{ + size_t i; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + int ret = -1; + bool enabled; + + if (virTypedParamsValidate(params, nparams, + VIR_DOMAIN_PERF_CMT, + VIR_TYPED_PARAM_BOOLEAN, + NULL) < 0) + return -1; + + if (!(vm = qemuDomObjFromDomain(dom))) + return -1; + + priv = vm->privateData; + + if (virDomainGetPerfEventsEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + for (i = 0; i < nparams; i++) { + virTypedParameterPtr param = ¶ms[i]; + enabled = params[i].value.b; + + if (STREQ(param->field, VIR_DOMAIN_PERF_CMT)) { + if (!enabled && virPerfEventDisable(VIR_PERF_EVENT_CMT, priv->perf)) { + virReportError(VIR_ERR_NO_SUPPORT, + _("can't disable perf event: %s"), + VIR_DOMAIN_PERF_CMT); + goto cleanup; + } + if (enabled && virPerfCmtEnable(vm->pid, priv->perf)) { + virReportError(VIR_ERR_NO_SUPPORT, + _("can't enable perf event: %s"), + VIR_DOMAIN_PERF_CMT); + goto cleanup; + } + } + } + + ret = 0; + + cleanup: + virDomainObjEndAPI(&vm); + return ret; +} + +static int +qemuDomainGetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int * nparams) +{ + size_t i; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + int ret = -1; + + if (!(vm = qemuDomObjFromDomain(dom))) + goto cleanup; + + priv = vm->privateData; + + if (virDomainGetPerfEventsEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + if ((*nparams) == 0) { + *nparams = QEMU_NB_PERF_PARAM; + ret = 0; + goto cleanup; + } + + for (i = 0; i < QEMU_NB_PERF_PARAM && i < *nparams; i++) { + virMemoryParameterPtr param = ¶ms[i]; + + switch (i) { + case VIR_PERF_EVENT_CMT: + if (virTypedParameterAssign(param, VIR_DOMAIN_PERF_CMT, + VIR_TYPED_PARAM_BOOLEAN, + priv->perf->events[VIR_PERF_EVENT_CMT].enabled) < 0) + goto cleanup; + break; + default: + break; + } + } + + if (*nparams > QEMU_NB_PERF_PARAM) + *nparams = QEMU_NB_PERF_PARAM; + ret = 0; + + cleanup: + virDomainObjEndAPI(&vm); + return ret; +} + +static int qemuSetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, unsigned long long period, long long quota) { @@ -19419,6 +19522,81 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver, #undef QEMU_ADD_COUNT_PARAM +static int +qemuDomainGetStatsPerfCmt(virPerfEventPtr eventPtr, + virDomainStatsRecordPtr record, + int *maxparams) +{ + FILE *fd; + unsigned long long cache = 0; + int scaling_factor = 0; + + if (eventPtr->fd <= 0 || !eventPtr->enabled) + return -1; + + if (read(eventPtr->fd, &cache, sizeof(uint64_t)) < 0) { + virReportSystemError(errno, "%s", + _("Unable to read cache data")); + return -1; + } + + fd = fopen("/sys/devices/intel_cqm/events/llc_occupancy.scale", "r"); + if (!fd) { + virReportSystemError(errno, "%s", + _("Unable to open CMT scale file")); + return -1; + } + if (fscanf(fd, "%d", &scaling_factor) != 1) { + virReportSystemError(errno, "%s", + _("Unable to read CMT scale file")); + VIR_FORCE_FCLOSE(fd); + return -1; + } + VIR_FORCE_FCLOSE(fd); + + cache *= scaling_factor; + + if (virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "perf.cache", + cache) < 0) + return -1; + + return 0; +} + +static int +qemuDomainGetStatsPerf(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr dom, + virDomainStatsRecordPtr record, + int *maxparams, + unsigned int privflags ATTRIBUTE_UNUSED) +{ + size_t i; + qemuDomainObjPrivatePtr priv = dom->privateData; + int ret = -1; + + for (i = 0; i < VIR_PERF_EVENT_LAST; i++) { + virPerfEventPtr event = virPerfEventOfType(priv->perf, i); + if (!event->enabled) + continue; + + switch (i) { + case VIR_PERF_EVENT_CMT: + if (qemuDomainGetStatsPerfCmt(event, record, maxparams)) + goto cleanup; + break; + default: + break; + } + } + + ret = 0; + cleanup: + return ret; +} + typedef int (*qemuDomainGetStatsFunc)(virQEMUDriverPtr driver, virDomainObjPtr dom, @@ -19439,6 +19617,7 @@ static struct qemuDomainGetStatsWorker qemuDomainGetStatsWorkers[] = { { qemuDomainGetStatsVcpu, VIR_DOMAIN_STATS_VCPU, false }, { qemuDomainGetStatsInterface, VIR_DOMAIN_STATS_INTERFACE, false }, { qemuDomainGetStatsBlock, VIR_DOMAIN_STATS_BLOCK, true }, + { qemuDomainGetStatsPerf, VIR_DOMAIN_STATS_PERF, false }, { NULL, 0, false } }; @@ -20227,6 +20406,8 @@ static virHypervisorDriver qemuHypervisorDriver = { .domainMigrateFinish3 = qemuDomainMigrateFinish3, /* 0.9.2 */ .domainMigrateConfirm3 = qemuDomainMigrateConfirm3, /* 0.9.2 */ .domainSendKey = qemuDomainSendKey, /* 0.9.4 */ + .domainGetPerfEvents = qemuDomainGetPerfEvents, /* 1.2.23 */ + .domainSetPerfEvents = qemuDomainSetPerfEvents, /* 1.2.23 */ .domainBlockJobAbort = qemuDomainBlockJobAbort, /* 0.9.4 */ .domainGetBlockJobInfo = qemuDomainGetBlockJobInfo, /* 0.9.4 */ .domainBlockJobSetSpeed = qemuDomainBlockJobSetSpeed, /* 0.9.4 */ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 4545f77..2d81069 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4680,6 +4680,10 @@ int qemuProcessStart(virConnectPtr conn, if (qemuSetupCgroup(driver, vm, nnicindexes, nicindexes) < 0) goto error; + VIR_DEBUG("Initializing perf event"); + if (virPerfInitialize(&(priv->perf)) < 0) + goto cleanup; + /* This must be done after cgroup placement to avoid resetting CPU * affinity */ if (!vm->def->cputune.emulatorpin && @@ -5187,6 +5191,8 @@ void qemuProcessStop(virQEMUDriverPtr driver, } virCgroupFree(&priv->cgroup); + virPerfFree(&(priv->perf)); + qemuProcessRemoveDomainStatus(driver, vm); /* Remove VNC and Spice ports from port reservation bitmap, but only if -- 1.9.1

On Tue, Nov 17, 2015 at 16:00:46 +0800, Qiaowei Ren wrote:
This patch implement the internal driver API for perf event into qemu driver.
In addition, this patch extend virDomainListGetStats API to get the statistics for perf event. To do so, we add a 'VIR_DOMAIN_STATS_PERF' enum to causes reporting of all previously enabled perf events.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/libvirt-domain.h | 1 + src/qemu/qemu_domain.h | 3 + src/qemu/qemu_driver.c | 181 +++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_process.c | 6 ++ 4 files changed, 191 insertions(+)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 69965e6..5e74b29 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1771,6 +1771,7 @@ typedef enum { VIR_DOMAIN_STATS_VCPU = (1 << 3), /* return domain virtual CPU info */ VIR_DOMAIN_STATS_INTERFACE = (1 << 4), /* return domain interfaces info */ VIR_DOMAIN_STATS_BLOCK = (1 << 5), /* return domain block info */ + VIR_DOMAIN_STATS_PERF = (1 << 6), /* return domain perf event info */ } virDomainStatsTypes;
typedef enum { diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 4be998c..0348d4a 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -26,6 +26,7 @@
# include "virthread.h" # include "vircgroup.h" +# include "virperf.h" # include "domain_addr.h" # include "domain_conf.h" # include "snapshot_conf.h" @@ -190,6 +191,8 @@ struct _qemuDomainObjPrivate {
virCgroupPtr cgroup;
+ virPerfPtr perf; + virCond unplugFinished; /* signals that unpluggingDevice was unplugged */ const char *unpluggingDevice; /* alias of the device that is being unplugged */ char **qemuDevices; /* NULL-terminated list of devices aliases known to QEMU */ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 92a9961..5ad4d79 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -98,6 +98,7 @@ #include "virhostdev.h" #include "domain_capabilities.h" #include "vircgroup.h" +#include "virperf.h" #include "virnuma.h" #include "dirname.h" #include "network/bridge_driver.h" @@ -113,6 +114,8 @@ VIR_LOG_INIT("qemu.qemu_driver");
#define QEMU_NB_NUMA_PARAM 2
+#define QEMU_NB_PERF_PARAM VIR_PERF_EVENT_LAST +
VIR_PERF_EVENT_LAST is good enough, no need to define another symbol for it.
#define QEMU_SCHED_MIN_PERIOD 1000LL #define QEMU_SCHED_MAX_PERIOD 1000000LL #define QEMU_SCHED_MIN_QUOTA 1000LL @@ -10244,6 +10247,106 @@ qemuDomainGetNumaParameters(virDomainPtr dom, }
static int +qemuDomainSetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int nparams) +{ + size_t i; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + int ret = -1; + bool enabled; + + if (virTypedParamsValidate(params, nparams, + VIR_DOMAIN_PERF_CMT, + VIR_TYPED_PARAM_BOOLEAN, + NULL) < 0) + return -1;
Use virTypedParamsCheck and define the data for it in virperf.h so that you don't need to change the code here if new event type is added.
+ + if (!(vm = qemuDomObjFromDomain(dom))) + return -1; + + priv = vm->privateData; + + if (virDomainGetPerfEventsEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + for (i = 0; i < nparams; i++) { + virTypedParameterPtr param = ¶ms[i]; + enabled = params[i].value.b; + + if (STREQ(param->field, VIR_DOMAIN_PERF_CMT)) {
Just use the appropriate FromString to convert the param field into the perf type enum and call virPerfEvent{Disable,Enable} depending on the value. Thus no change will be needed here if you add more perf events.
+ if (!enabled && virPerfEventDisable(VIR_PERF_EVENT_CMT, priv->perf)) { + virReportError(VIR_ERR_NO_SUPPORT, + _("can't disable perf event: %s"), + VIR_DOMAIN_PERF_CMT); + goto cleanup; + } + if (enabled && virPerfCmtEnable(vm->pid, priv->perf)) { + virReportError(VIR_ERR_NO_SUPPORT, + _("can't enable perf event: %s"), + VIR_DOMAIN_PERF_CMT); + goto cleanup; + }
Both virPerfEventDisable and virPerfEventEnable should have reported the error, no need to mask it with another one.
+ } + } + + ret = 0; + + cleanup: + virDomainObjEndAPI(&vm); + return ret; +} + +static int +qemuDomainGetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int * nparams) +{ + size_t i; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + int ret = -1; + + if (!(vm = qemuDomObjFromDomain(dom))) + goto cleanup; + + priv = vm->privateData; + + if (virDomainGetPerfEventsEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + if ((*nparams) == 0) { + *nparams = QEMU_NB_PERF_PARAM; + ret = 0; + goto cleanup; + } + + for (i = 0; i < QEMU_NB_PERF_PARAM && i < *nparams; i++) { + virMemoryParameterPtr param = ¶ms[i];
Comments similar to SetPerfEvents apply here as well.
+ + switch (i) { + case VIR_PERF_EVENT_CMT: + if (virTypedParameterAssign(param, VIR_DOMAIN_PERF_CMT, + VIR_TYPED_PARAM_BOOLEAN, + priv->perf->events[VIR_PERF_EVENT_CMT].enabled) < 0)
This should rather you an access function.
+ goto cleanup; + break; + default: + break; + } + } + + if (*nparams > QEMU_NB_PERF_PARAM) + *nparams = QEMU_NB_PERF_PARAM; + ret = 0; + + cleanup: + virDomainObjEndAPI(&vm); + return ret; +} + +static int qemuSetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, unsigned long long period, long long quota) { @@ -19419,6 +19522,81 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr driver,
#undef QEMU_ADD_COUNT_PARAM
+static int +qemuDomainGetStatsPerfCmt(virPerfEventPtr eventPtr, + virDomainStatsRecordPtr record, + int *maxparams) +{ + FILE *fd; + unsigned long long cache = 0; + int scaling_factor = 0; + + if (eventPtr->fd <= 0 || !eventPtr->enabled) + return -1; + + if (read(eventPtr->fd, &cache, sizeof(uint64_t)) < 0) { + virReportSystemError(errno, "%s", + _("Unable to read cache data")); + return -1; + } + + fd = fopen("/sys/devices/intel_cqm/events/llc_occupancy.scale", "r");
Will this value change in time? If not, we could just read it and store within virPerfEvent when enabling it. It may be necessary to turn virPerfEvent into union to do that.
+ if (!fd) { + virReportSystemError(errno, "%s", + _("Unable to open CMT scale file")); + return -1; + } + if (fscanf(fd, "%d", &scaling_factor) != 1) { + virReportSystemError(errno, "%s", + _("Unable to read CMT scale file")); + VIR_FORCE_FCLOSE(fd); + return -1; + } + VIR_FORCE_FCLOSE(fd);
Anyway, it looks like this would deserve to be moved to virperf.c (and use cleanup lables).
+ cache *= scaling_factor; + + if (virTypedParamsAddULLong(&record->params, + &record->nparams, + maxparams, + "perf.cache", + cache) < 0) + return -1; + + return 0; +} + +static int +qemuDomainGetStatsPerf(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr dom, + virDomainStatsRecordPtr record, + int *maxparams, + unsigned int privflags ATTRIBUTE_UNUSED) +{ + size_t i; + qemuDomainObjPrivatePtr priv = dom->privateData; + int ret = -1; + + for (i = 0; i < VIR_PERF_EVENT_LAST; i++) { + virPerfEventPtr event = virPerfEventOfType(priv->perf, i); + if (!event->enabled) + continue; + + switch (i) {
Typecast 'i' to the perf event type...
+ case VIR_PERF_EVENT_CMT: + if (qemuDomainGetStatsPerfCmt(event, record, maxparams)) + goto cleanup; + break; + default: + break;
and remove this default branch so that the compiler reminds you to change this code in case new perf event is added.
+ } + } + + ret = 0; + cleanup: + return ret; +} + typedef int (*qemuDomainGetStatsFunc)(virQEMUDriverPtr driver, virDomainObjPtr dom, @@ -19439,6 +19617,7 @@ static struct qemuDomainGetStatsWorker qemuDomainGetStatsWorkers[] = { { qemuDomainGetStatsVcpu, VIR_DOMAIN_STATS_VCPU, false }, { qemuDomainGetStatsInterface, VIR_DOMAIN_STATS_INTERFACE, false }, { qemuDomainGetStatsBlock, VIR_DOMAIN_STATS_BLOCK, true }, + { qemuDomainGetStatsPerf, VIR_DOMAIN_STATS_PERF, false }, { NULL, 0, false } };
@@ -20227,6 +20406,8 @@ static virHypervisorDriver qemuHypervisorDriver = { .domainMigrateFinish3 = qemuDomainMigrateFinish3, /* 0.9.2 */ .domainMigrateConfirm3 = qemuDomainMigrateConfirm3, /* 0.9.2 */ .domainSendKey = qemuDomainSendKey, /* 0.9.4 */ + .domainGetPerfEvents = qemuDomainGetPerfEvents, /* 1.2.23 */ + .domainSetPerfEvents = qemuDomainSetPerfEvents, /* 1.2.23 */ .domainBlockJobAbort = qemuDomainBlockJobAbort, /* 0.9.4 */ .domainGetBlockJobInfo = qemuDomainGetBlockJobInfo, /* 0.9.4 */ .domainBlockJobSetSpeed = qemuDomainBlockJobSetSpeed, /* 0.9.4 */ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 4545f77..2d81069 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4680,6 +4680,10 @@ int qemuProcessStart(virConnectPtr conn, if (qemuSetupCgroup(driver, vm, nnicindexes, nicindexes) < 0) goto error;
+ VIR_DEBUG("Initializing perf event"); + if (virPerfInitialize(&(priv->perf)) < 0) + goto cleanup;
This is why virPerfNew should not return a failure in case perf events are not supported.
+ /* This must be done after cgroup placement to avoid resetting CPU * affinity */ if (!vm->def->cputune.emulatorpin && @@ -5187,6 +5191,8 @@ void qemuProcessStop(virQEMUDriverPtr driver, } virCgroupFree(&priv->cgroup);
+ virPerfFree(&(priv->perf)); + qemuProcessRemoveDomainStatus(driver, vm);
/* Remove VNC and Spice ports from port reservation bitmap, but only if
Jirka

-----Original Message----- From: Jiri Denemark [mailto:jdenemar@redhat.com] Sent: Tuesday, November 24, 2015 9:01 PM To: Ren, Qiaowei Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 6/8] qemu_driver: add support to perf event
On Tue, Nov 17, 2015 at 16:00:46 +0800, Qiaowei Ren wrote:
This patch implement the internal driver API for perf event into qemu driver.
In addition, this patch extend virDomainListGetStats API to get the statistics for perf event. To do so, we add a 'VIR_DOMAIN_STATS_PERF' enum to causes reporting of all previously enabled perf events.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/libvirt-domain.h | 1 + src/qemu/qemu_domain.h | 3 + src/qemu/qemu_driver.c | 181 +++++++++++++++++++++++++++++++++++++++ #define QEMU_SCHED_MIN_PERIOD 1000LL #define QEMU_SCHED_MAX_PERIOD 1000000LL #define QEMU_SCHED_MIN_QUOTA 1000LL @@ -10244,6 +10247,106 @@ qemuDomainGetNumaParameters(virDomainPtr dom, }
static int +qemuDomainSetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int nparams) +{ + size_t i; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + int ret = -1; + bool enabled; + + if (virTypedParamsValidate(params, nparams, + VIR_DOMAIN_PERF_CMT, + VIR_TYPED_PARAM_BOOLEAN, + NULL) < 0) + return -1;
Use virTypedParamsCheck and define the data for it in virperf.h so that you don't need to change the code here if new event type is added.
Jirka, I checked the difference between virTypedParamsValidate and virTypedParamsCheck, and found that virTypedParamsCheck just check whether the param is valid through param name, but virTypedParamsValidate will check whether the param type is right. In addition, virTypedParamsCheck is just used in three functions in libvirt-domain.c, and all driver APIs use virTypedParamsValidate to validate the parms like this patch. What do you think about it? Thanks, Qiaowei

On Mon, Nov 30, 2015 at 08:17:05 +0000, Ren, Qiaowei wrote:
-----Original Message----- From: Jiri Denemark [mailto:jdenemar@redhat.com] Sent: Tuesday, November 24, 2015 9:01 PM To: Ren, Qiaowei Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 6/8] qemu_driver: add support to perf event
On Tue, Nov 17, 2015 at 16:00:46 +0800, Qiaowei Ren wrote:
This patch implement the internal driver API for perf event into qemu driver.
In addition, this patch extend virDomainListGetStats API to get the statistics for perf event. To do so, we add a 'VIR_DOMAIN_STATS_PERF' enum to causes reporting of all previously enabled perf events.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/libvirt-domain.h | 1 + src/qemu/qemu_domain.h | 3 + src/qemu/qemu_driver.c | 181 +++++++++++++++++++++++++++++++++++++++ #define QEMU_SCHED_MIN_PERIOD 1000LL #define QEMU_SCHED_MAX_PERIOD 1000000LL #define QEMU_SCHED_MIN_QUOTA 1000LL @@ -10244,6 +10247,106 @@ qemuDomainGetNumaParameters(virDomainPtr dom, }
static int +qemuDomainSetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int nparams) +{ + size_t i; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + int ret = -1; + bool enabled; + + if (virTypedParamsValidate(params, nparams, + VIR_DOMAIN_PERF_CMT, + VIR_TYPED_PARAM_BOOLEAN, + NULL) < 0) + return -1;
Use virTypedParamsCheck and define the data for it in virperf.h so that you don't need to change the code here if new event type is added.
Jirka, I checked the difference between virTypedParamsValidate and virTypedParamsCheck, and found that virTypedParamsCheck just check whether the param is valid through param name, but virTypedParamsValidate will check whether the param type is right.
Oh, sorry for the confusion then. I actually wanted the code to be something like if (virTypedParamsValidate(params, nparams, VIR_PERF_PARAMS) < 0) return -1; and VIR_PERF_PARAMS would be a macro defined in virparams.h: # define VIR_PERF_PARAMS \ VIR_DOMAIN_PERF_CMT, VIR_TYPED_PARAM_BOOLEAN, \ NULL Similar to QEMU_MIGRATION_PARAMETERS and few others. Jirka

-----Original Message----- From: Jiri Denemark [mailto:jdenemar@redhat.com] Sent: Monday, November 30, 2015 7:09 PM To: Ren, Qiaowei Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 6/8] qemu_driver: add support to perf event
On Mon, Nov 30, 2015 at 08:17:05 +0000, Ren, Qiaowei wrote:
-----Original Message----- From: Jiri Denemark [mailto:jdenemar@redhat.com] Sent: Tuesday, November 24, 2015 9:01 PM To: Ren, Qiaowei Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 6/8] qemu_driver: add support to perf event
On Tue, Nov 17, 2015 at 16:00:46 +0800, Qiaowei Ren wrote:
This patch implement the internal driver API for perf event into qemu driver.
In addition, this patch extend virDomainListGetStats API to get the statistics for perf event. To do so, we add a
'VIR_DOMAIN_STATS_PERF'
enum to causes reporting of all previously enabled perf events.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- include/libvirt/libvirt-domain.h | 1 + src/qemu/qemu_domain.h | 3 + src/qemu/qemu_driver.c | 181 +++++++++++++++++++++++++++++++++++++++ #define QEMU_SCHED_MIN_PERIOD 1000LL #define QEMU_SCHED_MAX_PERIOD 1000000LL #define QEMU_SCHED_MIN_QUOTA 1000LL @@ -10244,6 +10247,106 @@ qemuDomainGetNumaParameters(virDomainPtr dom, }
static int +qemuDomainSetPerfEvents(virDomainPtr dom, + virTypedParameterPtr params, + int nparams) { + size_t i; + virDomainObjPtr vm = NULL; + qemuDomainObjPrivatePtr priv; + int ret = -1; + bool enabled; + + if (virTypedParamsValidate(params, nparams, + VIR_DOMAIN_PERF_CMT, + VIR_TYPED_PARAM_BOOLEAN, + NULL) < 0) + return -1;
Use virTypedParamsCheck and define the data for it in virperf.h so that you don't need to change the code here if new event type is added.
Jirka, I checked the difference between virTypedParamsValidate and virTypedParamsCheck, and found that virTypedParamsCheck just check whether the param is valid through param name, but virTypedParamsValidate will check whether the param type is right.
Oh, sorry for the confusion then. I actually wanted the code to be something like
if (virTypedParamsValidate(params, nparams, VIR_PERF_PARAMS) < 0) return -1;
and VIR_PERF_PARAMS would be a macro defined in virparams.h:
# define VIR_PERF_PARAMS \ VIR_DOMAIN_PERF_CMT, VIR_TYPED_PARAM_BOOLEAN, \ NULL
Similar to QEMU_MIGRATION_PARAMETERS and few others.
Got it! Thanks, Jirka. Qiaowei

This patch extend domstats command to match extended virDomainListGetStats API in previous patch. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- tools/virsh-domain-monitor.c | 7 +++++++ tools/virsh.pod | 7 +++++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c index d4e500b..33df079 100644 --- a/tools/virsh-domain-monitor.c +++ b/tools/virsh-domain-monitor.c @@ -2016,6 +2016,10 @@ static const vshCmdOptDef opts_domstats[] = { .type = VSH_OT_BOOL, .help = N_("report domain block device statistics"), }, + {.name = "perf", + .type = VSH_OT_BOOL, + .help = N_("report domain perf event statistics"), + }, {.name = "list-active", .type = VSH_OT_BOOL, .help = N_("list only active domains"), @@ -2127,6 +2131,9 @@ cmdDomstats(vshControl *ctl, const vshCmd *cmd) if (vshCommandOptBool(cmd, "block")) stats |= VIR_DOMAIN_STATS_BLOCK; + if (vshCommandOptBool(cmd, "perf")) + stats |= VIR_DOMAIN_STATS_PERF; + if (vshCommandOptBool(cmd, "list-active")) flags |= VIR_CONNECT_GET_ALL_DOMAINS_STATS_ACTIVE; diff --git a/tools/virsh.pod b/tools/virsh.pod index 21ae4a3..935d017 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -845,8 +845,8 @@ I<snapshot-create> for disk snapshots) will accept either target or unique source names printed by this command. =item B<domstats> [I<--raw>] [I<--enforce>] [I<--backing>] [I<--state>] -[I<--cpu-total>] [I<--balloon>] [I<--vcpu>] [I<--interface>] [I<--block>] -[[I<--list-active>] [I<--list-inactive>] [I<--list-persistent>] +[I<--cpu-total>] [I<--balloon>] [I<--vcpu>] [I<--interface>] [I<--perf>] +[I<--block>] [[I<--list-active>] [I<--list-inactive>] [I<--list-persistent>] [I<--list-transient>] [I<--list-running>] [I<--list-paused>] [I<--list-shutoff>] [I<--list-other>]] | [I<domain> ...] @@ -899,6 +899,9 @@ I<--interface> returns: "net.<num>.tx.errs" - number of transmission errors, "net.<num>.tx.drop" - number of transmit packets dropped +I<--perf> returns: +"perf.cache" - the cache usage in Byte currently used + I<--block> returns information about disks associated with each domain. Using the I<--backing> flag extends this information to cover all resources in the backing chain, rather than the default -- 1.9.1

On Tue, Nov 17, 2015 at 16:00:47 +0800, Qiaowei Ren wrote:
This patch extend domstats command to match extended virDomainListGetStats API in previous patch.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- tools/virsh-domain-monitor.c | 7 +++++++ tools/virsh.pod | 7 +++++-- 2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c index d4e500b..33df079 100644 --- a/tools/virsh-domain-monitor.c +++ b/tools/virsh-domain-monitor.c @@ -2016,6 +2016,10 @@ static const vshCmdOptDef opts_domstats[] = { .type = VSH_OT_BOOL, .help = N_("report domain block device statistics"), }, + {.name = "perf", + .type = VSH_OT_BOOL, + .help = N_("report domain perf event statistics"), + }, {.name = "list-active", .type = VSH_OT_BOOL, .help = N_("list only active domains"), @@ -2127,6 +2131,9 @@ cmdDomstats(vshControl *ctl, const vshCmd *cmd) if (vshCommandOptBool(cmd, "block")) stats |= VIR_DOMAIN_STATS_BLOCK;
+ if (vshCommandOptBool(cmd, "perf")) + stats |= VIR_DOMAIN_STATS_PERF; + if (vshCommandOptBool(cmd, "list-active")) flags |= VIR_CONNECT_GET_ALL_DOMAINS_STATS_ACTIVE;
diff --git a/tools/virsh.pod b/tools/virsh.pod index 21ae4a3..935d017 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -845,8 +845,8 @@ I<snapshot-create> for disk snapshots) will accept either target or unique source names printed by this command.
=item B<domstats> [I<--raw>] [I<--enforce>] [I<--backing>] [I<--state>] -[I<--cpu-total>] [I<--balloon>] [I<--vcpu>] [I<--interface>] [I<--block>] -[[I<--list-active>] [I<--list-inactive>] [I<--list-persistent>] +[I<--cpu-total>] [I<--balloon>] [I<--vcpu>] [I<--interface>] [I<--perf>] +[I<--block>] [[I<--list-active>] [I<--list-inactive>] [I<--list-persistent>] [I<--list-transient>] [I<--list-running>] [I<--list-paused>] [I<--list-shutoff>] [I<--list-other>]] | [I<domain> ...]
@@ -899,6 +899,9 @@ I<--interface> returns: "net.<num>.tx.errs" - number of transmission errors, "net.<num>.tx.drop" - number of transmit packets dropped
+I<--perf> returns: +"perf.cache" - the cache usage in Byte currently used
I think this feature would deserve to be documented a bit more. It would be nice to provide more details about what perf events are. And there are a lot of caches so describing what cache usage is measured by perf.cache would be useful too. Jirka

This patch add new perf command to enable/disable perf event for a guest domain. For example: $ virsh perf domain --cmt 1 // enable CMT perf event for domain $ virsh perf domain --cmt 0 // disable CMT perf event for domain $ virsh perf domain // list the state (enabled or disabled) of all supported perf event for domain Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> --- tools/virsh-domain.c | 139 +++++++++++++++++++++++++++++++++++++++++++++++++++ tools/virsh.pod | 8 +++ 2 files changed, 147 insertions(+) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index bd00785..e17747e 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -8890,6 +8890,139 @@ cmdMemtune(vshControl *ctl, const vshCmd *cmd) } /* + * "perf" command + */ +static const vshCmdInfo info_perf[] = { + {.name = "help", + .data = N_("Get or set perf event") + }, + {.name = "desc", + .data = N_("Get or set the current perf events for a guest" + " domain.\n" + " To get the perf events list use following command: \n\n" + " virsh # perf <domain>") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_perf[] = { + {.name = "domain", + .type = VSH_OT_DATA, + .flags = VSH_OFLAG_REQ, + .help = N_("domain name, id or uuid") + }, + {.name = "cmt", + .type = VSH_OT_INT, + .help = N_("CMT event, as integer (default 0, disable)") + }, + {.name = NULL} +}; + +static int +virshPerfGetEventEnabled(vshControl *ctl, const vshCmd *cmd, + const char *name, bool *value) +{ + int ret, tmp; + const char *str; + char *end; + + ret = vshCommandOptString(ctl, cmd, name, &str); + if (ret <= 0) + return ret; + if (virStrToLong_i(str, &end, 0, &tmp) < 0) + return -1; + if (tmp == 1) { + *value = true; + ret = 1; + } else if (tmp == 0) { + *value = false; + ret = 1; + } else { + ret = -1; + } + + return ret; +} + +static bool +cmdPerf(vshControl *ctl, const vshCmd *cmd) +{ + virDomainPtr dom; + bool tmpVal; + int nparams = 0; + int maxparams = 0; + int rc; + size_t i; + virTypedParameterPtr params = NULL; + bool ret = false; + + if (!(dom = virshCommandOptDomain(ctl, cmd, NULL))) + return false; + +#define PARSE_PERF_PARAM(NAME, FIELD) \ + if ((rc = virshPerfGetEventEnabled(ctl, cmd, NAME, &tmpVal)) < 0) { \ + vshError(ctl, _("Unable to parse boolean parameter %s"), NAME); \ + goto cleanup; \ + } \ + if (rc == 1) { \ + if (virTypedParamsAddBoolean(¶ms, &nparams, &maxparams, \ + FIELD, tmpVal) < 0) \ + goto save_error; \ + } \ + + + PARSE_PERF_PARAM("cmt", VIR_DOMAIN_PERF_CMT); + +#undef PARSE_PERF_PARAM + + if (nparams == 0) { + /* get the number of perf events */ + if (virDomainGetPerfEvents(dom, NULL, &nparams) != 0) { + vshError(ctl, "%s", + _("Unable to get number of perf events")); + goto cleanup; + } + + if (nparams == 0) { + /* nothing to output */ + ret = true; + goto cleanup; + } + + params = vshCalloc(ctl, nparams, sizeof(*params)); + if (virDomainGetPerfEvents(dom, params, &nparams) != 0) { + vshError(ctl, "%s", _("Unable to get perf events")); + goto cleanup; + } + for (i = 0; i < nparams; i++) { + if (params[i].type == VIR_TYPED_PARAM_BOOLEAN && + params[i].value.b) { + vshPrint(ctl, "%-15s: %s\n", params[i].field, _("enabled")); + } else { + vshPrint(ctl, "%-15s: %s\n", params[i].field, _("disabled")); + } + } + } else { + if (virDomainSetPerfEvents(dom, params, nparams) != 0) + goto error; + } + + ret = true; + + cleanup: + virTypedParamsFree(params, nparams); + virDomainFree(dom); + return ret; + + save_error: + vshSaveLibvirtError(); + error: + vshError(ctl, "%s", _("Unable to enable/disable perf events")); + goto cleanup; +} + + +/* * "numatune" command */ static const vshCmdInfo info_numatune[] = { @@ -13368,6 +13501,12 @@ const vshCmdDef domManagementCmds[] = { .info = info_memtune, .flags = 0 }, + {.name = "perf", + .handler = cmdPerf, + .opts = opts_perf, + .info = info_perf, + .flags = 0 + }, {.name = "metadata", .handler = cmdMetadata, .opts = opts_metadata, diff --git a/tools/virsh.pod b/tools/virsh.pod index 935d017..c674b58 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -2116,6 +2116,14 @@ The guaranteed minimum memory allocation for the guest. Specifying -1 as a value for these limits is interpreted as unlimited. +=item B<perf> I<domain> [I<--cmt> B<enable-disable>] + +Get the current perf events setting or enable/disable specific perf +event for a guest domain. Currently only QEMU/KVM supports I<--cmt>. + +B<enable-disable> may be 0 or 1. 0 means cmt perf event will be +disabled, and 1 means cmt perf event will be enabled. + =item B<blkiotune> I<domain> [I<--weight> B<weight>] [I<--device-weights> B<device-weights>] [I<--device-read-iops-sec> B<device-read-iops-sec>] -- 1.9.1

On Tue, Nov 17, 2015 at 04:00:40PM +0800, Qiaowei Ren wrote:
The series mainly adds Intel CMT feature support into libvirt. CMT is new introduced PQos (Platform Qos) feature to monitor the usage of cache by applications running on the platform.
Currently CMT patches has been merged into Linux kernel mainline. The CMT implementation in Linux kernel is based on perf mechanism and there is no support for perf in libvirt, and so this series firstly add perf support into libvirt, including two public API and a set of util interfaces. And based on these APIs and interfaces, thie series implements CMT perf event support.
TODO: 1. This series relys on keeping an open file descriptor for the guest. We should add one patch to call sys_perf_event_open again iff libvirtd is restarted.
I've not done a detailed patch review, but overall I think this design is pretty good now, so thanks for taking time to re-write this. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

-----Original Message----- From: Daniel P. Berrange [mailto:berrange@redhat.com] Sent: Tuesday, November 17, 2015 6:47 PM To: Ren, Qiaowei Cc: libvir-list@redhat.com; Jiri Denemark Subject: Re: [PATCH 0/8] Add perf and Intel CMT feature support
On Tue, Nov 17, 2015 at 04:00:40PM +0800, Qiaowei Ren wrote:
The series mainly adds Intel CMT feature support into libvirt. CMT is new introduced PQos (Platform Qos) feature to monitor the usage of cache by applications running on the platform.
Currently CMT patches has been merged into Linux kernel mainline. The CMT implementation in Linux kernel is based on perf mechanism and there is no support for perf in libvirt, and so this series firstly add perf support into libvirt, including two public API and a set of util interfaces. And based on these APIs and interfaces, thie series implements CMT perf event support.
TODO: 1. This series relys on keeping an open file descriptor for the guest. We should add one patch to call sys_perf_event_open again iff libvirtd is restarted.
I've not done a detailed patch review, but overall I think this design is pretty good now, so thanks for taking time to re-write this.
Daniel, thanks for your feedback. ^-^ Thanks, Qiaowei

On Tue, Nov 17, 2015 at 16:00:40 +0800, Qiaowei Ren wrote:
The series mainly adds Intel CMT feature support into libvirt. CMT is new introduced PQos (Platform Qos) feature to monitor the usage of cache by applications running on the platform.
Currently CMT patches has been merged into Linux kernel mainline. The CMT implementation in Linux kernel is based on perf mechanism and there is no support for perf in libvirt, and so this series firstly add perf support into libvirt, including two public API and a set of util interfaces. And based on these APIs and interfaces, thie series implements CMT perf event support.
TODO: 1. This series relys on keeping an open file descriptor for the guest. We should add one patch to call sys_perf_event_open again iff libvirtd is restarted.
Yes, we should reenable perf events when reconnecting to running domains. Will we need to remember what events were enabled (in domain status XML) or is it something we can read back from the kernel? Jirka

On Tue, Nov 24, 2015 at 02:24:22PM +0100, Jiri Denemark wrote:
On Tue, Nov 17, 2015 at 16:00:40 +0800, Qiaowei Ren wrote:
The series mainly adds Intel CMT feature support into libvirt. CMT is new introduced PQos (Platform Qos) feature to monitor the usage of cache by applications running on the platform.
Currently CMT patches has been merged into Linux kernel mainline. The CMT implementation in Linux kernel is based on perf mechanism and there is no support for perf in libvirt, and so this series firstly add perf support into libvirt, including two public API and a set of util interfaces. And based on these APIs and interfaces, thie series implements CMT perf event support.
TODO: 1. This series relys on keeping an open file descriptor for the guest. We should add one patch to call sys_perf_event_open again iff libvirtd is restarted.
Yes, we should reenable perf events when reconnecting to running domains. Will we need to remember what events were enabled (in domain status XML) or is it something we can read back from the kernel?
We need to record it somewhere. I guess this raises the question of whether we should hide it in status XML, or persist the list of desired perf events in the real domain XML, so we can have the option of also having them enabled immediately at startup, without requiring separate API call. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

-----Original Message----- From: Daniel P. Berrange [mailto:berrange@redhat.com] Sent: Tuesday, November 24, 2015 9:29 PM To: Ren, Qiaowei; libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 0/8] Add perf and Intel CMT feature support
On Tue, Nov 24, 2015 at 02:24:22PM +0100, Jiri Denemark wrote:
On Tue, Nov 17, 2015 at 16:00:40 +0800, Qiaowei Ren wrote:
The series mainly adds Intel CMT feature support into libvirt. CMT is new introduced PQos (Platform Qos) feature to monitor the usage of cache by applications running on the platform.
Currently CMT patches has been merged into Linux kernel mainline. The CMT implementation in Linux kernel is based on perf mechanism and there is no support for perf in libvirt, and so this series firstly add perf support into libvirt, including two public API and a set of util interfaces. And based on these APIs and interfaces, thie series implements CMT perf event support.
TODO: 1. This series relys on keeping an open file descriptor for the guest. We should add one patch to call sys_perf_event_open again iff libvirtd is restarted.
Yes, we should reenable perf events when reconnecting to running domains. Will we need to remember what events were enabled (in domain status XML) or is it something we can read back from the kernel?
We need to record it somewhere. I guess this raises the question of whether we should hide it in status XML, or persist the list of desired perf events in the real domain XML, so we can have the option of also having them enabled immediately at startup, without requiring separate API call.
I checked the kernel perf interface and could not find the way to read back whether events were enabled from the kernel. So we have to record it somewhere according to Daniel. If we just persist the list of desired perf events in the real domain XML, users can have the option of having them enabled at startup. But it is also possible that users enable more events at runtime through API call for perf event. So should we record it in both real domain XML and status XML? Thanks, Qiaowei

On Wed, Nov 25, 2015 at 13:16:34 +0000, Ren, Qiaowei wrote:
-----Original Message----- From: Daniel P. Berrange [mailto:berrange@redhat.com] Sent: Tuesday, November 24, 2015 9:29 PM To: Ren, Qiaowei; libvir-list@redhat.com Subject: Re: [libvirt] [PATCH 0/8] Add perf and Intel CMT feature support
On Tue, Nov 24, 2015 at 02:24:22PM +0100, Jiri Denemark wrote:
On Tue, Nov 17, 2015 at 16:00:40 +0800, Qiaowei Ren wrote:
The series mainly adds Intel CMT feature support into libvirt. CMT is new introduced PQos (Platform Qos) feature to monitor the usage of cache by applications running on the platform.
Currently CMT patches has been merged into Linux kernel mainline. The CMT implementation in Linux kernel is based on perf mechanism and there is no support for perf in libvirt, and so this series firstly add perf support into libvirt, including two public API and a set of util interfaces. And based on these APIs and interfaces, thie series implements CMT perf event support.
TODO: 1. This series relys on keeping an open file descriptor for the guest. We should add one patch to call sys_perf_event_open again iff libvirtd is restarted.
Yes, we should reenable perf events when reconnecting to running domains. Will we need to remember what events were enabled (in domain status XML) or is it something we can read back from the kernel?
We need to record it somewhere. I guess this raises the question of whether we should hide it in status XML, or persist the list of desired perf events in the real domain XML, so we can have the option of also having them enabled immediately at startup, without requiring separate API call.
I checked the kernel perf interface and could not find the way to read back whether events were enabled from the kernel. So we have to record it somewhere according to Daniel.
If we just persist the list of desired perf events in the real domain XML, users can have the option of having them enabled at startup. But it is also possible that users enable more events at runtime through API call for perf event. So should we record it in both real domain XML and status XML?
The status XML always contains full domain XML of the running domain. Thus if you add to code to handle all this in domain XML, you don't need to do anything extra to store the info in status XML. However, you'll need to update live domain definition from the APIs which enable/disable the perf events. Jirka
participants (4)
-
Daniel P. Berrange
-
Jiri Denemark
-
Qiaowei Ren
-
Ren, Qiaowei