[PATCH v2 0/1] Add support for RAPL MSRs feature in QEMU

Hi, First of all, kudos to Peter Krempa for his fast review! In this v2, I've addressed the following points: - The socket is *not* mandatory and my code totally confused Peter. Sorry about that! here a snippet of the QEMU code to understand: /* Compute the socket path if necessary */ if (s->msr_energy.socket_path == NULL) { s->msr_energy.socket_path = vmsr_compute_default_paths(); } So I made all the modification to make it not necessary. - Change the socket name to "rapl_helper_socket" - Change the socket to be absFilePath - I did not add anything to honour the _OFF state, because it is not necessary to explicitly disable it. That's about it. Regards, Anthony Anthony Harivel (1): qemu: Add support for RAPL MSRs feature docs/formatdomain.rst | 2 ++ src/conf/domain_conf.c | 18 ++++++++++++++++++ src/conf/domain_conf.h | 2 ++ src/conf/schemas/domaincommon.rng | 10 ++++++++++ src/qemu/qemu_command.c | 11 +++++++++++ tests/qemuxmlconfdata/kvm-features-off.xml | 1 + .../kvm-features.x86_64-latest.args | 2 +- tests/qemuxmlconfdata/kvm-features.xml | 1 + 8 files changed, 46 insertions(+), 1 deletion(-) -- 2.46.0

Add the support in libvirt to activate the RAPL feature in QEMU. This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock in QEMU. The feature is activated in libvirt with the following XML configuration: <kvm> [...] <rapl state ='on' socket='/run/qemu-vmsr-helper.sock'/> [...] </kvm> See: https://gitlab.com/qemu-project/qemu/-/commit/0418f90809aea5b375c859e744c8e8... Signed-off-by: Anthony Harivel <aharivel@redhat.com> --- docs/formatdomain.rst | 2 ++ src/conf/domain_conf.c | 18 ++++++++++++++++++ src/conf/domain_conf.h | 2 ++ src/conf/schemas/domaincommon.rng | 10 ++++++++++ src/qemu/qemu_command.c | 11 +++++++++++ tests/qemuxmlconfdata/kvm-features-off.xml | 1 + .../kvm-features.x86_64-latest.args | 2 +- tests/qemuxmlconfdata/kvm-features.xml | 1 + 8 files changed, 46 insertions(+), 1 deletion(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index e1e219bcb5c5..5c66e41e940f 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -1996,6 +1996,7 @@ Hypervisors may allow certain CPU / machine features to be toggled on/off. <poll-control state='on'/> <pv-ipi state='off'/> <dirty-ring state='on' size='4096'/> + <rapl state ='on' socket='/run/qemu-vmsr-helper.sock'/> </kvm> <xen> <e820_host state='on'/> @@ -2111,6 +2112,7 @@ are: poll-control Decrease IO completion latency by introducing a grace period of busy waiting on, off :since:`6.10.0 (QEMU 4.2)` pv-ipi Paravirtualized send IPIs on, off :since:`7.10.0 (QEMU 3.1)` dirty-ring Enable dirty ring feature on, off; size - must be power of 2, range [1024,65536] :since:`8.0.0 (QEMU 6.1)` + rapl Enable rapl feature on, off; socket - rapl helper socket :since:`10.8.0 (QEMU 9.1)` ============== ============================================================================ ====================================================== ============================ ``xen`` diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d950921667a2..32a3d09fe175 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -220,6 +220,7 @@ VIR_ENUM_IMPL(virDomainKVM, "poll-control", "pv-ipi", "dirty-ring", + "rapl", ); VIR_ENUM_IMPL(virDomainXen, @@ -16693,6 +16694,11 @@ virDomainFeaturesKVMDefParse(virDomainDef *def, return -1; } } + + /* rapl feature should parse socket property */ + if (feature == VIR_DOMAIN_KVM_RAPL && value == VIR_TRISTATE_SWITCH_ON) { + kvm->rapl_helper_socket = virXMLPropString(feat, "socket"); + } } def->features[VIR_DOMAIN_FEATURE_KVM] = VIR_TRISTATE_SWITCH_ON; @@ -21108,6 +21114,7 @@ virDomainDefFeaturesCheckABIStability(virDomainDef *src, case VIR_DOMAIN_KVM_POLLCONTROL: case VIR_DOMAIN_KVM_PVIPI: case VIR_DOMAIN_KVM_DIRTY_RING: + case VIR_DOMAIN_KVM_RAPL: if (src->kvm_features->features[i] != dst->kvm_features->features[i]) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("State of KVM feature '%1$s' differs: source: '%2$s', destination: '%3$s'"), @@ -27840,6 +27847,17 @@ virDomainDefFormatFeatures(virBuffer *buf, } break; + case VIR_DOMAIN_KVM_RAPL: + if (def->kvm_features->features[j] != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&childBuf, "<%s state='%s'", + virDomainKVMTypeToString(j), + virTristateSwitchTypeToString(def->kvm_features->features[j])); + + virBufferEscapeString(&childBuf, " socket='%s'", def->kvm_features->rapl_helper_socket); + virBufferAddLit(&childBuf, "/>\n"); + } + break; + case VIR_DOMAIN_KVM_LAST: break; } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index eae621f900b9..37295f1cfddb 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2213,6 +2213,7 @@ typedef enum { VIR_DOMAIN_KVM_POLLCONTROL, VIR_DOMAIN_KVM_PVIPI, VIR_DOMAIN_KVM_DIRTY_RING, + VIR_DOMAIN_KVM_RAPL, VIR_DOMAIN_KVM_LAST } virDomainKVM; @@ -2400,6 +2401,7 @@ struct _virDomainFeatureKVM { int features[VIR_DOMAIN_KVM_LAST]; unsigned int dirty_ring_size; /* size of dirty ring for each vCPU, no units */ + char *rapl_helper_socket; /* rapl helper socket */ }; typedef struct _virDomainFeatureTCG virDomainFeatureTCG; diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index 05ba69792470..20d0ce556ecf 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -8016,6 +8016,16 @@ </optional> </element> </optional> + <optional> + <element name="rapl"> + <ref name="featurestate"/> + <optional> + <attribute name="socket"> + <ref name="absFilePath"/> + </attribute> + </optional> + </element> + </optional> </interleave> </element> </define> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 28914c9c346a..b1b0bb9c63a6 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -6562,6 +6562,9 @@ qemuBuildCpuCommandLine(virCommand *cmd, case VIR_DOMAIN_KVM_DIRTY_RING: break; + case VIR_DOMAIN_KVM_RAPL: + break; + case VIR_DOMAIN_KVM_LAST: break; } @@ -7176,6 +7179,14 @@ qemuBuildAccelCommandLine(virCommand *cmd, def->kvm_features->features[VIR_DOMAIN_KVM_DIRTY_RING] == VIR_TRISTATE_SWITCH_ON) { virBufferAsprintf(&buf, ",dirty-ring-size=%d", def->kvm_features->dirty_ring_size); } + if (def->features[VIR_DOMAIN_FEATURE_KVM] == VIR_TRISTATE_SWITCH_ON && + def->kvm_features->features[VIR_DOMAIN_KVM_RAPL] == VIR_TRISTATE_SWITCH_ON) { + virBufferAddLit(&buf, ",rapl=true"); + + if (def->kvm_features->rapl_helper_socket != NULL) { + virBufferAsprintf(&buf, ",rapl-helper-socket=%s", def->kvm_features->rapl_helper_socket); + } + } break; case VIR_DOMAIN_VIRT_HVF: diff --git a/tests/qemuxmlconfdata/kvm-features-off.xml b/tests/qemuxmlconfdata/kvm-features-off.xml index 3cd4ff18f283..3e8dbea4b177 100644 --- a/tests/qemuxmlconfdata/kvm-features-off.xml +++ b/tests/qemuxmlconfdata/kvm-features-off.xml @@ -16,6 +16,7 @@ <poll-control state='off'/> <pv-ipi state='off'/> <dirty-ring state='off'/> + <rapl state='off'/> </kvm> </features> <cpu mode='host-passthrough' check='none' migratable='off'/> diff --git a/tests/qemuxmlconfdata/kvm-features.x86_64-latest.args b/tests/qemuxmlconfdata/kvm-features.x86_64-latest.args index 955db67eb4b7..2dd613ab338d 100644 --- a/tests/qemuxmlconfdata/kvm-features.x86_64-latest.args +++ b/tests/qemuxmlconfdata/kvm-features.x86_64-latest.args @@ -11,7 +11,7 @@ XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1/.config \ -S \ -object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain--1-QEMUGuest1/master-key.aes"}' \ -machine pc,usb=off,dump-guest-core=off,memory-backend=pc.ram,acpi=on \ --accel kvm,dirty-ring-size=4096 \ +-accel kvm,dirty-ring-size=4096,rapl=true,rapl-helper-socket=/run/qemu-vmsr-helper.sock \ -cpu host,migratable=off,kvm=off,kvm-hint-dedicated=on,kvm-poll-control=on \ -m size=219136k \ -object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":224395264}' \ diff --git a/tests/qemuxmlconfdata/kvm-features.xml b/tests/qemuxmlconfdata/kvm-features.xml index 78091064b109..ee601e06de75 100644 --- a/tests/qemuxmlconfdata/kvm-features.xml +++ b/tests/qemuxmlconfdata/kvm-features.xml @@ -16,6 +16,7 @@ <poll-control state='on'/> <pv-ipi state='on'/> <dirty-ring state='on' size='4096'/> + <rapl state='on' socket='/run/qemu-vmsr-helper.sock'/> </kvm> </features> <cpu mode='host-passthrough' check='none' migratable='off'/> -- 2.46.0

On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote:
Add the support in libvirt to activate the RAPL feature in QEMU.
This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock in QEMU.
The feature is activated in libvirt with the following XML configuration:
<kvm> [...] <rapl state ='on' socket='/run/qemu-vmsr-helper.sock'/> [...] </kvm>
See: https://gitlab.com/qemu-project/qemu/-/commit/0418f90809aea5b375c859e744c8e8...
Signed-off-by: Anthony Harivel <aharivel@redhat.com> --- docs/formatdomain.rst | 2 ++ src/conf/domain_conf.c | 18 ++++++++++++++++++ src/conf/domain_conf.h | 2 ++ src/conf/schemas/domaincommon.rng | 10 ++++++++++ src/qemu/qemu_command.c | 11 +++++++++++ tests/qemuxmlconfdata/kvm-features-off.xml | 1 + .../kvm-features.x86_64-latest.args | 2 +- tests/qemuxmlconfdata/kvm-features.xml | 1 + 8 files changed, 46 insertions(+), 1 deletion(-)
[...]
@@ -7176,6 +7179,14 @@ qemuBuildAccelCommandLine(virCommand *cmd, def->kvm_features->features[VIR_DOMAIN_KVM_DIRTY_RING] == VIR_TRISTATE_SWITCH_ON) { virBufferAsprintf(&buf, ",dirty-ring-size=%d", def->kvm_features->dirty_ring_size); }
This should be separated via a newline.
+ if (def->features[VIR_DOMAIN_FEATURE_KVM] == VIR_TRISTATE_SWITCH_ON && + def->kvm_features->features[VIR_DOMAIN_KVM_RAPL] == VIR_TRISTATE_SWITCH_ON) { + virBufferAddLit(&buf, ",rapl=true"); + + if (def->kvm_features->rapl_helper_socket != NULL) { + virBufferAsprintf(&buf, ",rapl-helper-socket=%s", def->kvm_features->rapl_helper_socket); + } + } break;
case VIR_DOMAIN_VIRT_HVF:
Apart from that the rest looks good providing the following: I suppose that the 'rapl-helper-socket' is a shared (multiple qemu's use it) resource set up beforehand by the admin. Right? If that is the case that means the lifecycle of the daemon and permissions (including selinux) for accessing the socket are not something that libvirt needs to deal with. If either of them isn't true please outline how that socket is to be used to see how libvirt will need to approach it.

Peter Krempa, Sep 02, 2024 at 15:09:
On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote:
Add the support in libvirt to activate the RAPL feature in QEMU.
This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock in QEMU.
The feature is activated in libvirt with the following XML configuration:
<kvm> [...] <rapl state ='on' socket='/run/qemu-vmsr-helper.sock'/> [...] </kvm>
See: https://gitlab.com/qemu-project/qemu/-/commit/0418f90809aea5b375c859e744c8e8...
Signed-off-by: Anthony Harivel <aharivel@redhat.com> --- docs/formatdomain.rst | 2 ++ src/conf/domain_conf.c | 18 ++++++++++++++++++ src/conf/domain_conf.h | 2 ++ src/conf/schemas/domaincommon.rng | 10 ++++++++++ src/qemu/qemu_command.c | 11 +++++++++++ tests/qemuxmlconfdata/kvm-features-off.xml | 1 + .../kvm-features.x86_64-latest.args | 2 +- tests/qemuxmlconfdata/kvm-features.xml | 1 + 8 files changed, 46 insertions(+), 1 deletion(-)
[...]
@@ -7176,6 +7179,14 @@ qemuBuildAccelCommandLine(virCommand *cmd, def->kvm_features->features[VIR_DOMAIN_KVM_DIRTY_RING] == VIR_TRISTATE_SWITCH_ON) { virBufferAsprintf(&buf, ",dirty-ring-size=%d", def->kvm_features->dirty_ring_size); }
This should be separated via a newline.
Will be done for next iteration thanks.
+ if (def->features[VIR_DOMAIN_FEATURE_KVM] == VIR_TRISTATE_SWITCH_ON && + def->kvm_features->features[VIR_DOMAIN_KVM_RAPL] == VIR_TRISTATE_SWITCH_ON) { + virBufferAddLit(&buf, ",rapl=true"); + + if (def->kvm_features->rapl_helper_socket != NULL) { + virBufferAsprintf(&buf, ",rapl-helper-socket=%s", def->kvm_features->rapl_helper_socket); + } + } break;
case VIR_DOMAIN_VIRT_HVF:
Apart from that the rest looks good providing the following:
I suppose that the 'rapl-helper-socket' is a shared (multiple qemu's use it) resource set up beforehand by the admin. Right?
Indeed this is shared between all qemu instances.
If that is the case that means the lifecycle of the daemon and permissions (including selinux) for accessing the socket are not something that libvirt needs to deal with.
If either of them isn't true please outline how that socket is to be used to see how libvirt will need to approach it.
This is a very good question, I'm trying to solve at the moment. The daemon is called "qemu-vmsr-helper". The source is available in qemu/tools/i386/* (qemu 9.1) and is built with projet options "tools" enable. There is also a systemd files available in qemu source code qemu/contrib/systemd/ with the file qemu-vmsr-helper.service and qemu-vmsr-helper.socket that respectively start the daemon and start the socket to listen to. If libvirt can solve the installation of this daemon, then I'll be happy to add a patch to my patch queue. Looking forward you thought about it. Many thanks, Anthony

On Mon, Sep 02, 2024 at 03:09:42PM +0200, Peter Krempa wrote:
On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote:
Add the support in libvirt to activate the RAPL feature in QEMU.
This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock in QEMU.
The feature is activated in libvirt with the following XML configuration:
<kvm> [...] <rapl state ='on' socket='/run/qemu-vmsr-helper.sock'/> [...] </kvm>
See: https://gitlab.com/qemu-project/qemu/-/commit/0418f90809aea5b375c859e744c8e8...
Signed-off-by: Anthony Harivel <aharivel@redhat.com> --- docs/formatdomain.rst | 2 ++ src/conf/domain_conf.c | 18 ++++++++++++++++++ src/conf/domain_conf.h | 2 ++ src/conf/schemas/domaincommon.rng | 10 ++++++++++ src/qemu/qemu_command.c | 11 +++++++++++ tests/qemuxmlconfdata/kvm-features-off.xml | 1 + .../kvm-features.x86_64-latest.args | 2 +- tests/qemuxmlconfdata/kvm-features.xml | 1 + 8 files changed, 46 insertions(+), 1 deletion(-)
[...]
@@ -7176,6 +7179,14 @@ qemuBuildAccelCommandLine(virCommand *cmd, def->kvm_features->features[VIR_DOMAIN_KVM_DIRTY_RING] == VIR_TRISTATE_SWITCH_ON) { virBufferAsprintf(&buf, ",dirty-ring-size=%d", def->kvm_features->dirty_ring_size); }
This should be separated via a newline.
+ if (def->features[VIR_DOMAIN_FEATURE_KVM] == VIR_TRISTATE_SWITCH_ON && + def->kvm_features->features[VIR_DOMAIN_KVM_RAPL] == VIR_TRISTATE_SWITCH_ON) { + virBufferAddLit(&buf, ",rapl=true"); + + if (def->kvm_features->rapl_helper_socket != NULL) { + virBufferAsprintf(&buf, ",rapl-helper-socket=%s", def->kvm_features->rapl_helper_socket); + } + } break;
case VIR_DOMAIN_VIRT_HVF:
Apart from that the rest looks good providing the following:
I suppose that the 'rapl-helper-socket' is a shared (multiple qemu's use it) resource set up beforehand by the admin. Right?
The qemu-pr-helper could be run as a single instnce, or it could be run per-QEMU instance. The latter would give us better security isolation, for what is a privileged daemon. On the other hand, I wonder about the CPU overhead of having 100's of copies of the process running on a host.
If that is the case that means the lifecycle of the daemon and permissions (including selinux) for accessing the socket are not something that libvirt needs to deal with.
If either of them isn't true please outline how that socket is to be used to see how libvirt will need to approach it.
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Daniel P. Berrangé, Sep 03, 2024 at 12:08:
On Mon, Sep 02, 2024 at 03:09:42PM +0200, Peter Krempa wrote:
On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote:
Add the support in libvirt to activate the RAPL feature in QEMU. I suppose that the 'rapl-helper-socket' is a shared (multiple qemu's use it) resource set up beforehand by the admin. Right?
The qemu-pr-helper could be run as a single instnce, or it could be run per-QEMU instance. The latter would give us better security isolation, for what is a privileged daemon. On the other hand, I wonder about the CPU overhead of having 100's of copies of the process running on a host.
If it runs on a single instance, then the socket needs to be chmod/chown to something like qemu / libvirt group with access only to root and group. Running one helper instance per-QEMU instance would mean that every instance read 1 MSR / Package every second. The socket is left open (thanks to Daniel suggestion in QEMU review). The impact would be quite low I guess on the housekeeping CPU. When I designed the daemon with Paolo, the first solution was the main idea but I'm open to any solution that leads to a better adoption of the feature. Thanks, Anthony
If that is the case that means the lifecycle of the daemon and permissions (including selinux) for accessing the socket are not something that libvirt needs to deal with.
If either of them isn't true please outline how that socket is to be used to see how libvirt will need to approach it.
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Tue, Sep 03, 2024 at 13:29:28 +0200, Anthony Harivel wrote:
Daniel P. Berrangé, Sep 03, 2024 at 12:08:
On Mon, Sep 02, 2024 at 03:09:42PM +0200, Peter Krempa wrote:
On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote:
Add the support in libvirt to activate the RAPL feature in QEMU. I suppose that the 'rapl-helper-socket' is a shared (multiple qemu's use it) resource set up beforehand by the admin. Right?
The qemu-pr-helper could be run as a single instnce, or it could be run per-QEMU instance. The latter would give us better security isolation, for what is a privileged daemon. On the other hand, I wonder about the CPU overhead of having 100's of copies of the process running on a host.
So when I was originally skimming trhough the docs I didn't properly understand that the reason for the helper daemon is that there was a security issue with accessing the measurement counters and thought it was strictly for performance reasons.
If it runs on a single instance, then the socket needs to be chmod/chown to something like qemu / libvirt group with access only to root and group.
Another alternative for a shared instance to be used by multiple qemu instances is that libvirt can open the socket and pass it to qemu, which avoids the potential issue at-least with DAC security labels as the socket can be owned by root:root with mode 600. I'm not sure how the selinux policy for that daemon looks though.
Running one helper instance per-QEMU instance would mean that every instance read 1 MSR / Package every second. The socket is left open (thanks to Daniel suggestion in QEMU review). The impact would be quite low I guess on the housekeeping CPU.
When I designed the daemon with Paolo, the first solution was the main idea but I'm open to any solution that leads to a better adoption of the feature.
Libvirt can obviously manage also a per VM instance, which should be straightforward, but not as simple as this patch. This can theoretically also be added later, e.g. by adding a 'managed' property enabling the libvirt-managed daemon.

On Tue, Sep 03, 2024 at 02:16:58PM +0200, Peter Krempa wrote:
On Tue, Sep 03, 2024 at 13:29:28 +0200, Anthony Harivel wrote:
Daniel P. Berrangé, Sep 03, 2024 at 12:08:
On Mon, Sep 02, 2024 at 03:09:42PM +0200, Peter Krempa wrote:
On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote:
Add the support in libvirt to activate the RAPL feature in QEMU. I suppose that the 'rapl-helper-socket' is a shared (multiple qemu's use it) resource set up beforehand by the admin. Right?
The qemu-pr-helper could be run as a single instnce, or it could be run per-QEMU instance. The latter would give us better security isolation, for what is a privileged daemon. On the other hand, I wonder about the CPU overhead of having 100's of copies of the process running on a host.
So when I was originally skimming trhough the docs I didn't properly understand that the reason for the helper daemon is that there was a security issue with accessing the measurement counters and thought it was strictly for performance reasons.
Yep, this is only for security reasons.
If it runs on a single instance, then the socket needs to be chmod/chown to something like qemu / libvirt group with access only to root and group.
Another alternative for a shared instance to be used by multiple qemu instances is that libvirt can open the socket and pass it to qemu, which avoids the potential issue at-least with DAC security labels as the socket can be owned by root:root with mode 600.
I'm not sure how the selinux policy for that daemon looks though.
I don't believe any policy exists yet, since this is a brand new service.
Running one helper instance per-QEMU instance would mean that every instance read 1 MSR / Package every second. The socket is left open (thanks to Daniel suggestion in QEMU review). The impact would be quite low I guess on the housekeeping CPU.
When I designed the daemon with Paolo, the first solution was the main idea but I'm open to any solution that leads to a better adoption of the feature.
Libvirt can obviously manage also a per VM instance, which should be straightforward, but not as simple as this patch. This can theoretically also be added later, e.g. by adding a 'managed' property enabling the libvirt-managed daemon.
A parallel would be the qemu-pr-helper which this new service was initially derived from in terms of archicture. It also runs privileged for security reasons and can be single shared instance, or per-VM instance. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Daniel P. Berrangé, Sep 03, 2024 at 14:24:
On Tue, Sep 03, 2024 at 02:16:58PM +0200, Peter Krempa wrote:
On Tue, Sep 03, 2024 at 13:29:28 +0200, Anthony Harivel wrote:
Daniel P. Berrangé, Sep 03, 2024 at 12:08:
On Mon, Sep 02, 2024 at 03:09:42PM +0200, Peter Krempa wrote:
On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote:
Add the support in libvirt to activate the RAPL feature in QEMU. I suppose that the 'rapl-helper-socket' is a shared (multiple qemu's use it) resource set up beforehand by the admin. Right?
The qemu-pr-helper could be run as a single instnce, or it could be run per-QEMU instance. The latter would give us better security isolation, for what is a privileged daemon. On the other hand, I wonder about the CPU overhead of having 100's of copies of the process running on a host.
So when I was originally skimming trhough the docs I didn't properly understand that the reason for the helper daemon is that there was a security issue with accessing the measurement counters and thought it was strictly for performance reasons.
Yep, this is only for security reasons.
TL;DR: "A malicious user application may use RAPL to infer confidential information of another unprivileged application running on the same or a different CPU core by observing the power-related behavior of the victim application." The answer from Intel was to put the interface being privileged permission. Link: https://www.intel.com/content/www/us/en/developer/articles/technical/softwar...
If it runs on a single instance, then the socket needs to be chmod/chown to something like qemu / libvirt group with access only to root and group.
Another alternative for a shared instance to be used by multiple qemu instances is that libvirt can open the socket and pass it to qemu, which avoids the potential issue at-least with DAC security labels as the socket can be owned by root:root with mode 600.
I'm not sure how the selinux policy for that daemon looks though.
I don't believe any policy exists yet, since this is a brand new service.
Running one helper instance per-QEMU instance would mean that every instance read 1 MSR / Package every second. The socket is left open (thanks to Daniel suggestion in QEMU review). The impact would be quite low I guess on the housekeeping CPU.
When I designed the daemon with Paolo, the first solution was the main idea but I'm open to any solution that leads to a better adoption of the feature.
Libvirt can obviously manage also a per VM instance, which should be straightforward, but not as simple as this patch. This can theoretically also be added later, e.g. by adding a 'managed' property enabling the libvirt-managed daemon.
A parallel would be the qemu-pr-helper which this new service was initially derived from in terms of archicture. It also runs privileged for security reasons and can be single shared instance, or per-VM instance.
Which actually leads to the question to the following: Why do I "simply" not do like the qemu-pr-helper in libvirt ? see qemuProcessStartManagedPRDaemon() in src/qemu/qemu_process.c +2808
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Hi, Anthony Harivel, Sep 03, 2024 at 14:41:
Daniel P. Berrangé, Sep 03, 2024 at 14:24:
On Tue, Sep 03, 2024 at 02:16:58PM +0200, Peter Krempa wrote:
On Tue, Sep 03, 2024 at 13:29:28 +0200, Anthony Harivel wrote:
Daniel P. Berrangé, Sep 03, 2024 at 12:08:
On Mon, Sep 02, 2024 at 03:09:42PM +0200, Peter Krempa wrote:
On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote: > Add the support in libvirt to activate the RAPL feature in QEMU. I suppose that the 'rapl-helper-socket' is a shared (multiple qemu's use it) resource set up beforehand by the admin. Right?
The qemu-pr-helper could be run as a single instnce, or it could be run per-QEMU instance. The latter would give us better security isolation, for what is a privileged daemon. On the other hand, I wonder about the CPU overhead of having 100's of copies of the process running on a host.
So when I was originally skimming trhough the docs I didn't properly understand that the reason for the helper daemon is that there was a security issue with accessing the measurement counters and thought it was strictly for performance reasons.
Yep, this is only for security reasons.
TL;DR: "A malicious user application may use RAPL to infer confidential information of another unprivileged application running on the same or a different CPU core by observing the power-related behavior of the victim application."
The answer from Intel was to put the interface being privileged permission.
Link: https://www.intel.com/content/www/us/en/developer/articles/technical/softwar...
If it runs on a single instance, then the socket needs to be chmod/chown to something like qemu / libvirt group with access only to root and group.
Another alternative for a shared instance to be used by multiple qemu instances is that libvirt can open the socket and pass it to qemu, which avoids the potential issue at-least with DAC security labels as the socket can be owned by root:root with mode 600.
I'm not sure how the selinux policy for that daemon looks though.
I don't believe any policy exists yet, since this is a brand new service.
Running one helper instance per-QEMU instance would mean that every instance read 1 MSR / Package every second. The socket is left open (thanks to Daniel suggestion in QEMU review). The impact would be quite low I guess on the housekeeping CPU.
When I designed the daemon with Paolo, the first solution was the main idea but I'm open to any solution that leads to a better adoption of the feature.
Libvirt can obviously manage also a per VM instance, which should be straightforward, but not as simple as this patch. This can theoretically also be added later, e.g. by adding a 'managed' property enabling the libvirt-managed daemon.
A parallel would be the qemu-pr-helper which this new service was initially derived from in terms of archicture. It also runs privileged for security reasons and can be single shared instance, or per-VM instance.
Which actually leads to the question to the following: Why do I "simply" not do like the qemu-pr-helper in libvirt ?
see qemuProcessStartManagedPRDaemon() in src/qemu/qemu_process.c +2808
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
If I may resume the conversation: 1) The helper daemon is primarily needed for security reasons to prevent potential leaks of confidential information through RAPL. 2) There are two main ways to handle the helper daemon: > Single instance for all QEMU processes: This requires adjusting socket permissions (chmod/chown) to control access securely. > One instance per QEMU process: This offers better isolation but might increase CPU overhead slightly if the number of instances running get high. 3) Libvirt can manage both approaches, but no existing SELinux policy covers this yet, so we’ll need to consider that. 4) qemu-pr-helper is very similar to the qemu-vmsr-helper in terms of architecture and in terms of privileged needs. It can be single shared instance or per-vm instance. I cannot find the original patch that has added qemu-pr-helper into libvirt. The latest commit in src/qemu/qemu_process is 7eead248c65f and it doesn't look right IMHO. If you have a reference to the original patch, I can use it as a guide to implement this helper in the same way, if you believe that is the best solution. Thanks, Anthony

Hi, Anthony Harivel, Sep 05, 2024 at 13:01:
Hi,
Anthony Harivel, Sep 03, 2024 at 14:41:
Daniel P. Berrangé, Sep 03, 2024 at 14:24:
On Tue, Sep 03, 2024 at 02:16:58PM +0200, Peter Krempa wrote:
On Tue, Sep 03, 2024 at 13:29:28 +0200, Anthony Harivel wrote:
Daniel P. Berrangé, Sep 03, 2024 at 12:08:
On Mon, Sep 02, 2024 at 03:09:42PM +0200, Peter Krempa wrote: > On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote: > > Add the support in libvirt to activate the RAPL feature in QEMU. > I suppose that the 'rapl-helper-socket' is a shared (multiple qemu's use > it) resource set up beforehand by the admin. Right?
The qemu-pr-helper could be run as a single instnce, or it could be run per-QEMU instance. The latter would give us better security isolation, for what is a privileged daemon. On the other hand, I wonder about the CPU overhead of having 100's of copies of the process running on a host.
So when I was originally skimming trhough the docs I didn't properly understand that the reason for the helper daemon is that there was a security issue with accessing the measurement counters and thought it was strictly for performance reasons.
Yep, this is only for security reasons.
TL;DR: "A malicious user application may use RAPL to infer confidential information of another unprivileged application running on the same or a different CPU core by observing the power-related behavior of the victim application."
The answer from Intel was to put the interface being privileged permission.
Link: https://www.intel.com/content/www/us/en/developer/articles/technical/softwar...
If it runs on a single instance, then the socket needs to be chmod/chown to something like qemu / libvirt group with access only to root and group.
Another alternative for a shared instance to be used by multiple qemu instances is that libvirt can open the socket and pass it to qemu, which avoids the potential issue at-least with DAC security labels as the socket can be owned by root:root with mode 600.
I'm not sure how the selinux policy for that daemon looks though.
I don't believe any policy exists yet, since this is a brand new service.
Running one helper instance per-QEMU instance would mean that every instance read 1 MSR / Package every second. The socket is left open (thanks to Daniel suggestion in QEMU review). The impact would be quite low I guess on the housekeeping CPU.
When I designed the daemon with Paolo, the first solution was the main idea but I'm open to any solution that leads to a better adoption of the feature.
Libvirt can obviously manage also a per VM instance, which should be straightforward, but not as simple as this patch. This can theoretically also be added later, e.g. by adding a 'managed' property enabling the libvirt-managed daemon.
A parallel would be the qemu-pr-helper which this new service was initially derived from in terms of archicture. It also runs privileged for security reasons and can be single shared instance, or per-VM instance.
Which actually leads to the question to the following: Why do I "simply" not do like the qemu-pr-helper in libvirt ?
see qemuProcessStartManagedPRDaemon() in src/qemu/qemu_process.c +2808
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
If I may resume the conversation:
1) The helper daemon is primarily needed for security reasons to prevent potential leaks of confidential information through RAPL.
2) There are two main ways to handle the helper daemon: > Single instance for all QEMU processes: This requires adjusting socket permissions (chmod/chown) to control access securely. > One instance per QEMU process: This offers better isolation but might increase CPU overhead slightly if the number of instances running get high.
3) Libvirt can manage both approaches, but no existing SELinux policy covers this yet, so we’ll need to consider that.
4) qemu-pr-helper is very similar to the qemu-vmsr-helper in terms of architecture and in terms of privileged needs. It can be single shared instance or per-vm instance.
I cannot find the original patch that has added qemu-pr-helper into libvirt. The latest commit in src/qemu/qemu_process is 7eead248c65f and it doesn't look right IMHO.
If you have a reference to the original patch, I can use it as a guide to implement this helper in the same way, if you believe that is the best solution.
Thanks, Anthony
Gentle ping for the above discussion. Thanks, Anthony

On Thu, Sep 05, 2024 at 13:01:53 +0200, Anthony Harivel wrote:
Hi,
Anthony Harivel, Sep 03, 2024 at 14:41:
Daniel P. Berrangé, Sep 03, 2024 at 14:24:
On Tue, Sep 03, 2024 at 02:16:58PM +0200, Peter Krempa wrote:
On Tue, Sep 03, 2024 at 13:29:28 +0200, Anthony Harivel wrote:
Daniel P. Berrangé, Sep 03, 2024 at 12:08:
On Mon, Sep 02, 2024 at 03:09:42PM +0200, Peter Krempa wrote: > On Thu, Aug 22, 2024 at 17:59:47 +0200, Anthony Harivel wrote:
[...]
If I may resume the conversation:
1) The helper daemon is primarily needed for security reasons to prevent potential leaks of confidential information through RAPL.
2) There are two main ways to handle the helper daemon: > Single instance for all QEMU processes: This requires adjusting socket permissions (chmod/chown) to control access securely. > One instance per QEMU process: This offers better isolation but might increase CPU overhead slightly if the number of instances running get high.
3) Libvirt can manage both approaches, but no existing SELinux policy covers this yet, so we’ll need to consider that.
The question is also which of the approaches will actually be used. Implementing both is possible, it's just whether the managed approach will be used in the end.
4) qemu-pr-helper is very similar to the qemu-vmsr-helper in terms of architecture and in terms of privileged needs. It can be single shared instance or per-vm instance.
I cannot find the original patch that has added qemu-pr-helper into libvirt. The latest commit in src/qemu/qemu_process is 7eead248c65f and it doesn't look right IMHO.
If you have a reference to the original patch, I can use it as a guide to implement this helper in the same way, if you believe that is the best solution.
The qemu-pr-helper was introduced in libvirt in commits b0cd8045f012af78e863cd19f74e9db6c1b5dfdd and few prior ones. Note that it was a very long time ago so the code will likely no longer be state of the art. Also note that for 'qemu-pr-helper' it's much more complicated as it needs to be handled also for hotplug of disks when a new instance might need to be started. Here you'll only ever have one instance, that shares the lifetime with the VM, which makes few things much simpler (much less wiring up weird corner cases). In case we do in fact want a libvirt-managed version of this the question is also how to handle potential cases e.g. when the libvirt-managed daemon gets killed/crashes. The PR manager daemon has some form of event handlign which will restart and reconnect it. Is that needed here as well?

Hi, Peter Krempa, Oct 03, 2024 at 17:33:
Anthony Harivel, Sep 03, 2024 at 14:41: [...]
If I may resume the conversation:
1) The helper daemon is primarily needed for security reasons to prevent potential leaks of confidential information through RAPL.
2) There are two main ways to handle the helper daemon: > Single instance for all QEMU processes: This requires adjusting socket permissions (chmod/chown) to control access securely. > One instance per QEMU process: This offers better isolation but might increase CPU overhead slightly if the number of instances running get high.
3) Libvirt can manage both approaches, but no existing SELinux policy covers this yet, so we’ll need to consider that.
The question is also which of the approaches will actually be used. Implementing both is possible, it's just whether the managed approach will be used in the end.
In the end, the one that will be used is the one that makes things the more transparent/easier for the user. If libvirt does not manage the daemon, who is going to do it ? The vmsr daemon could also be managed by systemd. But again who is going to setup this ? My goal is to make this feature the easiest possible to deploy: - Install qemu (the daemon is installed on the system) - Install libvirt (the daemon is managed by it) - add the feature to the VM XML And it works. Users need this ideally as simple as this.
4) qemu-pr-helper is very similar to the qemu-vmsr-helper in terms of architecture and in terms of privileged needs. It can be single shared instance or per-vm instance.
I cannot find the original patch that has added qemu-pr-helper into libvirt. The latest commit in src/qemu/qemu_process is 7eead248c65f and it doesn't look right IMHO.
If you have a reference to the original patch, I can use it as a guide to implement this helper in the same way, if you believe that is the best solution.
The qemu-pr-helper was introduced in libvirt in commits b0cd8045f012af78e863cd19f74e9db6c1b5dfdd and few prior ones. Note that it was a very long time ago so the code will likely no longer be state of the art. Also note that for 'qemu-pr-helper' it's much more complicated as it needs to be handled also for hotplug of disks when a new instance might need to be started.
Here you'll only ever have one instance, that shares the lifetime with the VM, which makes few things much simpler (much less wiring up weird corner cases).
In case we do in fact want a libvirt-managed version of this the question is also how to handle potential cases e.g. when the libvirt-managed daemon gets killed/crashes.
The PR manager daemon has some form of event handlign which will restart and reconnect it. Is that needed here as well?
It's a stateless daemon so it should be pretty simple. - The daemon crash: the MSR value will be 0 during the reboot of the daemon, the VM does not crash. - The VM crash: it doesn't matter for the daemon. The VM reboot, request the MSR, it works. I found the commit 053d9e30e7a515e7fbddc598e91fc08158fa1329 quite interesting to continue the dev. Thanks, Anthony

On Mon, Oct 07, 2024 at 12:03:17 +0200, Anthony Harivel wrote:
Hi,
Peter Krempa, Oct 03, 2024 at 17:33:
Anthony Harivel, Sep 03, 2024 at 14:41: [...]
If I may resume the conversation:
1) The helper daemon is primarily needed for security reasons to prevent potential leaks of confidential information through RAPL.
2) There are two main ways to handle the helper daemon: > Single instance for all QEMU processes: This requires adjusting socket permissions (chmod/chown) to control access securely. > One instance per QEMU process: This offers better isolation but might increase CPU overhead slightly if the number of instances running get high.
3) Libvirt can manage both approaches, but no existing SELinux policy covers this yet, so we’ll need to consider that.
The question is also which of the approaches will actually be used. Implementing both is possible, it's just whether the managed approach will be used in the end.
In the end, the one that will be used is the one that makes things the more transparent/easier for the user.
If libvirt does not manage the daemon, who is going to do it ? The vmsr daemon could also be managed by systemd. But again who is going to setup this ?
My goal is to make this feature the easiest possible to deploy: - Install qemu (the daemon is installed on the system) - Install libvirt (the daemon is managed by it) - add the feature to the VM XML
And it works.
Users need this ideally as simple as this.
Okay that was really not obvious from the initial patch that you wanted to achieve the above. As noted it makes sense to have the managed version if it's going to be used this way, but the initial implementation didn't make it seem as such.
4) qemu-pr-helper is very similar to the qemu-vmsr-helper in terms of architecture and in terms of privileged needs. It can be single shared instance or per-vm instance.
I cannot find the original patch that has added qemu-pr-helper into libvirt. The latest commit in src/qemu/qemu_process is 7eead248c65f and it doesn't look right IMHO.
If you have a reference to the original patch, I can use it as a guide to implement this helper in the same way, if you believe that is the best solution.
The qemu-pr-helper was introduced in libvirt in commits b0cd8045f012af78e863cd19f74e9db6c1b5dfdd and few prior ones. Note that it was a very long time ago so the code will likely no longer be state of the art. Also note that for 'qemu-pr-helper' it's much more complicated as it needs to be handled also for hotplug of disks when a new instance might need to be started.
Here you'll only ever have one instance, that shares the lifetime with the VM, which makes few things much simpler (much less wiring up weird corner cases).
In case we do in fact want a libvirt-managed version of this the question is also how to handle potential cases e.g. when the libvirt-managed daemon gets killed/crashes.
The PR manager daemon has some form of event handlign which will restart and reconnect it. Is that needed here as well?
It's a stateless daemon so it should be pretty simple.
- The daemon crash: the MSR value will be 0 during the reboot of the daemon, the VM does not crash.
- The VM crash: it doesn't matter for the daemon. The VM reboot, request the MSR, it works.
Fair enough. If the VM keeps running and it just breaks this feature which looks mostly like statistics it should be fine if it simply stops working in such cases.

Peter Krempa, Oct 07, 2024 at 15:42:
On Mon, Oct 07, 2024 at 12:03:17 +0200, Anthony Harivel wrote:
Hi,
Peter Krempa, Oct 03, 2024 at 17:33:
Anthony Harivel, Sep 03, 2024 at 14:41: [...]
If I may resume the conversation:
1) The helper daemon is primarily needed for security reasons to prevent potential leaks of confidential information through RAPL.
2) There are two main ways to handle the helper daemon: > Single instance for all QEMU processes: This requires adjusting socket permissions (chmod/chown) to control access securely. > One instance per QEMU process: This offers better isolation but might increase CPU overhead slightly if the number of instances running get high.
3) Libvirt can manage both approaches, but no existing SELinux policy covers this yet, so we’ll need to consider that.
The question is also which of the approaches will actually be used. Implementing both is possible, it's just whether the managed approach will be used in the end.
In the end, the one that will be used is the one that makes things the more transparent/easier for the user.
If libvirt does not manage the daemon, who is going to do it ? The vmsr daemon could also be managed by systemd. But again who is going to setup this ?
My goal is to make this feature the easiest possible to deploy: - Install qemu (the daemon is installed on the system) - Install libvirt (the daemon is managed by it) - add the feature to the VM XML
And it works.
Users need this ideally as simple as this.
Okay that was really not obvious from the initial patch that you wanted to achieve the above.
As noted it makes sense to have the managed version if it's going to be used this way, but the initial implementation didn't make it seem as such.
And you are absolutely right ! Sorry about the confusion. My initial patch was only about enabling the feature in the XML file that creates the right QEMU command line. Daniel came into the discussion and gave me some thought about the daemon. Now I've decided that it should be better to have it this way: one commit to enable the feature in libvirt and another one to manage the daemon, both coming from the same patch queue. Bear with me while I'm doing this for the v3. Thanks, Anthony
4) qemu-pr-helper is very similar to the qemu-vmsr-helper in terms of architecture and in terms of privileged needs. It can be single shared instance or per-vm instance.
I cannot find the original patch that has added qemu-pr-helper into libvirt. The latest commit in src/qemu/qemu_process is 7eead248c65f and it doesn't look right IMHO.
If you have a reference to the original patch, I can use it as a guide to implement this helper in the same way, if you believe that is the best solution.
The qemu-pr-helper was introduced in libvirt in commits b0cd8045f012af78e863cd19f74e9db6c1b5dfdd and few prior ones. Note that it was a very long time ago so the code will likely no longer be state of the art. Also note that for 'qemu-pr-helper' it's much more complicated as it needs to be handled also for hotplug of disks when a new instance might need to be started.
Here you'll only ever have one instance, that shares the lifetime with the VM, which makes few things much simpler (much less wiring up weird corner cases).
In case we do in fact want a libvirt-managed version of this the question is also how to handle potential cases e.g. when the libvirt-managed daemon gets killed/crashes.
The PR manager daemon has some form of event handlign which will restart and reconnect it. Is that needed here as well?
It's a stateless daemon so it should be pretty simple.
- The daemon crash: the MSR value will be 0 during the reboot of the daemon, the VM does not crash.
- The VM crash: it doesn't matter for the daemon. The VM reboot, request the MSR, it works.
Fair enough. If the VM keeps running and it just breaks this feature which looks mostly like statistics it should be fine if it simply stops working in such cases.
participants (3)
-
Anthony Harivel
-
Daniel P. Berrangé
-
Peter Krempa