On 5/12/24 21:45, Andrew Melnychenko wrote:
This series of rfc patches adds support for loading the RSS eBPF
program and passing it to the QEMU.
Comments and suggestions would be useful.
QEMU with vhost may work with RSS through eBPF. To load eBPF,
the capabilities required that Libvirt may provide.
eBPF program and maps may be unique for particular QEMU and
Libvirt retrieves eBPF through qapi.
For now, there is only "RSS" eBPF object in QEMU, in the future,
there may be another one(g.e. network filters).
That's why in Libvirt added logic to load and store any
eBPF object that QEMU provides using qapi schema.
One of the reasons why this series of patches is in RFC are tests.
To this series of patches, the tests were added.
For now, the tests are synthetic, the proper "reply" file should
be generated with a new "caps" file. Currently, there are changes
in caps-9.0.0* files. There was added support for ebpf_rss_fds feature,
and request-ebpf command.
Also, there was added new config for qemuConfig - allowEBPF.
This config allows to enable/disable eBPF blob loading explicitly.
This is required for qemuxmlconf tests - where some test expects that
RSS would not support eBPF.
So, overall, the tests are required for review, comment, and discussion
how we want them to be implemented in the future.
For virtio-net RSS, the document has not changed.
```
<interface type="network">
<model type="virtio"/>
<driver queues="4" rss="on"
rss_hash_report="off"/>
<interface type="network">
```
Simplified routine for RSS:
* Libvirt retrieves eBPF "RSS" and load it.
* Libvirt passes file descriptors to virtio-net with property "ebpf_rss_fds"
("rss" property should be "on" too).
* if fds was provided - QEMU using eBPF RSS implementation.
* if fds was not provided - QEMU tries to load eBPF RSS in own context and use it.
* if eBPF RSS was not loaded - QEMU uses "in-qemu" RSS(vhost not supported).
Changes since RFC v2:
* refactored and rebased.
* applied changes according to the Qemu.
* added basic test.
Changes since RFC v1:
* changed eBPF format saved in the XML cache.
* refactored and checked with syntax test.
* refactored patch hunks.
Andrew Melnychenko (6):
qemu_monitor: Added QEMU's "request-ebpf" support.
qemu_capabilities: Added logic for retrieving eBPF objects from QEMU.
qemu_interface: Added routine for loading the eBPF objects.
qemu_command: Added "ebpf_rss_fds" support for virtio-net.
qemu_conf: Added configuration to optionally disable eBPF loading.
tests: Added tests for eBPF blob loading.
meson.build | 7 +
meson_options.txt | 1 +
src/qemu/meson.build | 1 +
src/qemu/qemu_capabilities.c | 126 +++++++++++
src/qemu/qemu_capabilities.h | 6 +
src/qemu/qemu_command.c | 63 ++++++
src/qemu/qemu_conf.c | 2 +
src/qemu/qemu_conf.h | 2 +
src/qemu/qemu_domain.c | 4 +
src/qemu/qemu_domain.h | 3 +
src/qemu/qemu_interface.c | 83 ++++++++
src/qemu/qemu_interface.h | 4 +
src/qemu/qemu_monitor.c | 13 ++
src/qemu/qemu_monitor.h | 3 +
src/qemu/qemu_monitor_json.c | 26 +++
src/qemu/qemu_monitor_json.h | 3 +
.../caps_9.0.0_x86_64.replies | 199 ++++++++++--------
.../caps_9.0.0_x86_64.xml | 4 +
tests/qemuxml2argvmock.c | 21 ++
.../net-virtio-rss-bpf.x86_64-latest.args | 37 ++++
.../net-virtio-rss-bpf.x86_64-latest.xml | 46 ++++
tests/qemuxmlconfdata/net-virtio-rss-bpf.xml | 46 ++++
tests/qemuxmlconftest.c | 4 +
23 files changed, 612 insertions(+), 92 deletions(-)
create mode 100644 tests/qemuxmlconfdata/net-virtio-rss-bpf.x86_64-latest.args
create mode 100644 tests/qemuxmlconfdata/net-virtio-rss-bpf.x86_64-latest.xml
create mode 100644 tests/qemuxmlconfdata/net-virtio-rss-bpf.xml
Couple of thoughts:
1) we mandate that the code and test suite passes after each individual
commit. In practice, this means that parts of the last patch should be
amended to the second one.
2) linking with ebpf - is there a way to avoid that? I mean tried to
avoid using libebpf when rewriting devices CGroupV1 controller for
CGroupsV2.
3) trust issues - libvirt will load "random" eBPF program into kernel.
Can it be trusted? Because on one hand libvirt does a lot to restrict
QEMU (because it can't be trusted) but then it'd trust QEMU and load
program it provided into kernel. But to be fair - before QEMU runs any
guest code it can be trusted, probably. Or better - it is as vulnerable
to malicious sysadmin attacks as any other binary on the system.
Michal