Re: [PATCH] vfio/pci: Propagate ACPI notifications to the user-space

9 Mar 2023


      All other ACPI events that are available to userspace are on netlink already.
As for translation of ACPI paths. It is sort of a requirement for VMM
to translate the PCI path from host to guest because the PCI device
tree in the guest is totally different already. The same follows for
ACPI paths.

What would you propose instead of netlink?
Sysfs entry for VFIO PCI device that accepts eventfd and signals the
events via eventfd? Or moving it into ACPI layer entirely and adding
eventfd sysfs interface for all ACPI devices?
--
Dominik


On Wed, Mar 8, 2023 at 3:38 PM Alex Williamson
<alex.williamson@redhat.com> wrote:
...
On Wed, 8 Mar 2023 14:44:28 -0800
Dominik Behr <dbehr@google.com> wrote:
...
On Wed, Mar 8, 2023 at 12:06 PM Alex Williamson
<alex.williamson@redhat.com> wrote:
...
On Wed, 8 Mar 2023 10:45:51 -0800
Dominik Behr <dbehr@chromium.org> wrote:
...
It is the same interface as other ACPI events like AC adapter LID etc
are forwarded to user-space.
 ACPI events are not particularly high frequency like interrupts.
I'm not sure that's relevant, these interfaces don't proclaim to
provide isolation among host processes which manage behavior relative
to accessories.  These are effectively system level services.  It's only
a very, very specialized use case that places a VMM as peers among these
processes.  Generally we don't want to grant a VMM any privileges beyond
what it absolutely needs, so letting a VMM managing an assigned NIC
really ought not to be able to snoop host events related to anything
other than the NIC.
How is that related to the fact that we are forwarding VFIO-PCI events
to netlink? Kernel does not grant any privileges to VMM.
There are already other ACPI events on netlink. The implementer of the
VMM can choose to allow VMM to snoop them or not.
In our case our VMM (crosvm) does already snoop LID, battery and AC
adapter events so the guest can adjust its behavior accordingly.
This change just adds another class of ACPI events that are forwarded
to netlink.
That's true, it is the VMM choice whether to allow snooping netlink,
but this is being proposed as THE solution to allow VMMs to receive
ACPI events related to vfio assigned devices.  If the solution
inherently requires escalating the VMM privileges to see all netlink
events, that's a weakness in the proposal.  As noted previously,
there's also no introspection here, the VMM can't know whether it
should listen to netlink for ACPI events or include AML related to a
GPE for the device.  It cannot determine if either the kernel supports
this feature or if the device has an ACPI companion that can generate
these events.
...
...
...
...
...
> What sort of ACPI events are we expecting to see here and what does user space do with them?
The use we are looking at right now are D-notifier events about the
GPU power available to mobile discrete GPUs.
The firmware notifies the GPU driver and resource daemon to
dynamically adjust the amount of power that can be used by the GPU.
...
The proposed interface really has no introspection, how does the VMM
know which devices need ACPI tables added "upfront"?  How do these
events factor into hotplug device support, where we may not be able to
dynamically inject ACPI code into the VM?
The VMM can examine PCI IDs and the associated firmware node of the
PCI device to figure out what events to expect and what ACPI table to
generate to support it but that should not be necessary.
I'm not entirely sure where your VMM is drawing the line between the VM
and management tools, but I think this is another case where the
hypervisor itself should not have privileges to examine the host
firmware tables to build its own.  Something like libvirt would be
responsible for that.
Yes, but that depends on the design of hypervisor and VMM and is not
related to this patch.
It is very much related to this patch if it proposes an interface to
solve a problem which is likely not compatible with the security model
of other VMMs.  We need a single solution to support all VMMs.
...
...
...
A generic GPE based ACPI event forwarder as Grzegorz proposed can be
injected at VM init time and handle any notification that comes later,
even from hotplug devices.
It appears that forwarder is sending the notify to a specific ACPI
device node, so it's unclear to me how that becomes boilerplate AML
added to all VMs.  We'll need to notify different devices based on
different events, right?
Valid point. The notifications have a "scope" ACPI path.
In my experience these events are consumed without looking where they
came from but I believe the patch can be extended to
provide ACPI path, in your example "_SB.PCI0.GPP0.PEGP" instead of
generic vfio_pci which VMM could use to translate an equivalent ACPI
path in the guest and pass it to a generic ACPI GPE based notifier via
shared memory. Grzegorz could you chime in whether that would be
possible?
So effectively we're imposing the host ACPI namespace on the VM, or at
least a mapping between the host and VM namespace?  The generality of
this is not improving.
...
...
...
...
The acpi_bus_generate_netlink_event() below really only seems to form a
u8 event type from the u32 event.  Is this something that could be
provided directly from the vfio device uAPI with an ioeventfd, thus
providing introspection that a device supports ACPI event notifications
and the ability for the VMM to exclusively monitor those events, and
only those events for the device, without additional privileges?
From what I can see these events are 8 bit as they come from ACPI.
They also do not carry any payload and it is up to the receiving
driver to query any additional context/state from the device.
This will work the same in the VM where driver can query the same
information from the passed through PCI device.
There are multiple other netflink based ACPI events forwarders which
do exactly the same thing for other devices like AC adapter, lid/power
button, ACPI thermal notifications, etc.
They all use the same mechanism and can be received by user-space
programs whether VMMs or others.
But again, those other receivers are potentially system services, not
an isolated VM instance operating in a limited privilege environment.
IMO, it's very different if the host display server has access to lid
or power events than it is to allow some arbitrary VM that happens to
have an unrelated assigned device that same privilege.
Therefore these VFIO related ACPI events could be received by a system
service via this netlink event and selectively forwarded to VMM if
such is a desire of whoever implements the userspace.
This is outside the scope of this patch. In our case our VMM does
receive these LID, AC or battery events.
But this is backwards, we're presupposing the choice to use netlink
based on the convenience of one VMM, which potentially creates
obstacles, maybe even security isolation issues for other VMMs.  The
method of delivering ACPI events to a VMM is very much within the scope
of this proposal.  Thanks,
Alex