On Wed, Mar 27, 2024 at 10:53 AM -0500, Jonathon Jongsma <jjongsma(a)redhat.com>
wrote:
On 3/20/24 10:46 AM, Boris Fiuczynski wrote:
> When an udev add event occurs the mdev active config data requires an
> update via mdevctl as the udev does not contain all config data.
> This update needs to occur immediate and to be finished before the
> libvirt CREATE event is issued to keep the API usage reliable.
The problem that you're trying to solve isn't spelled out very
explicitly here. My understanding is that the issue occurs when you
start a mediated device that has attributes (and the device is supported
by callout scripts in mdevctl). With the current code, my assumption is
that we observe the following behavior:
- udev event is emitted
- libvirt handles the udev event and emits a
VIR_NODE_DEVICE_EVENT_CREATED event for the new mdev. At this point the
mdev only information that is provided by udev (such as uuid, mdev type)
and doesn't include any attributes, for example.
- we schedule a thread to query mdevctl for the full information for the
device. The mdev device gets update to include the full mdevctl data,
including attributes
- libvirt emits a VIR_NODE_DEVICE_EVENT_ID_UPDATE event to indicate that
the device information was updated
And you're trying to skip the last step and make sure that the
attributes are set at the time that the first CREATED event is
emitted.
Boris is currently on vacation so I cannot discuss this with him, but
that’s my understanding as well. For me, this assumption also makes
sense… because otherwise it’s awkward to use the libvirt events API for
mdev devices (wait for a CREATED event and then for an UPDATE event for
the same device - not a really good UX pattern IMO).
[…snip…]
>
> + /* The added mdev needs an immediate active config update before
> + * the event is issues to allow sane API usage. */
> + if (is_mdev && (nodeDeviceUpdateMediatedDevices() < 0))
> + VIR_WARN("Update of mediated device %s failed",
> + def ? NULLSTR(def->sysfs_path) : "");
> +
> ret = 0;
>
> cleanup:
``
Right now libvirt has a dedicated thread for handling udev updates, and
we also spawn a separate thread to query mdevctl whenever we detect that
mdev state has changed. This patch makes it so that mdevctl is now
executed directly from the udev thread, which changes that separation of
responsibilities. It shouldn't cause any memory corruption since
nodeDeviceUpdateMediatedDevices() is already threadsafe, but it makes
the code harder to reason about. When we introduced mdevctl support to
libvirt, the first versions of my patch series did in fact query mdevctl
from the udev thread. But based on concerns during code review, it was
moved out to a separate thread. So I don't think that we should
necessarily reintroduce that here.
Yep.
Right now we already have in node_device_udev:
+ init thread for initialization
+ udev thread, which creates and then enqueues the libvirt events
+ mdevctl update thread, which calls `nodeDeviceUpdateMediateDevices()‘
(executing 'mdevctl' and updating the internal data structures with
the result)
+ main thread for the timeout mechanism to launch the mdevCtlUpdateThread
+ another thread (not sure if it’s the main thread): virNodeDeviceUpdate ->
nodeDeviceUpdateMediateDevices()
How about:
+ instead of having a mdevctl update thread and an init thread we could
have only one worker pool (similar to qemu_process.c)
+ one udev thread that dispatches the events and then submits jobs to
the worker pool
+ main thread for the timeout mechanism to schedule the “mdev update worker pool job”
+ another thread (not sure if it’s the main thread):
virNodeDeviceUpdate: -> nodeDeviceUpdateMediateDevices() <- maybe we
can leave it as is.
In addition, I notice that udevHandleOneDevice() already schedules an
mdevctl update thread to be run after this function (udevAddOneDevice())
is called for a mediated device. So after this patch, a new udev event
for a mediated device would result in the udev thread querying mdevctl
immediately and then also spawning a new thread to query mdevctl.
This can probably be removed after the patch from Boris.
Jonathon
--
Kind regards / Beste Grüße
Marc Hartmayer
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Wolfgang Wendt
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294