[You don't often get email from berrange@redhat.com. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification ]
On Fri, Dec 15, 2023 at 02:32:19AM +0800, Fima Shevrin wrote:
> RPC client implementation uses the following paradigm. The critical
> section is organized via virObjectLock(client)/virObjectUnlock(client)
> braces. Though this is potentially problematic as
> main thread: side thread:
> virObjectUnlock(client);
> virObjectLock(client);
> g_main_loop_quit(client->eventLoop);
> virObjectUnlock(client);
> g_main_loop_run(client->eventLoop);
>
> This means in particular that is the main thread is executing very long
> request like VM migration, the wakeup from the side thread could be
> stuck until the main request will be fully completed.
Can you explain this in more detail, with call traces illustration
for the two threads. You're not saying where the main thread is
doing work with the 'client' lock hold for a long time. Generally
the goal should be for the main thread to only hold the lock for
a short time. Also if the side thread is already holding a reference
on 'client', then potentially we should consider if it is possible
to terminate the event loop without acquiring the mutex, as GMainLoop
protects itself wrt concurrent usage already, provided all threads
hold a reference directly or indirectly.
>
> Discrubed case is easily reproducible with the simple python scripts doing slow
> and fast requests in parallel from two different threads.
>
> Our idea is to release the lock at the prepare stage and avoid libvirt stuck
> during the interaction between main and side threads.
>
> Co-authored-by: Denis V. Lunev <den@openvz.org>
> Co-authored-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
>
> Signed-off-by: Fima Shevrin <efim.shevrin@virtuozzo.com>
> ---
> src/rpc/virnetclient.c | 17 ++++++++++++-----
> src/util/vireventglibwatch.c | 28 ++++++++++++++++++++++++++--
> 2 files changed, 38 insertions(+), 7 deletions(-)
>
> diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
> index de8ebc2da9..63bd42ed3a 100644
> --- a/src/rpc/virnetclient.c
> +++ b/src/rpc/virnetclient.c
> @@ -987,6 +987,9 @@ int virNetClientSetTLSSession(virNetClient *client,
> * etc. If we make the grade, it will send us a '\1' byte.
> */
>
> + /* Here we are not passing the client to virEventGLibAddSocketWatch,
> + * since the entire virNetClientSetTLSSession function requires a lock.
> + */
> source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock),
> G_IO_IN,
> client->eventCtx,
> @@ -1692,14 +1695,18 @@ static int virNetClientIOEventLoop(virNetClient *client,
> if (client->nstreams)
> ev |= G_IO_IN;
>
> + /*
> + * We don't need to call virObjectLock(client) here,
> + * since the .prepare function inside glib Main Loop
> + * will do this. virEventGLibAddSocketWatch is responsible
> + * for passing client var in glib .prepare
> + */
> source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock),
> ev,
> client->eventCtx,
> - virNetClientIOEventFD, &data, NULL, NULL);
> -
> - /* Release lock while poll'ing so other threads
> - * can stuff themselves on the queue */
> - virObjectUnlock(client);
> + virNetClientIOEventFD, &data,
> + (virObjectLockable *)client,
> + NULL);
>
> #ifndef WIN32
> /* Block SIGWINCH from interrupting poll in curses programs,
> diff --git a/src/util/vireventglibwatch.c b/src/util/vireventglibwatch.c
> index 7680656ba2..641b772995 100644
> --- a/src/util/vireventglibwatch.c
> +++ b/src/util/vireventglibwatch.c
> @@ -34,11 +34,23 @@ struct virEventGLibFDSource {
>
>
> static gboolean
> -virEventGLibFDSourcePrepare(GSource *source G_GNUC_UNUSED,
> +virEventGLibFDSourcePrepare(GSource *source,
> gint *timeout)
> {
> + virEventGLibFDSource *ssource = (virEventGLibFDSource *)source;
> *timeout = -1;
>
> + if (ssource->client != NULL)
> + virObjectUnlock(ssource->client);
> +
> + /*
> + * Prepare function may be called multiple times
> + * in glib Main Loop, thus we assign source->client
> + * a null pointer to avoid calling pthread_mutex_unlock
> + * on an already unlocked mutex.
> + * */
> + ssource->client = NULL;
> +
> return FALSE;
> }
>
> @@ -123,11 +135,23 @@ struct virEventGLibSocketSource {
>
>
> static gboolean
> -virEventGLibSocketSourcePrepare(GSource *source G_GNUC_UNUSED,
> +virEventGLibSocketSourcePrepare(GSource *source,
> gint *timeout)
> {
> + virEventGLibSocketSource *ssource = (virEventGLibSocketSource *)source;
> *timeout = -1;
>
> + if (ssource->client != NULL)
> + virObjectUnlock(ssource->client);
> +
> + /*
> + * Prepare function may be called multiple times
> + * in glib Main Loop, thus we assign source->client
> + * a null pointer to avoid calling pthread_mutex_unlock
> + * on an already unlocked mutex.
> + * */
> + ssource->client = NULL;
> +
> return FALSE;
> }
>
> --
> 2.34.1
> _______________________________________________
> Devel mailing list -- devel@lists.libvirt.org
> To unsubscribe send an email to devel-leave@lists.libvirt.org
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|