On Fri, Apr 11, 2025 at 5:47 PM Daniel P. Berrangé
<berrange(a)redhat.com>
wrote:
> On Tue, Apr 08, 2025 at 10:27:51AM +0800, yong.huang(a)smartx.com wrote:
> > From: Hyman Huang <yong.huang(a)smartx.com>
> >
> > As advised by the GNU TLS, the caller should attempt again
> > if the gnutls_record_{recv,send} return EAGAIN or EINTR;
> > check the following link to view the details:
> >
>
https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html
> >
> > Add the retry parameter for virNetTLSSession{Read,Write}
> > functions in accordance with this guideline.
> >
> > This prevents the upper application from encountering the
> > following error message when it calls the virConnectOpenAuth API:
> > Unable to read TLS confirmation: Resource temporarily unavailable
> >
> > Signed-off-by: Hyman Huang <yong.huang(a)smartx.com>
> > ---
> > src/rpc/virnetclient.c | 2 +-
> > src/rpc/virnetsocket.c | 4 ++--
> > src/rpc/virnettlscontext.c | 28 ++++++++++++++++++++++++++--
> > src/rpc/virnettlscontext.h | 6 ++++--
> > 4 files changed, 33 insertions(+), 7 deletions(-)
> >
> > diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
> > index 92933220e2..5340db4211 100644
> > --- a/src/rpc/virnetclient.c
> > +++ b/src/rpc/virnetclient.c
> > @@ -1003,7 +1003,7 @@ int virNetClientSetTLSSession(virNetClient *client,
> > ignore_value(pthread_sigmask(SIG_SETMASK, &oldmask, NULL));
> > #endif /* !WIN32 */
> >
> > - len = virNetTLSSessionRead(client->tls, buf, 1);
> > + len = virNetTLSSessionRead(client->tls, buf, 1, true);
> > if (len < 0 && errno != ENOMSG) {
> > virReportSystemError(errno, "%s",
> > _("Unable to read TLS confirmation"));
> > diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c
> > index e8fc2d5f7d..6774dd4a4b 100644
> > --- a/src/rpc/virnetsocket.c
> > +++ b/src/rpc/virnetsocket.c
> > @@ -1739,7 +1739,7 @@ static ssize_t virNetSocketReadWire(virNetSocket
> *sock, char *buf, size_t len)
> > if (sock->tlsSession &&
> > virNetTLSSessionGetHandshakeStatus(sock->tlsSession) ==
> > VIR_NET_TLS_HANDSHAKE_COMPLETE) {
> > - ret = virNetTLSSessionRead(sock->tlsSession, buf, len);
> > + ret = virNetTLSSessionRead(sock->tlsSession, buf, len, false);
> > } else {
> > ret = read(sock->fd, buf, len);
> > }
> > @@ -1807,7 +1807,7 @@ static ssize_t virNetSocketWriteWire(virNetSocket
> *sock, const char *buf, size_t
> > if (sock->tlsSession &&
> > virNetTLSSessionGetHandshakeStatus(sock->tlsSession) ==
> > VIR_NET_TLS_HANDSHAKE_COMPLETE) {
> > - ret = virNetTLSSessionWrite(sock->tlsSession, buf, len);
> > + ret = virNetTLSSessionWrite(sock->tlsSession, buf, len, false);
> > } else {
> > ret = write(sock->fd, buf, len); /* sc_avoid_write */
> > }
> > diff --git a/src/rpc/virnettlscontext.c b/src/rpc/virnettlscontext.c
> > index e8023133b4..92d909c0b7 100644
> > --- a/src/rpc/virnettlscontext.c
> > +++ b/src/rpc/virnettlscontext.c
> > @@ -679,16 +679,28 @@ void
> virNetTLSSessionSetIOCallbacks(virNetTLSSession *sess,
> >
> >
> > ssize_t virNetTLSSessionWrite(virNetTLSSession *sess,
> > - const char *buf, size_t len)
> > + const char *buf, size_t len,
> > + bool retry)
> > {
> > ssize_t ret;
> >
> > + rewrite:
> > virObjectLock(sess);
> > ret = gnutls_record_send(sess->session, buf, len);
> >
> > if (ret >= 0)
> > goto cleanup;
> >
> > + if (retry && (ret == GNUTLS_E_AGAIN || ret ==
> GNUTLS_E_INTERRUPTED)) {
> > + /*
> > + * GNU TLS advises calling the function again to obtain the
> data if EAGAIN is returned.
> > + * See reference:
>
https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html
> > + * */
>
> To repeat my feedback on v1.
>
> This would surely be wrong for E_AGAIN - if the socket is blocking we must
> return to the main loop and wait for it to indicate that the socket has
> new data. If we don't wait in the main loop, this code will busy-loop on
a non-blocking socket
Agree, for the EAGAIN error, trying to repoll could be a solution like the
following pieces of code?
repoll:
source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock),
G_IO_IN,
client->eventCtx,
virNetClientIOEventTLSConfirm,
client, NULL);
g_main_loop_run(client->eventLoop);
retry:
len = virNetTLSSessionRead(client->tls, buf, 1, true);
if (len < 0 && errno == EINTR) {
goto retry;
}
if (len < 0 && errno == EAGAIN) {
goto repoll;
}
Yes, and virNetSocketReadWire is already handlnig both EINTR
and EAGAIN after it calls virNetTLSSessionRead.
The other caller is virNetClientSetTLSSession and I see that
is failed to handle EINTR/EGAIN, though EGAIN seems like it
ought to be unlikely given that caller waited for G_IO_IN...
> A retry on E_INTERRUPTED is ok, but E_AGAIN must wait for the
> socket to be readable. This is something higher level libvirt
> code should already be doing correctly. Do you have any real
> world example of a problem being hit ?
>
Yes, when our higher management app try to connect Libvirtd using
"/usr/local/venv/job-center/lib/python3.10/site-packages/libvirt.py", line
148, in openAuth raise libvirtError('virConnectOpenAuth() failed')
libvirt.libvirtError: Unable to read TLS confirmation: Resource temporarily
unavailable
....this error message comes from virNetClientSetTLSSession
and suggests the wait for G_IO_IN wasn't sufficient, so we
should retry I guess
With regards,
Daniel
--
|: