On Fri, Apr 11, 2025 at 5:47 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Apr 08, 2025 at 10:27:51AM +0800, yong.huang@smartx.com wrote:
> From: Hyman Huang <yong.huang@smartx.com>
>
> As advised by the GNU TLS, the caller should attempt again
> if the gnutls_record_{recv,send} return EAGAIN or EINTR;
> check the following link to view the details:
> https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html
>
> Add the retry parameter for virNetTLSSession{Read,Write}
> functions in accordance with this guideline.
>
> This prevents the upper application from encountering the
> following error message when it calls the virConnectOpenAuth API:
> Unable to read TLS confirmation: Resource temporarily unavailable
>
> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> ---
>  src/rpc/virnetclient.c     |  2 +-
>  src/rpc/virnetsocket.c     |  4 ++--
>  src/rpc/virnettlscontext.c | 28 ++++++++++++++++++++++++++--
>  src/rpc/virnettlscontext.h |  6 ++++--
>  4 files changed, 33 insertions(+), 7 deletions(-)
>
> diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
> index 92933220e2..5340db4211 100644
> --- a/src/rpc/virnetclient.c
> +++ b/src/rpc/virnetclient.c
> @@ -1003,7 +1003,7 @@ int virNetClientSetTLSSession(virNetClient *client,
>      ignore_value(pthread_sigmask(SIG_SETMASK, &oldmask, NULL));
>  #endif /* !WIN32 */

> -    len = virNetTLSSessionRead(client->tls, buf, 1);
> +    len = virNetTLSSessionRead(client->tls, buf, 1, true);
>      if (len < 0 && errno != ENOMSG) {
>          virReportSystemError(errno, "%s",
>                               _("Unable to read TLS confirmation"));
> diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c
> index e8fc2d5f7d..6774dd4a4b 100644
> --- a/src/rpc/virnetsocket.c
> +++ b/src/rpc/virnetsocket.c
> @@ -1739,7 +1739,7 @@ static ssize_t virNetSocketReadWire(virNetSocket *sock, char *buf, size_t len)
>      if (sock->tlsSession &&
>          virNetTLSSessionGetHandshakeStatus(sock->tlsSession) ==
>          VIR_NET_TLS_HANDSHAKE_COMPLETE) {
> -        ret = virNetTLSSessionRead(sock->tlsSession, buf, len);
> +        ret = virNetTLSSessionRead(sock->tlsSession, buf, len, false);
>      } else {
>          ret = read(sock->fd, buf, len);
>      }
> @@ -1807,7 +1807,7 @@ static ssize_t virNetSocketWriteWire(virNetSocket *sock, const char *buf, size_t
>      if (sock->tlsSession &&
>          virNetTLSSessionGetHandshakeStatus(sock->tlsSession) ==
>          VIR_NET_TLS_HANDSHAKE_COMPLETE) {
> -        ret = virNetTLSSessionWrite(sock->tlsSession, buf, len);
> +        ret = virNetTLSSessionWrite(sock->tlsSession, buf, len, false);
>      } else {
>          ret = write(sock->fd, buf, len); /* sc_avoid_write */
>      }
> diff --git a/src/rpc/virnettlscontext.c b/src/rpc/virnettlscontext.c
> index e8023133b4..92d909c0b7 100644
> --- a/src/rpc/virnettlscontext.c
> +++ b/src/rpc/virnettlscontext.c
> @@ -679,16 +679,28 @@ void virNetTLSSessionSetIOCallbacks(virNetTLSSession *sess,


>  ssize_t virNetTLSSessionWrite(virNetTLSSession *sess,
> -                              const char *buf, size_t len)
> +                              const char *buf, size_t len,
> +                              bool retry)
>  {
>      ssize_t ret;

> + rewrite:
>      virObjectLock(sess);
>      ret = gnutls_record_send(sess->session, buf, len);

>      if (ret >= 0)
>          goto cleanup;

> +    if (retry && (ret == GNUTLS_E_AGAIN || ret == GNUTLS_E_INTERRUPTED)) {
> +        /*
> +         * GNU TLS advises calling the function again to obtain the data if EAGAIN is returned.
> +         * See reference: https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html
> +         * */

To repeat my feedback on v1.

This would surely be wrong for E_AGAIN - if the socket is blocking we must
return to the main loop and wait for it to indicate that the socket has
new data. If we don't wait in the main loop, this code will busy-loop on
a non-blocking socket

Agree, for the EAGAIN error, trying to repoll could be a solution like the
following pieces of code?

 repoll:
    source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock),
                                        G_IO_IN,
                                        client->eventCtx,
                                        virNetClientIOEventTLSConfirm, client, NULL);

    g_main_loop_run(client->eventLoop);

 retry:
    len = virNetTLSSessionRead(client->tls, buf, 1, true);
    if (len < 0 && errno == EINTR) {
        goto retry;
    }
    if (len < 0 && errno == EAGAIN) {
        goto repoll;
    }
 


The GNUTLS docs do NOT say you must call it again /immediately/.

A retry on E_INTERRUPTED is ok, but E_AGAIN must wait for the
socket to be readable. This is something higher level libvirt
code should already be doing correctly. Do you have any real
world example of a problem being hit ?

Yes, when our higher management app try to connect Libvirtd using
uri: qemu+tls://10.111.222.11/system?no_verify=1 with the following
python code:

def libvirt_connection(uri=None, auth_info=None):
    conn = None
    _auth_info = auth_info

    def _request_cred(credentials, user_data):
        if not _auth_info:
            return 0

        for credential in credentials:
            if credential[0] == libvirt.VIR_CRED_AUTHNAME:
                credential[4] = _auth_info["username"]
            elif credential[0] == libvirt.VIR_CRED_PASSPHRASE:
                credential[4] = _auth_info["password"]
        return 0

    auth = [[libvirt.VIR_CRED_AUTHNAME, libvirt.VIR_CRED_PASSPHRASE], _request_cred, None]

    if uri and "no_verify" not in uri:
        uri = "{}?no_verify=1".format(uri)

    try:
        conn = libvirt.openAuth(uri or "qemu:///system", auth)
        yield conn
    except libvirt.libvirtError as e:
        logging.exception("libvirtError: %s" % str(e))
        raise
    finally:
        if conn:
          conn.close()

We encountered the following error message:

[2025-04-01 12:54:42,339: ERROR/MainProcess trace_id=79a42e2d-9ee1-4381-9a76-c96a05c88a66] libvirtError: Unable to read TLS confirmation: Resource temporarily unavailable Traceback (most recent call last): File "/usr/local/venv/job-center/lib/python3.10/site-packages/smartx_app/elf/common/utils/libvirt_driver.py", line 58, in libvirt_connection conn = libvirt.openAuth(uri or "qemu:///system", auth) File "/usr/local/venv/job-center/lib/python3.10/site-packages/libvirt.py", line 148, in openAuth raise libvirtError('virConnectOpenAuth() failed') libvirt.libvirtError: Unable to read TLS confirmation: Resource temporarily unavailable



> +        VIR_DEBUG("Try writing data from the TLS session again");
> +        virObjectUnlock(sess);
> +        goto rewrite;
> +    }
> +

With regards,
Daniel
--
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Thanks for the comments.

Yong
--
Best regards