[PATCH v1 0/1] rpc: Re-read the data if EAGAIN or EINTR were captured

From: Hyman Huang <yong.huang@smartx.com> v1: 1. Encapsulate the retry logic inside virNetTLSSession{Read,Write} 2. Use VIR_DEBUG instead of VIR_WARN to log the retry operation rfc: https://patchew.org/Libvirt/d716a59dc2c61916917c6d2e07d62055745606d5.1744044... Please review, thanks. Yong Hyman Huang (1): rpc: Add the retry argument for virNetTLSSession{Read,Write} src/rpc/virnetclient.c | 2 +- src/rpc/virnetsocket.c | 4 ++-- src/rpc/virnettlscontext.c | 28 ++++++++++++++++++++++++++-- src/rpc/virnettlscontext.h | 6 ++++-- 4 files changed, 33 insertions(+), 7 deletions(-) -- 2.27.0

From: Hyman Huang <yong.huang@smartx.com> As advised by the GNU TLS, the caller should attempt again if the gnutls_record_{recv,send} return EAGAIN or EINTR; check the following link to view the details: https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html Add the retry parameter for virNetTLSSession{Read,Write} functions in accordance with this guideline. This prevents the upper application from encountering the following error message when it calls the virConnectOpenAuth API: Unable to read TLS confirmation: Resource temporarily unavailable Signed-off-by: Hyman Huang <yong.huang@smartx.com> --- src/rpc/virnetclient.c | 2 +- src/rpc/virnetsocket.c | 4 ++-- src/rpc/virnettlscontext.c | 28 ++++++++++++++++++++++++++-- src/rpc/virnettlscontext.h | 6 ++++-- 4 files changed, 33 insertions(+), 7 deletions(-) diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index 92933220e2..5340db4211 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -1003,7 +1003,7 @@ int virNetClientSetTLSSession(virNetClient *client, ignore_value(pthread_sigmask(SIG_SETMASK, &oldmask, NULL)); #endif /* !WIN32 */ - len = virNetTLSSessionRead(client->tls, buf, 1); + len = virNetTLSSessionRead(client->tls, buf, 1, true); if (len < 0 && errno != ENOMSG) { virReportSystemError(errno, "%s", _("Unable to read TLS confirmation")); diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c index e8fc2d5f7d..6774dd4a4b 100644 --- a/src/rpc/virnetsocket.c +++ b/src/rpc/virnetsocket.c @@ -1739,7 +1739,7 @@ static ssize_t virNetSocketReadWire(virNetSocket *sock, char *buf, size_t len) if (sock->tlsSession && virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == VIR_NET_TLS_HANDSHAKE_COMPLETE) { - ret = virNetTLSSessionRead(sock->tlsSession, buf, len); + ret = virNetTLSSessionRead(sock->tlsSession, buf, len, false); } else { ret = read(sock->fd, buf, len); } @@ -1807,7 +1807,7 @@ static ssize_t virNetSocketWriteWire(virNetSocket *sock, const char *buf, size_t if (sock->tlsSession && virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == VIR_NET_TLS_HANDSHAKE_COMPLETE) { - ret = virNetTLSSessionWrite(sock->tlsSession, buf, len); + ret = virNetTLSSessionWrite(sock->tlsSession, buf, len, false); } else { ret = write(sock->fd, buf, len); /* sc_avoid_write */ } diff --git a/src/rpc/virnettlscontext.c b/src/rpc/virnettlscontext.c index e8023133b4..92d909c0b7 100644 --- a/src/rpc/virnettlscontext.c +++ b/src/rpc/virnettlscontext.c @@ -679,16 +679,28 @@ void virNetTLSSessionSetIOCallbacks(virNetTLSSession *sess, ssize_t virNetTLSSessionWrite(virNetTLSSession *sess, - const char *buf, size_t len) + const char *buf, size_t len, + bool retry) { ssize_t ret; + rewrite: virObjectLock(sess); ret = gnutls_record_send(sess->session, buf, len); if (ret >= 0) goto cleanup; + if (retry && (ret == GNUTLS_E_AGAIN || ret == GNUTLS_E_INTERRUPTED)) { + /* + * GNU TLS advises calling the function again to obtain the data if EAGAIN is returned. + * See reference: https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html + * */ + VIR_DEBUG("Try writing data from the TLS session again"); + virObjectUnlock(sess); + goto rewrite; + } + switch (ret) { case GNUTLS_E_AGAIN: errno = EAGAIN; @@ -712,16 +724,28 @@ ssize_t virNetTLSSessionWrite(virNetTLSSession *sess, } ssize_t virNetTLSSessionRead(virNetTLSSession *sess, - char *buf, size_t len) + char *buf, size_t len, + bool retry) { ssize_t ret; + reread: virObjectLock(sess); ret = gnutls_record_recv(sess->session, buf, len); if (ret >= 0) goto cleanup; + if (retry && (ret == GNUTLS_E_AGAIN || ret == GNUTLS_E_INTERRUPTED)) { + /* + * GNU TLS advises calling the function again to obtain the data if EAGAIN is returned. + * See reference: https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html + * */ + VIR_DEBUG("Try reading data from the TLS session again"); + virObjectUnlock(sess); + goto reread; + } + switch (ret) { case GNUTLS_E_AGAIN: errno = EAGAIN; diff --git a/src/rpc/virnettlscontext.h b/src/rpc/virnettlscontext.h index 11c954ce4b..da26d7836b 100644 --- a/src/rpc/virnettlscontext.h +++ b/src/rpc/virnettlscontext.h @@ -81,9 +81,11 @@ void virNetTLSSessionSetIOCallbacks(virNetTLSSession *sess, void *opaque); ssize_t virNetTLSSessionWrite(virNetTLSSession *sess, - const char *buf, size_t len); + const char *buf, size_t len, + bool retry); ssize_t virNetTLSSessionRead(virNetTLSSession *sess, - char *buf, size_t len); + char *buf, size_t len, + bool retry); int virNetTLSSessionHandshake(virNetTLSSession *sess); -- 2.27.0

On Tue, Apr 08, 2025 at 10:27:51AM +0800, yong.huang@smartx.com wrote:
From: Hyman Huang <yong.huang@smartx.com>
As advised by the GNU TLS, the caller should attempt again if the gnutls_record_{recv,send} return EAGAIN or EINTR; check the following link to view the details: https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html
Add the retry parameter for virNetTLSSession{Read,Write} functions in accordance with this guideline.
This prevents the upper application from encountering the following error message when it calls the virConnectOpenAuth API: Unable to read TLS confirmation: Resource temporarily unavailable
Signed-off-by: Hyman Huang <yong.huang@smartx.com> --- src/rpc/virnetclient.c | 2 +- src/rpc/virnetsocket.c | 4 ++-- src/rpc/virnettlscontext.c | 28 ++++++++++++++++++++++++++-- src/rpc/virnettlscontext.h | 6 ++++-- 4 files changed, 33 insertions(+), 7 deletions(-)
diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index 92933220e2..5340db4211 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -1003,7 +1003,7 @@ int virNetClientSetTLSSession(virNetClient *client, ignore_value(pthread_sigmask(SIG_SETMASK, &oldmask, NULL)); #endif /* !WIN32 */
- len = virNetTLSSessionRead(client->tls, buf, 1); + len = virNetTLSSessionRead(client->tls, buf, 1, true); if (len < 0 && errno != ENOMSG) { virReportSystemError(errno, "%s", _("Unable to read TLS confirmation")); diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c index e8fc2d5f7d..6774dd4a4b 100644 --- a/src/rpc/virnetsocket.c +++ b/src/rpc/virnetsocket.c @@ -1739,7 +1739,7 @@ static ssize_t virNetSocketReadWire(virNetSocket *sock, char *buf, size_t len) if (sock->tlsSession && virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == VIR_NET_TLS_HANDSHAKE_COMPLETE) { - ret = virNetTLSSessionRead(sock->tlsSession, buf, len); + ret = virNetTLSSessionRead(sock->tlsSession, buf, len, false); } else { ret = read(sock->fd, buf, len); } @@ -1807,7 +1807,7 @@ static ssize_t virNetSocketWriteWire(virNetSocket *sock, const char *buf, size_t if (sock->tlsSession && virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == VIR_NET_TLS_HANDSHAKE_COMPLETE) { - ret = virNetTLSSessionWrite(sock->tlsSession, buf, len); + ret = virNetTLSSessionWrite(sock->tlsSession, buf, len, false); } else { ret = write(sock->fd, buf, len); /* sc_avoid_write */ } diff --git a/src/rpc/virnettlscontext.c b/src/rpc/virnettlscontext.c index e8023133b4..92d909c0b7 100644 --- a/src/rpc/virnettlscontext.c +++ b/src/rpc/virnettlscontext.c @@ -679,16 +679,28 @@ void virNetTLSSessionSetIOCallbacks(virNetTLSSession *sess,
ssize_t virNetTLSSessionWrite(virNetTLSSession *sess, - const char *buf, size_t len) + const char *buf, size_t len, + bool retry) { ssize_t ret;
+ rewrite: virObjectLock(sess); ret = gnutls_record_send(sess->session, buf, len);
if (ret >= 0) goto cleanup;
+ if (retry && (ret == GNUTLS_E_AGAIN || ret == GNUTLS_E_INTERRUPTED)) { + /* + * GNU TLS advises calling the function again to obtain the data if EAGAIN is returned. + * See reference: https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html + * */
To repeat my feedback on v1. This would surely be wrong for E_AGAIN - if the socket is blocking we must return to the main loop and wait for it to indicate that the socket has new data. If we don't wait in the main loop, this code will busy-loop on a non-blocking socket The GNUTLS docs do NOT say you must call it again /immediately/. A retry on E_INTERRUPTED is ok, but E_AGAIN must wait for the socket to be readable. This is something higher level libvirt code should already be doing correctly. Do you have any real world example of a problem being hit ?
+ VIR_DEBUG("Try writing data from the TLS session again"); + virObjectUnlock(sess); + goto rewrite; + } +
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Fri, Apr 11, 2025 at 5:47 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Apr 08, 2025 at 10:27:51AM +0800, yong.huang@smartx.com wrote:
From: Hyman Huang <yong.huang@smartx.com>
As advised by the GNU TLS, the caller should attempt again if the gnutls_record_{recv,send} return EAGAIN or EINTR; check the following link to view the details:
https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html
Add the retry parameter for virNetTLSSession{Read,Write} functions in accordance with this guideline.
This prevents the upper application from encountering the following error message when it calls the virConnectOpenAuth API: Unable to read TLS confirmation: Resource temporarily unavailable
Signed-off-by: Hyman Huang <yong.huang@smartx.com> --- src/rpc/virnetclient.c | 2 +- src/rpc/virnetsocket.c | 4 ++-- src/rpc/virnettlscontext.c | 28 ++++++++++++++++++++++++++-- src/rpc/virnettlscontext.h | 6 ++++-- 4 files changed, 33 insertions(+), 7 deletions(-)
diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index 92933220e2..5340db4211 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -1003,7 +1003,7 @@ int virNetClientSetTLSSession(virNetClient *client, ignore_value(pthread_sigmask(SIG_SETMASK, &oldmask, NULL)); #endif /* !WIN32 */
- len = virNetTLSSessionRead(client->tls, buf, 1); + len = virNetTLSSessionRead(client->tls, buf, 1, true); if (len < 0 && errno != ENOMSG) { virReportSystemError(errno, "%s", _("Unable to read TLS confirmation")); diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c index e8fc2d5f7d..6774dd4a4b 100644 --- a/src/rpc/virnetsocket.c +++ b/src/rpc/virnetsocket.c @@ -1739,7 +1739,7 @@ static ssize_t virNetSocketReadWire(virNetSocket
*sock, char *buf, size_t len)
if (sock->tlsSession && virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == VIR_NET_TLS_HANDSHAKE_COMPLETE) { - ret = virNetTLSSessionRead(sock->tlsSession, buf, len); + ret = virNetTLSSessionRead(sock->tlsSession, buf, len, false); } else { ret = read(sock->fd, buf, len); } @@ -1807,7 +1807,7 @@ static ssize_t virNetSocketWriteWire(virNetSocket
*sock, const char *buf, size_t
if (sock->tlsSession && virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == VIR_NET_TLS_HANDSHAKE_COMPLETE) { - ret = virNetTLSSessionWrite(sock->tlsSession, buf, len); + ret = virNetTLSSessionWrite(sock->tlsSession, buf, len, false); } else { ret = write(sock->fd, buf, len); /* sc_avoid_write */ } diff --git a/src/rpc/virnettlscontext.c b/src/rpc/virnettlscontext.c index e8023133b4..92d909c0b7 100644 --- a/src/rpc/virnettlscontext.c +++ b/src/rpc/virnettlscontext.c @@ -679,16 +679,28 @@ void
virNetTLSSessionSetIOCallbacks(virNetTLSSession *sess,
ssize_t virNetTLSSessionWrite(virNetTLSSession *sess, - const char *buf, size_t len) + const char *buf, size_t len, + bool retry) { ssize_t ret;
+ rewrite: virObjectLock(sess); ret = gnutls_record_send(sess->session, buf, len);
if (ret >= 0) goto cleanup;
+ if (retry && (ret == GNUTLS_E_AGAIN || ret ==
GNUTLS_E_INTERRUPTED)) {
+ /* + * GNU TLS advises calling the function again to obtain the data if EAGAIN is returned. + * See reference: https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html + * */
To repeat my feedback on v1.
This would surely be wrong for E_AGAIN - if the socket is blocking we must return to the main loop and wait for it to indicate that the socket has new data. If we don't wait in the main loop, this code will busy-loop on
a non-blocking socket Agree, for the EAGAIN error, trying to repoll could be a solution like the following pieces of code? repoll: source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock), G_IO_IN, client->eventCtx, virNetClientIOEventTLSConfirm, client, NULL); g_main_loop_run(client->eventLoop); retry: len = virNetTLSSessionRead(client->tls, buf, 1, true); if (len < 0 && errno == EINTR) { goto retry; } if (len < 0 && errno == EAGAIN) { goto repoll; }
The GNUTLS docs do NOT say you must call it again /immediately/.
A retry on E_INTERRUPTED is ok, but E_AGAIN must wait for the socket to be readable. This is something higher level libvirt code should already be doing correctly. Do you have any real world example of a problem being hit ?
Yes, when our higher management app try to connect Libvirtd using uri: qemu+tls://10.111.222.11/system?no_verify=1 with the following python code: def libvirt_connection(uri=None, auth_info=None): conn = None _auth_info = auth_info def _request_cred(credentials, user_data): if not _auth_info: return 0 for credential in credentials: if credential[0] == libvirt.VIR_CRED_AUTHNAME: credential[4] = _auth_info["username"] elif credential[0] == libvirt.VIR_CRED_PASSPHRASE: credential[4] = _auth_info["password"] return 0 auth = [[libvirt.VIR_CRED_AUTHNAME, libvirt.VIR_CRED_PASSPHRASE], _request_cred, None] if uri and "no_verify" not in uri: uri = "{}?no_verify=1".format(uri) try: conn = libvirt.openAuth(uri or "qemu:///system", auth) yield conn except libvirt.libvirtError as e: logging.exception("libvirtError: %s" % str(e)) raise finally: if conn: conn.close() We encountered the following error message: [2025-04-01 12:54:42,339: ERROR/MainProcess trace_id=79a42e2d-9ee1-4381-9a76-c96a05c88a66] libvirtError: Unable to read TLS confirmation: Resource temporarily unavailable Traceback (most recent call last): File "/usr/local/venv/job-center/lib/python3.10/site-packages/smartx_app/elf/common/utils/libvirt_driver.py", line 58, in libvirt_connection conn = libvirt.openAuth(uri or "qemu:///system", auth) File "/usr/local/venv/job-center/lib/python3.10/site-packages/libvirt.py", line 148, in openAuth raise libvirtError('virConnectOpenAuth() failed') libvirt.libvirtError: Unable to read TLS confirmation: Resource temporarily unavailable
+ VIR_DEBUG("Try writing data from the TLS session again"); + virObjectUnlock(sess); + goto rewrite; + } +
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Thanks for the comments. Yong -- Best regards

On Mon, Apr 14, 2025 at 09:51:32AM +0800, Yong Huang wrote:
On Fri, Apr 11, 2025 at 5:47 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
On Tue, Apr 08, 2025 at 10:27:51AM +0800, yong.huang@smartx.com wrote:
From: Hyman Huang <yong.huang@smartx.com>
As advised by the GNU TLS, the caller should attempt again if the gnutls_record_{recv,send} return EAGAIN or EINTR; check the following link to view the details:
https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html
Add the retry parameter for virNetTLSSession{Read,Write} functions in accordance with this guideline.
This prevents the upper application from encountering the following error message when it calls the virConnectOpenAuth API: Unable to read TLS confirmation: Resource temporarily unavailable
Signed-off-by: Hyman Huang <yong.huang@smartx.com> --- src/rpc/virnetclient.c | 2 +- src/rpc/virnetsocket.c | 4 ++-- src/rpc/virnettlscontext.c | 28 ++++++++++++++++++++++++++-- src/rpc/virnettlscontext.h | 6 ++++-- 4 files changed, 33 insertions(+), 7 deletions(-)
diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index 92933220e2..5340db4211 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -1003,7 +1003,7 @@ int virNetClientSetTLSSession(virNetClient *client, ignore_value(pthread_sigmask(SIG_SETMASK, &oldmask, NULL)); #endif /* !WIN32 */
- len = virNetTLSSessionRead(client->tls, buf, 1); + len = virNetTLSSessionRead(client->tls, buf, 1, true); if (len < 0 && errno != ENOMSG) { virReportSystemError(errno, "%s", _("Unable to read TLS confirmation")); diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c index e8fc2d5f7d..6774dd4a4b 100644 --- a/src/rpc/virnetsocket.c +++ b/src/rpc/virnetsocket.c @@ -1739,7 +1739,7 @@ static ssize_t virNetSocketReadWire(virNetSocket
*sock, char *buf, size_t len)
if (sock->tlsSession && virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == VIR_NET_TLS_HANDSHAKE_COMPLETE) { - ret = virNetTLSSessionRead(sock->tlsSession, buf, len); + ret = virNetTLSSessionRead(sock->tlsSession, buf, len, false); } else { ret = read(sock->fd, buf, len); } @@ -1807,7 +1807,7 @@ static ssize_t virNetSocketWriteWire(virNetSocket
*sock, const char *buf, size_t
if (sock->tlsSession && virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == VIR_NET_TLS_HANDSHAKE_COMPLETE) { - ret = virNetTLSSessionWrite(sock->tlsSession, buf, len); + ret = virNetTLSSessionWrite(sock->tlsSession, buf, len, false); } else { ret = write(sock->fd, buf, len); /* sc_avoid_write */ } diff --git a/src/rpc/virnettlscontext.c b/src/rpc/virnettlscontext.c index e8023133b4..92d909c0b7 100644 --- a/src/rpc/virnettlscontext.c +++ b/src/rpc/virnettlscontext.c @@ -679,16 +679,28 @@ void
virNetTLSSessionSetIOCallbacks(virNetTLSSession *sess,
ssize_t virNetTLSSessionWrite(virNetTLSSession *sess, - const char *buf, size_t len) + const char *buf, size_t len, + bool retry) { ssize_t ret;
+ rewrite: virObjectLock(sess); ret = gnutls_record_send(sess->session, buf, len);
if (ret >= 0) goto cleanup;
+ if (retry && (ret == GNUTLS_E_AGAIN || ret ==
GNUTLS_E_INTERRUPTED)) {
+ /* + * GNU TLS advises calling the function again to obtain the data if EAGAIN is returned. + * See reference: https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html + * */
To repeat my feedback on v1.
This would surely be wrong for E_AGAIN - if the socket is blocking we must return to the main loop and wait for it to indicate that the socket has new data. If we don't wait in the main loop, this code will busy-loop on
a non-blocking socket
Agree, for the EAGAIN error, trying to repoll could be a solution like the following pieces of code?
repoll: source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock), G_IO_IN, client->eventCtx, virNetClientIOEventTLSConfirm, client, NULL);
g_main_loop_run(client->eventLoop);
retry: len = virNetTLSSessionRead(client->tls, buf, 1, true); if (len < 0 && errno == EINTR) { goto retry; } if (len < 0 && errno == EAGAIN) { goto repoll; }
Yes, and virNetSocketReadWire is already handlnig both EINTR and EAGAIN after it calls virNetTLSSessionRead. The other caller is virNetClientSetTLSSession and I see that is failed to handle EINTR/EGAIN, though EGAIN seems like it ought to be unlikely given that caller waited for G_IO_IN...
A retry on E_INTERRUPTED is ok, but E_AGAIN must wait for the socket to be readable. This is something higher level libvirt code should already be doing correctly. Do you have any real world example of a problem being hit ?
snip
Yes, when our higher management app try to connect Libvirtd using "/usr/local/venv/job-center/lib/python3.10/site-packages/libvirt.py", line 148, in openAuth raise libvirtError('virConnectOpenAuth() failed') libvirt.libvirtError: Unable to read TLS confirmation: Resource temporarily unavailable
....this error message comes from virNetClientSetTLSSession and suggests the wait for G_IO_IN wasn't sufficient, so we should retry I guess With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (3)
-
Daniel P. Berrangé
-
Yong Huang
-
yong.huang@smartx.com