So far, we were dropping non-blocking calls whenever sending them would block.
In case a client is sending lots of stream calls (which are not supposed to
generate any reply), the assumption that having other calls in a queue is
sufficient to get a reply from the server doesn't work. I tried to fix this in
b1e374a7ac56927cfe62435179bf0bba1e08b372 but failed and reverted that commit.
While working on the proper fix, I discovered several other issues we had in
handling keepalive messages in client RPC code. See individual patches for more
details.
As a nice bonus, the fixed version is shorter by one line than the current
broken version :-)
Jiri Denemark (12):
client rpc: Improve debug messages in virNetClientIO
client rpc: Use event loop for writing
client rpc: Don't drop non-blocking calls
client rpc: Just queue non-blocking call if another thread has the
buck
client rpc: Drop unused return value of virNetClientSendNonBlock
rpc: Refactor keepalive timer code
rpc: Add APIs for direct triggering of keepalive timer
client rpc: Separate call creation from running IO loop
rpc: Do not use timer for sending keepalive responses
rpc: Remove unused parameter in virKeepAliveStopInternal
server rpc: Remove APIs for manipulating filters on locked client
client rpc: Send keepalive requests from IO event loop
src/libvirt_probes.d | 2 +-
src/rpc/virkeepalive.c | 233 ++++++++++++--------------
src/rpc/virkeepalive.h | 7 +-
src/rpc/virnetclient.c | 368 +++++++++++++++++++++++-------------------
src/rpc/virnetserverclient.c | 127 +++++++--------
5 files changed, 368 insertions(+), 369 deletions(-)
--
1.7.10.2