"Daniel P. Berrange" <berrange(a)redhat.com> wrote:
I noticed a odd error message randomly appearing from virsh after
all
commands had been run
# virsh dominfo VirtTest
Id: -
Name: VirtTest
UUID: 82038f2a-1344-aaf7-1a85-2a7250be2076
OS Type: hvm
State: shut off
CPU(s): 3
Max memory: 256000 kB
Used memory: 256000 kB
Autostart: disable
libvir: Remote error : server closed connection
It turns out that inthe for(;;) loop where I'm procesing incoming data,
it was forgetting to break out of the loop when a completed RPC call
had been received. Most of the time this wasn't problem, because there'd
rarely be more data following, so it'd get EAGAIN next iteration & quit
the loop. When shutting down though, you'd occassionally see the SIGHUP
from read() before you get an EAGAIN, causing this error to be raised
even though the RPC call had been successfully received.
In addition, if there was a 2nd RPC reply already pending, we'd read and
loose some of its data. Though I never saw this happen, it is a definite
theoretical possibility, particularly over a TCP link with bad latency
or fragementation or bursty traffic.
Daniel
diff -r e582072116a3 src/remote_internal.c
--- a/src/remote_internal.c Mon Jan 26 16:21:42 2009 +0000
+++ b/src/remote_internal.c Mon Jan 26 17:02:15 2009 +0000
@@ -6135,12 +6135,27 @@ processCallRecv(virConnectPtr conn, stru
if (priv->bufferOffset == priv->bufferLength) {
if (priv->bufferOffset == 4) {
ret = processCallRecvLen(conn, priv, in_open);
+ if (ret < 0)
+ return -1;
This part is no net change. Just hoisting the test+return
that used to follow both calls.
+ /*
+ * We'll carry on around the loop to immediately
+ * process the message body, because it has probably
+ * already arrived. Worst case, we'll get EAGAIN on
+ * next iteration.
+ */
} else {
ret = processCallRecvMsg(conn, priv, in_open);
priv->bufferOffset = priv->bufferLength = 0;
- }
- if (ret < 0)
- return -1;
+ /*
+ * We've completed one call, so return even
+ * though there might still be more data on
+ * the wire. We need to actually let the caller
+ * deal with this arrived message to keep good
+ * response, and also to correctly handle EOF.
+ */
+ return ret;
This seems like a good idea for all the reasons you give.
Suggestions, while you're in the area:
- s/EGAIN/EAGAIN/ in a comment just above, in the same function.
- less important: I'd move the declaration of "ret" down into the
while loop, since it's used only therein. Save 2 lines (yeah, yeah,
out of 6800+ -- gotta start somewhere)
ACK