On Thu, Nov 26, 2015 at 04:10:40PM +0000, Ben Gray wrote:
Hi,
Occasionally when trying to start LXC containers with fds I get
the following error:
Wow, this is here for a while... Sorry for the delay.
virNetMessageDupFD:562 : Unable to duplicate FD -1: Bad file
descriptor
It's weird that there's '-1', though. But I might just missed something
when re-tracking what you did.
I tracked it down to the code that handles EAGAIN errors from
recvfd. In such cases the virNetMessageDecodeNumFDs function may be
called multiple times from virNetServerClientDispatchRead and each
time it overwrites the msg->fds array. In the best case (when
msg->donefds == 0) this results in a memory leak, in the worse case it
will leak any fd's already in msg->fds and cause subsequent failures
when dup is called.
It is a problem indeed. I'm testing your patch now, but without
reproducing the issue it's going to be a bit blind test. I don't
suppose there is easier to reproduce this other than connecting through
pv or socat that limits the block size and speed sent and then try
starting a container with bunch of FDs, right?
A very similar problem is mention here:
https://www.redhat.com/archives/libvir-list/2012-December/msg01306.html
Below is my patch to fix the issue.
--- a/src/rpc/virnetserverclient.c 2015-01-23 11:46:24.000000000 +0000
+++ b/src/rpc/virnetserverclient.c 2015-11-26 15:30:51.214462290 +0000
@@ -1107,36 +1107,40 @@
/* Now figure out if we need to read more data to get some
* file descriptors */
- if (msg->header.type == VIR_NET_CALL_WITH_FDS &&
- virNetMessageDecodeNumFDs(msg) < 0) {
- virNetMessageQueueServe(&client->rx);
- virNetMessageFree(msg);
- client->wantClose = true;
- return; /* Error */
- }
+ if (msg->header.type == VIR_NET_CALL_WITH_FDS) {
+ size_t i;
This is already here, it will shadow another local declaration.
Anyway, it's worth to mention that this patch does very little
functional change and is good to be viewed with '-w'.
I'm fairly sure we want to have this fixed and even though the freeze
for 1.3.1 started today, this fixes a bug and was sent long ago, so I
think it's fine if it goes in. But since this does not apply to current
master, I moved the code in a similar fashion (actually the same way, I
don't know why it didn't apply). I'll also reword the message and then
I'll push it. Thanks for the fix and congratulations to your first
patch in libvirt ;)
Martin