Hi Folks,
Using libvirt python bindings we are creating an lxc container.Here is the problem that we see sometimes (say 20 % of the time) when we create a new container.
1.container gets created, and also starts.However the we are not able to enter the namepace of the container.It throws an error initPid not available.We see that the using netstat command
, socket connection is closed.
2.To get around this problem we have to stop and start the container again.We see that socket under (/var/run/libvirt/*) connection is established and we are able to enter the namespace.
Enabled the libvirtd debug logs to debug this issue.
For success case we see that for new client connection gets created and is able to handle async incoming events,
2016-03-12 08:18:55.748+0000: 1247: info : virObjectNew:202 : OBJECT_NEW: obj=0x7fed54005460 classname=virLXCMonitor
2016-03-12 08:18:55.748+0000: 1247: debug : virNetSocketNew:159 : localAddr=0x7fed7cd1d170 remoteAddr=0x7fed7cd1d200 fd=28 errfd=-1 pid=0
2016-03-12 08:18:55.749+0000: 1247: info : virObjectNew:202 : OBJECT_NEW: obj=0x7fed54009040 classname=virNetSocket
2016-03-12 08:18:55.749+0000: 1247: info : virNetSocketNew:209 : RPC_SOCKET_NEW: sock=0x7fed54009040 fd=28 errfd=-1 pid=0 localAddr=127.0.0.1;0, remoteAddr=127.0.0.1;0
2016-03-12 08:18:55.749+0000: 1247: info : virObjectNew:202 : OBJECT_NEW: obj=0x7fed54009d10 classname=virNetClient
2016-03-12 08:18:55.749+0000: 1247: info : virNetClientNew:327 : RPC_CLIENT_NEW: client=0x7fed54009d10 sock=0x7fed54009040
2016-03-12 08:18:55.749+0000: 1247: info : virObjectRef:296 : OBJECT_REF: obj=0x7fed54009d10
2016-03-12 08:18:55.749+0000: 1247: info : virObjectRef:296 : OBJECT_REF: obj=0x7fed54009040
2016-03-12 08:18:55.750+0000: 1247: info : virObjectNew:202 : OBJECT_NEW: obj=0x7fed540009a0 classname=virNetClientProgram
2016-03-12 08:18:55.750+0000: 1247: info : virObjectRef:296 : OBJECT_REF: obj=0x7fed540009a0
2016-03-12 08:18:55.750+0000: 1247: info : virObjectRef:296 : OBJECT_REF: obj=0x7fed54005460
2016-03-12 08:18:55.750+0000: 1247: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x7fed5c168eb0
2016-03-12 08:18:55.750+0000: 1247: debug : virLXCProcessCleanInterfaces:475 : Cleared net names: eth0
2016-03-12 08:18:55.750+0000: 1247: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x7fed5c168eb0
2016-03-12 08:18:55.750+0000: 1247: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x7fed5c169600
2016-03-12 08:18:55.755+0000: 1244: debug : virNetClientIncomingEvent:1808 : client=0x7fed54009d10 wantclose=0
2016-03-12 08:18:55.755+0000: 1244: debug : virNetClientIncomingEvent:1816 : Event fired 0x7fed54009040 1
2016-03-12 08:18:55.755+0000: 1244: debug : virNetMessageDecodeLength:151 : Got length, now need 36 total (32 more)
2016-03-12 08:18:55.756+0000: 1244: info : virNetClientCallDispatch:1116 : RPC_CLIENT_MSG_RX: client=0x7fed54009d10 len=36 prog=305402420 vers=1 proc=2 type=2 status=0 serial=1
2016-03-12 08:18:55.756+0000: 1244: debug : virKeepAliveCheckMessage:377 : ka=(nil), client=0x7fed81fc5ed4, msg=0x7fed54009d78
2016-03-12 08:18:55.756+0000: 1244: debug : virNetClientProgramDispatch:220 : prog=305402420 ver=1 type=2 status=0 serial=1 proc=2
2016-03-12 08:18:55.756+0000: 1244: debug : virLXCMonitorHandleEventInit:109 : Event init 1420
For failure case ,we see that the client socket connection is initiated and gets closed immediately after receiving an incoming event.In this case, I don’t see an object for virNetClientProgram
being created.
Incoming event comes in and since the its unable to find client->prog it bails out and closes the connection.
Snaphost of the code,
static int virNetClientCallDispatchMessage(virNetClientPtr client)
{
size_t i;
virNetClientProgramPtr prog = NULL;
for (i = 0; i < client->nprograms; i++) {
if (virNetClientProgramMatches(client->programs[i],
&client->msg)) {
prog = client->programs[i];
break;
}
}
if (!prog) {
VIR_DEBUG("No program found for event with prog=%d vers=%d",
client->msg.header.prog, client->msg.header.vers);
return -1;
}
2016-03-12 08:19:15.935+0000: 1246: info : virObjectRef:296 : OBJECT_REF: obj=0x7fed5c168eb0
2016-03-12 08:19:15.935+0000: 1246: info : virObjectRef:296 : OBJECT_REF: obj=0x7fed82bd7bc0
2016-03-12 08:19:15.935+0000: 1246: info : virObjectNew:202 : OBJECT_NEW: obj=0x7fed82bd8120 classname=virLXCMonitor
2016-03-12 08:19:15.935+0000: 1246: debug : virNetSocketNew:159 : localAddr=0x7fed7d51e170 remoteAddr=0x7fed7d51e200 fd=31 errfd=-1 pid=0
2016-03-12 08:19:15.936+0000: 1246: info : virObjectNew:202 : OBJECT_NEW: obj=0x7fed82bd8660 classname=virNetSocket
2016-03-12 08:19:15.936+0000: 1246: info : virNetSocketNew:209 : RPC_SOCKET_NEW: sock=0x7fed82bd8660 fd=31 errfd=-1 pid=0 localAddr=127.0.0.1;0, remoteAddr=127.0.0.1;0
2016-03-12 08:19:15.936+0000: 1246: info : virObjectNew:202 : OBJECT_NEW: obj=0x7fed82bd8ca0 classname=virNetClient
2016-03-12 08:19:15.936+0000: 1246: info : virNetClientNew:327 : RPC_CLIENT_NEW: client=0x7fed82bd8ca0 sock=0x7fed82bd8660
2016-03-12 08:19:15.936+0000: 1246: info : virObjectRef:296 : OBJECT_REF: obj=0x7fed82bd8ca0
2016-03-12 08:19:15.936+0000: 1246: info : virObjectRef:296 : OBJECT_REF: obj=0x7fed82bd8660
2016-03-12 08:19:15.942+0000: 1244: debug : virNetClientIncomingEvent:1808 : client=0x7fed82bd8ca0 wantclose=0
2016-03-12 08:19:15.942+0000: 1244: debug : virNetClientIncomingEvent:1816 : Event fired 0x7fed82bd8660 1
2016-03-12 08:19:15.942+0000: 1244: debug : virNetMessageDecodeLength:151 : Got length, now need 36 total (32 more)
2016-03-12 08:19:15.942+0000: 1244: info : virNetClientCallDispatch:1116 : RPC_CLIENT_MSG_RX: client=0x7fed82bd8ca0 len=36 prog=305402420 vers=1 proc=2 type=2 status=0 serial=1
2016-03-12 08:19:15.942+0000: 1244: debug : virKeepAliveCheckMessage:377 : ka=(nil), client=0x7fed81fc5ed4, msg=0x7fed82bd8d08
2016-03-12 08:19:15.942+0000: 1244: debug : virNetClientCallDispatchMessage:1008 : No program found for event with prog=305402420 vers=1
2016-03-12 08:19:15.942+0000: 1244: debug : virNetMessageClear:57 : msg=0x7fed82bd8d08 nfds=0
2016-03-12 08:19:15.942+0000: 1244: debug : virNetClientMarkClose:632 : client=0x7fed82bd8ca0, reason=0
2016-03-12 08:19:15.942+0000: 1244: debug : virNetClientCloseLocked:648 : client=0x7fed82bd8ca0, sock=0x7fed82bd8660, reason=0
Here is the snapshot of code ,
virLXCMonitorPtr virLXCMonitorNew(virDomainObjPtr vm,
const char *socketdir,
virLXCMonitorCallbacksPtr cb)
{
virLXCMonitorPtr mon;
char *sockpath = NULL;
if (virLXCMonitorInitialize() < 0)
return NULL;
if (!(mon = virObjectLockableNew(virLXCMonitorClass)))
return NULL;
if (virAsprintf(&sockpath, "%s/%s.sock",
socketdir, vm->def->name) < 0)
goto error;
if (!(mon->client = virNetClientNewUNIX(sockpath, false, NULL)))
goto error;
if (virNetClientRegisterAsyncIO(mon->client) < 0)
goto error;
if (!(mon->program = virNetClientProgramNew(VIR_LXC_MONITOR_PROGRAM,
VIR_LXC_MONITOR_PROGRAM_VERSION,
virLXCMonitorEvents,
ARRAY_CARDINALITY(virLXCMonitorEvents),
mon)))
goto error;
if (virNetClientAddProgram(mon->client,
mon->program) < 0)
goto error;
mon->vm = vm;
memcpy(&mon->cb, cb, sizeof(mon->cb));
virObjectRef(mon);
virNetClientSetCloseCallback(mon->client, virLXCMonitorEOFNotify, mon,
virLXCMonitorCloseFreeCallback);
Is the problem occurring due to invocation of “virNetClientRegisterAsyncIO" api before the virNetClientAddProgram.Probably once we register for aysnc IO , immediately an event comes in and that thread
takes priority and bails out since it does not find the client->prog?Also the client is not retrying to establish a new connection.
Please let me any thoughts/comments.Is there any patch already available which has fixed this issue?
-Thanks,
Rammohan