Hi Daniel,
On Sun, Dec 21, 2008 at 6:15 PM, Daniel P. Berrange <berrange(a)redhat.com>wrote:
>
> However, I still have the crash in the dmesg output, as before, errors
like:
>
>
>
> sysfs: duplicate filename '0' can not be
> created
>
> ------------[ cut here
> -------------
>
> WARNING: at fs/sysfs/dir.c:424 sysfs_add_one+0x34/0xa6()
> ...
> Pid: 2616, comm: libvirtd Tainted: P 2.6.25.20-113 #1
> ...
> *kobject_add_internal failed for 0 with -EEXIST, don't try to register
> things with the same name in the same directory.*
Any of these messages in the dmesg output are kernel problems, not
libvirt problems. The process listing you show about indicates that
libvirtd itself is running, and has not crashed.
Daniel
Using eclipse and gdb, I've traced the problem until this line in source
code. Source code filename lxc_container.c, line 654, this function:
cpid = clone(lxcContainerDummyChild, childStack, flags, NULL);
is crashing something inside my kernel, which results in the messages that
I've sent previously in this thread. Sometimes, the crash occurs even though
the "cpid" value is 0, and in the second turn (libvirtd continues to run
despite the crashing messages), cpid value returns -1 and system gives debug
message mentioned in this function:
"DEBUG("clone call returned %s, container support is not enabled",
strerror(errno));"
The full function causing the error, with the exact line in "*bold* *font*"
is:
int lxcContainerAvailable(int features)
{
int flags = CLONE_NEWPID|CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWUSER|
CLONE_NEWIPC|SIGCHLD;
int cpid;
char *childStack;
char *stack;
int childStatus;
if (features & LXC_CONTAINER_FEATURE_NET)
flags |= CLONE_NEWNET;
if (VIR_ALLOC_N(stack, getpagesize() * 4) < 0) {
DEBUG0("Unable to allocate stack");
return -1;
}
childStack = stack + (getpagesize() * 4);
*
cpid = clone(lxcContainerDummyChild, childStack, flags, NULL);*
VIR_FREE(stack);
if (cpid < 0) {
DEBUG("clone call returned %s, container support is not enabled",
strerror(errno));
return -1;
} else {
waitpid(cpid, &childStatus, 0);
}
return 0;
}
I appreciate any clues on why this could happen, and what shall I change in
the host kernel to prevent it from happening?
Thank you very very much.
Emre