On Thu, Jun 10, 2010 at 10:35 PM, Matthias Bolte <matthias.bolte@googlemail.com> wrote:

2010/6/10 Emre Erenoglu <erenoglu@gmail.com>:
> On Thu, Jun 10, 2010 at 9:07 PM, Daniel P. Berrange <berrange@redhat.com>
> wrote:
>>
>> On Thu, Jun 10, 2010 at 08:57:15PM +0300, Emre Erenoglu wrote:
>> > On Thu, Jun 10, 2010 at 5:02 PM, Matthias Bolte <
>> > matthias.bolte@googlemail.com> wrote:
>> >
>> > > 2010/6/10 Emre Erenoglu <erenoglu@gmail.com>:
>> > > The initscript explicitly starts the one in /usr/sbin. If you just
>> > > start libvirtd manually without an absolute path then you'll start the
>> > > one in /usr/local/sbin. This might explain why you cannot reproduce
>> > > the segfault manually, but it doesn't explain why the segfault
>> > > happens.
>> > >
>> >
>> > There's no other installation of libvirt in the system. I can also
>> > reproduce
>> > the same thing in all Pardus machines, so I believe it's something in
>> > libvirt not doing well with something else in our service init
>> > mechanisms.
>>
>> I guess I'd put money on some environment variable causing trouble.
>> It could be a *missing* environment variable that we expect to always
>> be set, or something like that
>
> Hi Daniel, thanks for your message. Yes, I did a small script file as you
> suggested and found out this environment while libvirtd was run:
>
> DBUS_STARTER_ADDRESS=unix:path=/var/run/dbus/system_bus_socket,guid=6c515f612162b05d554b59cd4c112d43
> KRB5_KTNAME=/etc/libvirt/krb5.tab
> PWD=/
> DBUS_STARTER_BUS_TYPE=system
> SHLVL=1
> _=/usr/bin/env
>
> This looks very weak compared to the standard root environment that I pasted
> in my earlier message.

No PATH? I bet there is code in libvirt that assumes getenv("PATH")
will be != NULL.

Could you try to add PATH to the environment. It can be empty, doesn't
matter. Just make sure it's there, so getenv("PATH") returns an empty
string instead of NULL.

I just did exactly what you said by the same instinct, ie added the PATH environment variable, and, nailed it down! It works! wow!

>>
>> > > >> Could you provide a GDB backtrace of the segfault? The syslog entry
>> > > >> only
>> > > >> says that it crashed in libc, that's not enough information to
>> > > >> debug the segfault.
>> > > >
>> > > > Unfortunately, I can't find a related core file in the system. In
>> > > > fact,
>> > > core
>> > > > file is not generated. I'll also try to fix this out and come back
>> > > > to the
>> > > > list.
>> > > >
>> > >
>> > > Getting a backtrace would be simpler if you could reproduce the
>> > > problem manually. In that case you could just start libvirtd in GDB.
>> > > But getting a backtrace from a coredump will work too.
>> > >
>> > I can't reproduce the segfault when I run it manually. It only happens
>> > when
>> > it's run from this python script. I will try to initialize gdb inside
>> > the
>> > script and connect remotely to the gdb session, but it's getting a bit
>> > over
>> > my debugging capabilities :) For example, I don't know how to assign
>> > the
>> > symbols and source code etc from the package build directory to gdb.
>>
>> Try creating a wrapper script, eg
>>
>> mv /usr/sbin/libvirtd /usr/sbin/libvirtd.real
>> cat > /usr/sbin/libvirtd <<EOF
>> #!/bin/sh
>> cd /tmp
>> ulimited -c unlimited
>> exec /usr/sbin/libvirtd.real
>> EOF
>> chmod +x /usr/sbin/libvirtd
>>
>> That will hopefully give you a core dump in /tmp you can get get a
>> stack trace from
>
> Yes, I got the core file with the script. However, when I open the core file
> with gdb, and use bt command to get the backtrace, the only thing it tells
> me is this:
>
> Core was generated by `/usr/sbin/libvirtd --daemon'.
> Program terminated with signal 11, Segmentation fault.
> #0 0xb73ed8f3 in ?? ()
> (gdb) bt
> Cannot access memory at address 0x810b9db
>
> Maybe I don't know enough of debugging as I know I have to see the code
> lines (somehow) at this segfault point. Could you guide me on that?
>
> Thanks,
>
> Br,
> Emre
>

Strange backtrace. Maybe there is heap corruption going on so that GDB
can't make sense out of it anymore.

I'll do some research about the PATH usage in libvirt now.

OK. I guess it's used to find the dhcp daemon, iptables etc. Other service scripts seem to work happily without this PATH, but I'll ask developers to add it to the python service environment to make sure it works fine.

Thanks again Matthias, Daniel! I'm a happy guy now :)

Emre Erenoglu