On Thu, Jun 10, 2010 at 9:07 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
On Thu, Jun 10, 2010 at 08:57:15PM +0300, Emre Erenoglu wrote:
> On Thu, Jun 10, 2010 at 5:02 PM, Matthias Bolte <
> matthias.bolte@googlemail.com> wrote:
>
> > 2010/6/10 Emre Erenoglu <erenoglu@gmail.com>:
> > The initscript explicitly starts the one in /usr/sbin. If you just
> > start libvirtd manually without an absolute path then you'll start the
> > one in /usr/local/sbin. This might explain why you cannot reproduce
> > the segfault manually, but it doesn't explain why the segfault
> > happens.
> >
>
> There's no other installation of libvirt in the system. I can also reproduce
> the same thing in all Pardus machines, so I believe it's something in
> libvirt not doing well with something else in our service init mechanisms.

I guess I'd put money on some environment variable causing trouble.
It could be a *missing* environment variable that we expect to always
be set, or something like that

Hi Daniel, thanks for your message. Yes, I did a small script file as you suggested and found out this environment while libvirtd was run:

DBUS_STARTER_ADDRESS=unix:path=/var/run/dbus/system_bus_socket,guid=6c515f612162b05d554b59cd4c112d43
KRB5_KTNAME=/etc/libvirt/krb5.tab
PWD=/
DBUS_STARTER_BUS_TYPE=system
SHLVL=1
_=/usr/bin/env
 
This looks very weak compared to the standard root environment that I pasted in my earlier message.


> > >> Could you provide a GDB backtrace of the segfault? The syslog entry only
> > >> says that it crashed in libc, that's not enough information to
> > >> debug the segfault.
> > >
> > > Unfortunately, I can't find a related core file in the system. In fact,
> > core
> > > file is not generated. I'll also try to fix this out and come back to the
> > > list.
> > >
> >
> > Getting a backtrace would be simpler if you could reproduce the
> > problem manually. In that case you could just start libvirtd in GDB.
> > But getting a backtrace from a coredump will work too.
> >
> I can't reproduce the segfault when I run it manually. It only happens when
> it's run from this python script. I will try to initialize gdb inside the
> script and connect remotely to the gdb session, but it's getting a bit over
> my debugging capabilities :)  For example, I don't know how to assign the
> symbols and source code etc from the package build directory to gdb.

Try creating a wrapper script, eg

  mv /usr/sbin/libvirtd /usr/sbin/libvirtd.real
  cat > /usr/sbin/libvirtd <<EOF
  #!/bin/sh
  cd /tmp
  ulimited -c unlimited
  exec /usr/sbin/libvirtd.real
  EOF
  chmod +x /usr/sbin/libvirtd

That will hopefully give you a core dump in /tmp you can get get a
stack trace from

Yes, I got the core file with the script. However, when I open the core file with gdb, and use bt command to get the backtrace, the only thing it tells me is this:

Core was generated by `/usr/sbin/libvirtd --daemon'.
Program terminated with signal 11, Segmentation fault.
#0  0xb73ed8f3 in ?? ()
(gdb) bt
Cannot access memory at address 0x810b9db

Maybe I don't know enough of debugging as I know I have to see the code lines (somehow) at this segfault point. Could you guide me on that?

Thanks,

Br,
Emre