Re: [libvirt] Segfault in libvirtd when run as a service

11 Jun 2010

      2010/6/10 Emre Erenoglu <erenoglu@gmail.com>:
...
On Thu, Jun 10, 2010 at 9:07 PM, Daniel P. Berrange <berrange@redhat.com>
wrote:
...
On Thu, Jun 10, 2010 at 08:57:15PM +0300, Emre Erenoglu wrote:
...
On Thu, Jun 10, 2010 at 5:02 PM, Matthias Bolte <
matthias.bolte@googlemail.com> wrote:
...
2010/6/10 Emre Erenoglu <erenoglu@gmail.com>:
The initscript explicitly starts the one in /usr/sbin. If you just
start libvirtd manually without an absolute path then you'll start the
one in /usr/local/sbin. This might explain why you cannot reproduce
the segfault manually, but it doesn't explain why the segfault
happens.
There's no other installation of libvirt in the system. I can also
reproduce
the same thing in all Pardus machines, so I believe it's something in
libvirt not doing well with something else in our service init
mechanisms.
I guess I'd put money on some environment variable causing trouble.
It could be a *missing* environment variable that we expect to always
be set, or something like that
Hi Daniel, thanks for your message. Yes, I did a small script file as you
suggested and found out this environment while libvirtd was run:
DBUS_STARTER_ADDRESS=unix:path=/var/run/dbus/system_bus_socket,guid=6c515f612162b05d554b59cd4c112d43
KRB5_KTNAME=/etc/libvirt/krb5.tab
PWD=/
DBUS_STARTER_BUS_TYPE=system
SHLVL=1
_=/usr/bin/env
This looks very weak compared to the standard root environment that I pasted
in my earlier message.
No PATH? I bet there is code in libvirt that assumes getenv("PATH")
will be != NULL.

Could you try to add PATH to the environment. It can be empty, doesn't
matter. Just make sure it's there, so getenv("PATH") returns an empty
string instead of NULL.
...
...
...
...
...
...
Could you provide a GDB backtrace of the segfault? The syslog entry
only
says that it crashed in libc, that's not enough information to
debug the segfault.
Unfortunately, I can't find a related core file in the system. In
fact,
core
file is not generated. I'll also try to fix this out and come back
to the
list.
Getting a backtrace would be simpler if you could reproduce the
problem manually. In that case you could just start libvirtd in GDB.
But getting a backtrace from a coredump will work too.
I can't reproduce the segfault when I run it manually. It only happens
when
it's run from this python script. I will try to initialize gdb inside
the
script and connect remotely to the gdb session, but it's getting a bit
over
my debugging capabilities :)  For example, I don't know how to assign
the
symbols and source code etc from the package build directory to gdb.
Try creating a wrapper script, eg
  mv /usr/sbin/libvirtd /usr/sbin/libvirtd.real
  cat > /usr/sbin/libvirtd <<EOF
  #!/bin/sh
  cd /tmp
  ulimited -c unlimited
  exec /usr/sbin/libvirtd.real
  EOF
  chmod +x /usr/sbin/libvirtd
That will hopefully give you a core dump in /tmp you can get get a
stack trace from
Yes, I got the core file with the script. However, when I open the core file
with gdb, and use bt command to get the backtrace, the only thing it tells
me is this:
Core was generated by `/usr/sbin/libvirtd --daemon'.
Program terminated with signal 11, Segmentation fault.
#0  0xb73ed8f3 in ?? ()
(gdb) bt
Cannot access memory at address 0x810b9db
Maybe I don't know enough of debugging as I know I have to see the code
lines (somehow) at this segfault point. Could you guide me on that?
Thanks,
Br,
Emre
Strange backtrace. Maybe there is heap corruption going on so that GDB
can't make sense out of it anymore.

I'll do some research about the PATH usage in libvirt now.

Matthias