[libvirt] Libvirt segfault in qemuMonitorSend() with multi-threaded API use

4 Mar 2010

      I have a multi-threaded Python program that shares a single libvirt
connection object among several threads (one thread per active domain on
the system plus a management thread).  On a heavily loaded host with 8
running domains I am getting a consistent libvirtd segfault in the qemu
monitor handling code.  This happens with libvirt-0.7.6 and git.

Mar  4 12:23:13 bc1cn7-mgmt kernel: [ 3947.836151] libvirtd[7716]:
segfault at 24 ip 000000000045de5c sp 00007fe5aa7d2b20 error 4 in
libvirtd[400000+b3000]

Using addr2line, this translates to: libvirt/src/qemu/qemu_monitor.c:698

Which is in qemuMonitorSend():

--> while (!mon->msg->finished) { 
        if (virCondWait(&mon->notify, &mon->lock) < 0)
            goto cleanup;
    }

It seems that mon->msg is being reset to NULL in the middle of this loop
execution.  I suspect that is because qemuMonitorSend() is not reentrant
and multiple threads in my program are racing here.  I would guess the
'mon->msg = NULL;' on line 707 causes the NULL that trips up the other
racer.

I presume the Monitor interface has some locking protection around it to
ensure that only one thread can use it at a time?

Is there an easy way to fix this?  I am not familiar with the measures
employed to make libvirt thread-safe.  Thanks!

-- 
Thanks,
Adam

Adam Litke

Daniel P. Berrange

Adam Litke

tags

participants (2)