Richard W.M. Jones wrote:
Program terminated with signal 11, Segmentation fault.
#0 0x0000003d8b472a1b in free () from /lib64/libc.so.6
(gdb) bt
#0 0x0000003d8b472a1b in free () from /lib64/libc.so.6
#1 0x00002aaaaaae8dd7 in virResetError (err=0x33535c8) at virterror.c:111
#2 0x00002aaaaaae8fce in __virRaiseError (conn=0x33535a0, dom=0x0,
net=0x0,
domain=0, code=6, level=VIR_ERR_ERROR,
str1=0x2aaaaab0c678 "invalid connection pointer in %s",
str2=0x2aaaaab08560 "virConnectNumOfDomains", str3=0x0, int1=0, int2=0,
msg=0x2aaaaab0c678 "invalid connection pointer in %s") at
virterror.c:358
#3 0x00002aaaaaacfa8e in virLibConnError (conn=0x33535a0,
error=VIR_ERR_INVALID_CONN, info=0x2aaaaab08560
"virConnectNumOfDomains")
at libvirt.c:127
#4 0x00002aaaaaad1052 in virConnectNumOfDomains (conn=0x736e6961)
at libvirt.c:758
#5 0x000000000043fa4e in ?? ()
A preliminary look at the code seems to indicate a fault in this logic:
int
virConnectNumOfDomains(virConnectPtr conn)
{
DEBUG("conn=%p", conn);
if (!VIR_IS_CONNECT(conn)) {
virLibConnError(conn, VIR_ERR_INVALID_CONN, __FUNCTION__);
The VIR_IS_CONNECT macro is defined as:
#define VIR_CONNECT_MAGIC 0x4F23DEAD
#define VIR_IS_CONNECT(obj) ((obj) && (obj)->magic==VIR_CONNECT_MAGIC)
Obviously if VIR_IS_CONNECT fails then "conn" should not be used
further, so calling virLibConnError (conn, ...) is wrong. Personally I
think when we detect memory corruption in a C program we should just
call abort().
I'll see if I can come up with a patch to fix this later ... at the
moment I'm more interested in why my program is passing an invalid
connection pointer in the first place :-(
So a couple of things to say about this.
Firstly attached is a patch which fixes the libvirt problem, by passing
NULL when we find that a connection is invalid, rather than passing the
known invalid connection to virLibConnError. (It also fixes the same
problem with virDomainPtr and virNetworkPtr objects).
Secondly I worked out why my program was passing an invalid connection
pointer, and it may act as a salutary lesson for anyone writing libvirt
bindings for a language which has real garbage collection (hello, Java).
In the normal case I wrap objects like virConnectPtr, virDomainPtr in an
OCaml object which has an attached finaliser. Thus the OCaml user
doesn't need to worry about manually freeing these objects - the garbage
collector will collect them when they become unreachable and the
finaliser will call virConnectClose, virDomainFree, etc. as appropriate.
The problem I was hitting was when a libvirt function raised an error
(ie. virterror), which contains a virConnectPtr, virDomainPtr and
virNetworkPtr. I was using my normal wrapping functions to attach
finalisers, but that was the wrong thing to do in this case because
virterror does not increment the reference counts of these objects.
When my OCaml virterror exception was garbage collected, that would
cause virConnectClose in this case to be called, but that closed the
connection that I was still using, resulting in the main program
continuing to use a closed virConnectPtr.
Note that the same double-free or invalid pointer use could also happen
with the domain pointer or the network pointer in the virterror. It was
just my good luck that it happened to be the connection pointer that got
closed first which made the error rather more obvious and easier to find.
The partial solution on the OCaml side is to use a special wrapper
function for the virterror objects which doesn't attach a finaliser.
This is not a complete solution because an OCaml program might save the
exception and use it later, after the virConnectPtr, virDomainPtr or
virNetworkPtr that it contains have been freed elsewhere. So the OCaml
program becomes unsafe, and we have to tell users not to save these
exceptions. Unfortunately there is no good solution on the C side
either: if we change virterror so it increments the reference counts
then any C program which uses virterror will have to be changed.
(The above is fixed in ocaml-libvirt bindings >= 0.3.2.6).
Rich.
--
Emerging Technologies, Red Hat -
http://et.redhat.com/~rjones/
Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod
Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in
England and Wales under Company Registration No. 03798903