----- "Matthias Bolte" <matthias.bolte(a)googlemail.com> wrote:
2010/1/10 <pspreadborough(a)comcast.net>:
>
> ----- "Matthias Bolte" <matthias.bolte(a)googlemail.com> wrote:
>
>> 2010/1/10 <pspreadborough(a)comcast.net>:
>> >
>> > Hello,
>> >
>> > I have been trying to use the domain event C code example but
>> > unfortunately it segfaults (signal 11) every time I run it:
>> >
>> > [root@Spring events-c]# ./event-test
>> > myEventAddHandleFunc:221: Add handle 5 1 0xf081a0 0x8f727f8
>> > myEventAddHandleFunc:221: Add handle 7 1 0xf09990 0x8f727f8
>> > myEventAddHandleFunc:221: Add handle 8 1 0xed7940 0x8f727f8
>> > myEventAddTimeoutFunc:251: Adding Timeout -1 0xedefa0 0x8f727f8
>> > myEventAddHandleFunc:221: Add handle 11 1 0xed7940 0x8f727f8
>> > myEventAddTimeoutFunc:251: Adding Timeout -1 0xedefa0 0x8f727f8
>> > main:322 :: Registering domain event cbs
>> > Segmentation fault (core dumped)
>> >
>> > Core was generated by
>> >
>>
`/root/libvirt-0.7.5/examples/domain-events/events-c/.libs/lt-event-test'.
>> > Program terminated with signal 11, Segmentation fault.
>> > [New process 21806]
>> > [New process 21822]
>> > #0 remoteDomainEventQueueFlush (timer=-1, opaque=0x8f727f8) at
>> > remote/remote_driver.c:8720
>> > 8720 tempQueue.count = priv->domainEvents->count;
>> > (gdb) bt
>> > #0 remoteDomainEventQueueFlush (timer=-1, opaque=0x8f727f8) at
>> > remote/remote_driver.c:8720
>> > #1 0x080490d3 in main (argc=Cannot access memory at address 0x1
>> > ) at event-test.c:347
>> >
>> > The stack looks corrupted so I'm doubtful that this trace if of
much
>> value.
>> > I have built
>> > and installed libvirt-0.7.5 and it and it's tools seem to be
>> operating
>> > correctly.
>>
>> I tried the event-test with libvirt-0.7.5 and QEMU/Xen and both
are
>> working as expected. No segfaults.
>>
>> Could you inspect the values of priv and priv->domainEvents in GDB
>> using 'p priv' to see if they are NULL and try to dereference them
in
>> GDB using 'p *priv' to see if they point to valid memory areas?
>>
>> Yes the backtrace looks corrupted. If there is stack/heap
corruption
>> involved valgrind may reveal it, so try to run the event-test in
>> valgrind and see if that gives any hints.
>>
>> You can also try the GIT version of libvirt. There was a invalid
free
>> call (resulting in heap corruption) in the node device code fixed
>> after the 0.7.5 release. But that should have no effect on the
>> event-test.
>>
>> Matthias
>
> Matthias,
>
> priv->domainEvents is NULL, here's the gdb output:
This explains the segfault. The next question is, why is it NULL?
> (gdb) p *priv
> $1 = {lock = {lock = {__data = {__lock = 1, __count = 0, __owner =
21806, __kind = 0, __nusers = 1, {__spins = 0, __list = {
> __next = 0x0}}}, __size =
"\001\000\000\000\000\000\000\000.U\000\000\000\000\000\000\001\000\000\000\000\000\000",
> __align = 1}}, sock = 150469168, watch = 3, pid = 4, uses_tls =
1982791681, is_secure = 1815048801, session = 0x782f6269,
Seeing uses_tls and is_secure being large numbers and knowing that
both are used as boolean values in the code and should have values of
0 or 1 make me think that priv points to already freed memory here.
> type = 0x2f646e65 <Address 0x2f646e65 out of bounds>, counter =
1684956536, localUses = 1668248365,
> hostname = 0x74656b <Address 0x74656b out of bounds>, debugLog =
0x0, saslconn = 0x0, saslDecoded = 0x0, saslDecodedLength = 0,
type and hostname are char pointers, but the seem to point into
nowhere, confirms that this is either freed memory or priv itself got
overwritten due to heap corruption.
> saslDecodedOffset = 0, saslEncoded = 0x0, saslEncodedLength = 0,
saslEncodedOffset = 0,
> buffer = '\0' <repeats 68 times>,
"n\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\bQ�\b����\030\034�\b\000\000\000\000�\033�\b\001\000\000\000\002\000\000\000�\033�\b\000\000\000\000\025|�\000\a",
'\0' <repeats 11 times>, "X\000�\b", '\0' <repeats 12
times>,
"\021\000\000\000\002\000\000\000P��\b\000\000\000\000\021", '\0'
<repeats 15 times>,
"\021\000\000\0008\036�\b\f\000\000\000\020\000\000\000\021\000\000\000\a\000\000\000\b\000\000\000\t\000\000\000\021\000\000\000\002\000\000\000\230\034�\b\000\000\000\000A\000\000\000\003\000\000\000\001\000\000\000\001\000"...,
bufferLength = 0,
> bufferOffset = 0, callbackList = 0x0, domainEvents = 0x0,
eventFlushTimer = 0, domainEventDispatching = 1, wakeupSendFD = 0,
> wakeupReadFD = 0, waitDispatch = 0x0, streams = 0x0}
>
> I'll try a run with valgrind and post the results.
>
> Pete
>
Could you test the Python version of this example found in
examples/domain-events/events-python/event-test.py? Does this work?
Otherwise lets see if valgrind gives any hints.
Matthias
During initialization I notice that the myEventAddHandleFunc() method is
called multiple times, each time with a different fd value (5,7,8 and 11).
The way the code is written only the last fd value is recorded and then
used in the poll() call. Is this the intended? if so why are the preceding
fds ignored?
myEventAddHandleFunc:223: Add handle 5 1 0xf13480 0x97b97d8
myEventAddHandleFunc:223: Add handle 7 1 0xf14c70 0x97b97d8
Allocating domainEvents:0x97c6b10
myEventAddHandleFunc:223: Add handle 8 1 0xee2940 0x97b97d8
myEventAddTimeoutFunc:260: Adding Timeout -1 0xee9fc0 0x97b97d8
Allocating domainEvents:0x97c5780
myEventAddHandleFunc:223: Add handle 11 1 0xee2940 0x97b97d8
myEventAddTimeoutFunc:260: Adding Timeout -1 0xee9fc0 0x97b97d8
main:333 :: Registering domain event cbs
Regards,
Pete