2010/1/12 <pspreadborough(a)comcast.net>:
>
> ----- pspreadborough(a)comcast.net wrote:
>
>> ----- "Matthias Bolte" <matthias.bolte(a)googlemail.com> wrote:
>>
>> > 2010/1/11 <pspreadborough(a)comcast.net>:
>> > >
>> > > ----- "Matthias Bolte" <matthias.bolte(a)googlemail.com>
wrote:
>> > >
>> > >> 2010/1/10 <pspreadborough(a)comcast.net>:
>> > >> >
>> > >> > ----- "Matthias Bolte"
<matthias.bolte(a)googlemail.com>
wrote:
>> > >> >
>> > >> >> 2010/1/10 <pspreadborough(a)comcast.net>:
>> > >> >> >
>> > >> >> > Hello,
>> > >> >> >
>> > >> >> > I have been trying to use the domain event C code
example
>> but
>> > >> >> > unfortunately it segfaults (signal 11) every time I
run
it:
>> > >> >> >
>> > >> >> > [root@Spring events-c]# ./event-test
>> > >> >> > myEventAddHandleFunc:221: Add handle 5 1 0xf081a0
0x8f727f8
>> > >> >> > myEventAddHandleFunc:221: Add handle 7 1 0xf09990
0x8f727f8
>> > >> >> > myEventAddHandleFunc:221: Add handle 8 1 0xed7940
0x8f727f8
>> > >> >> > myEventAddTimeoutFunc:251: Adding Timeout -1
0xedefa0
>> > 0x8f727f8
>> > >> >> > myEventAddHandleFunc:221: Add handle 11 1 0xed7940
0x8f727f8
>> > >> >> > myEventAddTimeoutFunc:251: Adding Timeout -1
0xedefa0
>> > 0x8f727f8
>> > >> >> > main:322 :: Registering domain event cbs
>> > >> >> > Segmentation fault (core dumped)
>> > >> >> >
>> > >> >> > Core was generated by
>> > >> >> >
>> > >> >>
>> > >>
>> >
>>
`/root/libvirt-0.7.5/examples/domain-events/events-c/.libs/lt-event-test'.
>> > >> >> > Program terminated with signal 11, Segmentation
fault.
>> > >> >> > [New process 21806]
>> > >> >> > [New process 21822]
>> > >> >> > #0 remoteDomainEventQueueFlush (timer=-1,
opaque=0x8f727f8)
>> > at
>> > >> >> > remote/remote_driver.c:8720
>> > >> >> > 8720 tempQueue.count =
priv->domainEvents->count;
>> > >> >> > (gdb) bt
>> > >> >> > #0 remoteDomainEventQueueFlush (timer=-1,
opaque=0x8f727f8)
>> > at
>> > >> >> > remote/remote_driver.c:8720
>> > >> >> > #1 0x080490d3 in main (argc=Cannot access memory
at
address
>> > 0x1
>> > >> >> > ) at event-test.c:347
>> > >> >> >
>> > >> >> > The stack looks corrupted so I'm doubtful that
this trace
if
>> > of
>> > >> much
>> > >> >> value.
>> > >> >> > I have built
>> > >> >> > and installed libvirt-0.7.5 and it and it's
tools seem to
be
>> > >> >> operating
>> > >> >> > correctly.
>> > >> >>
>> > >> >> I tried the event-test with libvirt-0.7.5 and QEMU/Xen
and
>> both
>> > >> are
>> > >> >> working as expected. No segfaults.
>> > >> >>
>> > >> >> Could you inspect the values of priv and
priv->domainEvents
in
>> > GDB
>> > >> >> using 'p priv' to see if they are NULL and try
to
dereference
>> > them
>> > >> in
>> > >> >> GDB using 'p *priv' to see if they point to valid
memory
>> areas?
>> > >> >>
>> > >> >> Yes the backtrace looks corrupted. If there is
stack/heap
>> > >> corruption
>> > >> >> involved valgrind may reveal it, so try to run the
event-test
>> > in
>> > >> >> valgrind and see if that gives any hints.
>> > >> >>
>> > >> >> You can also try the GIT version of libvirt. There was a
>> > invalid
>> > >> free
>> > >> >> call (resulting in heap corruption) in the node device
code
>> > fixed
>> > >> >> after the 0.7.5 release. But that should have no effect
on
the
>> > >> >> event-test.
>> > >> >>
>> > >> >> Matthias
>> > >> >
>> > >> > Matthias,
>> > >> >
>> > >> > priv->domainEvents is NULL, here's the gdb output:
>> > >>
>> > >> This explains the segfault. The next question is, why is it
NULL?
>> > >>
>> > >> > (gdb) p *priv
>> > >> > $1 = {lock = {lock = {__data = {__lock = 1, __count = 0,
>> __owner
>> > =
>> > >> 21806, __kind = 0, __nusers = 1, {__spins = 0, __list = {
>> > >> > __next = 0x0}}}, __size =
>> > >>
>> >
>>
"\001\000\000\000\000\000\000\000.U\000\000\000\000\000\000\001\000\000\000\000\000\000",
>> > >> > __align = 1}}, sock = 150469168, watch = 3, pid = 4,
>> > uses_tls =
>> > >> 1982791681, is_secure = 1815048801, session = 0x782f6269,
>> > >>
>> > >> Seeing uses_tls and is_secure being large numbers and knowing
>> that
>> > >> both are used as boolean values in the code and should have
>> values
>> > of
>> > >> 0 or 1 make me think that priv points to already freed memory
>> > here.
>> > >>
>> > >> > type = 0x2f646e65 <Address 0x2f646e65 out of bounds>,
counter
>> =
>> > >> 1684956536, localUses = 1668248365,
>> > >> > hostname = 0x74656b <Address 0x74656b out of bounds>,
debugLog
>> > =
>> > >> 0x0, saslconn = 0x0, saslDecoded = 0x0, saslDecodedLength =
0,
>> > >>
>> > >> type and hostname are char pointers, but the seem to point
into
>> > >> nowhere, confirms that this is either freed memory or priv
itself
>> > got
>> > >> overwritten due to heap corruption.
>> > >>
>> > >> > saslDecodedOffset = 0, saslEncoded = 0x0, saslEncodedLength
=
>> > 0,
>> > >> saslEncodedOffset = 0,
>> > >> > buffer = '\0' <repeats 68 times>,
>> > >>
>> >
>>
"n\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\bQ�\b����\030\034�\b\000\000\000\000�\033�\b\001\000\000\000\002\000\000\000�\033�\b\000\000\000\000\025|�\000\a",
>> > >> '\0' <repeats 11 times>, "X\000�\b",
'\0' <repeats 12 times>,
>> > >>
"\021\000\000\000\002\000\000\000P��\b\000\000\000\000\021",
'\0'
>> > >> <repeats 15 times>,
>> > >>
>> >
>>
"\021\000\000\0008\036�\b\f\000\000\000\020\000\000\000\021\000\000\000\a\000\000\000\b\000\000\000\t\000\000\000\021\000\000\000\002\000\000\000\230\034�\b\000\000\000\000A\000\000\000\003\000\000\000\001\000\000\000\001\000"...,
>> > >> bufferLength = 0,
>> > >> > bufferOffset = 0, callbackList = 0x0, domainEvents = 0x0,
>> > >> eventFlushTimer = 0, domainEventDispatching = 1, wakeupSendFD
=
>> 0,
>> > >> > wakeupReadFD = 0, waitDispatch = 0x0, streams = 0x0}
>> > >> >
>> > >> > I'll try a run with valgrind and post the results.
>> > >> >
>> > >> > Pete
>> > >> >
>> > >>
>> > >> Could you test the Python version of this example found in
>> > >> examples/domain-events/events-python/event-test.py? Does this
>> > work?
>> > >>
>> > >> Otherwise lets see if valgrind gives any hints.
>> > >>
>> > >> Matthias
>> > >
>> > >
>> > > During initialization I notice that the myEventAddHandleFunc()
>> > method is
>> > > called multiple times, each time with a different fd value
(5,7,8
>> > and 11).
>> > > The way the code is written only the last fd value is recorded
and
>> > then
>> > > used in the poll() call. Is this the intended? if so why are
the
>> > preceding
>> > > fds ignored?
>> > >
>> > > myEventAddHandleFunc:223: Add handle 5 1 0xf13480 0x97b97d8
>> > > myEventAddHandleFunc:223: Add handle 7 1 0xf14c70 0x97b97d8
>> > > Allocating domainEvents:0x97c6b10
>> > > myEventAddHandleFunc:223: Add handle 8 1 0xee2940 0x97b97d8
>> > > myEventAddTimeoutFunc:260: Adding Timeout -1 0xee9fc0
0x97b97d8
>> > > Allocating domainEvents:0x97c5780
>> > > myEventAddHandleFunc:223: Add handle 11 1 0xee2940 0x97b97d8
>> > > myEventAddTimeoutFunc:260: Adding Timeout -1 0xee9fc0
0x97b97d8
>> > > main:333 :: Registering domain event cbs
>> > >
>> > >
>> > > Regards,
>> > >
>> > > Pete
>> > >
>> >
>> > That's strange. I can't reproduce this neither. I always get
exactly
>> > one call to myEventAddHandleFunc:
>> >
>> > myEventAddHandleFunc:221: Add handle 3 1 0x7ff116f68b00
0x1d97f00
>> > myEventAddTimeoutFunc:251: Adding Timeout -1 0x7ff116f68750
>> 0x1d97f00
>> > main:322 :: Registering domain event cbs
>> > myEventUpdateHandleFunc:232: Updated Handle 0 0
>> > myEventUpdateHandleFunc:232: Updated Handle 0 1
>> >
>> > You could try to run the event-test in GDB and set a breakpoint
on
>> > myEventAddHandleFunc to see where 4 additional calls to
>> > myEventAddHandleFunc come from.
>> >
>> > In my case I get this backtrace when setting a breakpoint on
>> > myEventAddHandleFunc:
>> >
>> > (gdb) bt
>> > #0 myEventAddHandleFunc (fd=6, event=1, cb=0x7f1be6ec3b00
>> > <remoteDomainEventFired>, opaque=0xc68f00, ff=0) at
event-test.c:220
>> > #1 0x00007f1be6ecbaaf in doRemoteOpen (conn=0xc68f00,
>> > priv=0x7f1be7350010, auth=0x0, flags=0) at
>> remote/remote_driver.c:893
>> > #2 0x00007f1be6ece053 in remoteOpen (conn=0xc68f00, auth=0x0,
>> > flags=13007744) at remote/remote_driver.c:1076
>> > #3 0x00007f1be6eb155d in do_open (name=0x7fff48400968
>> > "qemu:///system", auth=0x0, flags=0) at libvirt.c:1117
>> > #4 0x0000000000400eb3 in main (argc=<value optimized out>,
>> > argv=<value optimized out>) at event-test.c:313
>> >
>> > Matthias
>>
>>
>> Matthias
>>
>> Here are the four stack traces, one for each time
>> myEventAddHandleFunc() was
>> called.
>>
>> #0 myEventAddHandleFunc (fd=8, event=1, cb=0x85b480
>> <xenStoreWatchEvent>, opaque=0x824a7d8, ff=0) at event-test.c:223
>> #1 0x007ccf55 in virEventAddHandle (fd=8, events=1, cb=0x85b480
>> <xenStoreWatchEvent>, opaque=0x824a7d8, ff=0)
>> at util/event.c:45
>> #2 0x0085b291 in xenStoreOpen (conn=0x824a7d8, auth=0x0,
flags=<value
>> optimized out>) at xen/xs_internal.c:339
>> #3 0x00844287 in xenUnifiedOpen (conn=0x824a7d8, auth=0x0,
flags=0)
>> at xen/xen_driver.c:352
>> #4 0x00811d05 in do_open (name=0xbf8c49d4 "xen:///", auth=0x0,
>> flags=0) at libvirt.c:1117
>> #5 0x08048e92 in main (argc=Cannot access memory at address 0x2
>> ) at event-test.c:325
>> (gdb) c
>> Continuing.
>> (gdb) b
>> Note: breakpoint 1 also set at pc 0x8048bc9.
>> Breakpoint 2 at 0x8048bc9: file event-test.c, line 223.
>> (gdb) bt
>> #0 myEventAddHandleFunc (fd=10, event=1, cb=0x85cc70
>> <xenInotifyEvent>, opaque=0x824a7d8, ff=0) at event-test.c:223
>> #1 0x007ccf55 in virEventAddHandle (fd=10, events=1, cb=0x85cc70
>> <xenInotifyEvent>, opaque=0x824a7d8, ff=0) at util/event.c:45
>> #2 0x0085c827 in xenInotifyOpen (conn=0x824a7d8, auth=0x0,
flags=0)
>> at xen/xen_inotify.c:460
>> #3 0x008444f1 in xenUnifiedOpen (conn=0x824a7d8, auth=0x0,
flags=0)
>> at xen/xen_driver.c:391
>> #4 0x00811d05 in do_open (name=0xbf8c49d4 "xen:///", auth=0x0,
>> flags=0) at libvirt.c:1117
>> #5 0x08048e92 in main (argc=Cannot access memory at address 0x1
>> ) at event-test.c:325
>> (gdb) c
>> Continuing.
>> (gdb) bt
>> #0 myEventAddHandleFunc (fd=11, event=1, cb=0x82a940
>> <remoteDomainEventFired>, opaque=0x824a7d8, ff=0) at
event-test.c:223
>> #1 0x007ccf55 in virEventAddHandle (fd=11, events=1, cb=0x82a940
>> <remoteDomainEventFired>, opaque=0x824a7d8, ff=0)
>> at util/event.c:45
>> #2 0x0082c478 in doRemoteOpen (conn=0x824a7d8, priv=0xb7534008,
>> auth=0x0, flags=0) at remote/remote_driver.c:894
>> #3 0x00830448 in remoteOpenSecondaryDriver (conn=0x824a7d8,
auth=0x0,
>> flags=0, priv=0xbf8c2668) at remote/remote_driver.c:1006
>> #4 0x0083082a in remoteNetworkOpen (conn=0x824a7d8, auth=0x0,
>> flags=0) at remote/remote_driver.c:3549
>> #5 0x00811e2f in do_open (name=0xbf8c49d4 "xen:///", auth=0x0,
>> flags=0) at libvirt.c:1137
>> #6 0x08048e92 in main (argc=1, argv=0xbf8c28b4) at
event-test.c:325
>> (gdb) c
>> Continuing.
>> (gdb) bt
>> #0 myEventAddHandleFunc (fd=14, event=1, cb=0x82a940
>> <remoteDomainEventFired>, opaque=0x824a7d8, ff=0) at
event-test.c:223
>> #1 0x007ccf55 in virEventAddHandle (fd=14, events=1, cb=0x82a940
>> <remoteDomainEventFired>, opaque=0x824a7d8, ff=0)
>> at util/event.c:45
>> #2 0x0082c478 in doRemoteOpen (conn=0x824a7d8, priv=0x8258ab0,
>> auth=0x0, flags=0) at remote/remote_driver.c:894
>> #3 0x00830448 in remoteOpenSecondaryDriver (conn=0x824a7d8,
auth=0x0,
>> flags=0, priv=0xbf8c2668) at remote/remote_driver.c:1006
>> #4 0x0083078a in remoteInterfaceOpen (conn=0x824a7d8, auth=0x0,
>> flags=0) at remote/remote_driver.c:4104
>> #5 0x00811f4e in do_open (name=0xbf8c49d4 "xen:///", auth=0x0,
>> flags=0) at libvirt.c:1156
>> #6 0x08048e92 in main (argc=1, argv=0xbf8c28b4) at
event-test.c:325
>>
>> Regards,
>>
>> Pete
>>
>>
>> --
>> Libvir-list mailing list
>> Libvir-list(a)redhat.com
>>
https://www.redhat.com/mailman/listinfo/libvir-list
>
> Matthias,
>
> Using the UIR xen+unix:/// works! I'm curious to know why when
> using the UIR xen:/// multiple myEventAddHandleFunc() calls
occurred.
> Are there and fields I could use to identify and ignore
> the spurious calls?
>
> I'd like to use the event monitoring for several remote Xen hosts.
> I assume I'll have to modify the code to connect remotely to each
> host and hope for just one myEventAddHandleFunc() call per host.
>
> Thanks for your assistance it's much apprecied,
>
> Regards,
>
> Pete
>
>
Ah you are using the event-test from within a dom0. If I do that I
can
reproduce the segfault. And I also understand why the Python version
works.
The C version is basically build to handle a single call to
myEventAddHandleFunc. This works with xen+unix:/// because then the
remote driver is involved and only one event handle is added by the
remote driver. With xen:/// several Xen subdrivers register event
handles and myEventAddHandleFunc will just overwrite the values
stored
from the previously added event handle. For some reason this results
in the segfault you see. The Python version works because it handles
multiple added handles properly.
I tried to understand why this triggers a segfault, but no success
yet.
Matthias
Matthias,
Yes, I'm currently working within Dom0 although my project will
ultimately work remotely. Thank you for your assistance.
Regards,
Pete