----- "Matthias Bolte" <matthias.bolte@googlemail.com> wrote:
> 2010/1/10 <pspreadborough@comcast.net>:
> >
> > ----- "Matthias Bolte" <matthias.bolte@googlemail.com> wrote:
> >
> >> 2010/1/10 <pspreadborough@comcast.net>:
> >> >
> >> > Hello,
> >> >
> >> > I have been trying to use the domain event C code example but
> >> > unfortunately it segfaults (signal 11) every time I run it:
> >> >
> >> > [root@Spring events-c]# ./event-test
> >> > myEventAddHandleFunc:221: Add handle 5 1 0xf081a0 0x8f727f8
> >> > myEventAddHandleFunc:221: Add handle 7 1 0xf09990 0x8f727f8
> >> > myEventAddHandleFunc:221: Add handle 8 1 0xed7940 0x8f727f8
> >> > myEventAddTimeoutFunc:251: Adding Timeout -1 0xedefa0 0x8f727f8
> >> > myEventAddHandleFunc:221: Add handle 11 1 0xed7940 0x8f727f8
> >> > myEventAddTimeoutFunc:251: Adding Timeout -1 0xedefa0 0x8f727f8
> >> > main:322 :: Registering domain event cbs
> >> > Segmentation fault (core dumped)
> >> >
> >> > Core was generated by
> >> >
> >> `/root/libvirt-0.7.5/examples/domain-events/events-c/.libs/lt-event-test'.
> >> > Program terminated with signal 11, Segmentation fault.
> >> > [New process 21806]
> >> > [New process 21822]
> >> > #0 remoteDomainEventQueueFlush (timer=-1, opaque=0x8f727f8) at
> >> > remote/remote_driver.c:8720
> >> > 8720 tempQueue.count = priv->domainEvents->count;
> >> > (gdb) bt
> >> > #0 remoteDomainEventQueueFlush (timer=-1, opaque=0x8f727f8) at
> >> > remote/remote_driver.c:8720
> >> > #1 0x080490d3 in main (argc=Cannot access memory at address 0x1
> >> > ) at event-test.c:347
> >> >
> >> > The stack looks corrupted so I'm doubtful that this trace if of much
> >> value.
> >> > I have built
> >> > and installed libvirt-0.7.5 and it and it's tools seem to be
> >> operating
> >> > correctly.
> >>
> >> I tried the event-test with libvirt-0.7.5 and QEMU/Xen and both are
> >> working as expected. No segfaults.
> >>
> >> Could you inspect the values of priv and priv->domainEvents in GDB
> >> using 'p priv' to see if they are NULL and try to dereference them in
> >> GDB using 'p *priv' to see if they point to valid memory areas?
> >>
> >> Yes the backtrace looks corrupted. If there is stack/heap corruption
> >> involved valgrind may reveal it, so try to run the event-test in
> >> valgrind and see if that gives any hints.
> >>
> >> You can also try the GIT version of libvirt. There was a invalid free
> >> call (resulting in heap corruption) in the node device code fixed
> >> after the 0.7.5 release. But that should have no effect on the
> >> event-test.
> >>
> >> Matthias
> >
> > Matthias,
> >
> > priv->domainEvents is NULL, here's the gdb output:
>
> This explains the segfault. The next question is, why is it NULL?
>
> > (gdb) p *priv
> > $1 = {lock = {lock = {__data = {__lock = 1, __count = 0, __owner = 21806, __kind = 0, __nusers = 1, {__spins = 0, __list = {
> > __next = 0x0}}}, __size = "\001\000\000\000\000\000\000\000.U\000\000\000\000\000\000\001\000\000\000\000\000\000",
> > __align = 1}}, sock = 150469168, watch = 3, pid = 4, uses_tls = 1982791681, is_secure = 1815048801, session = 0x782f6269,
>
> Seeing uses_tls and is_secure being large numbers and knowing that
> both are used as boolean values in the code and should have values of
> 0 or 1 make me think that priv points to already freed memory here.
>
> > type = 0x2f646e65 <Address 0x2f646e65 out of bounds>, counter = 1684956536, localUses = 1668248365,
> > hostname = 0x74656b <Address 0x74656b out of bounds>, debugLog = 0x0, saslconn = 0x0, saslDecoded = 0x0, saslDecodedLength = 0,
>
> type and hostname are char pointers, but the seem to point into
> nowhere, confirms that this is either freed memory or priv itself got
> overwritten due to heap corruption.
>
> > saslDecodedOffset = 0, saslEncoded = 0x0, saslEncodedLength = 0, saslEncodedOffset = 0,
> > buffer = '\0' <repeats 68 times>, "n\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\bQ�\b����\030\034�\b\000\000\000\000�\033�\b\001\000\000\000\002\000\000\000�\033�\b\000\000\000\000\025|�\000\a", '\0' <repeats 11 times>, "X\000�\b", '\0' <repeats 12 times>, "\021\000\000\000\002\000\000\000P��\b\000\000\000\000\021", '\0' <repeats 15 times>, "\021\000\000\0008\036�\b\f\000\000\000\020\000\000\000\021\000\000\000\a\000\000\000\b\000\000\000\t\000\000\000\021\000\000\000\002\000\000\000\230\034�\b\000\000\000\000A\000\000\000\003\000\000\000\001\000\000\000\001\000"..., bufferLength = 0,
> > bufferOffset = 0, callbackList = 0x0, domainEvents = 0x0, eventFlushTimer = 0, domainEventDispatching = 1, wakeupSendFD = 0,
> > wakeupReadFD = 0, waitDispatch = 0x0, streams = 0x0}
> >
> > I'll try a run with valgrind and post the results.
> >
> > Pete
> >
>
> Could you test the Python version of this example found in
> examples/domain-events/events-python/event-test.py? Does this work?
>
> Otherwise lets see if valgrind gives any hints.
>
> Matthias
>
Matthias,
I have built the latest git image and run it through valgrind, here's the valgrind output:
[root@Spring .libs]# valgrind -v ./event-test xen:///
==19881== Memcheck, a memory error detector.
==19881== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
==19881== Using LibVEX rev 1658, a library for dynamic binary translation.
==19881== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
==19881== Using valgrind-3.2.1, a dynamic binary instrumentation framework.
==19881== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
==19881==
--19881-- Command line
--19881-- ./event-test
--19881-- xen:///
--19881-- Startup, with flags:
--19881-- -v
--19881-- Contents of /proc/version:
--19881-- Linux version 2.6.18-164.10.1.el5xen (mockbuild@builder16.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Thu Jan 7 21:14:48 EST 2010
--19881-- Arch and hwcaps: X86, x86-sse1-sse2
--19881-- Valgrind library directory: /usr/lib/valgrind
--19881-- Reading syms from /lib/ld-2.5.so (0x163000)
--19881-- Reading syms from /root/libvirt/libvirt/examples/domain-events/events-c/.libs/event-test (0x8048000)
--19881-- Reading syms from /usr/lib/valgrind/x86-linux/memcheck (0x38000000)
--19881-- object doesn't have a dynamic symbol table
--19881-- Reading suppressions file: /usr/lib/valgrind/default.supp
--19881-- REDIR: 0x178790 (index) redirected to 0x38027D0F (vgPlain_x86_linux_REDIR_FOR_index)
--19881-- Reading syms from /usr/lib/valgrind/x86-linux/vgpreload_core.so (0x4001000)
--19881-- Reading syms from /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so (0x4003000)
==19881== WARNING: new redirection conflicts with existing -- ignoring it
--19881-- new: 0x00178790 (index ) R-> 0x04006080 index
--19881-- REDIR: 0x178930 (strlen) redirected to 0x4006250 (strlen)
--19881-- Reading syms from /usr/local/lib/libvirt.so.0.7.5 (0x4009000)
--19881-- Reading syms from /usr/lib/libxml2.so.2.6.26 (0x6254000)
--19881-- object doesn't have a symbol table
--19881-- Reading syms from /usr/lib/libz.so.1.2.3 (0xCD2000)
--19881-- object doesn't have a symbol table
--19881-- Reading syms from /lib/libm-2.5.so (0x413B000)
--19881-- Reading syms from /usr/lib/libgnutls.so.13.0.6 (0x670D000)
--19881-- object doesn't have a symbol table
--19881-- Reading syms from /usr/lib/libgcrypt.so.11.5.2 (0x6590000)
--19881-- object doesn't have a symbol table
--19881-- Reading syms from /usr/lib/libsasl2.so.2.0.22 (0x6536000)
--19881-- object doesn't have a symbol table
--19881-- Reading syms from /usr/lib/libxenstore.so.3.0.0 (0x4162000)
--19881-- object doesn't have a symbol table
--19881-- Reading syms from /lib/libpthread-2.5.so (0x416A000)
--19881-- Reading syms from /lib/libc-2.5.so (0x4183000)
--19881-- Reading syms from /lib/libdl-2.5.so (0xCA7000)
--19881-- Reading syms from /usr/lib/libgpg-error.so.0.3.0 (0xB28000)
--19881-- object doesn't have a symbol table
--19881-- Reading syms from /lib/libresolv-2.5.so (0x26E000)
--19881-- Reading syms from /lib/libcrypt-2.5.so (0x64CE000)
--19881-- REDIR: 0x41F42D0 (memset) redirected to 0x4006540 (memset)
--19881-- REDIR: 0x41F47C0 (memcpy) redirected to 0x4006C20 (memcpy)
--19881-- REDIR: 0x41F3430 (rindex) redirected to 0x4005F60 (rindex)
--19881-- REDIR: 0x41F3090 (strlen) redirected to 0x4006230 (strlen)
--19881-- REDIR: 0x41EED20 (malloc) redirected to 0x400533B (malloc)
--19881-- REDIR: 0x41F2B30 (strcmp) redirected to 0x4006300 (strcmp)
--19881-- REDIR: 0x41EE9E0 (calloc) redirected to 0x4004668 (calloc)
--19881-- REDIR: 0x41F3DD0 (memchr) redirected to 0x4006420 (memchr)
--19881-- REDIR: 0x41EC980 (free) redirected to 0x4004F55 (free)
--19881-- REDIR: 0x41EF190 (realloc) redirected to 0x40053EA (realloc)
==19881== Warning: noted but unhandled ioctl 0x305000 with no size/direction hints
==19881== This could cause spurious value errors to appear.
==19881== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==19881== Warning: noted but unhandled ioctl 0x305000 with no size/direction hints
==19881== This could cause spurious value errors to appear.
==19881== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==19881== Warning: noted but unhandled ioctl 0x305000 with no size/direction hints
==19881== This could cause spurious value errors to appear.
==19881== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
--19881-- REDIR: 0x41F29C0 (index) redirected to 0x4006050 (index)
--19881-- REDIR: 0x41F3280 (strncmp) redirected to 0x4006290 (strncmp)
--19881-- REDIR: 0x41F44C0 (stpcpy) redirected to 0x40068D0 (stpcpy)
--19881-- REDIR: 0x41F3140 (strnlen) redirected to 0x4006200 (strnlen)
--19881-- REDIR: 0x41F3380 (strncpy) redirected to 0x4006DA0 (strncpy)
--19881-- REDIR: 0x41F2BA0 (strcpy) redirected to 0x40069B0 (strcpy)
myEventAddHandleFunc:221: Add handle 5 1 0x40ad4a0 0x42d6188
myEventAddHandleFunc:221: Add handle 7 1 0x40aec90 0x42d6188
Allocating domainEvents:0x43bea68
myEventAddHandleFunc:221: Add handle 8 1 0x407c940 0x42d6188
myEventAddTimeoutFunc:251: Adding Timeout -1 0x4083fe0 0x42d6188
Allocating domainEvents:0x4e8c320
myEventAddHandleFunc:221: Add handle 11 1 0x407c940 0x42d6188
myEventAddTimeoutFunc:251: Adding Timeout -1 0x4083fe0 0x42d6188
main:322 :: Registering domain event cbs
poll
==19881== Invalid read of size 4
==19881== at 0x4084011: remoteDomainEventQueueFlush (remote_driver.c:8722)
==19881== by 0x80490C2: main (event-test.c:341)
==19881== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19881==
==19881== Process terminating with default action of signal 11 (SIGSEGV)
==19881== Access not within mapped region at address 0x0
==19881== at 0x4084011: remoteDomainEventQueueFlush (remote_driver.c:8722)
==19881== by 0x80490C2: main (event-test.c:341)
==19881==
==19881== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 30 from 1)
==19881==
==19881== 1 errors in context 1 of 1:
==19881== Invalid read of size 4
==19881== at 0x4084011: remoteDomainEventQueueFlush (remote_driver.c:8722)
==19881== by 0x80490C2: main (event-test.c:341)
==19881== Address 0x0 is not stack'd, malloc'd or (recently) free'd
--19881--
--19881-- supp: 30 Fedora-Core-6-hack3-ld25
==19881==
==19881== IN SUMMARY: 1 errors from 1 contexts (suppressed: 30 from 1)
==19881==
==19881== malloc/free: in use at exit: 590,340 bytes in 686 blocks.
==19881== malloc/free: 1,695 allocs, 1,009 frees, 1,700,423 bytes allocated.
==19881==
==19881== searching for pointers to 686 not-freed blocks.
==19881== checked 11,495,892 bytes.
==19881==
==19881== LEAK SUMMARY:
==19881== definitely lost: 0 bytes in 0 blocks.
==19881== possibly lost: 136 bytes in 1 blocks.
==19881== still reachable: 590,204 bytes in 685 blocks.
==19881== suppressed: 0 bytes in 0 blocks.
==19881== Reachable blocks (those to which a pointer was found) are not shown.
==19881== To see them, rerun with: --show-reachable=yes
--19881-- memcheck: sanity checks: 3 cheap, 1 expensive
--19881-- memcheck: auxmaps: 0 auxmap entries (0k, 0M) in use
--19881-- memcheck: auxmaps: 0 searches, 0 comparisons
--19881-- memcheck: SMs: n_issued = 36 (576k, 0M)
--19881-- memcheck: SMs: n_deissued = 0 (0k, 0M)
--19881-- memcheck: SMs: max_noaccess = 65535 (1048560k, 1023M)
--19881-- memcheck: SMs: max_undefined = 0 (0k, 0M)
--19881-- memcheck: SMs: max_defined = 244 (3904k, 3M)
--19881-- memcheck: SMs: max_non_DSM = 36 (576k, 0M)
--19881-- memcheck: max sec V bit nodes: 115 (5k, 0M)
--19881-- memcheck: set_sec_vbits8 calls: 679 (new: 115, updates: 564)
--19881-- memcheck: max shadow mem size: 885k, 0M
--19881-- translate: fast SP updates identified: 7,064 ( 89.2%)
--19881-- translate: generic_known SP updates identified: 543 ( 6.8%)
--19881-- translate: generic_unknown SP updates identified: 304 ( 3.8%)
--19881-- tt/tc: 13,590 tt lookups requiring 14,331 probes
--19881-- tt/tc: 13,590 fast-cache updates, 3 flushes
--19881-- transtab: new 6,511 (139,232 -> 2,275,366; ratio 163:10) [0 scs]
--19881-- transtab: dumped 0 (0 -> ??)
--19881-- transtab: discarded 8 (187 -> ??)
--19881-- scheduler: 374,982 jumps (bb entries).
--19881-- scheduler: 3/10,454 major/minor sched events.
--19881-- sanity: 4 cheap, 1 expensive checks.
--19881-- exectx: 30,011 lists, 566 contexts (avg 0 per list)
--19881-- exectx: 2,732 searches, 2,171 full compares (794 per 1000)
--19881-- exectx: 0 cmp2, 67 cmp4, 0 cmpAll
Killed
[root@Spring .libs]#
I tried the python version and did have some success, but not 100% since I didn't
see events for pausing the VM:
[root@Spring events-python]# python event-test.py xen:///
Using uri:xen:///
myDomainEventCallback2 EVENT: Domain vm-full-1(1) Started 0
myDomainEventCallback1 EVENT: Domain vm-full-1(1) Started 0
I noticed that the the C version looks in one place for the libvirt socket and the
python version in a different place, I don't know if this is significant or not?
I tweaked the /etc/libvirt/libvirtd.conf file to alter the location.
C: /usr/local/var/run/libvirt
python: /var/run/libvirt/
Regards,
Pete