Re: [libvirt] Segfault in event-test.c example

----- "Matthias Bolte" <matthias.bolte@googlemail.com> wrote:
2010/1/10 <pspreadborough@comcast.net>:
----- "Matthias Bolte" <matthias.bolte@googlemail.com> wrote:
2010/1/10 <pspreadborough@comcast.net>:
Hello,
I have been trying to use the domain event C code example but unfortunately it segfaults (signal 11) every time I run it:
[root@Spring events-c]# ./event-test myEventAddHandleFunc:221: Add handle 5 1 0xf081a0 0x8f727f8 myEventAddHandleFunc:221: Add handle 7 1 0xf09990 0x8f727f8 myEventAddHandleFunc:221: Add handle 8 1 0xed7940 0x8f727f8 myEventAddTimeoutFunc:251: Adding Timeout -1 0xedefa0 0x8f727f8 myEventAddHandleFunc:221: Add handle 11 1 0xed7940 0x8f727f8 myEventAddTimeoutFunc:251: Adding Timeout -1 0xedefa0 0x8f727f8 main:322 :: Registering domain event cbs Segmentation fault (core dumped)
Core was generated by
`/root/libvirt-0.7.5/examples/domain-events/events-c/.libs/lt-event-test'.
Program terminated with signal 11, Segmentation fault. [New process 21806] [New process 21822] #0 remoteDomainEventQueueFlush (timer=-1, opaque=0x8f727f8) at remote/remote_driver.c:8720 8720 tempQueue.count = priv->domainEvents->count; (gdb) bt #0 remoteDomainEventQueueFlush (timer=-1, opaque=0x8f727f8) at remote/remote_driver.c:8720 #1 0x080490d3 in main (argc=Cannot access memory at address 0x1 ) at event-test.c:347
The stack looks corrupted so I'm doubtful that this trace if of much value. I have built and installed libvirt-0.7.5 and it and it's tools seem to be operating correctly.
I tried the event-test with libvirt-0.7.5 and QEMU/Xen and both are working as expected. No segfaults.
Could you inspect the values of priv and priv->domainEvents in GDB using 'p priv' to see if they are NULL and try to dereference them in GDB using 'p *priv' to see if they point to valid memory areas?
Yes the backtrace looks corrupted. If there is stack/heap corruption involved valgrind may reveal it, so try to run the event-test in valgrind and see if that gives any hints.
You can also try the GIT version of libvirt. There was a invalid free call (resulting in heap corruption) in the node device code fixed after the 0.7.5 release. But that should have no effect on the event-test.
Matthias
Matthias,
priv->domainEvents is NULL, here's the gdb output:
This explains the segfault. The next question is, why is it NULL?
(gdb) p *priv $1 = {lock = {lock = {__data = {__lock = 1, __count = 0, __owner = 21806, __kind = 0, __nusers = 1, {__spins = 0, __list = { __next = 0x0}}}, __size = "\001\000\000\000\000\000\000\000.U\000\000\000\000\000\000\001\000\000\000\000\000\000", __align = 1}}, sock = 150469168, watch = 3, pid = 4, uses_tls = 1982791681, is_secure = 1815048801, session = 0x782f6269,
Seeing uses_tls and is_secure being large numbers and knowing that both are used as boolean values in the code and should have values of 0 or 1 make me think that priv points to already freed memory here.
type = 0x2f646e65 <Address 0x2f646e65 out of bounds>, counter = 1684956536, localUses = 1668248365, hostname = 0x74656b <Address 0x74656b out of bounds>, debugLog = 0x0, saslconn = 0x0, saslDecoded = 0x0, saslDecodedLength = 0,
type and hostname are char pointers, but the seem to point into nowhere, confirms that this is either freed memory or priv itself got overwritten due to heap corruption.
saslDecodedOffset = 0, saslEncoded = 0x0, saslEncodedLength = 0, saslEncodedOffset = 0, buffer = '\0' <repeats 68 times>, "n\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\bQ�\b����\030\034�\b\000\000\000\000�\033�\b\001\000\000\000\002\000\000\000�\033�\b\000\000\000\000\025|�\000\a", '\0' <repeats 11 times>, "X\000�\b", '\0' <repeats 12 times>, "\021\000\000\000\002\000\000\000P��\b\000\000\000\000\021", '\0' <repeats 15 times>, "\021\000\000\0008\036�\b\f\000\000\000\020\000\000\000\021\000\000\000\a\000\000\000\b\000\000\000\t\000\000\000\021\000\000\000\002\000\000\000\230\034�\b\000\000\000\000A\000\000\000\003\000\000\000\001\000\000\000\001\000"..., bufferLength = 0, bufferOffset = 0, callbackList = 0x0, domainEvents = 0x0, eventFlushTimer = 0, domainEventDispatching = 1, wakeupSendFD = 0, wakeupReadFD = 0, waitDispatch = 0x0, streams = 0x0}
I'll try a run with valgrind and post the results.
Pete
Could you test the Python version of this example found in examples/domain-events/events-python/event-test.py? Does this work?
Otherwise lets see if valgrind gives any hints.
Matthias
Matthias, I have built the latest git image and run it through valgrind, here's the valgrind output: [root@Spring .libs]# valgrind -v ./event-test xen:/// ==19881== Memcheck, a memory error detector. ==19881== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==19881== Using LibVEX rev 1658, a library for dynamic binary translation. ==19881== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==19881== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==19881== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==19881== --19881-- Command line --19881-- ./event-test --19881-- xen:/// --19881-- Startup, with flags: --19881-- -v --19881-- Contents of /proc/version: --19881-- Linux version 2.6.18-164.10.1.el5xen (mockbuild@builder16.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Thu Jan 7 21:14:48 EST 2010 --19881-- Arch and hwcaps: X86, x86-sse1-sse2 --19881-- Valgrind library directory: /usr/lib/valgrind --19881-- Reading syms from /lib/ld-2.5.so (0x163000) --19881-- Reading syms from /root/libvirt/libvirt/examples/domain-events/events-c/.libs/event-test (0x8048000) --19881-- Reading syms from /usr/lib/valgrind/x86-linux/memcheck (0x38000000) --19881-- object doesn't have a dynamic symbol table --19881-- Reading suppressions file: /usr/lib/valgrind/default.supp --19881-- REDIR: 0x178790 (index) redirected to 0x38027D0F (vgPlain_x86_linux_REDIR_FOR_index) --19881-- Reading syms from /usr/lib/valgrind/x86-linux/vgpreload_core.so (0x4001000) --19881-- Reading syms from /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so (0x4003000) ==19881== WARNING: new redirection conflicts with existing -- ignoring it --19881-- new: 0x00178790 (index ) R-> 0x04006080 index --19881-- REDIR: 0x178930 (strlen) redirected to 0x4006250 (strlen) --19881-- Reading syms from /usr/local/lib/libvirt.so.0.7.5 (0x4009000) --19881-- Reading syms from /usr/lib/libxml2.so.2.6.26 (0x6254000) --19881-- object doesn't have a symbol table --19881-- Reading syms from /usr/lib/libz.so.1.2.3 (0xCD2000) --19881-- object doesn't have a symbol table --19881-- Reading syms from /lib/libm-2.5.so (0x413B000) --19881-- Reading syms from /usr/lib/libgnutls.so.13.0.6 (0x670D000) --19881-- object doesn't have a symbol table --19881-- Reading syms from /usr/lib/libgcrypt.so.11.5.2 (0x6590000) --19881-- object doesn't have a symbol table --19881-- Reading syms from /usr/lib/libsasl2.so.2.0.22 (0x6536000) --19881-- object doesn't have a symbol table --19881-- Reading syms from /usr/lib/libxenstore.so.3.0.0 (0x4162000) --19881-- object doesn't have a symbol table --19881-- Reading syms from /lib/libpthread-2.5.so (0x416A000) --19881-- Reading syms from /lib/libc-2.5.so (0x4183000) --19881-- Reading syms from /lib/libdl-2.5.so (0xCA7000) --19881-- Reading syms from /usr/lib/libgpg-error.so.0.3.0 (0xB28000) --19881-- object doesn't have a symbol table --19881-- Reading syms from /lib/libresolv-2.5.so (0x26E000) --19881-- Reading syms from /lib/libcrypt-2.5.so (0x64CE000) --19881-- REDIR: 0x41F42D0 (memset) redirected to 0x4006540 (memset) --19881-- REDIR: 0x41F47C0 (memcpy) redirected to 0x4006C20 (memcpy) --19881-- REDIR: 0x41F3430 (rindex) redirected to 0x4005F60 (rindex) --19881-- REDIR: 0x41F3090 (strlen) redirected to 0x4006230 (strlen) --19881-- REDIR: 0x41EED20 (malloc) redirected to 0x400533B (malloc) --19881-- REDIR: 0x41F2B30 (strcmp) redirected to 0x4006300 (strcmp) --19881-- REDIR: 0x41EE9E0 (calloc) redirected to 0x4004668 (calloc) --19881-- REDIR: 0x41F3DD0 (memchr) redirected to 0x4006420 (memchr) --19881-- REDIR: 0x41EC980 (free) redirected to 0x4004F55 (free) --19881-- REDIR: 0x41EF190 (realloc) redirected to 0x40053EA (realloc) ==19881== Warning: noted but unhandled ioctl 0x305000 with no size/direction hints ==19881== This could cause spurious value errors to appear. ==19881== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==19881== Warning: noted but unhandled ioctl 0x305000 with no size/direction hints ==19881== This could cause spurious value errors to appear. ==19881== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==19881== Warning: noted but unhandled ioctl 0x305000 with no size/direction hints ==19881== This could cause spurious value errors to appear. ==19881== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. --19881-- REDIR: 0x41F29C0 (index) redirected to 0x4006050 (index) --19881-- REDIR: 0x41F3280 (strncmp) redirected to 0x4006290 (strncmp) --19881-- REDIR: 0x41F44C0 (stpcpy) redirected to 0x40068D0 (stpcpy) --19881-- REDIR: 0x41F3140 (strnlen) redirected to 0x4006200 (strnlen) --19881-- REDIR: 0x41F3380 (strncpy) redirected to 0x4006DA0 (strncpy) --19881-- REDIR: 0x41F2BA0 (strcpy) redirected to 0x40069B0 (strcpy) myEventAddHandleFunc:221: Add handle 5 1 0x40ad4a0 0x42d6188 myEventAddHandleFunc:221: Add handle 7 1 0x40aec90 0x42d6188 Allocating domainEvents:0x43bea68 myEventAddHandleFunc:221: Add handle 8 1 0x407c940 0x42d6188 myEventAddTimeoutFunc:251: Adding Timeout -1 0x4083fe0 0x42d6188 Allocating domainEvents:0x4e8c320 myEventAddHandleFunc:221: Add handle 11 1 0x407c940 0x42d6188 myEventAddTimeoutFunc:251: Adding Timeout -1 0x4083fe0 0x42d6188 main:322 :: Registering domain event cbs poll ==19881== Invalid read of size 4 ==19881== at 0x4084011: remoteDomainEventQueueFlush (remote_driver.c:8722) ==19881== by 0x80490C2: main (event-test.c:341) ==19881== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==19881== ==19881== Process terminating with default action of signal 11 (SIGSEGV) ==19881== Access not within mapped region at address 0x0 ==19881== at 0x4084011: remoteDomainEventQueueFlush (remote_driver.c:8722) ==19881== by 0x80490C2: main (event-test.c:341) ==19881== ==19881== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 30 from 1) ==19881== ==19881== 1 errors in context 1 of 1: ==19881== Invalid read of size 4 ==19881== at 0x4084011: remoteDomainEventQueueFlush (remote_driver.c:8722) ==19881== by 0x80490C2: main (event-test.c:341) ==19881== Address 0x0 is not stack'd, malloc'd or (recently) free'd --19881-- --19881-- supp: 30 Fedora-Core-6-hack3-ld25 ==19881== ==19881== IN SUMMARY: 1 errors from 1 contexts (suppressed: 30 from 1) ==19881== ==19881== malloc/free: in use at exit: 590,340 bytes in 686 blocks. ==19881== malloc/free: 1,695 allocs, 1,009 frees, 1,700,423 bytes allocated. ==19881== ==19881== searching for pointers to 686 not-freed blocks. ==19881== checked 11,495,892 bytes. ==19881== ==19881== LEAK SUMMARY: ==19881== definitely lost: 0 bytes in 0 blocks. ==19881== possibly lost: 136 bytes in 1 blocks. ==19881== still reachable: 590,204 bytes in 685 blocks. ==19881== suppressed: 0 bytes in 0 blocks. ==19881== Reachable blocks (those to which a pointer was found) are not shown. ==19881== To see them, rerun with: --show-reachable=yes --19881-- memcheck: sanity checks: 3 cheap, 1 expensive --19881-- memcheck: auxmaps: 0 auxmap entries (0k, 0M) in use --19881-- memcheck: auxmaps: 0 searches, 0 comparisons --19881-- memcheck: SMs: n_issued = 36 (576k, 0M) --19881-- memcheck: SMs: n_deissued = 0 (0k, 0M) --19881-- memcheck: SMs: max_noaccess = 65535 (1048560k, 1023M) --19881-- memcheck: SMs: max_undefined = 0 (0k, 0M) --19881-- memcheck: SMs: max_defined = 244 (3904k, 3M) --19881-- memcheck: SMs: max_non_DSM = 36 (576k, 0M) --19881-- memcheck: max sec V bit nodes: 115 (5k, 0M) --19881-- memcheck: set_sec_vbits8 calls: 679 (new: 115, updates: 564) --19881-- memcheck: max shadow mem size: 885k, 0M --19881-- translate: fast SP updates identified: 7,064 ( 89.2%) --19881-- translate: generic_known SP updates identified: 543 ( 6.8%) --19881-- translate: generic_unknown SP updates identified: 304 ( 3.8%) --19881-- tt/tc: 13,590 tt lookups requiring 14,331 probes --19881-- tt/tc: 13,590 fast-cache updates, 3 flushes --19881-- transtab: new 6,511 (139,232 -> 2,275,366; ratio 163:10) [0 scs] --19881-- transtab: dumped 0 (0 -> ??) --19881-- transtab: discarded 8 (187 -> ??) --19881-- scheduler: 374,982 jumps (bb entries). --19881-- scheduler: 3/10,454 major/minor sched events. --19881-- sanity: 4 cheap, 1 expensive checks. --19881-- exectx: 30,011 lists, 566 contexts (avg 0 per list) --19881-- exectx: 2,732 searches, 2,171 full compares (794 per 1000) --19881-- exectx: 0 cmp2, 67 cmp4, 0 cmpAll Killed [root@Spring .libs]# I tried the python version and did have some success, but not 100% since I didn't see events for pausing the VM: [root@Spring events-python]# python event-test.py xen:/// Using uri:xen:/// myDomainEventCallback2 EVENT: Domain vm-full-1(1) Started 0 myDomainEventCallback1 EVENT: Domain vm-full-1(1) Started 0 I noticed that the the C version looks in one place for the libvirt socket and the python version in a different place, I don't know if this is significant or not? I tweaked the /etc/libvirt/libvirtd.conf file to alter the location. C: /usr/local/var/run/libvirt python: /var/run/libvirt/ Regards, Pete

On Mon, Jan 11, 2010 at 12:52:26AM +0000, pspreadborough@comcast.net wrote:
I tried the python version and did have some success, but not 100% since I didn't see events for pausing the VM:
Pause/resume events are not available for Xen, since it provides no way to get notified of them. THe other events stop/start/define/undefine are available for Xen. The QEMU/LXC drivers support all events.
[root@Spring events-python]# python event-test.py xen:/// Using uri:xen:///
myDomainEventCallback2 EVENT: Domain vm-full-1(1) Started 0 myDomainEventCallback1 EVENT: Domain vm-full-1(1) Started 0
I noticed that the the C version looks in one place for the libvirt socket and the python version in a different place, I don't know if this is significant or not? I tweaked the /etc/libvirt/libvirtd.conf file to alter the location.
C: /usr/local/var/run/libvirt
That looks like you have built your own custom libvirt binary with the default install prefix. If you do a custom build and want it to interoperate with the installed system version, thenm make sure to pass --prefix=/usr --sysconfdir=/etc --localstatedir=/var to the ./configure script Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Daniel, Is the Migrate event supported by Xen? Regards, Pete ----- "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Mon, Jan 11, 2010 at 12:52:26AM +0000, pspreadborough@comcast.net wrote:
I tried the python version and did have some success, but not 100%
since I didn't
see events for pausing the VM:
Pause/resume events are not available for Xen, since it provides no way to get notified of them. THe other events stop/start/define/undefine are available for Xen. The QEMU/LXC drivers support all events.
[root@Spring events-python]# python event-test.py xen:/// Using uri:xen:///
myDomainEventCallback2 EVENT: Domain vm-full-1(1) Started 0 myDomainEventCallback1 EVENT: Domain vm-full-1(1) Started 0
I noticed that the the C version looks in one place for the libvirt
socket and the
python version in a different place, I don't know if this is significant or not? I tweaked the /etc/libvirt/libvirtd.conf file to alter the location.
C: /usr/local/var/run/libvirt
That looks like you have built your own custom libvirt binary with the default install prefix. If you do a custom build and want it to interoperate with the installed system version, thenm make sure to pass
--prefix=/usr --sysconfdir=/etc --localstatedir=/var
to the ./configure script
Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (2)
-
Daniel P. Berrange
-
pspreadborough@comcast.net