
On Wed, Jul 04, 2012 at 10:24:38AM +0200, Wido den Hollander wrote:
On 03-07-12 17:21, Daniel P. Berrange wrote:
On Tue, Jul 03, 2012 at 05:11:48PM +0200, Wido den Hollander wrote:
Yes, there is memory corruption somewhere. I never used valgrind before, but the output seems to show.
I ran libvirtd inside a screen, I've attached the screenlog with all the output.
At the end you'll see there is a misread. I was storing the base64 encoded value of "wido".
Thanks so there are two errors shown by valgrind
==6825== Invalid read of size 1 ==6825== at 0x5769DBA: vfprintf (vfprintf.c:1624) ==6825== by 0x58289D0: __vasprintf_chk (vasprintf_chk.c:68) ==6825== by 0x509C727: virVasprintf (stdio2.h:199) ==6825== by 0x508CFEA: virLogVMessage (logging.c:749) ==6825== by 0x508D349: virLogMessage (logging.c:696) ==6825== by 0x127AB0BC: secretSaveValue (secret_driver.c:297) ==6825== by 0x127AB30A: secretSetValue (secret_driver.c:861) ==6825== by 0x5130CF4: virSecretSetValue (libvirt.c:14457) ==6825== by 0x41CC8F: remoteDispatchSecretSetValueHelper (remote_dispatch.h:10962) ==6825== by 0x5174F84: virNetServerProgramDispatch (virnetserverprogram.c:416) ==6825== by 0x5171260: virNetServerHandleJob (virnetserver.c:161) ==6825== by 0x50990AD: virThreadPoolWorker (threadpool.c:143) ==6825== Address 0x19cb3c64 is 0 bytes after a block of size 4 alloc'd ==6825== at 0x4C29DB4: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==6825== by 0x508E32F: virAllocN (memory.c:128) ==6825== by 0x127AB21B: secretSetValue (secret_driver.c:838) ==6825== by 0x5130CF4: virSecretSetValue (libvirt.c:14457) ==6825== by 0x41CC8F: remoteDispatchSecretSetValueHelper (remote_dispatch.h:10962) ==6825== by 0x5174F84: virNetServerProgramDispatch (virnetserverprogram.c:416) ==6825== by 0x5171260: virNetServerHandleJob (virnetserver.c:161) ==6825== by 0x50990AD: virThreadPoolWorker (threadpool.c:143) ==6825== by 0x5098755: virThreadHelper (threads-pthread.c:161) ==6825== by 0x550AE99: start_thread (pthread_create.c:308) ==6825== by 0x58124BC: clone (clone.S:112)
This one is harmless - this is because the VIR_DEBUG line you added uses '%s' to print secret_value, but this is not a NULL terminated string. So we can ignore this.
==6825== ==6825== Invalid read of size 4 ==6825== at 0xA57E4B9: base64_encode (in /usr/lib/x86_64-linux-gnu/libroken.so.18.1.0) ==6825== by 0x10DDBC98: base64_encode_alloc (base64.c:140) ==6825== by 0x127AB0EF: secretSaveValue (secret_driver.c:302) ==6825== by 0x127AB30A: secretSetValue (secret_driver.c:861) ==6825== by 0x5130CF4: virSecretSetValue (libvirt.c:14457) ==6825== by 0x41CC8F: remoteDispatchSecretSetValueHelper (remote_dispatch.h:10962) ==6825== by 0x5174F84: virNetServerProgramDispatch (virnetserverprogram.c:416) ==6825== by 0x5171260: virNetServerHandleJob (virnetserver.c:161) ==6825== by 0x50990AD: virThreadPoolWorker (threadpool.c:143) ==6825== by 0x5098755: virThreadHelper (threads-pthread.c:161) ==6825== by 0x550AE99: start_thread (pthread_create.c:308) ==6825== by 0x58124BC: clone (clone.S:112) ==6825== Address 0xb7786c8 is 8 bytes inside a block of size 9 alloc'd ==6825== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==6825== by 0xA57E3D7: base64_encode (in /usr/lib/x86_64-linux-gnu/libroken.so.18.1.0) ==6825== by 0x10DDBC98: base64_encode_alloc (base64.c:140) ==6825== by 0x127AB0EF: secretSaveValue (secret_driver.c:302) ==6825== by 0x127AB30A: secretSetValue (secret_driver.c:861) ==6825== by 0x5130CF4: virSecretSetValue (libvirt.c:14457) ==6825== by 0x41CC8F: remoteDispatchSecretSetValueHelper (remote_dispatch.h:10962) ==6825== by 0x5174F84: virNetServerProgramDispatch (virnetserverprogram.c:416) ==6825== by 0x5171260: virNetServerHandleJob (virnetserver.c:161) ==6825== by 0x50990AD: virThreadPoolWorker (threadpool.c:143)
This one is very interesting. It shows that the 'base64_encode' function is doing an out-of-bounds read. More tellingly though is that it is reporting 'base64_encode' function is in a wierd library:
/usr/lib/x86_64-linux-gnu/libroken.so.18.1.
If this were normal, we should expect to see that function present in 'base64.c' since this function code is provided by gnulib itself.
So something else libvirt is linking to, directly or indirectly is using libroken.so which also has a 'base64_encode'symbol defined. This is overriding gnulib's symbol of the same name.
I'm willing to bet the API contract of this libroken.so base64_encode. differs from GNULIBS, with crashtastic results
Can you try and find how this libroken.so is getting linked to libvirt ?
Or indeed what this library does, or what package it is part of.
The library is libroken18-heimdal under Ubuntu 12.04: http://packages.ubuntu.com/precise/libroken18-heimdal
When installing ubuntu-virt-server libraries like gnutls depend on this library.
I'm not sure why libvirt gets linked to libroken, but on Ubuntu systems it's installed on most system which use libvirt.
I haven't found a header file which as the symbol 'base64_encode' in it for libroken, only gnutls and glib seem to have something that has 'base64_encode' in it, but those are prefixed with g_
I expect that this is an internal symbol from libroken.so which they leak into the public namespace.
For now I have no clues to why it gets linked to libroken, but it seems the problem has been found.
If gnutls links to libroken.so, then libvirt gets linked to it indirectly It sounds like we might need to have a workaround in gnulib to avoid this problem. With other cases where gnulib replaces existing symbols they use some magic such that the gnulib replacement gets prefixed with 'rpl_'. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|