[libvirt] [PATCH] Avoid crash in shunloadtest

From: "Daniel P. Berrange" <berrange@redhat.com> For unknown reasons, the shunloadtest will crash on Fedora 16 inside dlopen() (gdb) bt #0 0x00000000000050e6 in ?? () #1 0x00007ff61a77b9d5 in floor () from /lib64/libm.so.6 #2 0x00007ff61e522963 in _dl_relocate_object () from /lib64/ld-linux-x86-64.so.2 #3 0x00007ff61e5297e6 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2 #4 0x00007ff61e525006 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2 #5 0x00007ff61e52917a in _dl_open () from /lib64/ld-linux-x86-64.so.2 #6 0x00007ff61e0f6f26 in dlopen_doit () from /lib64/libdl.so.2 #7 0x00007ff61e525006 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2 #8 0x00007ff61e0f752f in _dlerror_run () from /lib64/libdl.so.2 #9 0x00007ff61e0f6fc1 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2 #10 0x0000000000400a15 in main (argc=<optimized out>, argv=<optimized out>) at shunloadtest.c:105 Changing from RTLD_NOW to RTLD_LAZY avoids this problem, but quite possibly does not fix the root cause. * shunloadtest.c: s/NOW/LAZY/ --- tests/shunloadtest.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/tests/shunloadtest.c b/tests/shunloadtest.c index 2cdb8b8..ab6e56f 100644 --- a/tests/shunloadtest.c +++ b/tests/shunloadtest.c @@ -102,7 +102,7 @@ int main(int argc ATTRIBUTE_UNUSED, char **argv) fprintf(stderr, " .%*s 1 ", 39, ""); signal(SIGSEGV, sigHandler); - if (!(lib = dlopen("./.libs/libshunload.so", RTLD_NOW))) { + if (!(lib = dlopen("./.libs/libshunload.so", RTLD_LAZY))) { fprintf(stderr, "Cannot load ./.libs/libshunload.so %s\n", dlerror()); return 1; } -- 1.7.7.3

On 12/01/2011 09:33 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
For unknown reasons, the shunloadtest will crash on Fedora 16 inside dlopen()
(gdb) bt #0 0x00000000000050e6 in ?? () #1 0x00007ff61a77b9d5 in floor () from /lib64/libm.so.6 #2 0x00007ff61e522963 in _dl_relocate_object () from /lib64/ld-linux-x86-64.so.2 #3 0x00007ff61e5297e6 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2 #4 0x00007ff61e525006 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2 #5 0x00007ff61e52917a in _dl_open () from /lib64/ld-linux-x86-64.so.2 #6 0x00007ff61e0f6f26 in dlopen_doit () from /lib64/libdl.so.2 #7 0x00007ff61e525006 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2 #8 0x00007ff61e0f752f in _dlerror_run () from /lib64/libdl.so.2 #9 0x00007ff61e0f6fc1 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2 #10 0x0000000000400a15 in main (argc=<optimized out>, argv=<optimized out>) at shunloadtest.c:105
Changing from RTLD_NOW to RTLD_LAZY avoids this problem, but quite possibly does not fix the root cause.
* shunloadtest.c: s/NOW/LAZY/ ---
ACK - I've seen the problem as well, and this avoids the crash. I don't know if we should be opening a bug against glibc for regressing on dlopen() behavior under RTLD_NOW, or if it was a bug in our assumptions for how RTLD_NOW works. I also don't know if the change would have caught the original bug that shunloadtest was designed to test, although it's on my list of things to check out. But we might as well push this now, as it is certainly making my development on F16 painful while the crash continues to happen. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Thu, Dec 01, 2011 at 09:45:44AM -0700, Eric Blake wrote:
On 12/01/2011 09:33 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
For unknown reasons, the shunloadtest will crash on Fedora 16 inside dlopen()
(gdb) bt #0 0x00000000000050e6 in ?? () #1 0x00007ff61a77b9d5 in floor () from /lib64/libm.so.6 #2 0x00007ff61e522963 in _dl_relocate_object () from /lib64/ld-linux-x86-64.so.2 #3 0x00007ff61e5297e6 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2 #4 0x00007ff61e525006 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2 #5 0x00007ff61e52917a in _dl_open () from /lib64/ld-linux-x86-64.so.2 #6 0x00007ff61e0f6f26 in dlopen_doit () from /lib64/libdl.so.2 #7 0x00007ff61e525006 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2 #8 0x00007ff61e0f752f in _dlerror_run () from /lib64/libdl.so.2 #9 0x00007ff61e0f6fc1 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2 #10 0x0000000000400a15 in main (argc=<optimized out>, argv=<optimized out>) at shunloadtest.c:105
Changing from RTLD_NOW to RTLD_LAZY avoids this problem, but quite possibly does not fix the root cause.
* shunloadtest.c: s/NOW/LAZY/ ---
ACK - I've seen the problem as well, and this avoids the crash.
I don't know if we should be opening a bug against glibc for regressing on dlopen() behavior under RTLD_NOW, or if it was a bug in our assumptions for how RTLD_NOW works. I also don't know if the change would have caught the original bug that shunloadtest was designed to test, although it's on my list of things to check out. But we might as well push this now, as it is certainly making my development on F16 painful while the crash continues to happen.
The test should still be valid, even with RTLD_LAZY, because it is the dlclose() behaviour we're validating. We want to make sure the library isn't unloaded from memory upon dlclose(). So whether we did lazy symbol resolution doesn't matter Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 12/01/2011 10:44 AM, Daniel P. Berrange wrote:
On Thu, Dec 01, 2011 at 09:45:44AM -0700, Eric Blake wrote:
Changing from RTLD_NOW to RTLD_LAZY avoids this problem, but quite possibly does not fix the root cause.
* shunloadtest.c: s/NOW/LAZY/ ---
ACK - I've seen the problem as well, and this avoids the crash.
I'm pushing this under the build-breaker rule. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Daniel P. Berrange
-
Eric Blake