[libvirt] conftest segfault

One of the libvirt tests (conftest) has been segfaulting for some time with no indication of a test failure other than a message in syslog. I verified this by building libvirt-1.0.3-1 with mock. Gene

On 03/15/2013 09:48 AM, Gene Czarcinski wrote:
One of the libvirt tests (conftest) has been segfaulting for some time with no indication of a test failure other than a message in syslog. I verified this by building libvirt-1.0.3-1 with mock.
I went back and looked at my logs more closely. This started with libvirt-1.0.3-1. Gene

On 15.03.2013 14:56, Gene Czarcinski wrote:
On 03/15/2013 09:48 AM, Gene Czarcinski wrote:
One of the libvirt tests (conftest) has been segfaulting for some time with no indication of a test failure other than a message in syslog. I verified this by building libvirt-1.0.3-1 with mock.
I went back and looked at my logs more closely. This started with libvirt-1.0.3-1.
Gene
Do you have a coredump? What does it say? Michal

On 03/15/2013 10:17 AM, Michal Privoznik wrote:
On 15.03.2013 14:56, Gene Czarcinski wrote:
On 03/15/2013 09:48 AM, Gene Czarcinski wrote:
One of the libvirt tests (conftest) has been segfaulting for some time with no indication of a test failure other than a message in syslog. I verified this by building libvirt-1.0.3-1 with mock.
I went back and looked at my logs more closely. This started with libvirt-1.0.3-1.
Gene
Do you have a coredump? What does it say?
In a word: no. See the attached excerpt from syslog which may explain why. I have explored a bit into the problem. 1. It occurs when ./autogen is run an will likely also occur when ./configure is run ... this is something internal to autogen. To eliminate the rpmbuild, I ran things from a git repository. 2. git checkout v1.0.2-maint does *not* have the problem. 3. git checkout v1.0.3-maint does have the problem 4. So does git checkout v1.0.3-rc1 One approach to identify this may be to do a binary search through the commits. Gene

On 03/15/2013 07:48 AM, Gene Czarcinski wrote:
One of the libvirt tests (conftest) has been segfaulting for some time with no indication of a test failure other than a message in syslog. I verified this by building libvirt-1.0.3-1 with mock.
Generally, this is not an issue. Autoconf tests INTENTIONALLY try to probe for broken systems, in order to work around brokenness, so a segfaulting conftest during ./configure just says that things are probing as expected. About the only thing that could be done to avoid a segfault during ./configure is fixing the underlying broken system that the probe was detecting in the first place, but that's more likely to be a glibc or kernel fix, not a libvirt fix. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 03/15/2013 11:38 AM, Eric Blake wrote:
One of the libvirt tests (conftest) has been segfaulting for some time with no indication of a test failure other than a message in syslog. I verified this by building libvirt-1.0.3-1 with mock. Generally, this is not an issue. Autoconf tests INTENTIONALLY try to
On 03/15/2013 07:48 AM, Gene Czarcinski wrote: probe for broken systems, in order to work around brokenness, so a segfaulting conftest during ./configure just says that things are probing as expected. About the only thing that could be done to avoid a segfault during ./configure is fixing the underlying broken system that the probe was detecting in the first place, but that's more likely to be a glibc or kernel fix, not a libvirt fix.
Isn't it a bit strange that it is broken under v1.0.3-maint but not under v1.0.2-maint? The underlying system is the same.

On 03/15/2013 10:16 AM, Gene Czarcinski wrote:
On 03/15/2013 11:38 AM, Eric Blake wrote:
One of the libvirt tests (conftest) has been segfaulting for some time with no indication of a test failure other than a message in syslog. I verified this by building libvirt-1.0.3-1 with mock. Generally, this is not an issue. Autoconf tests INTENTIONALLY try to
On 03/15/2013 07:48 AM, Gene Czarcinski wrote: probe for broken systems, in order to work around brokenness, so a segfaulting conftest during ./configure just says that things are probing as expected. About the only thing that could be done to avoid a segfault during ./configure is fixing the underlying broken system that the probe was detecting in the first place, but that's more likely to be a glibc or kernel fix, not a libvirt fix.
Isn't it a bit strange that it is broken under v1.0.3-maint but not under v1.0.2-maint? The underlying system is the same.
It's not broken. If instead you mean, "Isn't it strange that a configure triggers a conftest segfault under v1.0.3-maint but not under v1.0.2-maint?", the answer is no, it's not strange - v1.0.3 has more probes entered into than v1.0.2, so the new segfault is probably due to one of those added probes. Knowing WHICH configure.ac probe is segfaulting might be nice, for diagnosing why it is faulting and where to send a bug report upstream. For example, if the bug is in glibc, then fixing glibc to not segfault will fix every program that relies on glibc without probing for the bug - but it will have no difference to libvirt which is already prepared to work around the glibc bug. Look through the config.log of your 1.0.3 build to see if it you can quickly spot a test that failed with a segfault. There's a wealth of information in that file (maybe as simple as finding the string '$? = 139' or a message about a SEGV). But again, a segfault of conftest during configure is NORMAL, and nothing to panic about - it means that the package is PROPERLY going to avoid tickling that bug later on when the package is compiled. In fact, just in typing that, I though of at least one glibc CVE bug in re_compile_pattern, that still has not been fixed as of the glibc currently in Fedora 18. Furthermore, I know that this bug was reported only recently, and that the version of gnulib used in v1.0.2 did not check for this bug, while the version in v1.0.3 does (see libvirt commit d09949e29 for where I updated to newer gnulib, precisely to work around that glibc CVE). So sure enough, I looked through my config.log and found: configure:31277: checking for working re_compile_pattern configure:31456: gcc -std=gnu99 -o conftest -g conftest.c >&5 configure:31456: $? = 0 configure:31456: ./conftest *** glibc detected *** ./conftest: malloc(): memory corruption: 0x0000000000b31fc0 *** ./configure: line 3528: 15097 Segmentation fault (core dumped) ./conftest$ac_exeext configure:31456: $? = 139 configure: program exited with status 139 Libvirt is immune to that bug, but the CVE still affects other glibc clients that use re_compile_pattern without being aware of the problems, so you may want to put some pressure on the right people to fix https://bugzilla.redhat.com/show_bug.cgi?id=905877. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 03/15/2013 01:26 PM, Eric Blake wrote:
On 03/15/2013 11:38 AM, Eric Blake wrote:
One of the libvirt tests (conftest) has been segfaulting for some time with no indication of a test failure other than a message in syslog. I verified this by building libvirt-1.0.3-1 with mock. Generally, this is not an issue. Autoconf tests INTENTIONALLY try to
On 03/15/2013 07:48 AM, Gene Czarcinski wrote: probe for broken systems, in order to work around brokenness, so a segfaulting conftest during ./configure just says that things are probing as expected. About the only thing that could be done to avoid a segfault during ./configure is fixing the underlying broken system that the probe was detecting in the first place, but that's more likely to be a glibc or kernel fix, not a libvirt fix.
Isn't it a bit strange that it is broken under v1.0.3-maint but not under v1.0.2-maint? The underlying system is the same. It's not broken. If instead you mean, "Isn't it strange that a configure triggers a conftest segfault under v1.0.3-maint but not under v1.0.2-maint?", the answer is no, it's not strange - v1.0.3 has more
On 03/15/2013 10:16 AM, Gene Czarcinski wrote: probes entered into than v1.0.2, so the new segfault is probably due to one of those added probes.
Knowing WHICH configure.ac probe is segfaulting might be nice, for diagnosing why it is faulting and where to send a bug report upstream. For example, if the bug is in glibc, then fixing glibc to not segfault will fix every program that relies on glibc without probing for the bug - but it will have no difference to libvirt which is already prepared to work around the glibc bug.
Look through the config.log of your 1.0.3 build to see if it you can quickly spot a test that failed with a segfault. There's a wealth of information in that file (maybe as simple as finding the string '$? = 139' or a message about a SEGV). But again, a segfault of conftest during configure is NORMAL, and nothing to panic about - it means that the package is PROPERLY going to avoid tickling that bug later on when the package is compiled.
In fact, just in typing that, I though of at least one glibc CVE bug in re_compile_pattern, that still has not been fixed as of the glibc currently in Fedora 18. Furthermore, I know that this bug was reported only recently, and that the version of gnulib used in v1.0.2 did not check for this bug, while the version in v1.0.3 does (see libvirt commit d09949e29 for where I updated to newer gnulib, precisely to work around that glibc CVE). So sure enough, I looked through my config.log and found:
configure:31277: checking for working re_compile_pattern configure:31456: gcc -std=gnu99 -o conftest -g conftest.c >&5 configure:31456: $? = 0 configure:31456: ./conftest *** glibc detected *** ./conftest: malloc(): memory corruption: 0x0000000000b31fc0 *** ./configure: line 3528: 15097 Segmentation fault (core dumped) ./conftest$ac_exeext configure:31456: $? = 139 configure: program exited with status 139
Libvirt is immune to that bug, but the CVE still affects other glibc clients that use re_compile_pattern without being aware of the problems, so you may want to put some pressure on the right people to fix https://bugzilla.redhat.com/show_bug.cgi?id=905877.
I did a little testing and the problem begins occurring with this commit:
commit d09949e29386c38443c82a2231240cc1e3954a5d Author: Eric Blake <eblake@redhat.com> Date: Sat Jan 26 13:41:31 2013 -0700
maint: update to latest gnulib
CVE-2013-0242 in glibc's regex() can cause a DoS in any daemon that runs a regex search on user input while in a multibyte locale. I'm not sure how hard it would be to trigger such a setup for libvirtd, but rather than risk things, we can avoid the issue: gnulib has worked around the problem, and by updating to the latest gnulib, we can avoid the bug even on platforms where glibc has yet to be patched.
* .gnulib: Update to latest, for various fixes, including regex. * bootstrap: Resync from upstream.
If do a ./autogen with the previous commit, no problem. Move along to this commit and "bang" ... the segfault message in syslog. So, it appears the this version of gnulib fixes something important but also causes a segfault when ./configure is run with the new gnulib. What I do not know is if this segfault has any meaning. Gene

On 03/16/2013 02:02 PM, Gene Czarcinski wrote:
In fact, just in typing that, I though of at least one glibc CVE bug in re_compile_pattern, that still has not been fixed as of the glibc currently in Fedora 18. Furthermore, I know that this bug was reported only recently, and that the version of gnulib used in v1.0.2 did not check for this bug, while the version in v1.0.3 does (see libvirt commit d09949e29 for where I updated to newer gnulib, precisely to work around that glibc CVE). So sure enough, I looked through my config.log and found:
Libvirt is immune to that bug, but the CVE still affects other glibc clients that use re_compile_pattern without being aware of the problems, so you may want to put some pressure on the right people to fix https://bugzilla.redhat.com/show_bug.cgi?id=905877.
I did a little testing and the problem begins occurring with this commit:
commit d09949e29386c38443c82a2231240cc1e3954a5d Author: Eric Blake <eblake@redhat.com> Date: Sat Jan 26 13:41:31 2013 -0700
maint: update to latest gnulib
CVE-2013-0242 in glibc's regex() can cause a DoS in any daemon
Bingo - you found the very commit that I was just describing above.
If do a ./autogen with the previous commit, no problem. Move along to this commit and "bang" ... the segfault message in syslog.
And as I've already told you, this is GOOD, and will continue to happen as long as glibc is vulnerable to the CVE. The segfault in configure is INTENTIONAL - it is gnulib probing whether glibc is broken, in order to decide whether to rely on glibc or provide its own alternate regex implementation. If glibc coredumps, then gnulib uses its alternate, and libvirtd is a few kilobytes larger but immune to the glibc bug. If glibc does not core dump, then gnulib uses glibc, and libvirtd will be slightly smaller. Either way, libvirtd is immune to the glibc CVE, _because_ the configure test was intentionally trying to tickle a segfault. Libvirt 1.0.2, on the other hand, is vulnerable to the glibc bug, because it was not probing for that particular failure.
So, it appears the this version of gnulib fixes something important but also causes a segfault when ./configure is run with the new gnulib. What I do not know is if this segfault has any meaning.
The segfault DOES have meaning - it means that glibc is broken, but that the brokenness of glibc will not impact the rest of libvirtd. Quit worrying about it. Meanwhile, if you are wondering about glibc, just read https://bugzilla.redhat.com/show_bug.cgi?id=905877 (which you can easily find by searching for CVE-2013-0242, the very CVE mentioned in the commit message that you bisected as being the source where ./configure first tickles a segfault). It looks like an updated glibc will be hitting updates-testing of Fedora 18 early next week, at which point, the segv during ./configure should disappear. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 03/16/2013 11:08 PM, Eric Blake wrote:
On 03/16/2013 02:02 PM, Gene Czarcinski wrote:
In fact, just in typing that, I though of at least one glibc CVE bug in re_compile_pattern, that still has not been fixed as of the glibc currently in Fedora 18. Furthermore, I know that this bug was reported only recently, and that the version of gnulib used in v1.0.2 did not check for this bug, while the version in v1.0.3 does (see libvirt commit d09949e29 for where I updated to newer gnulib, precisely to work around that glibc CVE). So sure enough, I looked through my config.log and found: Libvirt is immune to that bug, but the CVE still affects other glibc clients that use re_compile_pattern without being aware of the problems, so you may want to put some pressure on the right people to fix https://bugzilla.redhat.com/show_bug.cgi?id=905877.
I did a little testing and the problem begins occurring with this commit:
commit d09949e29386c38443c82a2231240cc1e3954a5d Author: Eric Blake <eblake@redhat.com> Date: Sat Jan 26 13:41:31 2013 -0700
maint: update to latest gnulib
CVE-2013-0242 in glibc's regex() can cause a DoS in any daemon Bingo - you found the very commit that I was just describing above.
If do a ./autogen with the previous commit, no problem. Move along to this commit and "bang" ... the segfault message in syslog. And as I've already told you, this is GOOD, and will continue to happen as long as glibc is vulnerable to the CVE. The segfault in configure is INTENTIONAL - it is gnulib probing whether glibc is broken, in order to decide whether to rely on glibc or provide its own alternate regex implementation. If glibc coredumps, then gnulib uses its alternate, and libvirtd is a few kilobytes larger but immune to the glibc bug. If glibc does not core dump, then gnulib uses glibc, and libvirtd will be slightly smaller. Either way, libvirtd is immune to the glibc CVE, _because_ the configure test was intentionally trying to tickle a segfault. Libvirt 1.0.2, on the other hand, is vulnerable to the glibc bug, because it was not probing for that particular failure.
So, it appears the this version of gnulib fixes something important but also causes a segfault when ./configure is run with the new gnulib. What I do not know is if this segfault has any meaning. The segfault DOES have meaning - it means that glibc is broken, but that the brokenness of glibc will not impact the rest of libvirtd. Quit worrying about it.
Meanwhile, if you are wondering about glibc, just read https://bugzilla.redhat.com/show_bug.cgi?id=905877 (which you can easily find by searching for CVE-2013-0242, the very CVE mentioned in the commit message that you bisected as being the source where ./configure first tickles a segfault). It looks like an updated glibc will be hitting updates-testing of Fedora 18 early next week, at which point, the segv during ./configure should disappear. I thought I had checked bugzilla but I searched the comment fields rather than the subject line for CVW-2013-242 ... I will need to remember to do both next time.
Since libvirt uses gnulib for some/many/most/all of what it needs from libc and gnulib specifically has its own copy/version of regexec.c which has exactly the same patch as the one that goes against glibc, you are correct in saying that libvirt does not suffer from the problem. Maybe something should be done to have configure use gnulib instead of glibc for its tests. One other little thing ... libvirt has a test named "conftest" ... very confusing in this case. I just may submit a patch to rename that particular test. Gene

On 03/16/2013 11:08 PM, Eric Blake wrote:
So, it appears the this version of gnulib fixes something important but also causes a segfault when ./configure is run with the new gnulib. What I do not know is if this segfault has any meaning. The segfault DOES have meaning - it means that glibc is broken, but that the brokenness of glibc will not impact the rest of libvirtd. Quit worrying about it.
Since libvirt uses gnulib for some/many/most/all of what it needs from libc and gnulib specifically has its own copy/version of regexec.c which has exactly the same patch as the one that goes against glibc, you are correct in saying that libvirt does not suffer from the problem. Maybe something should be done to have configure use gnulib instead of glibc for its tests.
You are misunderstanding HOW gnulib works. Gnulib works by injecting tests into configure time, in order to determine whether to stick with glibc or gnulib at compile time. You MUST test glibc, at least once, in order to determine whether glibc is safe to use elsewhere. Getting a SEGV during configure is SUPPOSED to happen, if glibc is buggy, and that is gnulib that injected the code into configure that is forcing the glibc coredump. As for your suggestion of using gnulib at configure time, it IS gnulib that is doing the probe of glibc; we can't link against gnulib until it is built, but we don't know what to build until the gnulib configure-time tests have probed what works and what needs working around. There is nothing to fix in libvirt. Gnulib is working, as designed.
One other little thing ... libvirt has a test named "conftest" ... very confusing in this case. I just may submit a patch to rename that particular test.
Yes, a patch to rename the libvirt test would be worthwhile. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 03/18/2013 01:42 PM, Eric Blake wrote:
On 03/16/2013 11:08 PM, Eric Blake wrote:
So, it appears the this version of gnulib fixes something important but also causes a segfault when ./configure is run with the new gnulib. What I do not know is if this segfault has any meaning. The segfault DOES have meaning - it means that glibc is broken, but that the brokenness of glibc will not impact the rest of libvirtd. Quit worrying about it.
Since libvirt uses gnulib for some/many/most/all of what it needs from libc and gnulib specifically has its own copy/version of regexec.c which has exactly the same patch as the one that goes against glibc, you are correct in saying that libvirt does not suffer from the problem. Maybe something should be done to have configure use gnulib instead of glibc for its tests. You are misunderstanding HOW gnulib works. Gnulib works by injecting tests into configure time, in order to determine whether to stick with glibc or gnulib at compile time. You MUST test glibc, at least once, in order to determine whether glibc is safe to use elsewhere. Getting a SEGV during configure is SUPPOSED to happen, if glibc is buggy, and that is gnulib that injected the code into configure that is forcing the glibc coredump. As for your suggestion of using gnulib at configure time, it IS gnulib that is doing the probe of glibc; we can't link against gnulib until it is built, but we don't know what to build until the gnulib configure-time tests have probed what works and what needs working around.
There is nothing to fix in libvirt. Gnulib is working, as designed. OK, understand (finally) ;)
One other little thing ... libvirt has a test named "conftest" ... very confusing in this case. I just may submit a patch to rename that particular test. Yes, a patch to rename the libvirt test would be worthwhile. Done.
BTW, if you get a chance could you look at the patch to add ip-route for virtual networks? Gene
participants (3)
-
Eric Blake
-
Gene Czarcinski
-
Michal Privoznik