[libvirt] Likely build race, "/usr/bin/ld: cannot find -lvirt"

tl;dr: I think there is a bug in libvirt's build system which, with low probability, causes a build failure containing this message: /usr/bin/ld: cannot find -lvirt Complete build logs of two attempts: http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386-libvirt/6... http://logs.test-lab.xenproject.org/osstest/logs/123096/build-i386-libvirt/6... Snippet from 123046 containing the error is enclosed below. Longer explanation: I have two new machines for the Xen Project CI, which I am trying to commission. As part of commissioning I run a complete test run (a "flight" in osstest terminology) on just those new hosts. The i386 libvirt build failed: http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386-libvirt/6... Everything else that would be expected to work was fine. The test programme was identical to flight 122815, except that that ran on other hosts in the test farm (and, there, it passed). The error is the kind of error one sees with missing dependencies in parallel builds, etc. I wanted to have some 32-bit libvirt tests actually run, so I reran a new flight containing the relevant parts. That failed too in a very similar way: http://logs.test-lab.xenproject.org/osstest/logs/123096/build-i386-libvirt/6... The two machines are Dell R230s (and therefore hardly unusual). The main novelty of these machines is that the firmware is UEFI booting in UEFI mode. I doubt that has anything to do with it. The host, including compiler, is Debian jessie i386. As you can see from the log, we were trying to build libvirt 764a7483f189e6de841163647c14296e693dbb2e What may be less obvious is that we were trying to build it against xen.git#0306a1311d02ea52b4a9a9bc339f8bab9354c5e3. http://logs.test-lab.xenproject.org/osstest/logs/123064/build-i386-libvirt/i... http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386/info.html Does this seem like a likely explanation ? Have other people experienced occasional problems with make -j ? If someone wants to suggest a patch that might fix it I can test it. In the meantime I have set off a number of new attempts, to try to guess the failure probability, and also one attempt on other hosts to check that nothing unexpected was broken. Ian. /usr/bin/ld: cannot find -lvirt /usr/bin/ld: cannot find -lvirt /bin/mkdir -p '/home/osstest/build.123046.build-i386-libvirt/dist/usr/local/lib/libvirt/storage-backend' /bin/bash ../libtool --mode=install /usr/bin/install -c libvirt_storage_backend_fs.la libvirt_storage_backend_logical.la libvirt_storage_backend_scsi.la libvirt_storage_backend_mpath.la '/home/osstest/build.123046.build-i386-libvirt/dist/usr/local/lib/libvirt/storage-backend' libtool: install: warning: relinking `libvirt_storage_backend_fs.la' libtool: install: (cd /home/osstest/build.123046.build-i386-libvirt/libvirt/src; /bin/bash /home/osstest/build.123046.build-i386-libvirt/libvirt/libtool --silent --tag CC --mode=relink gcc -std=gnu99 -I./conf -I/usr/include/libxml2 -fno-common -W -Waddress -Waggressive-loop-optimizations -Wall -Wattributes -Wbad-function-cast -Wbuiltin-macro-redefined -Wcast-align -Wchar-subscripts -Wclobbered -Wcomment -Wcomments -Wcoverage-mismatch -Wcpp -Wdate-time -Wdeprecated-declarations -Wdiv-by-zero -Wdouble-promotion -Wempty-body -Wendif-labels -Wextra -Wformat-contains-nul -Wformat-extra-args -Wformat-security -Wformat-y2k -Wformat-zero-length -Wfree-nonheap-object -Wignored-qualifiers -Wimplicit -Wimplicit-function-declaration -Wimplicit-int -Winit-self -Winline -Wint-to-pointer-cast -Winvalid-memory-model -Winvalid-pch -Wjump-misses-init -Wlogical-op -Wmain -Wmaybe-uninitialized -Wmemset-transposed-args -Wmissing-braces -Wmissing-declarations -Wmissing-field-initializers -Wmissing-include-dirs -Wmissing-parameter-type -Wmissing-prototypes -Wmultichar -Wnarrowing -Wnested-externs -Wnonnull -Wold-style-declaration -Wold-style-definition -Wopenmp-simd -Woverflow -Woverride-init -Wpacked-bitfield-compat -Wparentheses -Wpointer-arith -Wpointer-sign -Wpointer-to-int-cast -Wpragmas -Wpsabi -Wreturn-local-addr -Wreturn-type -Wsequence-point -Wshadow -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstrict-prototypes -Wsuggest-attribute=const -Wsuggest-attribute=format -Wsuggest-attribute=noreturn -Wsuggest-attribute=pure -Wswitch -Wsync-nand -Wtrampolines -Wtrigraphs -Wtype-limits -Wuninitialized -Wunknown-pragmas -Wunused -Wunused-but-set-parameter -Wunused-but-set-variable -Wunused-function -Wunused-label -Wunused-local-typedefs -Wunused-parameter -Wunused-result -Wunused-value -Wunused-variable -Wvarargs -Wvariadic-macros -Wvector-operation-performance -Wvolatile-register-var -Wwrite-strings -Wnormalized=nfc -Wno-sign-compare -Wjump-misses-init -Wswitch-enum -Wno-format-nonliteral -fstack-protector-strong -fexceptions -fasyn chronous-unwind-tables -fipa-pure-const -Wno-suggest-attribute=pure -Wno-suggest-attribute=const -Werror -Wframe-larger-than=4096 -g -I/home/osstest/build.123046.build-i386-libvirt/xendist/usr/local/include/ -DLIBXL_API_VERSION=0x040400 -module -avoid-version -Wl,-z -Wl,nodelete -export-dynamic -Wl,-z -Wl,relro -Wl,-z -Wl,now -Wl,--no-copy-dt-needed-entries -g -L/home/osstest/build.123046.build-i386-libvirt/xendist/usr/local/lib/ -Wl,-rpath-link=/home/osstest/build.123046.build-i386-libvirt/xendist/usr/local/lib/ -o libvirt_storage_backend_fs.la -rpath /usr/local/lib/libvirt/storage-backend storage/libvirt_storage_backend_fs_la-storage_backend_fs.lo libvirt.la ../gnulib/lib/libgnu.la -ldl -inst-prefix-dir /home/osstest/build.123046.build-i386-libvirt/dist) collect2: error: ld returned 1 exit status Makefile:6410: recipe for target 'install-lockdriverLTLIBRARIES' failed libtool: install: error: relink `lockd.la' with the above command before installing it make[3]: *** [install-lockdriverLTLIBRARIES] Error 1 make[3]: *** Waiting for unfinished jobs.... /usr/bin/ld: cannot find -lvirt

Ian Jackson writes ("Likely build race, "/usr/bin/ld: cannot find -lvirt""):
tl;dr:
I think there is a bug in libvirt's build system which, with low probability, causes a build failure containing this message: /usr/bin/ld: cannot find -lvirt
Complete build logs of two attempts:
http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386-libvirt/6...
http://logs.test-lab.xenproject.org/osstest/logs/123096/build-i386-libvirt/6...
I have run a number of attempts. Out of 5 more, 1 succeeded. So out of a total of 7 attempts, 1 succeeded. This repro rate is an IMO excellent opportunity to debug this race :-). Ian.

On 05/24/2018 04:27 AM, Ian Jackson wrote:
Ian Jackson writes ("Likely build race, "/usr/bin/ld: cannot find -lvirt""):
tl;dr:
I think there is a bug in libvirt's build system which, with low probability, causes a build failure containing this message: /usr/bin/ld: cannot find -lvirt
Complete build logs of two attempts:
http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386-libvirt/6...
http://logs.test-lab.xenproject.org/osstest/logs/123096/build-i386-libvirt/6...
I have run a number of attempts. Out of 5 more, 1 succeeded. So out of a total of 7 attempts, 1 succeeded. This repro rate is an IMO excellent opportunity to debug this race :-).
There appears to be a missing dependency between the lockd library and libvirt library, but my autotools skills lack the savvy to find it. Here we see the install command and relinking of lockd.la /bin/bash ../libtool --mode=install /usr/bin/install -c lockd.la '/home/osstest/build.123096.build-i386-libvirt/dist/usr/local/lib/libvirt/lock-driver' libtool: install: warning: relinking `lockd.la' libtool: install: (cd /home/osstest/build.123096.build-i386-libvirt/libvirt/src; /bin/bash /home/osstest/build.123096.build-i386-libvirt/libvirt/libtool --silent --tag CC --mode=relink gcc -std=gnu99 -I./conf -I/usr/include/libxml2 -fno-common -W -Waddress -Waggressive-loop-optimizations -Wall -Wattributes -Wbad-function-cast -Wbuiltin-macro-redefined -Wcast-align -Wchar-subscripts -Wclobbered -Wcomment -Wcomments -Wcoverage-mismatch -Wcpp -Wdate-time -Wdeprecated-declarations -Wdiv-by-zero -Wdouble-promotion -Wempty-body -Wendif-labels -Wextra -Wformat-contains-nul -Wformat-extra-args -Wformat-security -Wformat-y2k -Wformat-zero-length -Wfree-nonheap-object -Wignored-qualifiers -Wimplicit -Wimplicit-function-declaration -Wimplicit-int -Winit-self -Winline -Wint-to-pointer-cast -Winvalid-memory-model -Winvalid-pch -Wjump-misses-init -Wlogical-op -Wmain -Wmaybe-uninitialized -Wmemset-transposed-args -Wmissing-braces -Wmissing-declarations -Wmissing-field-initializers -Wmissing-include-dirs -Wmissing-parameter-type -Wmissing-prototypes -Wmultichar -Wnarrowing -Wnested-externs -Wnonnull -Wold-style-declaration -Wold-style-definition -Wopenmp-simd -Woverflow -Woverride-init -Wpacked-bitfield-compat -Wparentheses -Wpointer-arith -Wpointer-sign -Wpointer-to-int-cast -Wpragmas -Wpsabi -Wreturn-local-addr -Wreturn-type -Wsequence-point -Wshadow -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstrict-prototypes -Wsuggest-attribute=const -Wsuggest-attribute=format -Wsuggest-attribute=noreturn -Wsuggest-attribute=pure -Wswitch -Wsync-nand -Wtrampolines -Wtrigraphs -Wtype-limits -Wuninitialized -Wunknown-pragmas -Wunused -Wunused-but-set-parameter -Wunused-but-set-variable -Wunused-function -Wunused-label -Wunused-local-typedefs -Wunused-parameter -Wunused-result -Wunused-value -Wunused-variable -Wvarargs -Wvariadic-macros -Wvector-operation-performance -Wvolatile-register-var -Wwrite-strings -Wnormalized=nfc -Wno-sign-compare -Wjump-misses-init -Wswitch-enum -Wno-format-nonliteral -fstack-protector-strong -fexceptions -fasynchronous-unwind-tables -fipa-pure-const -Wno-suggest-attribute=pure -Wno-suggest-attribute=const -Werror -Wframe-larger-than=4096 -g -I/home/osstest/build.123096.build-i386-libvirt/xendist/usr/local/include/ -DLIBXL_API_VERSION=0x040400 -module -avoid-version -Wl,-z -Wl,nodelete -export-dynamic -Wl,-z -Wl,relro -Wl,-z -Wl,now -Wl,--no-copy-dt-needed-entries -Wl,-z -Wl,defs -g -L/home/osstest/build.123096.build-i386-libvirt/xendist/usr/local/lib/ -Wl,-rpath-link=/home/osstest/build.123096.build-i386-libvirt/xendist/usr/local/lib/ -o lockd.la -rpath /usr/local/lib/libvirt/lock-driver locking/lockd_la-lock_driver_lockd.lo locking/lockd_la-lock_protocol.lo libvirt.la ../gnulib/lib/libgnu.la -ldl -inst-prefix-dir /home/osstest/build.123096.build-i386-libvirt/dist) /usr/bin/ld: cannot find -lvirt collect2: error: ld returned 1 exit status libtool: install: error: relink `lockd.la' with the above command before installing it Makefile:6410: recipe for target 'install-lockdriverLTLIBRARIES' failed and several lines later it seems another thread finally finishes libvirt.la libtool: install: /usr/bin/install -c .libs/libvirt.lai /home/osstest/build.123096.build-i386-libvirt/dist/usr/local/lib/libvirt.la I've stared at the various Makefile.{,inc.}am files but can't spot the problem. Perhaps other libvirt maintainers with better autotools skills can give some hints. Regards, Jim

On Thu, May 24, 2018 at 15:52:55 -0600, Jim Fehlig wrote:
On 05/24/2018 04:27 AM, Ian Jackson wrote:
Ian Jackson writes ("Likely build race, "/usr/bin/ld: cannot find -lvirt""):
tl;dr:
I think there is a bug in libvirt's build system which, with low probability, causes a build failure containing this message: /usr/bin/ld: cannot find -lvirt
Complete build logs of two attempts:
http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386-libvirt/6...
http://logs.test-lab.xenproject.org/osstest/logs/123096/build-i386-libvirt/6...
I have run a number of attempts. Out of 5 more, 1 succeeded. So out of a total of 7 attempts, 1 succeeded. This repro rate is an IMO excellent opportunity to debug this race :-).
There appears to be a missing dependency between the lockd library and libvirt library, but my autotools skills lack the savvy to find it. Here we see the install command and relinking of lockd.la
I hit the same race twice on aarch64 and ppc64 and I can confirm the installation phase fails if libvirt.la is installed later than libraries which link to it. However, the dependencies seem to be set correctly in the Makefiles. But it looks like they are only honored when linking the library during the build phase. During make install libvirt.la and libraries which link to it are installed independently. That is, install-modLTLIBRARIES does not depend on anything except for the mod_LTIBRARIES themselves. Thus when libtool decides to relink the libraries libvirt.la may still be missing at this point. Manually changing install-modLTLIBRARIES: $(mod_LTLIBRARIES) to install-modLTLIBRARIES: $(mod_LTLIBRARIES) install-libLTLIBRARIES fixed the problem for me (tested with an artificial delay added to install-libLTLIBRARIES target), but I have no idea how to persuade automake to generate something like that for us. Eric, is my investigation correct and do you have any ideas on how to fix the race? Jirka

On 06/12/2018 06:11 AM, Jiri Denemark wrote:
I hit the same race twice on aarch64 and ppc64 and I can confirm the installation phase fails if libvirt.la is installed later than libraries which link to it. However, the dependencies seem to be set correctly in the Makefiles. But it looks like they are only honored when linking the library during the build phase. During make install libvirt.la and libraries which link to it are installed independently. That is, install-modLTLIBRARIES does not depend on anything except for the mod_LTIBRARIES themselves. Thus when libtool decides to relink the libraries libvirt.la may still be missing at this point. Manually changing
install-modLTLIBRARIES: $(mod_LTLIBRARIES)
to
install-modLTLIBRARIES: $(mod_LTLIBRARIES) install-libLTLIBRARIES
fixed the problem for me (tested with an artificial delay added to install-libLTLIBRARIES target), but I have no idea how to persuade automake to generate something like that for us.
Eric, is my investigation correct and do you have any ideas on how to fix the race?
Can you add that line directly into Makefile.am, or does doing that cause automake to complain and/or omit its normal rules because it thinks you are overriding its defaults? I know that getting automake to add a dependency is not always trivial, but that it should be possible (my strengths lie more on autoconf than on automake). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

On Tue, Jun 12, 2018 at 07:57:40 -0500, Eric Blake wrote:
On 06/12/2018 06:11 AM, Jiri Denemark wrote:
I hit the same race twice on aarch64 and ppc64 and I can confirm the installation phase fails if libvirt.la is installed later than libraries which link to it. However, the dependencies seem to be set correctly in the Makefiles. But it looks like they are only honored when linking the library during the build phase. During make install libvirt.la and libraries which link to it are installed independently. That is, install-modLTLIBRARIES does not depend on anything except for the mod_LTIBRARIES themselves. Thus when libtool decides to relink the libraries libvirt.la may still be missing at this point. Manually changing
install-modLTLIBRARIES: $(mod_LTLIBRARIES)
to
install-modLTLIBRARIES: $(mod_LTLIBRARIES) install-libLTLIBRARIES
fixed the problem for me (tested with an artificial delay added to install-libLTLIBRARIES target), but I have no idea how to persuade automake to generate something like that for us.
Eric, is my investigation correct and do you have any ideas on how to fix the race?
Can you add that line directly into Makefile.am, or does doing that cause automake to complain and/or omit its normal rules because it thinks you are overriding its defaults?
Yeah. It doesn't complain, but it omits its normal install-modLTLIBRARIES rule which mean nothing will be installed. However, the error is still reported so there are other libraries which are not in mod_LTLIBRARIES affected too. I also tried adding install-modLTLIBRARIES-local target, but it didn't work either since automake doesn't use this target (well I didn't really hope it would work, but I tried it anyway). It's not really surprising bisecting found the following commit which introduced the race, but I'm not really sure how to fix it. Isn't this a bug in automake? :-) commit 21639744f6371db0bfa1bd0d21fe5c51c6d6878a Author: Daniel P. Berrangé <berrange@redhat.com> Date: Thu Jan 25 09:35:56 2018 +0000 build: explicitly link all modules with libvirt.so The dlopened modules we currently build all use various symbols from libvirt.so, but don't actually link to it. They rely on the libvirtd daemon re-exporting the libvirt.so symbols. This means that at the time the modules are linked, they contain a huge number of undefined symbols. It also means that these undefined symbols are not versioned, so despite us providing a LIBVIRT_PRIVATE_XXXX version that intentionally changes on every release, the loadable modules could actually be loaded into any libvirtd regardless of version. This change explicitly links all modules against libvirt.so so that they don't rely on the re-export behave and can be fully resolved at build time. This will give us a stronger guarantee modules will actually be loadable at runtime and that we're using modules from the matched build. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Jirka

On Tue, Jun 12, 2018 at 04:16:53PM +0200, Jiri Denemark wrote:
On Tue, Jun 12, 2018 at 07:57:40 -0500, Eric Blake wrote:
On 06/12/2018 06:11 AM, Jiri Denemark wrote:
I hit the same race twice on aarch64 and ppc64 and I can confirm the installation phase fails if libvirt.la is installed later than libraries which link to it. However, the dependencies seem to be set correctly in the Makefiles. But it looks like they are only honored when linking the library during the build phase. During make install libvirt.la and libraries which link to it are installed independently. That is, install-modLTLIBRARIES does not depend on anything except for the mod_LTIBRARIES themselves. Thus when libtool decides to relink the libraries libvirt.la may still be missing at this point. Manually changing
install-modLTLIBRARIES: $(mod_LTLIBRARIES)
to
install-modLTLIBRARIES: $(mod_LTLIBRARIES) install-libLTLIBRARIES
fixed the problem for me (tested with an artificial delay added to install-libLTLIBRARIES target), but I have no idea how to persuade automake to generate something like that for us.
Eric, is my investigation correct and do you have any ideas on how to fix the race?
Can you add that line directly into Makefile.am, or does doing that cause automake to complain and/or omit its normal rules because it thinks you are overriding its defaults?
Yeah. It doesn't complain, but it omits its normal install-modLTLIBRARIES rule which mean nothing will be installed. However, the error is still reported so there are other libraries which are not in mod_LTLIBRARIES affected too.
What I find strange is that automake has chosen to wire up install-modLTLIBRARIES to the install-data-am target, instead of the install-exec-am target. mod_LTLIBRARIES = .... moddir = $(libdir)/libvirt/connection-driver ... mod_LTLIBRARIES += libvirt_driver_lxc.la I would have expected the _LTLIBRARIES suffix to cause it to be wired into the install-exec-am target
I also tried adding install-modLTLIBRARIES-local target, but it didn't work either since automake doesn't use this target (well I didn't really hope it would work, but I tried it anyway).
It's not really surprising bisecting found the following commit which introduced the race, but I'm not really sure how to fix it. Isn't this a bug in automake? :-)
The attractive big hammer solution is to stop using libtool entirely and create shared libraries directly with gcc -shared, thus getting rid of the stupid shell wrapper scripts & relinking that libtool does....
commit 21639744f6371db0bfa1bd0d21fe5c51c6d6878a Author: Daniel P. Berrangé <berrange@redhat.com> Date: Thu Jan 25 09:35:56 2018 +0000
build: explicitly link all modules with libvirt.so
The dlopened modules we currently build all use various symbols from libvirt.so, but don't actually link to it. They rely on the libvirtd daemon re-exporting the libvirt.so symbols. This means that at the time the modules are linked, they contain a huge number of undefined symbols. It also means that these undefined symbols are not versioned, so despite us providing a LIBVIRT_PRIVATE_XXXX version that intentionally changes on every release, the loadable modules could actually be loaded into any libvirtd regardless of version.
This change explicitly links all modules against libvirt.so so that they don't rely on the re-export behave and can be fully resolved at build time. This will give us a stronger guarantee modules will actually be loadable at runtime and that we're using modules from the matched build.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Jiri Denemark writes ("Re: [libvirt] Likely build race, "/usr/bin/ld: cannot find -lvirt""):
On Tue, Jun 12, 2018 at 07:57:40 -0500, Eric Blake wrote:
Can you add that line directly into Makefile.am, or does doing that cause automake to complain and/or omit its normal rules because it thinks you are overriding its defaults?
Yeah. It doesn't complain, but it omits its normal install-modLTLIBRARIES rule which mean nothing will be installed. However, the error is still reported so there are other libraries which are not in mod_LTLIBRARIES affected too.
I did a web search for "automake add dependency" and ended up at a stackmumble page which referred to EXTRA_a_DEPENDENCIES. See (automake-1) Program and Library Variables which says 'maude_DEPENDENCIES' 'EXTRA_maude_DEPENDENCIES' It is also occasionally useful to have a target (program or library) depend on some other file that is not actually part of that target. This can be done using the '_DEPENDENCIES' variable. Each target depends on the contents of such a variable, but no further interpretation is done. Since these dependencies are associated to the link rule used to create the programs they should normally list files used by the link command. That is '*.$(OBJEXT)', '*.a', or '*.la' files for programs; '*.lo' and '*.la' files for Libtool libraries; and '*.$(OBJEXT)' files for static libraries. In rare cases you may need to add other kinds of files such as linker scripts, but _listing a source file in '_DEPENDENCIES' is wrong_. If some source file needs to be built before all the components of a program are built, consider using the 'BUILT_SOURCES' variable (*note Sources::). If '_DEPENDENCIES' is not supplied, it is computed by Automake. The automatically-assigned value is the contents of '_LDADD' or '_LIBADD', with most configure substitutions, '-l', '-L', '-dlopen' and '-dlpreopen' options removed. The configure substitutions that are left in are only '$(LIBOBJS)' and '$(ALLOCA)'; these are left because it is known that they will not cause an invalid value for '_DEPENDENCIES' to be generated. '_DEPENDENCIES' is more likely used to perform conditional compilation using an 'AC_SUBST' variable that contains a list of objects. *Note Conditional Sources::, and *note Conditional Libtool Sources::. The 'EXTRA_*_DEPENDENCIES' variable may be useful for cases where you merely want to augment the 'automake'-generated '_DEPENDENCIES' variable rather than replacing it. Ian.
participants (5)
-
Daniel P. Berrangé
-
Eric Blake
-
Ian Jackson
-
Jim Fehlig
-
Jiri Denemark