Has anyone tried to use gold instead of ld?
gold promises to be much faster than ld, so it may help speeding up test cycles for large C++ applications, but can it be used as drop-in replacement for ld?
Can gcc/g++ directly call gold.?
Are there any know bugs or problems?
Although gold is part of the GNU binutils since a while, I have found almost no "success stories" or even "Howtos" in the Web.
(Update: added links to gold and blog entry explaining it)
At the moment it is compiling bigger projects on Ubuntu 10.04. Here you can install and integrate it easily with the binutils-gold package (if you remove that package, you get your old ld). Gcc will automatically use gold then.
Some experiences:
gold doesn't search in /usr/local/lib
gold doesn't assume libs like pthread or rt, had to add them by hand
it is faster and needs less memory (the later is important on big C++ projects with a lot of boost etc.)
What does not work: It cannot compile kernel stuff and therefore no kernel modules. Ubuntu does this automatically via DKMS if it updates proprietary drivers like fglrx. This fails with ld-gold (you have to remove gold, restart DKMS, reinstall ld-gold.
As it took me a little while to find out how to selectively use gold (i.e. not system-wide using a symlink), I'll post the solution here. It's based on http://code.google.com/p/chromium/wiki/LinuxFasterBuilds#Linking_using_gold .
Make a directory where you can put a gold glue script. I am using ~/bin/gold/.
Put the following glue script there and name it ~/bin/gold/ld:
#!/bin/bash
gold "$#"
Obviously, make it executable, chmod a+x ~/bin/gold/ld.
Change your calls to gcc to gcc -B$HOME/bin/gold which makes gcc look in the given directory for helper programs like ld and thus uses the glue script instead of the system-default ld.
Can gcc/g++ directly call gold.?
Just to complement the answers: there is a gcc's option -fuse-ld=gold (see gcc doc). Though, AFAIK, it is possible to configure gcc during the build in a way that the option will not have any effect.
Minimal synthetic benchmark: LD vs gold vs LLVM LLD
Outcome:
gold was about 3x to 4x faster for all values I've tried when using -Wl,--threads -Wl,--thread-count=$(nproc) to enable multithreading
LLD was about 2x faster than gold!
Tested on:
Ubuntu 20.04, GCC 9.3.0, binutils 2.34, sudo apt install lld LLD 10
Lenovo ThinkPad P51 laptop, Intel Core i7-7820HQ CPU (4 cores / 8 threads), 2x Samsung M471A2K43BB1-CRC RAM (2x 16GiB), Samsung MZVLB512HAJQ-000L7 SSD (3,000 MB/s).
Simplified description of the benchmark parameters:
1: number of object files providing symbols
2: number of symbols per symbol provider object file
3: number of object files using all provided symbols symbols
Results for different benchmark parameters:
10000 10 10
nogold: wall=4.35s user=3.45s system=0.88s 876820kB
gold: wall=1.35s user=1.72s system=0.46s 739760kB
lld: wall=0.73s user=1.20s system=0.24s 625208kB
1000 100 10
nogold: wall=5.08s user=4.17s system=0.89s 924040kB
gold: wall=1.57s user=2.18s system=0.54s 922712kB
lld: wall=0.75s user=1.28s system=0.27s 664804kB
100 1000 10
nogold: wall=5.53s user=4.53s system=0.95s 962440kB
gold: wall=1.65s user=2.39s system=0.61s 987148kB
lld: wall=0.75s user=1.30s system=0.25s 704820kB
10000 10 100
nogold: wall=11.45s user=10.14s system=1.28s 1735224kB
gold: wall=4.88s user=8.21s system=0.95s 2180432kB
lld: wall=2.41s user=5.58s system=0.74s 2308672kB
1000 100 100
nogold: wall=13.58s user=12.01s system=1.54s 1767832kB
gold: wall=5.17s user=8.55s system=1.05s 2333432kB
lld: wall=2.79s user=6.01s system=0.85s 2347664kB
100 1000 100
nogold: wall=13.31s user=11.64s system=1.62s 1799664kB
gold: wall=5.22s user=8.62s system=1.03s 2393516kB
lld: wall=3.11s user=6.26s system=0.66s 2386392kB
This is the script that generates all the objects for the link tests:
generate-objects
#!/usr/bin/env bash
set -eu
# CLI args.
# Each of those files contains n_ints_per_file ints.
n_int_files="${1:-10}"
n_ints_per_file="${2:-10}"
# Each function adds all ints from all files.
# This leads to n_int_files x n_ints_per_file x n_funcs relocations.
n_funcs="${3:-10}"
# Do a debug build, since it is for debug builds that link time matters the most,
# as the user will be recompiling often.
cflags='-ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic'
# Cleanup previous generated files objects.
./clean
# Generate i_*.c, ints.h and int_sum.h
rm -f ints.h
echo 'return' > int_sum.h
int_file_i=0
while [ "$int_file_i" -lt "$n_int_files" ]; do
int_i=0
int_file="${int_file_i}.c"
rm -f "$int_file"
while [ "$int_i" -lt "$n_ints_per_file" ]; do
echo "${int_file_i} ${int_i}"
int_sym="i_${int_file_i}_${int_i}"
echo "unsigned int ${int_sym} = ${int_file_i};" >> "$int_file"
echo "extern unsigned int ${int_sym};" >> ints.h
echo "${int_sym} +" >> int_sum.h
int_i=$((int_i + 1))
done
int_file_i=$((int_file_i + 1))
done
echo '1;' >> int_sum.h
# Generate funcs.h and main.c.
rm -f funcs.h
cat <<EOF >main.c
#include "funcs.h"
int main(void) {
return
EOF
i=0
while [ "$i" -lt "$n_funcs" ]; do
func_sym="f_${i}"
echo "${func_sym}() +" >> main.c
echo "int ${func_sym}(void);" >> funcs.h
cat <<EOF >"${func_sym}.c"
#include "ints.h"
int ${func_sym}(void) {
#include "int_sum.h"
}
EOF
i=$((i + 1))
done
cat <<EOF >>main.c
1;
}
EOF
# Generate *.o
ls | grep -E '\.c$' | parallel --halt now,fail=1 -t --will-cite "gcc $cflags -c -o '{.}.o' '{}'"
GitHub upstream.
Note that the object file generation can be quite slow, since each C file can be quite large.
Given an input of type:
./generate-objects [n_int_files [n_ints_per_file [n_funcs]]]
it generates:
main.c
#include "funcs.h"
int main(void) {
return f_0() + f_1() + ... + f_<n_funcs>();
}
f_0.c, f_1.c, ..., f_<n_funcs>.c
extern unsigned int i_0_0;
extern unsigned int i_0_1;
...
extern unsigned int i_1_0;
extern unsigned int i_1_1;
...
extern unsigned int i_<n_int_files>_<n_ints_per_file>;
int f_0(void) {
return
i_0_0 +
i_0_1 +
...
i_1_0 +
i_1_1 +
...
i_<n_int_files>_<n_ints_per_file>
}
0.c, 1.c, ..., <n_int_files>.c
unsigned int i_0_0 = 0;
unsigned int i_0_1 = 0;
...
unsigned int i_0_<n_ints_per_file> = 0;
which leads to:
n_int_files x n_ints_per_file x n_funcs
relocations on the link.
Then I compared:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main *.o
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -fuse-ld=gold -Wl,--threads -Wl,--thread-count=`nproc` -o main *.o
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -fuse-ld=lld -o main *.o
Some limits I've been trying to mitigate when selecting the test parameters:
at 100k C files, both methods get failed mallocs occasionally
GCC cannot compile a function with 1M additions
I have also observed a 2x in the debug build of gem5: https://gem5.googlesource.com/public/gem5/+/fafe4e80b76e93e3d0d05797904c19928587f5b5
Similar question: https://unix.stackexchange.com/questions/545699/what-is-the-gold-linker
Phoronix benchmarks
Phoronix did some benchmarking in 2017 for some real world projects, but for the projects they examined, the gold gains were not so significant: https://www.phoronix.com/scan.php?page=article&item=lld4-linux-tests&num=2 (archive).
Known incompatibilities
gold
https://sourceware.org/bugzilla/show_bug.cgi?id=23869 gold failed if I do a partial link with LD and then try the final link with gold. lld worked on the same test case.
https://github.com/cirosantilli/linux-kernel-module-cheat/issues/109 my debug symbols appeared broken in some places
LLD benchmarks
At https://lld.llvm.org/ they give build times for a few well known projects. with similar results to my synthetic benchmarks. Project/linker versions are not given unfortunately. In their results:
gold was about 3x/4x faster than LD
LLD was 3x/4x faster than gold, so a greater speedup than in my synthetic benchmark
They comment:
This is a link time comparison on a 2-socket 20-core 40-thread Xeon E5-2680 2.80 GHz machine with an SSD drive. We ran gold and lld with or without multi-threading support. To disable multi-threading, we added -no-threads to the command lines.
and results look like:
Program | Size | GNU ld | gold -j1 | gold | lld -j1 | lld
-------------|----------|---------|----------|---------|---------|-------
ffmpeg dbg | 92 MiB | 1.72s | 1.16s | 1.01s | 0.60s | 0.35s
mysqld dbg | 154 MiB | 8.50s | 2.96s | 2.68s | 1.06s | 0.68s
clang dbg | 1.67 GiB | 104.03s | 34.18s | 23.49s | 14.82s | 5.28s
chromium dbg | 1.14 GiB | 209.05s | 64.70s | 60.82s | 27.60s | 16.70s
As a Samba developer, I have been using the gold linker almost exclusively on Ubuntu, Debian, and Fedora since several years now. My assessment:
gold is many times (felt: 5-10 times) faster than the classical linker.
Initially, there were a few problems, but they have gone since roughly around Ubuntu 12.04.
The gold linker even found some dependency problems in our code, since it seems to be more correct than the classical one with respect to some details. See, e.g. this Samba commit.
I have not used gold selectively, but have been using symlinks or the alternatives mechanism if the distribution provides it.
You could link ld to gold (in a local binary directory if you have ld installed to avoid overwriting):
ln -s `which gold` ~/bin/ld
or
ln -s `which gold` /usr/local/bin/ld
Some projects seem to be incompatible with gold, because of some incompatible differences between ld and gold. Example: OpenFOAM, see http://www.openfoam.org/mantisbt/view.php?id=685 .
DragonFlyBSD switched over to gold as their default linker. So it seems to be ready for a variety of tools.
More details:
http://phoronix.com/scan.php?page=news_item&px=DragonFlyBSD-Gold-Linker
Related
I'm trying to use packages that require Rcpp in R on my M1 Mac, which I was never able to get up and running after purchasing this computer. I updated it to Monterey in the hope that this would fix some installation issues but it hasn't. I tried running the Rcpp check from this page but I get the following error:
> Rcpp::sourceCpp("~/github/helloworld.cpp")
ld: warning: directory not found for option '-L/opt/R/arm64/gfortran/lib/gcc/aarch64-apple-darwin20.2.0/11.0.0'
ld: warning: directory not found for option '-L/opt/R/arm64/gfortran/lib'
ld: library not found for -lgfortran
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [sourceCpp_4.so] Error 1
clang++ -arch arm64 -std=gnu++14 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include -I"/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/Rcpp/include" -I"/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/RcppArmadillo/include" -I"/Users/afredston/github" -I/opt/R/arm64/include -fPIC -falign-functions=64 -Wall -g -O2 -c helloworld.cpp -o helloworld.o
clang++ -arch arm64 -std=gnu++14 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/opt/R/arm64/lib -o sourceCpp_4.so helloworld.o -L/Library/Frameworks/R.framework/Resources/lib -lRlapack -L/Library/Frameworks/R.framework/Resources/lib -lRblas -L/opt/R/arm64/gfortran/lib/gcc/aarch64-apple-darwin20.2.0/11.0.0 -L/opt/R/arm64/gfortran/lib -lgfortran -lemutls_w -lm -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
Error in Rcpp::sourceCpp("~/github/helloworld.cpp") :
Error 1 occurred building shared library.
I get that it can't "find" gfortran. I installed this release of gfortran for Monterey. When I type which gfortran into Terminal, it returns /opt/homebrew/bin/gfortran. (Maybe this version of gfortran requires Xcode tools that are too new—it says something about 13.2 and when I run clang --version it says 13.0—but I don't see another release of gfortran for Monterey?)
I also appended /opt/homebrew/bin: to PATH in R so it looks like this now:
> Sys.getenv("PATH")
[1] "/opt/homebrew/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/Library/TeX/texbin:/Applications/RStudio.app/Contents/MacOS/postback"
Other things I checked:
Xcode command line tools is installed (which clang returns /usr/bin/clang).
Files ~/.R/Makevars and ~/.Renviron don't exist.
Here's my session info:
R version 4.1.1 (2021-08-10)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.1.1 tools_4.1.1 RcppArmadillo_0.10.7.5.0
[4] Rcpp_1.0.7
Background
Currently (2023-02-20), CRAN builds R 4.2 binaries for Apple silicon using Apple Clang from Command Line Tools for Xcode 13.1 and using an experimental fork of GNU Fortran 12.
If you obtain R from CRAN (i.e., here), then you need to replicate CRAN's compiler setup on your system before building R packages that contain C/C++/Fortran code from their sources (and before using Rcpp, etc.). This requirement ensures that your package builds are compatible with R itself.
A further complication is the fact that Apple Clang doesn't support OpenMP, so you need to do even more work to compile programs that make use of multithreading. You could circumvent the issue by building R itself, all R packages, and all external libraries from sources with LLVM Clang, which does support OpenMP, but that approach is onerous and "for experts only".
There is another approach that has been tested by a few people, including Simon Urbanek, the maintainer of R for macOS. It is experimental and also "for experts only", but it works on my machine and is much simpler than learning to build R and other libraries yourself.
Instructions for obtaining a working toolchain
Warning: These come with no warranty and could break at any time. Some level of familiarity with C/C++/Fortran program compilation, Makefile syntax, and Unix shells is assumed. Everyone is encouraged to consult official documentation, which is more likely to be maintained than answers on SO. As usual, sudo at your own risk.
I will try to address compilers and OpenMP support at the same time. I am going to assume that you are starting from nothing. Feel free to skip steps you've already taken, though you might find a fresh start helpful.
I've tested these instructions on a machine running Big Sur, but they should also work on Monterey and Ventura.
Download an R 4.2 binary from CRAN here and install. Be sure to select the binary built for Apple silicon.
Run
$ sudo xcode-select --install
in Terminal to install the latest release version of Apple's Command Line Tools for Xcode, which includes Apple Clang. You can obtain earlier versions from your browser here. However, the version that you install should not be older than the one that CRAN used to build your R binary.
Download the GNU Fortran binary provided here and install by unpacking to root:
$ curl -LO https://mac.r-project.org/tools/gfortran-12.0.1-20220312-is-darwin20-arm64.tar.xz
$ sudo tar xvf gfortran-12.0.1-20220312-is-darwin20-arm64.tar.xz -C /
$ sudo ln -sfn $(xcrun --show-sdk-path) /opt/R/arm64/gfortran/SDK
The last command updates a symlink inside of the installation so that it points to the SDK inside of your Command Line Tools installation.
Download an OpenMP runtime suitable for your Apple Clang version here and install by unpacking to root. You can query your Apple Clang version with clang --version. For example, I have version 1300.0.29.3, so I did:
$ curl -LO https://mac.r-project.org/openmp/openmp-12.0.1-darwin20-Release.tar.gz
$ sudo tar xvf openmp-12.0.1-darwin20-Release.tar.gz -C /
After unpacking, you should find these files on your system:
/usr/local/lib/libomp.dylib
/usr/local/include/ompt.h
/usr/local/include/omp.h
/usr/local/include/omp-tools.h
Add the following lines to $(HOME)/.R/Makevars, creating the file if necessary.
CPPFLAGS += -I/usr/local/include -Xclang -fopenmp
LDFLAGS += -L/usr/local/lib -lomp
Test that you are able to use R to compile a C or C++ program with OpenMP support while linking relevant libraries from the GNU Fortran installation (indicated by the -l flags in the output of R CMD CONFIG FLIBS).
The most transparent approach is to use R CMD SHLIB directly. In a temporary directory, create an empty source file omp_test.c and add the following lines:
#ifdef _OPENMP
# include <omp.h>
#endif
#include <Rinternals.h>
SEXP omp_test(void)
{
#ifdef _OPENMP
Rprintf("OpenMP threads available: %d\n", omp_get_max_threads());
#else
Rprintf("OpenMP not supported\n");
#endif
return R_NilValue;
}
Compile it:
$ R CMD SHLIB omp_test.c $(R CMD CONFIG FLIBS)
Then call the compiled C function from R:
$ R -e 'dyn.load("omp_test.so"); invisible(.Call("omp_test"))'
OpenMP threads available: 8
If the compiler or linker throws an error, or if you find that OpenMP is still not supported, then one of us has made a mistake. Please report any issues.
Note that you can implement the same test using Rcpp, if you don't mind installing it:
library(Rcpp)
registerPlugin("flibs", Rcpp.plugin.maker(libs = "$(FLIBS)"))
sourceCpp(code = '
#ifdef _OPENMP
# include <omp.h>
#endif
#include <Rcpp.h>
// [[Rcpp::plugins(flibs)]]
// [[Rcpp::export]]
void omp_test()
{
#ifdef _OPENMP
Rprintf("OpenMP threads available: %d\\n", omp_get_max_threads());
#else
Rprintf("OpenMP not supported\\n");
#endif
return;
}
')
omp_test()
OpenMP threads available: 8
References
Everything is a bit scattered:
R Installation and Administration manual [link]
Writing R Extensions manual [link]
R for macOS Developers web page [link]
I resolved this issue by adding a path to the homebrew installation of gfortran to my ~/.R/Makevars following these instructions: https://pat-s.me/transitioning-from-x86-to-arm64-on-macos-experiences-of-an-r-user/#gfortran
I just avoided the issue until MacOS had things working more smoothly. so I either Windows Developer Virtual Machine (VM) or run my code development in another environment. I'm not too impressed with the updated and "faster" chipset, but that it doesn't work with much. Slow to implement and work-a-rounds often are a must.
Tested the following process for making multithread data.table work in a M2 MacBook Pro (macOS Monterey)
Steps are mostly the same with this answer by the user inferator.
Download and install R from CRAN
Download and install RStudio with developer tools
Run the following commands in terminal to install OpenMP
curl -O https://mac.r-project.org/openmp/openmp-12.0.1-darwin20-Release.tar.gz
sudo tar fvxz openmp-12.0.1-darwin20-Release.tar.gz -C /
Add compiler flags to connect clan w/ OpenMP. In terminal, write the following:
cd ~
mkdir .R
nano .R/Makevars
Inside the opened Makevars file paste the following lines. Once finished, hit command+O and then Enter to save. Do a command+X to close the editor.
CPPFLAGS += -Xclang -fopenmp
LDFLAGS += -lomp
Download and run the installer for gfortran by downloading gfortran-ARM-12.1-Monterey.dmg from the respective GitHub repo
This concludes the steps regarding enabling OpenMP and (hopefully) Rcpp in R under a M2 chip system.
Now, for testing that everything works with data.table I did the following
Open RStudio and run
install.packages("data.table", type = "source")
If everything is done correctly, the package should compile without any errors and return the following when running getDTthreads(verbose = TRUE):
OpenMP version (_OPENMP) 201811
omp_get_num_procs() 8
R_DATATABLE_NUM_PROCS_PERCENT unset (default 50)
R_DATATABLE_NUM_THREADS unset
R_DATATABLE_THROTTLE unset (default 1024)
omp_get_thread_limit() 2147483647
omp_get_max_threads() 8
OMP_THREAD_LIMIT unset
OMP_NUM_THREADS unset
RestoreAfterFork true
data.table is using 4 threads with throttle==1024. See ?setDTthreads.
[1] 4
I am currently trying to install NIST's sclite, which is part of SCTK 2.4.0 (github or current version). I am attempting the install on Cygwin in bash. The installation is done using make.
What I've Done
I made a directory for the install and navigated to that directory
mkdir sctk2.4.0
cd sctk2.4.0
( You'll possibly need $ cd /path/to/dir/sctk2.4.0 .)
I cloned the project from github
git clone https://github.com/chinshr/sctk.git
navigated into the base folder
cd sctk
then I started following the instructions in the INSTALL file.
Running
make config
worked fine, but after typing
make all
I got the output that follows
(mkdir -p bin)
(cd src; make all)
make[1]: Entering directory '/cygdrive/c/David/programs/sctk2.4.0/sctk/src'
(cd asclite; make all)
make[2]: Entering directory '/cygdrive/c/David/programs/sctk2.4.0/sctk/src/asclite'
(cd core; make all)
make[3]: Entering directory '/cygdrive/c/David/programs/sctk2.4.0/sctk/src/asclite/core'
g++ -o asclite -g -Os alignment.o segment.o sgml_reportgenerator.o alignedsegmentiterator.o reportgenerator.o speechset.o segmentsgroup.o logger.o tokenalignment.o sgml_generic_reportgenerator.o recording.o statistics.o compressedlevenshteinmatrix.o segmentor.o id.o trntrn_segmentor.o linestyle_inputparser.o inputparser.o levenshteinmatrix.o levenshtein.o uemfilter.o speakermatch.o spkrautooverlap.o graphalignedsegment.o rawsys_reportgenerator.o graphalignedtoken.o timedobject.o stt_scorer.o aligner.o arraylevenshteinmatrix.o graph.o main.o trn_inputparser.o alignedspeech.o token.o alignedsegment.o graph_coordinate.o rttm_inputparser.o scorer.o properties.o ctmstmrttm_segmentor.o filter.o speech.o alignedspeechiterator.o stm_inputparser.o checker.o ctm_inputparser.o lzma/LzFind.o lzma/LzmaEnc.o lzma/Alloc.o lzma/LzmaLib.o lzma/LzmaDec.o -lm
alignment.o: file not recognized: File format not recognized
collect2: error: ld returned 1 exit status
make[3]: *** [makefile:62: asclite] Error 1
make[3]: Leaving directory '/cygdrive/c/David/programs/sctk2.4.0/sctk/src/asclite/core'
make[2]: *** [makefile:12: all] Error 2
make[2]: Leaving directory '/cygdrive/c/David/programs/sctk2.4.0/sctk/src/asclite'
make[2]: *** [makefile:12: all] Error 2
make[2]: Leaving directory '/cygdrive/c/David/programs/sctk2.4.0/sctk/src'
make: *** [makefile:20: all] Error 2
I've looked at this SO post, but I've determined that the alignment.o file is not corrupted. Just in case, I tried a few make clean and even re-cloned the project from github, but I still get the same error.
Can anyone help me complete this installation, or at least get to the next step?
System Details
$ uname -a
CYGWIN_NT-6.1 MyMachine 2.10.0(0.325/5/3) 2018-02-02 15:16 x86_64 Cygwin
$ bash --version
GNU bash, version 4.4.12(3)-release (x86_64-unknown-cygwin) ...
$ gcc --version
gcc (GCC) 6.4.0 ...
$ g++ --version
g++ (GCC) 6.4.0 ...
$ make --version
GNU Make 4.2.1
Built for x86_64-unknown-cygwin ...
$ systeminfo | sed -n 's/^OS\ *//p'
Name: Microsoft Windows 7 Enterprise
Version: 6.1.7601 Service Pack 1 Build 7601
Manufacturer: Microsoft Corporation
Configuration: Member Workstation
Build Type: Multiprocessor Free
Note
I'm asking about this problem and then giving an answer to my own question. (I like that StackOverflow is allowing us to do that.) Hopefully, this will make it easier for people to help me with the problems I ran into further in the installation.
The next problem I ran into is discussed here. You can see this next problem in the answer to this problem.
This is the 'EASIER' solution.
Here are the details on what I called "the kaldi solution". Right now, it's just a list of commands without details. As shown here, these commands will install a sclite-2.4.10 directory under the $HOME (~) directory:
$ cd
$ git clone https://github.com/kaldi-asr/kaldi.git
$ cd kaldi/tools
$ extras/check_dependencies.sh
$ make -j $(nproc --all)
$ cp -R sctk-2.4.10 ~/
$ cd
$ rm -rf kaldi
$ cd sctk-2.4.10/
$ cp $HOME/.bashrc "${HOME}/.bashrc.$(date +%Y%m%d-%H%M%S).bak"
$ echo -e "\n\n## Allow access to sclite, rfilter, etc" >> $HOME/.bashrc
$ echo 'export PATH='"$(pwd)/bin"':$PATH' >> $HOME/.bashrc
$ source ~/.bashrc
See this question/answer for details on how to use it on Windows.
(See my comment under the question for the kaldi solution.)
The solution to this problem was in the README, as solutions often are. Note: There was another problem which came up after this problem was solved. See the bottom of this answer for help with that.
Here is the command I used to get the pertinent info from the README.
cat README | tail -13
and here is the pertinent info
64 bits Compilation
With big alignments, sctk needs to be compiled in 64 bits.
By default, the C/C++ software are compiled in 32 bits with the options (-Os)
but can be compiled in 64 bits, -m64 is added to the CFLAGS variable in:
src/asclite/core/makefile
src/asclite/test/makefile
src/rfilter1/makefile
src/sclite/makefile
Example of CFLAGS:
For OSX 10.4+: -fast -m64 -arch x86_64 -arch ppc64
So, I went to the makefiles listed (except rfilter1, see below) and changed the code there, replacing each -Os with -m64. Do this ONLY for the makefiles that are listed. I'll give an example for one of the listed files, but note you will have to do it for the others.
cd sctk
vim src/asclite/core/makefile
When the file was open, I found the line:
CFLAGS = -g -Os
which I changed to
CFLAGS = -g -m64
(pressed "i" to get into INSERT mode, made the change, pressed "Esc", then pressed ":wq" (Write and Quit) followed by "Enter")
I made the changes in all the listed files EXCEPT src/rfilter1/makefile, because that file had no -Os in it. This ended up being important, as the install wouldn't work if I had changed this file at all.
After this was completed, I ran make clean, but I DID NOT run make config, because this would have undone the changes we just make. I went directly to:
make all
This gets us past where we were before.
This problem was taken care of, but there was another problem:
In file included from main.cpp:20:0:
recording.h:122:36: error: template argument 2 is invalid
map<string, Filter::Filter*> filters;
^
recording.h:122:36: error: template argument 4 is invalid
which I asked about here.
AN EVEN EASIER SOLUTION - Taking advantage of a new, edited version of sclite
This is in case someone finds the answer here useful. I know there are no votes here, but I just got a trophy telling me this is my first question to get 1000 views, so I'll update my answer to show the easiest way to get things done.
TL;DR
https://www.nist.gov/itl/iad/mig/tools
https://github.com/usnistgov/SCTK
% cd /the/dir/where/i/want/to/install
% git clone https://github.com/usnistgov/SCTK.git
% cd SCTK
From the git master README (quoted) with some comments I've put in.
% make config
% sed -i 's#[-]Os#-m64#g' src/asclite/core/makefile
% sed -i 's#[-]Os#-m64#g' src/asclite/test/makefile
% sed -i 's#[-]Os#-m64#g' src/sclite/makefile
% make clean
% ## Possible edit to the `rfilter1 makefile`, which are
% ## described at the end of the answer but were not necessary
% ## for me.
% make all
% make check
% make install
% make doc
I also add the executables' directory to my PATH and make the documentation available via the man command.
% pwd
/the/dir/where/i/want/to/install/SCTK
% # back up your `.bashrc`
% cp $HOME/.bashrc "${HOME}/.bashrc.$(date +%Y%m%d-%H%M%S).bak"
% # persistent path changes
% echo -e "\n\n## Allow access to sclite, rfilter, etc" >> $HOME/.bashrc
% # your machine might use something other than `export` for this. CHECK!
% echo 'export PATH='"$(pwd)/bin"':$PATH' >> $HOME/.bashrc
% # make changes availabel this session
% source ~/.bashrc
% # man stuff
% cd doc
% cp -r ./* /usr/man/man1
END OF THE TL;DR SECTION
Details
Since when I posted this in May 2018, there have finally been some updates made to the software:
I posted my question in May 2018, and the updates were made in Fall 2018. They partly fix the problems I ran into here, but some of the information in the README and some makefiles are useful to note here.
The 32- to 64- bit issue (changing -Os to -m64, as done above) was find-able via the README.
% cat -n README.md | grep -A 4 "64 bits Compilation"
61 **64 bits Compilation**:
62 With big alignments, sctk needs to be compiled in 64 bits. By default, the C/C++ software are compiled in 32 bits with the options (`-Os`) but can be compiled in 64 bits. To do so, `-m64` is added to the CFLAGS variable in `src/asclite/core/makefile`, `src/asclite/test/makefile`, `src/rfilter1/makefile` and `src/sclite/makefile`.
63
64 Example of `CFLAGS` for OSX 10.4+: `-fast -m64 -arch x86_64 -arch ppc64`
65
Here is line 62 with word wrap
With big alignments, sctk needs to be compiled in 64 bits. By default, the C/C++ software are compiled in 32 bits with the options (-Os) but can be compiled in 64 bits. To do so, -m64 is added to the CFLAGS variable in
src/asclite/core/makefile,
src/asclite/test/makefile,
src/rfilter1/makefile and
src/sclite/makefile.
Since there was no -Os in src/rfilter1/makefile, I didn't make any changes.
I was able to finish the installation with no problem (including no failed tests). Here is my system info.
$ uname -a
CYGWIN_NT-10.0 MyMachine 3.0.7(0.338/5/3) 2019-04-30 18:08 x86_64 Cygwin
$ bash --version | head -n 1
GNU bash, version 4.4.12(3)-release (x86_64-unknown-cygwin)
$ gcc --version | head -n 1
gcc (GCC) 7.4.0
$ g++ --version | head -n 1
g++ (GCC) 7.4.0
$ make --version | head -n 2
GNU Make 4.2.1
Built for x86_64-unknown-cygwin
$ systeminfo | sed -n 's/^OS\ *//p'
Name: Microsoft Windows 10 Enterprise
Version: 10.0.17134 N/A Build 17134
Manufacturer: Microsoft Corporation
Configuration: Member Workstation
Build Type: Multiprocessor Free
However, it seems that some other people trying to compile on Cygwin have had issues. Here is some more info from the README
% grep "Special Note to Cygwin users" README.md
*Special Note to Cygwin users:* it has been reported that compilation of `rfilter1` can fail in some case, please read the OPTIONS part of the `rfilter1/makefile` and adapt accordingly before retrying compilation.
Well, let's look at the makefile for rfilter1, and see what some of you might need to do.
% head -n 15 src/rfilter1/makefile | tail -7
########################### Options for compilation #########################
####### If you have an very new version of GCC, the strcmp* family of functions
####### is included in the distribution. Changing the value of OPTIONS to
####### be blank will diable the use of supplied versions of these functions.
####### In particular, this behavior has been noted on some versions of cygwin
OPTIONS=-DNEED_STRCMP=1 $(CPPFLAGS) $(CFLAGS) $(LDFLAGS)
So, if you have rfilter1 compilation problems, change the non-commented line to
OPTIONS= $(CPPFLAGS) $(CFLAGS) $(LDFLAGS)
I have a GNUmakefile that respects CXX and CXXFLAGS. It also performs some platform and architecture tests. Currently, the makefile assumes the host and target are the same:
IS_X86 = $(shell uname -m | $(EGREP) -c "i.86|x86|i86|amd64")
In an effort to improve robustness, I want to ask the tools what it is compiling for. I've come up with the following, but I'm not sure it is correct.
$ export CXX=clang++
$ export CXXFLAGS="-DNDEBUG -g2 -O3 -m32"
$ $CXX $CXXFLAGS -dM -E - < /dev/null | egrep "(i386|x86_64)"
#define __i386 1
#define __i386__ 1
#define i386 1
$ export CXX=clang++
$ export CXXFLAGS="-DNDEBUG -g2 -O3"
$ $CXX $CXXFLAGS -dM -E - < /dev/null | egrep "(i386|x86_64)"
#define __x86_64 1
#define __x86_64__ 1
My question is, will the above - with CXX and CXXFLAGS - work reliably to detect a target? Or do I need something else?
Here's the two reasons I ask. First, my experience with Autotools indicates something different. When Autotools performs a test like above, they test CPP, and sometimes CPP or CXX needs to include --isysroot (or other hacks) to get things configured properly.
Second, some toolchains, like Clang, integrate other components (like a preprocessor or assembler), so I can't use CPP directly under all circumstances.
In fact, doing something as simple as $CXX -Wa,-v - </dev/null (ask assembler for its version) results in an "unsupported option" error under Clang when using its integrated assembler. (Cf., With integrated assembler enabled, fail to fetch version string of assembler).
And just in case: this is not an Autools or Cmake project. It does not use Boost or any other libraries. Its a stand alone C++03 project.
My question is, will the above - with CXX and CXXFLAGS - work reliably to detect a target?
The answer is Yes, it will. The preprocessor or compielr driver (passing down to preprocessor) will mostly yield expected target defines with all else being equal. Notable exception is GCC and ARMv8/Aarch64, which is missing a slew of expected defines.
The thing to avoid is uname -m (and friends). Uname reports information on the host, and not the target.
I downloaded and built gcc 4.8.1 on my desktop, running 64-bit Ubuntu 12.04. I built it out of source, like the docs recommend, and with the commands
../../gcc-4.8.1/configure --prefix=$HOME --program-suffix=-4.8
make
make -k check
make install
It seemed to pass all the tests, and I installed everything into my home directory w/ the suffix -4.8 to distinguish from the system gcc, which is version 4.6.3.
Unfortunately when I compile c++ programs using g++-4.8 it links to the system libc and libstdc++ rather than the newer ones compiled from gcc-4.8.1. I downloaded and built gcc 4.8 because I wanted to play around with the new C++11 features in the standard library, so this behaviour is definitely not what I wanted. What can I do to get gcc-4.8 to automatically link to the standard libraries that came with it rather than the system standard libraries?
When you link with your own gcc you need to add an extra run-time linker search path(s) with -Wl,-rpath,$(PREFIX)/lib64 so that at run-time it finds the shared libraries corresponding to your gcc.
I normally create a wrapper named gcc and g++ in the same directory as gcc-4.8 and g++-4.8 which I invoke instead of gcc-4.8 and g++-4.8, as prescribed in Dynamic linker is unable to find GCC libraries:
#!/bin/bash
exec ${0}SUFFIX -Wl,-rpath,PREFIX/lib64 "$#"
When installing SUFFIX and PREFIX should be replaced with what was passed to configure:
cd ${PREFIX}/bin && rm -f gcc g++ c++ gfortran
sed -e 's#PREFIX#${PREFIX}#g' -e 's#SUFFIX#${SUFFIX}#g' gcc-wrapper.sh > ${PREFIX}/bin/gcc
chmod +x ${PREFIX}/bin/gcc
cd ${PREFIX}/bin && ln gcc g++ && ln gcc c++ && ln gcc gfortran
(gcc-wrapper.sh is that bash snippet).
The above solution does not work with some versions of libtool because g++ -Wl,... -v assumes linking mode and fails with an error.
A better solution is to use specs file. Once gcc/g++ is built, invoke the following command to make gcc/g++ add -rpath to the linker command line (replace ${PREFIX}/lib64 as necessary):
g++ -dumpspecs | awk '/^\*link:/ { print; getline; print "-rpath=${PREFIX}/lib64", $0; next } { print }' > $(dirname $(g++ -print-libgcc-file-name))/specs
I just had the same problem when building gcc-4.8.2. I don't have root access on that machine and therefore need to install to my home directory. It took several attempts before I figured out the magic required to get this to work so I will reproduce it here so other people will have an easier time. These are the commands that I used to configure gcc:
prefix=/user/grc/packages
export LDFLAGS=-Wl,-rpath,$prefix/lib
export LD_RUN_PATH=$prefix/lib
export LD_LIBRARY_PATH=$prefix/lib
../../src/gmp-4.3.2/configure --prefix=$prefix
../../src/mpfr-2.4.2/configure --prefix=$prefix
../../src/mpc-0.8.1/configure --prefix=$prefix --with-mpfr=$prefix --with-gmp=$prefix
../../src/gcc-4.8.2/configure --prefix=$prefix --with-mpfr=$prefix --with-gmp=$prefix --with-mpc=$prefix --enable-languages=c,c++
That got me a working binary but any program I built with that version of g++ wouldn't run correctly unless I built it with the -Wl,-rpath,$prefix/lib64 option. It is possible to get g++ to automatically add that option by providing a specs file. If you run
strace g++ 2>&1 | grep specs
you can see which directories it checks for a specs file. In my case it was $prefix/lib/gcc/x86_64-unknown-linux-gnu/4.8.2/specs so I ran g++ -dumpspecs to create a new specs file:
cd $prefix/lib/gcc/x86_64-unknown-linux-gnu/4.8.2
$prefix/bin/g++ -dumpspecs > xx
mv xx specs
and then edited that file to provide the -rpath option. Search for the lines like this:
*link_libgcc:
%D
and edit to add the rpath option:
*link_libgcc:
%D -rpath /user/grc/packages/lib/%M
The %M expands to either ../lib or ../lib64 depending on whether you are building a 32-bit or a 64-bit executable.
Note that when I tried this same trick on an older gcc-4.7 build it didn't work because it didn't expand the %M. For older versions you can remove the %M and just hardcode lib or lib64 but that is only a viable solution if you only ever build 32-bit executables (with lib) or only ever build 64-bit executables (with lib64).
gcc -print-search-dirs will tell you where your compiler is looking for runtime libraries, etc. You can override this with the -B<prefix> option.
Ok, this is just a bit of a fun exercise, but it can't be too hard compiling programmes for some older linux systems, or can it?
I have access to a couple of ancient systems all running linux and maybe it'd be interesting to see how they perform under load. Say as an example we want to do some linear algebra using Eigen which is a nice header-only library. Any chance to compile it on the target system?
user#ancient:~ $ uname -a
Linux local 2.2.16 #5 Sat Jul 8 20:36:25 MEST 2000 i586 unknown
user#ancient:~ $ gcc --version
egcs-2.91.66
Maybe not... So let's compile it on a current system. Below are my attempts, mainly failed ones. Any more ideas very welcome.
Compile with -m32 -march=i386
user#ancient:~ $ ./a.out
BUG IN DYNAMIC LINKER ld.so: dynamic-link.h: 53: elf_get_dynamic_info: Assertion `! "bad dynamic tag"' failed!
Compile with -m32 -march=i386 -static: Runs on all fairly recent kernel versions but fails if they are slightly older with the well known error message
user#ancient:~ $ ./a.out
FATAL: kernel too old
Segmentation fault
This is a glibc error which has a minimum kernel version it supports, e.g. kernel 2.6.4 on my system:
$ file a.out
a.out: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
statically linked, for GNU/Linux 2.6.4, not stripped
Compile glibc myself with support for the oldest kernel possible. This post describes it in more detail but essentially it goes like this
wget ftp://ftp.gnu.org/gnu/glibc/glibc-2.14.tar.bz2
tar -xjf glibc-2.14.tar.bz2
cd glibc-2.14
mkdir build; cd build
../configure --prefix=/usr/local/glibc_32 \
--enable-kernel=2.0.0 \
--with-cpu=i486 --host=i486-linux-gnu \
CC="gcc -m32 -march=i486" CXX="g++ -m32 -march=i486"
make -j 4
make intall
Not sure if the --with-cpu and --host options do anything, most important is to force the use of compiler flags -m32 -march=i486 for 32-bit builds (unfortunately -march=i386 bails out with errors after a while) and --enable-kernel=2.0.0 to make the library compatible with older kernels. Incidentially, during configure I got the warning
WARNING: minimum kernel version reset to 2.0.10
which is still acceptable, I suppose. For a list of things which change with different kernels see ./sysdeps/unix/sysv/linux/kernel-features.h.
Ok, so let's link against the newly compiled glibc library, slightly messy but here it goes:
$ export LIBC_PATH=/usr/local/glibc_32
$ export LIBC_FLAGS=-nostdlib -L${LIBC_PATH} \
${LIBC_PATH}/crt1.o ${LIBC_PATH}/crti.o \
-lm -lc -lgcc -lgcc_eh -lstdc++ -lc \
${LIBC_PATH}/crtn.o
$ g++ -m32 -static prog.o ${LIBC_FLAGS} -o prog
Since we're doing a static compile the link order is important and may well require some trial and error, but basically we learn from what options gcc gives to the linker:
$ g++ -m32 -static -Wl,-v file.o
Note, crtbeginT.o and crtend.o are also linked against which I didn't need for my programmes so I left them out. The output also includes a line like --start-group -lgcc -lgcc_eh -lc --end-group which indicates inter-dependence between the libraries, see this post. I just mentioned -lc twice in the gcc command line which also solves inter-dependence.
Right, the hard work has paid off and now I get
$ file ./prog
./prog: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
statically linked, for GNU/Linux 2.0.10, not stripped
Brilliant I thought, now try it on the old system:
user#ancient:~ $ ./prog
set_thread_area failed when setting up thread-local storage
Segmentation fault
This, again, is a glibc error message from ./nptl/sysdeps/i386/tls.h. I fail to understand the details and give up.
Compile on the new system g++ -c -m32 -march=i386 and link on the old. Wow, that actually works for C and simple C++ programmes (not using C++ objects), at least for the few I've tested. This is not too surprising as all I need from libc is printf (and maybe some maths) of which the interface hasn't changed but the interface to libstdc++ is very different now.
Setup a virtual box with an old linux system and gcc version 2.95. Then compile gcc version 4.x.x ... sorry, but too lazy for that right now ...
???
Have found the reason for the error message:
user#ancient $ ./prog
set_thread_area failed when setting up thread-local storage
Segmentation fault
It's because glibc makes a system call to a function which is only available since kernel 2.4.20. In a way it can be seen as a bug of glibc as it wrongly claims to be compatible with kernel 2.0.10 when it requires at least kernel 2.4.20.
The details:
./glibc-2.14/nptl/sysdeps/i386/tls.h
[...]
/* Install the TLS. */ \
asm volatile (TLS_LOAD_EBX \
"int $0x80\n\t" \
TLS_LOAD_EBX \
: "=a" (_result), "=m" (_segdescr.desc.entry_number) \
: "0" (__NR_set_thread_area), \
TLS_EBX_ARG (&_segdescr.desc), "m" (_segdescr.desc)); \
[...]
_result == 0 ? NULL \
: "set_thread_area failed when setting up thread-local storage\n"; })
[...]
The main thing here is, it calls the assembly function int 0x80 which is a system call to the linux kernel which decides what to do based on the value of eax, which is set to
__NR_set_thread_area in this case and is defined in
$ grep __NR_set_thread_area /usr/src/linux-2.4.20/include/asm-i386/unistd.h
#define __NR_set_thread_area 243
but not in any earlier kernel versions.
So the good news is that point "3. Compiling glibc with --enable-kernel=2.0.0" will probably produce executables which run on all linux kernels >= 2.4.20.
The only chance to make this work with older kernels would be to disable tls (thread-local storage) but which is not possible with glibc 2.14, despite the fact it is offered as a configure option.
The reason you can't compile it on the original system likely has nothing to do with kernel version (it could, but 2.2 isn't generally old enough for that to be a stumbling block for most code). The problem is that the toolchain is ancient (at the very least, the compiler). However, nothing stops you from building a newer version of G++ with the egcs that is installed. You may also encounter problems with glibc once you've done that, but you should at least get that far.
What you should do will look something like this:
Build latest GCC with egcs
Rebuild latest GCC with the gcc you just built
Build latest binutils and ld with your new compiler
Now you have a well-built modern compiler and (most of a) toolchain with which to build your sample application. If luck is not on your side you may also need to build a newer version of glibc, but this is your problem - the toolchain - not the kernel.