Fully statically build application with all dependencies (libgcc, etc.)? - c++

I am currently trying to compile all my applications' dependencies as a static library. My motivation:
Not to rely on any OS provided libraries in order to have a perfectly reproducible code base
Avoid issues when deploying on other systems caused by dynamic linking
Avoid run-time clashes when linking against different versions of a library
Being able to cross-compile for other OS
However, as I initially dreaded I had to go down the rabbit hole quite fast. I am currently stuck with OpenCV and I'm sure there is more to come. However, my main questions are:
Is it possible to build an entirely statically build app (e.g. libc, ligcc, etc. ?)
Is it possible to link all libraries statically but link major components (libgcc, etc.) dynamically?
If not, is it possible to link against statically built libraries (e.g. OpenCV) but to satisfy their dependencies by linking dynamically (zlib, libc, etc.)?
I did research on the internet but couldn't find a comprehensive guide that dwells on the internals of linking (static vs. dynamic). Do you know about a good book / tutorial? Does a book about gcc get me further?
Is this a very stupid idea?

My motivation:
Not to rely on any OS provided libraries in order to have a perfectly reproducible code base
Avoid issues when deploying on other systems caused by dynamic linking
Avoid run-time clashes when linking against different versions of a library
Being able to cross-compile for other OS
Your motivations are all wrong.
For #1, you do not need a fully-static binary. You just need to link against a set of version-controlled libraries using --sysroot facility provided by GNU linkers
For #2, you motivation is misguided.
On Linux, a fully-static binary may crash in mysterious ways if the libc installed on a target system is different from (static) libc the program was built on. That is, a fully-static binary on Linux is (contrary to popular belief) significantly less portable than a dynamically linked one. One should simply never statically link libc.a on Linux.
This alone should make you abandon this approach (at least for any GLIBC based systems).
For #3, don't link against different versions of a library (at program build time), and no clashes will result.
For #4, the same solution as for #1 just works.

Related

How does musl's GCC wrapper differ from musl's cross-compiler?

I am trying to compile various programs such as MariaDB with a musl toolchain. That is, I don't want any dependencies on glibc or GNU's linker after compilation has finished.
Thus far, I have been using musl's GCC wrapper, musl-gcc to compile things. But, with larger programs like MariaDB I struggle to get all the necessary libraries and headers and symlinking or adding environment variables for the compilation doesn't really help.
I see mention of building a cross-compiler targeting musl libc with additional documentation and code at this GitHub repo. From the documentation on the cross-compiler:
This gives you a full, relocatable musl-targeting toolchain, with C++ support and its own library paths you can install third-party libraries into.
It sounds like this could help me, but I am not sure how this is very different from musl's GCC wrapper, which as I understand, just alters where GCC looks for libraries and headers etc.
Ultimately, I am unsure how different this cross-compiler really is from the GCC wrapper and if it would be useful in my case. Why would I need my own library paths to install third-party libraries into when I can just symlink to existing libraries and use the GCC wrapper? Is the cross-compiler the way I should be compiling things, especially bigger code bases?
All the wrapper does is remove the default library and include paths and add replacement ones. Otherwise, it relies on the compiler's view of the ABI being sufficiently matched that a GCC that thinks it's targeting glibc (*-linux-gnu) works with a musl (*-linux-musl) target.
A full GCC toolchain has a number of "target libraries" - libraries that are linked into the output program to provide some functionality the compiler offers. This includes libgcc (software multiply or divide, software floating point, etc. according to whether the target arch needs these things, and unwinding support for exception handling), libstd++ (the C++ standard library), and a number of other things like libgomp (GNU OpenMP runtime implementation for use with #pragma OMP ...). In theory all of these possibly depend on the specific target ABI, including the libc, and potentially link to symbols from libc. In practice, for the most part libgcc doesn't, and is "safely" reusable with a different libc as long as the basic type definitions and a few other ABI things match.
For the other libraries with more complex dependencies on libc, it's generally not expected that a build against one libc would work with a different one. musl libc provides some degree of ABI-compat for using glibc-linked libraries, so there's some hope that they'd work anyway, but you'd need to copy them to the new library path. However, GCC's C++ standard library headers also seem to have some heavy dependency on the (libc) system headers for the target, and when setup for glibc do not seem to work with musl. There is probably some way to make this work (i.e. to add C++ support to the wrapper), but it's an open problem exactly how to do it.
Even if this could be made to work, though, the deeper you go, the more reasons you're likely to find that you'd rather just have a proper cross-compiler toolchain that knows it's targeting musl. And nowadays these are easy enough to build. But if you find you're needing additonal third party libraries too, and don't want to build them yourself, it probably makes more sense to just use a Docker image (or other container) with a distro that provides library binaries for you.

How to build self-contained executable with gcc? [duplicate]

I wrote a program, that uses a shared library installed on my system. This library is seldom installed on other systems. How do I compile my program so that the library doesn't need to be installed on other systems? I have the source code for the library available. What's the best way?
The other systems of course have the same architecture and OS.
Compile it as a static library and link that into the executable.
Though the OP had solved his problem by answering a different question, there are (at least) two ways to wedge a shared library into your binary in case
there is no source code available
there is no compiler (or build-chain) available
static link does not work or it's not obvious how do it
to preserve memory layout - static link will change it and may "wake-up" hidden bugs
for "permanent link" LD_PRELOAD library into executable
The first is statifier (open source but limited to x86 and x86_64 and only object code)
The second that I know of is magic ermine (by the same developer). It is closed source, but the developer is friendly to opensource projects and ermine has the advantage of supporting more platforms as well as the ability to include all necessary data files within its virtual file system.
http://statifier.sourceforge.net/ and http://www.magicermine.com/

Static linking - working with GTKmm application? - revised

Is it possible to make a static linking (compilation) on Gtk(mm) program? I need the program to be less relaying on dependences in user's system.
I try:
g++ -static data/Area.h data/Picture.cpp data/GLScene.cpp data/KBDialog.cpp data/Dialogs.h data/FilePreview.cpp data/MainWindow.cpp prog.cpp -o prog `pkg-config --cflags --libs gtkmm-2.4 gtkglextmm-1.2 exiv2`
but It fails:
/usr/bin/ld: cannot find -lgtkmm-2.4
/usr/bin/ld: cannot find -lGL
/usr/bin/ld: cannot find -latkmm-1.6
/usr/bin/ld: cannot find -lgdkmm-2.4
/usr/bin/ld: cannot find -lpangomm-1.4
/usr/bin/ld: cannot find -lgdk_pixbuf-2.0
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(glocalfileinfo.o): In function `lookup_gid_name':
(.text+0x207a): warning: Using 'getgrgid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(glocalvfs.o): In function `g_local_vfs_parse_name':
(.text+0x26c): warning: Using 'getpwnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0x1244): warning: Using 'getpwuid' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0x1237): warning: Using 'setpwent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0x124f): warning: Using 'endpwent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0xf6e): warning: Using 'getpwnam_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(glocalfileinfo.o): In function `lookup_uid_data':
(.text+0x1eea): warning: Using 'getpwuid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libX11.a(xim_trans.o): In function `_XimXTransSocketUNIXConnect':
(.text+0xe23): warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(gnetworkaddress.o): In function `g_network_address_parse':
(.text+0xe3c): warning: Using 'getservbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(gnetworkaddress.o): In function `g_network_address_parse':
(.text+0xe4c): warning: Using 'endservent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
collect2: ld returned 1 exit status
I would rather avoid doing that, because GTK depends on tricky low level libraries which are really very system specific (perhaps libfontconfig.so etc), and contains system-specific information (e.g. builtin paths for fonts...).
I also think that GTK needs dynamic shared libraries to implement theming or styling (so GTK itself is calling dlopen, and having a statically linked libdl is not reasonable).
I suggest at least linking dynamically gtk and all its dependencies.
(Your question asked twice pisses me off, so here is a more detailed answer that I might edit and complete further)
Why dynamically linked shared libraries are useful?
First, almost every binary is dynamically linked on today's Linux systems. On my Debian/Sid system, I only have /sbin/ldconfig /bin/sash and /usr/bin/rar statically linked executables, but about seven thousand other dynamically linked executables (under /bin & /usr/bin). Even essential programs like /sbin/init are today dynamically linked.
There are several wins in having mostly dynamically linked ELF executables using shared libraries
Avoid wasting disk space. When dynamically linked executables did not exist (1986 era, SunOS3.5, because the kernel was not able to mmap file segments), people took a lot of time mixing several binaries in a single one (I remember textedit and cmdtool being the same binary, a mix of several programs on SunOS3.5) to win disk space. Ok, disk space is cheaper today, but if my seven thousand programs each had to link statically libc that would consume several gigabytes of disk space (and that would mean an extra DVD or hours of networking upload when installing a Linux distribution).
Enabling an easier update. When the packaging system (apt-get, dpkg and friends on Debian) upgrades a common shared library (like the GLibC or Gtk), it replaces the dynamically linked shared libraries (*.so files, called ELF shared objects) and all the future executions of binary using them take profit. So if /usr/lib/libgtk-3.so is updated, there is no need to update /usr/bin/gedit to take advantage of the bug fixes inside libgtk-3.so; just restarting gedit will make it profit of improvements in libgtk-3.so
More efficient overall RAM usage. A file like libc.so is used by almost every process, and even libgtk-3.so is used by dozens of processes. Most of it is mmap-ed read-only "text" segment (notably containing the executable binary machine code and read-only constants like string); this mapping is using the same RAM cells for every process using it. So the memory is shared
Legal compliance with LGPL license
The LGPL-2.1 license of GTK libraries is the only reason why you are legally allowed to use GTK (i.e. run GTK programs, and link your own program with GTK). This license gives you rights, in particular the one to improve GTK or take advantage of GTK improvements, but you should not prohibit users of your (e.g. proprietary) program linking /usr/lib/gtk-3.so to take advantage of improvements inside GTK itself. The section 6 of LGPL2.1 mention explicitly dynamic linking. You are not allowed to distribute statically linked GTK binaries without giving the user the mean to upgrade his GTK library. The most convenient way is having your GTK program dynamically linked against libgtk-3.so. A less easy alternative would be to distribute your statically linked executable with its object *.o files and instructions on how to re-link it statically against an hypothetical improved libgtk.a (which don't exist).
Plugin ability to dynamically load other library modules
A program can load some shared object at run time using the dlopen function (based upon the mmap system call, thru the -ldl library). This is how plugins are possible on Linux. GTK uses itself very actively this ability: theming, styling, and perhaps fonts are using dlopen and implemented by dlopen-ing appropriate stuff. Since dlopen is a public interface to the dynamic loader /lib64/ld-linux-x86-64.so.2, the -ldl library is a dynamically shared object libdl.so.2 sharing functionality and code with the dynamic loader (itself referenced in every dynamically linked executable as the "ELF interpreter"). It is uncommon and unwise to link -ldl statically. Even the libc.so library may load other modules (perhaps for DNS support, etc...); some functionalities are restricted in statically linked executables (see file /etc/nsswitch.conf etc.).
dynamic linking is slightly slower at startup time, since a program has to initiate and dynamically load (this is the role of ld-linux-x86-64.so.2) at startup all the dynamic libraries it needs. Code inside a dynamic library needs to be position independent code otherwise the relocation part of dynamically loaded libraries would be too big (and the relocation effort at start-up too long), which may cost an extra register (and this is mostly true on 32 bits x86 processors, much less on x86-64 or AMD64 64 bits ones) so makes up slightly bigger machine code (on 32 bits x86 machines, we are speaking of a few percents of size increase and runtime slowdown; on 64 bits machines, it is negligible). Of course, relocating hundred of thousands of external calls may take some time (and happens more with C++ code than with C code, perhaps because of name mangling issues).
Why you (Marco) should not statically link your GTK binary?
The five first points above should convince you that linking statically GTK is an evil thing to do. In particular, take attention of the legal aspects (LGPL): making knowingly an LGPL violation is a huge professional mistake, don't do that.
If you really wanted, with weeks of effort, you probably could be technically able (by recompiling and hacking GTK source code) to link statically your binary with GTK (with some reduced functionalities, like no theming), but that is probably unethical and useless. If your boss is stupid enough to require you that, try to convince him (or else find another job). And the very fact that you've asked on a public forum how to link statically GTK (which I am understanding as "how to violate the LGPL license") put you at risk. There are organizations -like gpl-violations- which take attention to that.
I don't see any useful reason to statically link a GTK program. Even proprietary programs using a GUI library are dynamically linked (a good example is AMD FGLRX driver and its companion programs like amdccle providing a Qt based graphical interface for installation).
Of course, you may want to deal with dependencies. Leave that to the package manager of your linux distribution.
If you want more help, please explain much more what you really want to do, and convince us that you don't ask help in violating a license. Better yet, try to distribute your software with a free license like e.g. GPLv3

Linking Statically with glibc and libstdc++

I'm writing a cross-platform application which is not GNU GPL compatible. The major problem I'm currently facing is that the application is linked dynamically with glibc and libstdc++, and almost every new major update to the libraries are not backwards compatible. Hence, random crashes are seen in my application.
As a workaround, I distribute binaries of my application compiled on several different systems (with different C/C++ runtime versions). But I want to do without this. So my question is, keeping licensing and everything in mind, can I link against glibc and libstdc++ statically? Also, will this cause issues with rtld?
You don't need to.
Copy the original libraries you linked against to a directory (../lib in this example) in your application folder.
Like:
my_app_install_path
.bin
lib
documentation
Rename you app for something like app.bin. Substitute your app for a little shell script that sets the enviroment variable LD_LIBRARY_PATH to the library path (and concatenate the previous LD_LIBRARY_PATH contents, if any). Now ld should be able to find the dynamic libraries you linked against and you don't need to compile them statically to your executable.
Remember to comply with the LGPL adding the given attribution to the libraries and pointing in the documentation where the source can be downloaded.
glibc is under the LGPL. Under section 6. of LGPL 2.1, you can distribute your program linked to the library provided you comply with one of five options. The first is to provide the source code of the library, along with the object code (source is optional, not required) of your own program, so it can be relinked with the library. You can alternatively provide a written offer of the same. Your own code does not have to be under the LGPL, and you don't have to release source.
libstdc++ is under the GPL, but with a major exception. You can basically just distribute under the license of your choice without providing source for either your own code or libstdc++. The only condition is that you compile normally, without e.g. proprietary modifications or plugins to GCC.
IANAL, and you should consider consulting one if you need real legal advice.
Specifying the option -static-libgcc to the linker would cause it to link against a static version of the C library, if available on the system. Otherwise it is ignored.
I must question what the heck you are doing with the poor library functions?
I have some cross platform software as well. It runs fine on Linux systems of all sorts. Build with the oldest version of software that you want to support. The glibc and libstdc++ libraries are really very backward compatible.
I have built on CentOS 4 and run it on RHEL 6 beta. No problems.
I can build on stable Debian and run it on testing.
Now, I do sometimes have trouble with some libraries if I try to build on, say old Debian and try to run it on CentOS 5.4. That is usually due to distribution configuration choices that are different, like choosing threading or non-threading.

problem with different linux distribution with c++ executable

I have a c++ code that runs perfect on my linux machine (Ubuntu Karmic).
When I try to run it on another version, I have all sort of shared libraries missing.
Is there any way to merge all shared libraries into single executable?
Edit:
I think I've asked the wrong question. I should have ask for a way to static-link my executable when it is already built.
I found the answer in ermine & statifier
There are 3 possible reasons you have shared libraries missing:
you are using shared libraries which do not exist by default on the other distribution, or you have installed them on your host, but not the other one, e.g. libDBI.so
you have over-specified the version at link time, e.g. libz.so.1.2.3 and the other machine has an API compatible (major version 1) but different minor version 2.3, which would probably work with your program if only it would link
the major version of the library has changed, which means it is incompatible libc.so.2 vs libc.so.1.
The fixes are:
don't link libraries which you don't need that may not be on different distros, OR, install the additional libraries on the other machines, either manually or make them dependencies of your installer package (e.g. use RPM)
don't specify the versions so tightly on the command line - link libz.so.1 instead of libz.so.1.2.3.
compile multiple versions against different libc versions.
What you are describing is the use of static libraries instead of shared libraries.
There have been several technical solutions to the original problem noted here, e.g.
compile multiple versions against
different libc versions.
or
install the additional libraries on the
other machines
but if you're in the position of an ISV, there is really just one sane solution:
Get a clean install of an older system, (e.g. Ubuntu 6.x if you're targeting desktops, perhaps as far back as Red Hat 9 if you're targeting servers) and build your software on that. Generally libraries (and definitely libc) are backwards compatible, so you you won't have problems running on newer systems.
Of course if you have non-standard or recent-version lib dependencies this doesn't completely solve the problem. In that case, as other's have suggested, if you want to be robust it's better to dlopen() and report the problems (or run with reduced functionality).
I am not too sure, but you may want to create your executable by statically linking all the libraries.
One alternative is to dynamically load shared libraries using dlopen() and if it fails to load, exit gracefully with the message that the dependent library is required for the executable to work.
The user then may install the appropriate library.
Another possible solution is using statifier (http://statifier.sf.net) or Ermine (http://magicErmine.com)
Both of them are able pack dynamic executable and all of it's needed libraries into one self-containing executable