How can I statically link standard library to my C++ program? - c++

I'm using Code::Blocks IDE(v13.12) with GNU GCC Compiler.
I want to the linker to link static versions of required runtime libraries for my programs, how may I do this?
I already know that my executable size will increase. Would you please tell me other the downsides?
What about doing this in Visual C++ Express?

Since nobody else has come up with an answer yet, I will give it a try. Unfortunately, I don't know that Code::Blocks IDE so my answer will only be partial.
1 How to Create a Statically Linked Executable with GCC
This is not IDE specific but holds for GCC (and many other compilers) in general. Assume you have a simplistic “hello, world” program in main.cpp (no external dependencies except for the standard library and runtime library). You'd compile and statically link it via:
Compile main.cpp to main.o (the output file name is implicit):
$ g++ -c -Wall main.cpp
The -c tells GCC to stop after the compilation step (not run the linker). The -Wall turns on most diagnostic messages. If novice programmers would use it more often and pay more attention to it, many questions on this site would not have been asked. ;-)
Link main.o (could list more than one object file) statically pulling in the standard and runtime library and put the executable in the file main:
$ g++ -o main main.o -static
Without using the -o main switch, GCC would have put the final executable in the not so well-named file a.out (which once eventually stood for “assembly output”).
Especially at the beginning, I strongly recommend doing such things “by hand” as it will help get a better understanding of the build tool-chain.
As a matter of fact, the above two commands could have been combined into just one:
$ g++ -Wall -o main main.cpp -static
Any reasonable IDE should have options for specifying such compiler / linker flags.
2 Pros and Cons of Static Linking
Reasons for static linking:
You have a single file that can be copied to any machine with a compatible architecture and operating system and it will just work, no matter what version of what library is installed.
You can execute the program in an environment where the shared libraries are not available. For example, putting a statically linked CGI executable into a chroot() jail might help reduce the attack surface on a web server.
Since no dynamic linking is needed, program startup might be faster. (I'm sure there are situations where the opposite is true, especially if the shared library was already loaded for another process.)
Since the linker can hard-code function addresses, function calls might be faster.
On systems that have more than one version of a common library (LAPACK, for example) installed, static linking can help make sure that a specific version is always used without worrying about setting the LD_LIBRARY_PATH correctly. Obviously, this is also a disadvantage since now you cannot select the library any more without recompiling. If you always wanted the same version, why would you have installed more than one in the first place?
Reasons against static linking:
As you have already mentioned, the size of the executable might grow dramatically. This depends of course heavily on what libraries you link in.
The operating system might be smart enough to load the text section of a shared library into the RAM only once if several processes need the library at the same time. By linking statically, you void this advantage and the system might run short of memory more quickly.
Your program no longer profits from library upgrades. Instead of simply replacing one shared library with a (hopefully ABI compatible) newer release, a system administrator will have to recompile and reinstall every program that uses it. This is the most severe drawback in my opinion.
Consider for example the OpenSSL library. When the Heartbleed bug was discovered and fixed earlier this year, system administrators could install a patched version of OpenSSL and restart all services to fix the vulnerability within a day as soon as the patch was out. That is, if their services were linking dynamically against OpenSSL. For those that have been linked statically, it would have taken weeks until the last one was fixed and I'm pretty sure that there is still proprietary “all in one” software out in the wild that did not see a fix up to the present day.
Your users cannot replace a shared library on the fly. For example, the torsocks script (and associated library) allows users to replace (via setting LD_PRELOAD appropriately) the networking system library by one that routes their traffic through the Tor network. And this even works for programs whose developers never even thought of that possibility. (Whether this is secure and a good idea is subject of an unrelated debate.) An other common use-case is debugging or “hardening” applications by replacing malloc and the like with specialized versions.
In my opinion, the disadvantages of static linking outweigh the advantages in all but very special cases. As a rule of thumb: link dynamically if you can and statically if you have to.
A Addendum
As Alf has pointed out (see comments), there is a special GCC option to selectively link in the C++ standard library statically but not link the whole program statically. From the GCC manual:
-static-libstdc++
When the g++ program is used to link a C++ program, it normally automatically links against libstdc++. If libstdc++ is available as a shared library, and the -static option is not used, then this links against the shared version of libstdc++. That is normally fine. However, it is sometimes useful to freeze the version of libstdc++ used by the program without going all the way to a fully static link. The -static-libstdc++ option directs the g++ driver to link libstdc++ statically, without necessarily linking other libraries statically.

In Visual C++, the /MT option does a static link and the /MD option does a dynamic link. (see http://msdn.microsoft.com/en-us/library/2kzt1wy3.aspx)
I'd recommend using /MD and redistributing the C++ runtime, which is freely available from Microsoft. Once the C++ runtime is installed, than any program requiring the run time will continue to work. You would need to pass the proper option to tell the compiler which runtime to use. There is a good explanation here, Should I compile with /MD or /MT?
On Linux, I'd recommend redistributing libstdc++ instead of a static link. If their system libstdc++ works, I'd let the user just use that. System libraries, such as libpthread and libgcc should just use the system default. This requires compiling the program on a system with symbols compatible with all linux versions you are distributing for.
On Mac OS X, just redistribute the app with dynamic linking to libstdc++. Anyone using the same OS version should be able to use your program.

Related

How does the GNU linker decide what C/C++ library files are needed?

I'm building PHP7 on an OpenWRT machine (an ARM router). I wanted to include MySQL, so I had to build that as well. OpenWRT is 99.5% ordinary linux, but there are some weird building / shared library things that probably don't get exercised often, so I've run into some difficulties.
MySQL builds OK (after some screwing around) and I have a libmysqlclient.so that works. However, the configure process for PHP7 fails when trying to link the MySQL test program, because libmysqlclient.so must be linked with the C++ standard libraries, not the C standard libs. (MySQL is apparently at least partially C++, and it uses std::...stuff....) Configure tries to compile the test program with gcc, which doesn't include the C++ libraries in the link, so the test fails.
I bodged over this by making a simple C/C++ switching script: if the command line includes -lmysqlclient then I exec g++ $* else exec gcc $*. Then I told configure to use my script as the C compiler.
It occurs to me that there must be a better way to handle this, though. It seems like libmysqlclient.so should have some way to tell the linker that it also needs libstdc++.so, so that even if gcc is used to link, all the necessary libraries would get pulled in.
Is there some way to mark dependencies in libmysqlclient.so? Or to make configure smarter about running test programs?
You should virtually never try to link with the C++ standard library manually. Use g++ for linking C++ programs. gcc knows the minute details of what library to use and where it lives, so you don't have to.
Now the question is, when to use g++, and when not to. One possible answer to that question is "always use g++". There is no harm in it. g++ can link C programs just fine. There is no overhead in the produced program. There might be some performance loss in the link process itself, but it probably won't be noticeable for any but the most humongous of programs.

How do I pull in unexpected build dependencies of standard libraries

I feel somewhat ridiculous, but I'm trying to import the OpenBLAS libraries into a project. They were built with gfortran as the Fortran compiler. My early builds had no issue just pulling libopenblas.so in, but on another system, it's choking on libgfortran.so when I try to run our program, which doesn't exist there. My impression has been that this is a standard library on most, if not all, Linux systems. I could probably add a copy of libgfortran.so to Artifactory and let Apache Ivy pull it in, but it seems like it would make more sense to use the standard version if possible. Is there a good way to pull it in via Ivy when doing an ant resolve command if it doesn't exist on the system?
An alternate solution may be to statically link libgfortran.a in on the compiling system, but my attempts to do so by adding -static RELATIVE_PATH_TO_LIBS/libgfortran.a compile and link fine, but I still get errors when running said program on the system which lacks the library.
Thank you for whatever help you can provide.
If the executable file format is the "ELF" file format (default on Linux systems) you can use "readelf" to display the dynamic section of the executable:
readelf -d my_executable_file
It should contain a list of all shared libraries required. This is a possibility to check if the executable still requires this library.
If "libgfortran.so" is the problem and "libgfortran.a" is available I would rename "libgfortran.a" to "libxxxx.a" and use the linker switches:
-Lpath_containing_libxxxx.a -lxxxx
instead of "-lgfortran". I would not use the "-static" switch because in this case the linker also tries to link all the other libraries statically. The linker should automatically link "-lxxxx" statically because no dynamic library with this name is available.

position independent executable (-pie) for arm(cortex-m3)

I'm programming for stm32 (Cortex-m3) with codesourcery g++ lite(based on gcc4.7.2 version). And I want the executables to be loaded dynamically.
I knew I have two options available:
1. relocatable elf, which needs a elf parser.
2. position independent code (PIC) with a global offset register
I prefer PIC with global offset register, because it seems it's easier to implement and I'm not familiar with elf or any elf library. Also, It's easy to generate a .bin file from an elf file with some tools.
I've tried building my program with "-msingle-pic-base -fpic" compiling options and "-pie" linking options, but then I got a linking error:
...path...ld.exe: ...path...thumb2\libstdc++.a(pure.o): relocation
R_ARM_THM_MOVW_ABS_NC against `a local symbol' can not be used when
making a shared object; recompile with -fPIC
I don't quite understand the error message. It seems the default standard c/c++ library can't go with my options and I need to get the source of the library and rebuild for my own purpose.
So,
1. Could anyone provide me any useful information/link on how to work with the position independent executable ?
2. with the -msingle-pic-base option, I don't need to care too much about the GOT and ld script anymore, right?
Note: Without the "-pie" linking option I can build the program. But the program fails when calling a c++ virtual function (when I'm using the IDE(keil)'s simulator to debug my program). I don't understand what's going on and what I've been missing.
----------------------------------------------------------------------
-- added 20130314
with the -msingle-pic-base option, I don't need to care too much about the GOT and ld script anymore, right?
From my experiments, the register (r9 is used in my program) should point to the beginning of the got.plt sections. Delete the "-pie" option, the linking will success, (with r9 properly set) then the c++ virtual function is called successfully. However, I still think the "-pie" option is important, which may ensure that the current standard library is position independent. Could anyone explain this for me?
----------------------------------------------------------------------
-- added 20130315
I took a look at the documents on ABI from ARM's website. But it was of little help because they are not targeting a specific platform. There seems to be a concept of EABI (I'm using sourcery's arm-none-eabi edition), but I couldn't find any documentation on "EABI" from arm's website. I can't neither find documentation on this topic from sourcery and gcc's. There're more than one implementation of PIC, so which one is the sourcery g++ using in the none-eabi case? I think the behaviors of the "-msingle-pic-base", "-fpie", "-pie" options are so poorly documented !
-----------------------------------------------------------------------
From the dis-assembly code, I just figured out that, whit the "-msingle-pic-base", the r9 should point to the base address of the ".got" section, the pointers in the .got sections are absolute pointer and the addressing of variable is similar to the description in the article : Position Independent Code (PIC) in shared libraries. So I still need to modify the ".got" sections on loading. I don't know what is the ".got.plt" section used for in my program. It seems that function calls are using PC-relative addressing.
How to build with the "-pie" or how to link a standard library compiled with "-fpic" is still a problem for me.
The error message tells you to recompile the libstdc++ library, which is most often built, when the gcc compiler is built.
Thus you must recompile your standard libraries (libstdc++, libgcc_*, libc, libm and the all) with -fPIC and link your project against them.
If you rely on prebuilt compiler packages, you're mostly out of the game in the microcontroller world. If you build your compiler yourself (which is, by the way, not too difficult, but an advanced/expert task) you are on the go.
It is also possible to compile your stdandard libraries yourself with the compiler you have. You will need the sources of libraries and figure out, how the compiler package build system builds them and you have to mimic this. Perhaps here are some experts, who can advise you on this way.
There's a nice blog post on this topic, eight years after asking the question initially, but it's there: https://mcuoneclipse.com/2021/06/05/position-independent-code-with-gcc-for-arm-cortex-m/
The general outline is that you have to:
Set up GOT from linker-generated information
Set up PLT from Program Header information
Implement a binder based on the GOT entries
Compile your library as a shared relocatable binary: -msingle-pic-base -mpic-register=r9 -mno-pic-data-is-text-relative -fPIC
Set R9 accordingly

Static linking of Glibc

How can i compile my app linking statically glibc library, but only the code needed for my app? (Not all lib)
Now my compile command:
g++ -o newserver test.cpp ... -lboost_system -lboost_thread -std=c++0x
Thanks!
That's what -static does (as described in another answer): unneeded modules won't get linked into your program. But your expectations on the amount of stuff which is needed (in a sense that we can't convince linker to the contrary) may be too optimistic.
If you trying to do it for portability (running an executable on other machines with older glibc or something like that), there is one easy test question to see if you're going to get what you want:
Did you think of the problem with libnss, and are you sure it is not going to bite you?
If your answer is yes, maybe it makes sense to go on. If the answer is no, or the question seems too obscure and there is no answer, just quit your expirements with statically linked glibc: it has more chance to hurt than help.
Add -static to the compile line. It will only add what your application needs [and of course, any functions the functions you application calls, and any functions those functions call, including a bunch of startup code and some other bits and pieces], so it will be around 800K (for a simple "hello world" program) on an x86 machine. Other architectures vary. Since boost probably also calls the standard library at least a little bit, it's likely that you will have more than 800K added to your appliciation. But it only applies functions used by any of the code in the final binary, not the entire library [about 2MB as a shared library].
If you ONLY want link glibc, you will need to modify the linking line to your compile to:
-Wl,-Bstatic -libc -Wl,-Bdynamic. This will prevent any other library from being linked statically [you sometimes need to have more than one of these statements, as sometimes something pulled in by another library requires "more" from glibc to be pulled in - don't worry, it won't bring in anything more than the linker thinks is necessary].

Static linking - working with GTKmm application? - revised

Is it possible to make a static linking (compilation) on Gtk(mm) program? I need the program to be less relaying on dependences in user's system.
I try:
g++ -static data/Area.h data/Picture.cpp data/GLScene.cpp data/KBDialog.cpp data/Dialogs.h data/FilePreview.cpp data/MainWindow.cpp prog.cpp -o prog `pkg-config --cflags --libs gtkmm-2.4 gtkglextmm-1.2 exiv2`
but It fails:
/usr/bin/ld: cannot find -lgtkmm-2.4
/usr/bin/ld: cannot find -lGL
/usr/bin/ld: cannot find -latkmm-1.6
/usr/bin/ld: cannot find -lgdkmm-2.4
/usr/bin/ld: cannot find -lpangomm-1.4
/usr/bin/ld: cannot find -lgdk_pixbuf-2.0
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(glocalfileinfo.o): In function `lookup_gid_name':
(.text+0x207a): warning: Using 'getgrgid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(glocalvfs.o): In function `g_local_vfs_parse_name':
(.text+0x26c): warning: Using 'getpwnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0x1244): warning: Using 'getpwuid' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0x1237): warning: Using 'setpwent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0x124f): warning: Using 'endpwent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0xf6e): warning: Using 'getpwnam_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(glocalfileinfo.o): In function `lookup_uid_data':
(.text+0x1eea): warning: Using 'getpwuid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libX11.a(xim_trans.o): In function `_XimXTransSocketUNIXConnect':
(.text+0xe23): warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(gnetworkaddress.o): In function `g_network_address_parse':
(.text+0xe3c): warning: Using 'getservbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(gnetworkaddress.o): In function `g_network_address_parse':
(.text+0xe4c): warning: Using 'endservent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
collect2: ld returned 1 exit status
I would rather avoid doing that, because GTK depends on tricky low level libraries which are really very system specific (perhaps libfontconfig.so etc), and contains system-specific information (e.g. builtin paths for fonts...).
I also think that GTK needs dynamic shared libraries to implement theming or styling (so GTK itself is calling dlopen, and having a statically linked libdl is not reasonable).
I suggest at least linking dynamically gtk and all its dependencies.
(Your question asked twice pisses me off, so here is a more detailed answer that I might edit and complete further)
Why dynamically linked shared libraries are useful?
First, almost every binary is dynamically linked on today's Linux systems. On my Debian/Sid system, I only have /sbin/ldconfig /bin/sash and /usr/bin/rar statically linked executables, but about seven thousand other dynamically linked executables (under /bin & /usr/bin). Even essential programs like /sbin/init are today dynamically linked.
There are several wins in having mostly dynamically linked ELF executables using shared libraries
Avoid wasting disk space. When dynamically linked executables did not exist (1986 era, SunOS3.5, because the kernel was not able to mmap file segments), people took a lot of time mixing several binaries in a single one (I remember textedit and cmdtool being the same binary, a mix of several programs on SunOS3.5) to win disk space. Ok, disk space is cheaper today, but if my seven thousand programs each had to link statically libc that would consume several gigabytes of disk space (and that would mean an extra DVD or hours of networking upload when installing a Linux distribution).
Enabling an easier update. When the packaging system (apt-get, dpkg and friends on Debian) upgrades a common shared library (like the GLibC or Gtk), it replaces the dynamically linked shared libraries (*.so files, called ELF shared objects) and all the future executions of binary using them take profit. So if /usr/lib/libgtk-3.so is updated, there is no need to update /usr/bin/gedit to take advantage of the bug fixes inside libgtk-3.so; just restarting gedit will make it profit of improvements in libgtk-3.so
More efficient overall RAM usage. A file like libc.so is used by almost every process, and even libgtk-3.so is used by dozens of processes. Most of it is mmap-ed read-only "text" segment (notably containing the executable binary machine code and read-only constants like string); this mapping is using the same RAM cells for every process using it. So the memory is shared
Legal compliance with LGPL license
The LGPL-2.1 license of GTK libraries is the only reason why you are legally allowed to use GTK (i.e. run GTK programs, and link your own program with GTK). This license gives you rights, in particular the one to improve GTK or take advantage of GTK improvements, but you should not prohibit users of your (e.g. proprietary) program linking /usr/lib/gtk-3.so to take advantage of improvements inside GTK itself. The section 6 of LGPL2.1 mention explicitly dynamic linking. You are not allowed to distribute statically linked GTK binaries without giving the user the mean to upgrade his GTK library. The most convenient way is having your GTK program dynamically linked against libgtk-3.so. A less easy alternative would be to distribute your statically linked executable with its object *.o files and instructions on how to re-link it statically against an hypothetical improved libgtk.a (which don't exist).
Plugin ability to dynamically load other library modules
A program can load some shared object at run time using the dlopen function (based upon the mmap system call, thru the -ldl library). This is how plugins are possible on Linux. GTK uses itself very actively this ability: theming, styling, and perhaps fonts are using dlopen and implemented by dlopen-ing appropriate stuff. Since dlopen is a public interface to the dynamic loader /lib64/ld-linux-x86-64.so.2, the -ldl library is a dynamically shared object libdl.so.2 sharing functionality and code with the dynamic loader (itself referenced in every dynamically linked executable as the "ELF interpreter"). It is uncommon and unwise to link -ldl statically. Even the libc.so library may load other modules (perhaps for DNS support, etc...); some functionalities are restricted in statically linked executables (see file /etc/nsswitch.conf etc.).
dynamic linking is slightly slower at startup time, since a program has to initiate and dynamically load (this is the role of ld-linux-x86-64.so.2) at startup all the dynamic libraries it needs. Code inside a dynamic library needs to be position independent code otherwise the relocation part of dynamically loaded libraries would be too big (and the relocation effort at start-up too long), which may cost an extra register (and this is mostly true on 32 bits x86 processors, much less on x86-64 or AMD64 64 bits ones) so makes up slightly bigger machine code (on 32 bits x86 machines, we are speaking of a few percents of size increase and runtime slowdown; on 64 bits machines, it is negligible). Of course, relocating hundred of thousands of external calls may take some time (and happens more with C++ code than with C code, perhaps because of name mangling issues).
Why you (Marco) should not statically link your GTK binary?
The five first points above should convince you that linking statically GTK is an evil thing to do. In particular, take attention of the legal aspects (LGPL): making knowingly an LGPL violation is a huge professional mistake, don't do that.
If you really wanted, with weeks of effort, you probably could be technically able (by recompiling and hacking GTK source code) to link statically your binary with GTK (with some reduced functionalities, like no theming), but that is probably unethical and useless. If your boss is stupid enough to require you that, try to convince him (or else find another job). And the very fact that you've asked on a public forum how to link statically GTK (which I am understanding as "how to violate the LGPL license") put you at risk. There are organizations -like gpl-violations- which take attention to that.
I don't see any useful reason to statically link a GTK program. Even proprietary programs using a GUI library are dynamically linked (a good example is AMD FGLRX driver and its companion programs like amdccle providing a Qt based graphical interface for installation).
Of course, you may want to deal with dependencies. Leave that to the package manager of your linux distribution.
If you want more help, please explain much more what you really want to do, and convince us that you don't ask help in violating a license. Better yet, try to distribute your software with a free license like e.g. GPLv3