I have an application that use dlopen() to load additional modules. The application and modules are built on Ubuntu 12.04 x86_64 using gcc 4.6 but for i386 arch. The binaries are then copied to another machine with exactly same OS and work fine.
However if they are copied to Ubuntu 12.04 i386 then some (but not all) modules fail to load with the following message:
dlopen: cannot load any more object with static TLS
I would suspect that this is caused by the usage of __thread variables. However such variables are not used in the loaded modules - only in the loader module itself.
Can someone provide any additional info, what can be the reason?
I am reducing number of __thread variables and optimizing them (with -ftls-model etc), I'm just curious why it doesn't work on almost same system.
I would suspect that this is caused by the usage of __thread variables.
Correct.
However such variables are not used in the loaded modules - only in the loader module itself.
Incorrect. You may not be using __thread yourself, but some library you statically link in into your module is using them. You can confirm this with:
readelf -l /path/to/foo.so | grep TLS
what can be the reason?
The module is using -ftls-model=initial-exec, but should be using -ftls-model=global-dynamic. This most often happens when (some of) the code linked into foo.so is built without -fPIC.
Linking non-fPIC code into a shared library is impossible on x86_64, but is allowed on ix86 (and leads to many subtle problems, like this one).
Update:
I have 1 module compiled without -fPIC, but I do not set tls-model at all, as far as I remember the default value is not initial-exec
There could only be one tls model for each ELF image (executable or shared library).
TLS model defaults to initial-exec for non-fPIC code.
It follows that if you link even one non-fPIC object that uses __thread into foo.so, then foo.so gets initial-exec for all of its TLS.
So why it causes problems - because if initial-exec is used then the number of tls variables is limited (because they are not dynamically allocated)?
Correct.
Related
I am building some code that needs to be a shared object (.so).
The problem that the libc on my building machine may be newer than the published machines, so I want to link with it statically to avoid compatibilities issues. (My program uses memcpy which is apparently a GLIBC_2.14 thing when it can go as low as 2.5).
Compiling with both -shared and -static doesn't work since crtbeginT.o wasn't compiled with -fPIC.
Edit: Probably not a duplicate of GCC linking libc static and some other library dynamically, revisited? since that question talking about the main elf linking libc statically and this is about a shared object linking libc statically.
You want to statically link glibc in your shared library.
You should not do this.
If you try, you will end up with a C++ One Definition Rule (ODR) violation. This is because some parts of glibc will be from the "old" version of your target machine, and some will be from the "new" version of your library. The result is undefined behavior.
The right solution is simple: build with an older glibc (as old as your oldest target for deployment). Or build multiple times, once for each version of glibc you need (if you truly need new glibc features). Even if you think you need a new glibc feature, consider just copy-pasting that one feature into your library under a different name to avoid collisions.
Regarding memcpy in particular, see: https://stackoverflow.com/a/8823419/4323 - and for a manual fix: https://stackoverflow.com/a/5977518/4323
I'm using Code::Blocks IDE(v13.12) with GNU GCC Compiler.
I want to the linker to link static versions of required runtime libraries for my programs, how may I do this?
I already know that my executable size will increase. Would you please tell me other the downsides?
What about doing this in Visual C++ Express?
Since nobody else has come up with an answer yet, I will give it a try. Unfortunately, I don't know that Code::Blocks IDE so my answer will only be partial.
1 How to Create a Statically Linked Executable with GCC
This is not IDE specific but holds for GCC (and many other compilers) in general. Assume you have a simplistic “hello, world” program in main.cpp (no external dependencies except for the standard library and runtime library). You'd compile and statically link it via:
Compile main.cpp to main.o (the output file name is implicit):
$ g++ -c -Wall main.cpp
The -c tells GCC to stop after the compilation step (not run the linker). The -Wall turns on most diagnostic messages. If novice programmers would use it more often and pay more attention to it, many questions on this site would not have been asked. ;-)
Link main.o (could list more than one object file) statically pulling in the standard and runtime library and put the executable in the file main:
$ g++ -o main main.o -static
Without using the -o main switch, GCC would have put the final executable in the not so well-named file a.out (which once eventually stood for “assembly output”).
Especially at the beginning, I strongly recommend doing such things “by hand” as it will help get a better understanding of the build tool-chain.
As a matter of fact, the above two commands could have been combined into just one:
$ g++ -Wall -o main main.cpp -static
Any reasonable IDE should have options for specifying such compiler / linker flags.
2 Pros and Cons of Static Linking
Reasons for static linking:
You have a single file that can be copied to any machine with a compatible architecture and operating system and it will just work, no matter what version of what library is installed.
You can execute the program in an environment where the shared libraries are not available. For example, putting a statically linked CGI executable into a chroot() jail might help reduce the attack surface on a web server.
Since no dynamic linking is needed, program startup might be faster. (I'm sure there are situations where the opposite is true, especially if the shared library was already loaded for another process.)
Since the linker can hard-code function addresses, function calls might be faster.
On systems that have more than one version of a common library (LAPACK, for example) installed, static linking can help make sure that a specific version is always used without worrying about setting the LD_LIBRARY_PATH correctly. Obviously, this is also a disadvantage since now you cannot select the library any more without recompiling. If you always wanted the same version, why would you have installed more than one in the first place?
Reasons against static linking:
As you have already mentioned, the size of the executable might grow dramatically. This depends of course heavily on what libraries you link in.
The operating system might be smart enough to load the text section of a shared library into the RAM only once if several processes need the library at the same time. By linking statically, you void this advantage and the system might run short of memory more quickly.
Your program no longer profits from library upgrades. Instead of simply replacing one shared library with a (hopefully ABI compatible) newer release, a system administrator will have to recompile and reinstall every program that uses it. This is the most severe drawback in my opinion.
Consider for example the OpenSSL library. When the Heartbleed bug was discovered and fixed earlier this year, system administrators could install a patched version of OpenSSL and restart all services to fix the vulnerability within a day as soon as the patch was out. That is, if their services were linking dynamically against OpenSSL. For those that have been linked statically, it would have taken weeks until the last one was fixed and I'm pretty sure that there is still proprietary “all in one” software out in the wild that did not see a fix up to the present day.
Your users cannot replace a shared library on the fly. For example, the torsocks script (and associated library) allows users to replace (via setting LD_PRELOAD appropriately) the networking system library by one that routes their traffic through the Tor network. And this even works for programs whose developers never even thought of that possibility. (Whether this is secure and a good idea is subject of an unrelated debate.) An other common use-case is debugging or “hardening” applications by replacing malloc and the like with specialized versions.
In my opinion, the disadvantages of static linking outweigh the advantages in all but very special cases. As a rule of thumb: link dynamically if you can and statically if you have to.
A Addendum
As Alf has pointed out (see comments), there is a special GCC option to selectively link in the C++ standard library statically but not link the whole program statically. From the GCC manual:
-static-libstdc++
When the g++ program is used to link a C++ program, it normally automatically links against libstdc++. If libstdc++ is available as a shared library, and the -static option is not used, then this links against the shared version of libstdc++. That is normally fine. However, it is sometimes useful to freeze the version of libstdc++ used by the program without going all the way to a fully static link. The -static-libstdc++ option directs the g++ driver to link libstdc++ statically, without necessarily linking other libraries statically.
In Visual C++, the /MT option does a static link and the /MD option does a dynamic link. (see http://msdn.microsoft.com/en-us/library/2kzt1wy3.aspx)
I'd recommend using /MD and redistributing the C++ runtime, which is freely available from Microsoft. Once the C++ runtime is installed, than any program requiring the run time will continue to work. You would need to pass the proper option to tell the compiler which runtime to use. There is a good explanation here, Should I compile with /MD or /MT?
On Linux, I'd recommend redistributing libstdc++ instead of a static link. If their system libstdc++ works, I'd let the user just use that. System libraries, such as libpthread and libgcc should just use the system default. This requires compiling the program on a system with symbols compatible with all linux versions you are distributing for.
On Mac OS X, just redistribute the app with dynamic linking to libstdc++. Anyone using the same OS version should be able to use your program.
Some linux program for example mongodb binary file can run on different version linux whatever the host machine gcc version and glibc version.
How to do that? static link all libs? But I heard of glibc is not supposed to be static linked.
To make an executable that is independent of the installed libraries, you must statically link it.
However, if the application isn't very large/complex to build, it's often better to either distribute the source and build on/for the target system, or pre-build for the most popular variants.
The reason that you don't want to statically link glibc (and all other libs that the application may use) is that even the most simple application becomes about 700K-1MB. Given that my distribution has 1900 entries in /usr/bin, that would make it around 2GB minimum, where now it is 400MB (and that includes beasts like clang, emacs and skype, all weighing in at over 7MB in non-statically linked form - they probably have more than a dozen library dependencies each - clang, for example, grows from under 10MB to around 100-120MB if you compile it with static linking).
And of course, with static linkage, all the code for each application needs to be loaded into memory as a separate copy. So the overall memory usage goes up quite dramatically.
Is it possible to make a static linking (compilation) on Gtk(mm) program? I need the program to be less relaying on dependences in user's system.
I try:
g++ -static data/Area.h data/Picture.cpp data/GLScene.cpp data/KBDialog.cpp data/Dialogs.h data/FilePreview.cpp data/MainWindow.cpp prog.cpp -o prog `pkg-config --cflags --libs gtkmm-2.4 gtkglextmm-1.2 exiv2`
but It fails:
/usr/bin/ld: cannot find -lgtkmm-2.4
/usr/bin/ld: cannot find -lGL
/usr/bin/ld: cannot find -latkmm-1.6
/usr/bin/ld: cannot find -lgdkmm-2.4
/usr/bin/ld: cannot find -lpangomm-1.4
/usr/bin/ld: cannot find -lgdk_pixbuf-2.0
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(glocalfileinfo.o): In function `lookup_gid_name':
(.text+0x207a): warning: Using 'getgrgid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(glocalvfs.o): In function `g_local_vfs_parse_name':
(.text+0x26c): warning: Using 'getpwnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0x1244): warning: Using 'getpwuid' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0x1237): warning: Using 'setpwent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0x124f): warning: Using 'endpwent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libglib-2.0.a(gutils.o): In function `g_get_any_init_do':
(.text+0xf6e): warning: Using 'getpwnam_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(glocalfileinfo.o): In function `lookup_uid_data':
(.text+0x1eea): warning: Using 'getpwuid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libX11.a(xim_trans.o): In function `_XimXTransSocketUNIXConnect':
(.text+0xe23): warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(gnetworkaddress.o): In function `g_network_address_parse':
(.text+0xe3c): warning: Using 'getservbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/lib/gcc/i686-linux-gnu/4.4.5/../../../../lib/libgio-2.0.a(gnetworkaddress.o): In function `g_network_address_parse':
(.text+0xe4c): warning: Using 'endservent' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
collect2: ld returned 1 exit status
I would rather avoid doing that, because GTK depends on tricky low level libraries which are really very system specific (perhaps libfontconfig.so etc), and contains system-specific information (e.g. builtin paths for fonts...).
I also think that GTK needs dynamic shared libraries to implement theming or styling (so GTK itself is calling dlopen, and having a statically linked libdl is not reasonable).
I suggest at least linking dynamically gtk and all its dependencies.
(Your question asked twice pisses me off, so here is a more detailed answer that I might edit and complete further)
Why dynamically linked shared libraries are useful?
First, almost every binary is dynamically linked on today's Linux systems. On my Debian/Sid system, I only have /sbin/ldconfig /bin/sash and /usr/bin/rar statically linked executables, but about seven thousand other dynamically linked executables (under /bin & /usr/bin). Even essential programs like /sbin/init are today dynamically linked.
There are several wins in having mostly dynamically linked ELF executables using shared libraries
Avoid wasting disk space. When dynamically linked executables did not exist (1986 era, SunOS3.5, because the kernel was not able to mmap file segments), people took a lot of time mixing several binaries in a single one (I remember textedit and cmdtool being the same binary, a mix of several programs on SunOS3.5) to win disk space. Ok, disk space is cheaper today, but if my seven thousand programs each had to link statically libc that would consume several gigabytes of disk space (and that would mean an extra DVD or hours of networking upload when installing a Linux distribution).
Enabling an easier update. When the packaging system (apt-get, dpkg and friends on Debian) upgrades a common shared library (like the GLibC or Gtk), it replaces the dynamically linked shared libraries (*.so files, called ELF shared objects) and all the future executions of binary using them take profit. So if /usr/lib/libgtk-3.so is updated, there is no need to update /usr/bin/gedit to take advantage of the bug fixes inside libgtk-3.so; just restarting gedit will make it profit of improvements in libgtk-3.so
More efficient overall RAM usage. A file like libc.so is used by almost every process, and even libgtk-3.so is used by dozens of processes. Most of it is mmap-ed read-only "text" segment (notably containing the executable binary machine code and read-only constants like string); this mapping is using the same RAM cells for every process using it. So the memory is shared
Legal compliance with LGPL license
The LGPL-2.1 license of GTK libraries is the only reason why you are legally allowed to use GTK (i.e. run GTK programs, and link your own program with GTK). This license gives you rights, in particular the one to improve GTK or take advantage of GTK improvements, but you should not prohibit users of your (e.g. proprietary) program linking /usr/lib/gtk-3.so to take advantage of improvements inside GTK itself. The section 6 of LGPL2.1 mention explicitly dynamic linking. You are not allowed to distribute statically linked GTK binaries without giving the user the mean to upgrade his GTK library. The most convenient way is having your GTK program dynamically linked against libgtk-3.so. A less easy alternative would be to distribute your statically linked executable with its object *.o files and instructions on how to re-link it statically against an hypothetical improved libgtk.a (which don't exist).
Plugin ability to dynamically load other library modules
A program can load some shared object at run time using the dlopen function (based upon the mmap system call, thru the -ldl library). This is how plugins are possible on Linux. GTK uses itself very actively this ability: theming, styling, and perhaps fonts are using dlopen and implemented by dlopen-ing appropriate stuff. Since dlopen is a public interface to the dynamic loader /lib64/ld-linux-x86-64.so.2, the -ldl library is a dynamically shared object libdl.so.2 sharing functionality and code with the dynamic loader (itself referenced in every dynamically linked executable as the "ELF interpreter"). It is uncommon and unwise to link -ldl statically. Even the libc.so library may load other modules (perhaps for DNS support, etc...); some functionalities are restricted in statically linked executables (see file /etc/nsswitch.conf etc.).
dynamic linking is slightly slower at startup time, since a program has to initiate and dynamically load (this is the role of ld-linux-x86-64.so.2) at startup all the dynamic libraries it needs. Code inside a dynamic library needs to be position independent code otherwise the relocation part of dynamically loaded libraries would be too big (and the relocation effort at start-up too long), which may cost an extra register (and this is mostly true on 32 bits x86 processors, much less on x86-64 or AMD64 64 bits ones) so makes up slightly bigger machine code (on 32 bits x86 machines, we are speaking of a few percents of size increase and runtime slowdown; on 64 bits machines, it is negligible). Of course, relocating hundred of thousands of external calls may take some time (and happens more with C++ code than with C code, perhaps because of name mangling issues).
Why you (Marco) should not statically link your GTK binary?
The five first points above should convince you that linking statically GTK is an evil thing to do. In particular, take attention of the legal aspects (LGPL): making knowingly an LGPL violation is a huge professional mistake, don't do that.
If you really wanted, with weeks of effort, you probably could be technically able (by recompiling and hacking GTK source code) to link statically your binary with GTK (with some reduced functionalities, like no theming), but that is probably unethical and useless. If your boss is stupid enough to require you that, try to convince him (or else find another job). And the very fact that you've asked on a public forum how to link statically GTK (which I am understanding as "how to violate the LGPL license") put you at risk. There are organizations -like gpl-violations- which take attention to that.
I don't see any useful reason to statically link a GTK program. Even proprietary programs using a GUI library are dynamically linked (a good example is AMD FGLRX driver and its companion programs like amdccle providing a Qt based graphical interface for installation).
Of course, you may want to deal with dependencies. Leave that to the package manager of your linux distribution.
If you want more help, please explain much more what you really want to do, and convince us that you don't ask help in violating a license. Better yet, try to distribute your software with a free license like e.g. GPLv3
I know the '-fPIC' option has something to do with resolving addresses and independence between individual modules, but I'm not sure what it really means. Can you explain?
PIC stands for Position Independent Code.
To quote man gcc:
If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table. This option makes a difference on AArch64, m68k, PowerPC and SPARC.
Use this when building shared objects (*.so) on those mentioned architectures.
The f is the gcc prefix for options that "control the interface conventions used
in code generation"
The PIC stands for "Position Independent Code", it is a specialization of the fpic for m68K and SPARC.
Edit: After reading page 11 of the document referenced by 0x6adb015, and the comment by coryan, I made a few changes:
This option only makes sense for shared libraries and you're telling the OS you're using a Global Offset Table, GOT. This means all your address references are relative to the GOT, and the code can be shared accross multiple processes.
Otherwise, without this option, the loader would have to modify all the offsets itself.
Needless to say, we almost always use -fpic/PIC.
man gcc says:
-fpic
Generate position-independent code (PIC) suitable for use in a shared
library, if supported for the target machine. Such code accesses all
constant addresses through a global offset table (GOT). The dynamic
loader resolves the GOT entries when the program starts (the dynamic
loader is not part of GCC; it is part of the operating system). If
the GOT size for the linked executable exceeds a machine-specific
maximum size, you get an error message from the linker indicating
that -fpic does not work; in that case, recompile with -fPIC instead.
(These maximums are 8k on the SPARC and 32k on the m68k and RS/6000.
The 386 has no such limit.)
Position-independent code requires special support, and therefore
works only on certain machines. For the 386, GCC supports PIC for
System V but not for the Sun 386i. Code generated for the
IBM RS/6000 is always position-independent.
-fPIC
If supported for the target machine, emit position-independent code,
suitable for dynamic linking and avoiding any limit on the size of
the global offset table. This option makes a difference on the m68k
and the SPARC.
Position-independent code requires special support, and therefore
works only on certain machines.