Static linking of Glibc - c++

How can i compile my app linking statically glibc library, but only the code needed for my app? (Not all lib)
Now my compile command:
g++ -o newserver test.cpp ... -lboost_system -lboost_thread -std=c++0x
Thanks!

That's what -static does (as described in another answer): unneeded modules won't get linked into your program. But your expectations on the amount of stuff which is needed (in a sense that we can't convince linker to the contrary) may be too optimistic.
If you trying to do it for portability (running an executable on other machines with older glibc or something like that), there is one easy test question to see if you're going to get what you want:
Did you think of the problem with libnss, and are you sure it is not going to bite you?
If your answer is yes, maybe it makes sense to go on. If the answer is no, or the question seems too obscure and there is no answer, just quit your expirements with statically linked glibc: it has more chance to hurt than help.

Add -static to the compile line. It will only add what your application needs [and of course, any functions the functions you application calls, and any functions those functions call, including a bunch of startup code and some other bits and pieces], so it will be around 800K (for a simple "hello world" program) on an x86 machine. Other architectures vary. Since boost probably also calls the standard library at least a little bit, it's likely that you will have more than 800K added to your appliciation. But it only applies functions used by any of the code in the final binary, not the entire library [about 2MB as a shared library].
If you ONLY want link glibc, you will need to modify the linking line to your compile to:
-Wl,-Bstatic -libc -Wl,-Bdynamic. This will prevent any other library from being linked statically [you sometimes need to have more than one of these statements, as sometimes something pulled in by another library requires "more" from glibc to be pulled in - don't worry, it won't bring in anything more than the linker thinks is necessary].

Related

How does the GNU linker decide what C/C++ library files are needed?

I'm building PHP7 on an OpenWRT machine (an ARM router). I wanted to include MySQL, so I had to build that as well. OpenWRT is 99.5% ordinary linux, but there are some weird building / shared library things that probably don't get exercised often, so I've run into some difficulties.
MySQL builds OK (after some screwing around) and I have a libmysqlclient.so that works. However, the configure process for PHP7 fails when trying to link the MySQL test program, because libmysqlclient.so must be linked with the C++ standard libraries, not the C standard libs. (MySQL is apparently at least partially C++, and it uses std::...stuff....) Configure tries to compile the test program with gcc, which doesn't include the C++ libraries in the link, so the test fails.
I bodged over this by making a simple C/C++ switching script: if the command line includes -lmysqlclient then I exec g++ $* else exec gcc $*. Then I told configure to use my script as the C compiler.
It occurs to me that there must be a better way to handle this, though. It seems like libmysqlclient.so should have some way to tell the linker that it also needs libstdc++.so, so that even if gcc is used to link, all the necessary libraries would get pulled in.
Is there some way to mark dependencies in libmysqlclient.so? Or to make configure smarter about running test programs?
You should virtually never try to link with the C++ standard library manually. Use g++ for linking C++ programs. gcc knows the minute details of what library to use and where it lives, so you don't have to.
Now the question is, when to use g++, and when not to. One possible answer to that question is "always use g++". There is no harm in it. g++ can link C programs just fine. There is no overhead in the produced program. There might be some performance loss in the link process itself, but it probably won't be noticeable for any but the most humongous of programs.

How can I statically link standard library to my C++ program?

I'm using Code::Blocks IDE(v13.12) with GNU GCC Compiler.
I want to the linker to link static versions of required runtime libraries for my programs, how may I do this?
I already know that my executable size will increase. Would you please tell me other the downsides?
What about doing this in Visual C++ Express?
Since nobody else has come up with an answer yet, I will give it a try. Unfortunately, I don't know that Code::Blocks IDE so my answer will only be partial.
1 How to Create a Statically Linked Executable with GCC
This is not IDE specific but holds for GCC (and many other compilers) in general. Assume you have a simplistic “hello, world” program in main.cpp (no external dependencies except for the standard library and runtime library). You'd compile and statically link it via:
Compile main.cpp to main.o (the output file name is implicit):
$ g++ -c -Wall main.cpp
The -c tells GCC to stop after the compilation step (not run the linker). The -Wall turns on most diagnostic messages. If novice programmers would use it more often and pay more attention to it, many questions on this site would not have been asked. ;-)
Link main.o (could list more than one object file) statically pulling in the standard and runtime library and put the executable in the file main:
$ g++ -o main main.o -static
Without using the -o main switch, GCC would have put the final executable in the not so well-named file a.out (which once eventually stood for “assembly output”).
Especially at the beginning, I strongly recommend doing such things “by hand” as it will help get a better understanding of the build tool-chain.
As a matter of fact, the above two commands could have been combined into just one:
$ g++ -Wall -o main main.cpp -static
Any reasonable IDE should have options for specifying such compiler / linker flags.
2 Pros and Cons of Static Linking
Reasons for static linking:
You have a single file that can be copied to any machine with a compatible architecture and operating system and it will just work, no matter what version of what library is installed.
You can execute the program in an environment where the shared libraries are not available. For example, putting a statically linked CGI executable into a chroot() jail might help reduce the attack surface on a web server.
Since no dynamic linking is needed, program startup might be faster. (I'm sure there are situations where the opposite is true, especially if the shared library was already loaded for another process.)
Since the linker can hard-code function addresses, function calls might be faster.
On systems that have more than one version of a common library (LAPACK, for example) installed, static linking can help make sure that a specific version is always used without worrying about setting the LD_LIBRARY_PATH correctly. Obviously, this is also a disadvantage since now you cannot select the library any more without recompiling. If you always wanted the same version, why would you have installed more than one in the first place?
Reasons against static linking:
As you have already mentioned, the size of the executable might grow dramatically. This depends of course heavily on what libraries you link in.
The operating system might be smart enough to load the text section of a shared library into the RAM only once if several processes need the library at the same time. By linking statically, you void this advantage and the system might run short of memory more quickly.
Your program no longer profits from library upgrades. Instead of simply replacing one shared library with a (hopefully ABI compatible) newer release, a system administrator will have to recompile and reinstall every program that uses it. This is the most severe drawback in my opinion.
Consider for example the OpenSSL library. When the Heartbleed bug was discovered and fixed earlier this year, system administrators could install a patched version of OpenSSL and restart all services to fix the vulnerability within a day as soon as the patch was out. That is, if their services were linking dynamically against OpenSSL. For those that have been linked statically, it would have taken weeks until the last one was fixed and I'm pretty sure that there is still proprietary “all in one” software out in the wild that did not see a fix up to the present day.
Your users cannot replace a shared library on the fly. For example, the torsocks script (and associated library) allows users to replace (via setting LD_PRELOAD appropriately) the networking system library by one that routes their traffic through the Tor network. And this even works for programs whose developers never even thought of that possibility. (Whether this is secure and a good idea is subject of an unrelated debate.) An other common use-case is debugging or “hardening” applications by replacing malloc and the like with specialized versions.
In my opinion, the disadvantages of static linking outweigh the advantages in all but very special cases. As a rule of thumb: link dynamically if you can and statically if you have to.
A Addendum
As Alf has pointed out (see comments), there is a special GCC option to selectively link in the C++ standard library statically but not link the whole program statically. From the GCC manual:
-static-libstdc++
When the g++ program is used to link a C++ program, it normally automatically links against libstdc++. If libstdc++ is available as a shared library, and the -static option is not used, then this links against the shared version of libstdc++. That is normally fine. However, it is sometimes useful to freeze the version of libstdc++ used by the program without going all the way to a fully static link. The -static-libstdc++ option directs the g++ driver to link libstdc++ statically, without necessarily linking other libraries statically.
In Visual C++, the /MT option does a static link and the /MD option does a dynamic link. (see http://msdn.microsoft.com/en-us/library/2kzt1wy3.aspx)
I'd recommend using /MD and redistributing the C++ runtime, which is freely available from Microsoft. Once the C++ runtime is installed, than any program requiring the run time will continue to work. You would need to pass the proper option to tell the compiler which runtime to use. There is a good explanation here, Should I compile with /MD or /MT?
On Linux, I'd recommend redistributing libstdc++ instead of a static link. If their system libstdc++ works, I'd let the user just use that. System libraries, such as libpthread and libgcc should just use the system default. This requires compiling the program on a system with symbols compatible with all linux versions you are distributing for.
On Mac OS X, just redistribute the app with dynamic linking to libstdc++. Anyone using the same OS version should be able to use your program.

position independent executable (-pie) for arm(cortex-m3)

I'm programming for stm32 (Cortex-m3) with codesourcery g++ lite(based on gcc4.7.2 version). And I want the executables to be loaded dynamically.
I knew I have two options available:
1. relocatable elf, which needs a elf parser.
2. position independent code (PIC) with a global offset register
I prefer PIC with global offset register, because it seems it's easier to implement and I'm not familiar with elf or any elf library. Also, It's easy to generate a .bin file from an elf file with some tools.
I've tried building my program with "-msingle-pic-base -fpic" compiling options and "-pie" linking options, but then I got a linking error:
...path...ld.exe: ...path...thumb2\libstdc++.a(pure.o): relocation
R_ARM_THM_MOVW_ABS_NC against `a local symbol' can not be used when
making a shared object; recompile with -fPIC
I don't quite understand the error message. It seems the default standard c/c++ library can't go with my options and I need to get the source of the library and rebuild for my own purpose.
So,
1. Could anyone provide me any useful information/link on how to work with the position independent executable ?
2. with the -msingle-pic-base option, I don't need to care too much about the GOT and ld script anymore, right?
Note: Without the "-pie" linking option I can build the program. But the program fails when calling a c++ virtual function (when I'm using the IDE(keil)'s simulator to debug my program). I don't understand what's going on and what I've been missing.
----------------------------------------------------------------------
-- added 20130314
with the -msingle-pic-base option, I don't need to care too much about the GOT and ld script anymore, right?
From my experiments, the register (r9 is used in my program) should point to the beginning of the got.plt sections. Delete the "-pie" option, the linking will success, (with r9 properly set) then the c++ virtual function is called successfully. However, I still think the "-pie" option is important, which may ensure that the current standard library is position independent. Could anyone explain this for me?
----------------------------------------------------------------------
-- added 20130315
I took a look at the documents on ABI from ARM's website. But it was of little help because they are not targeting a specific platform. There seems to be a concept of EABI (I'm using sourcery's arm-none-eabi edition), but I couldn't find any documentation on "EABI" from arm's website. I can't neither find documentation on this topic from sourcery and gcc's. There're more than one implementation of PIC, so which one is the sourcery g++ using in the none-eabi case? I think the behaviors of the "-msingle-pic-base", "-fpie", "-pie" options are so poorly documented !
-----------------------------------------------------------------------
From the dis-assembly code, I just figured out that, whit the "-msingle-pic-base", the r9 should point to the base address of the ".got" section, the pointers in the .got sections are absolute pointer and the addressing of variable is similar to the description in the article : Position Independent Code (PIC) in shared libraries. So I still need to modify the ".got" sections on loading. I don't know what is the ".got.plt" section used for in my program. It seems that function calls are using PC-relative addressing.
How to build with the "-pie" or how to link a standard library compiled with "-fpic" is still a problem for me.
The error message tells you to recompile the libstdc++ library, which is most often built, when the gcc compiler is built.
Thus you must recompile your standard libraries (libstdc++, libgcc_*, libc, libm and the all) with -fPIC and link your project against them.
If you rely on prebuilt compiler packages, you're mostly out of the game in the microcontroller world. If you build your compiler yourself (which is, by the way, not too difficult, but an advanced/expert task) you are on the go.
It is also possible to compile your stdandard libraries yourself with the compiler you have. You will need the sources of libraries and figure out, how the compiler package build system builds them and you have to mimic this. Perhaps here are some experts, who can advise you on this way.
There's a nice blog post on this topic, eight years after asking the question initially, but it's there: https://mcuoneclipse.com/2021/06/05/position-independent-code-with-gcc-for-arm-cortex-m/
The general outline is that you have to:
Set up GOT from linker-generated information
Set up PLT from Program Header information
Implement a binder based on the GOT entries
Compile your library as a shared relocatable binary: -msingle-pic-base -mpic-register=r9 -mno-pic-data-is-text-relative -fPIC
Set R9 accordingly

Man Bites Dog: symbol resolved *without* linking library? clock_gettime() -lrt

I have a C++ source tree developed under Ubuntu 12.04 using clang++ 3.2 that builds some libraries, then compiles some applications with these libraries and the usual collection of other various system libraries. Two puzzles. Client reports that code fails to build with undefined reference to clock_gettime(). Sure enough, I did not include the obligatory "-lrt" in the build logic (scons).
First puzzle: This compiles, links and executes correctly and without complaint on my system even though I do not specify "-lrt" anywhere! How does this symbol get correctly resolved? I suspect it is because the application links against a dynamic library that itself requires librt but I don't understand the logic behind why this would happen?
Second puzzle: Assuming a satisfactory explanation of how clock_gettime() gets resolved without "-lrt", why would this happen on my system but not on client's very similar setup?
"... a riddle, wrapped in a mystery, inside an enigma" --- Winston Churchill
Suggestions for tools to reveal what is really going on here would be most welcomed.
From SUSv4 (Utilities/c99):
-l rt
This option shall make available all interfaces referenced in <aio.h>, <mqueue.h>, <sched.h>, <semaphore.h>, and <spawn.h>, interfaces marked as optional in <sys/mman.h>, interfaces marked as ADV (Advisory Information) in <fcntl.h>, and interfaces beginning with the prefix clock_ and time_ in <time.h>. An implementation may search this library in the absence of this option.
I presume the above suffices to at least justify why this behavior is allowed by POSIX.
Most likely the involved systems are different in this regard. For example, clock_gettime() might be implemented in libc for you, but in librt for your client. Don't take chances: use a portable -l rt and forget about the issue.
A more obvious example is -l xnet, which behaves similarly according to POSIX, but, at least on Gentoo, Debian and Ubuntu Linux systems, compiling with -l xnet actually yields an error. (libxnet purportedly contains the implementation for the UNIX sockets interface.)
If you want to further investigate the issue, try ldd in GNU/Linux systems. ldd should display the dynamic dependencies for your binary. I'd bet that clock_gettime() is simply implemented in libc.so.

g++ produces big binaries despite small project

Probably this is a common question. In fact I think I asked it years ago... but I can't remember the answer.
The problem is: I have a project that is composed of 6 source files. All of them no more than 200 lines of code. It uses many STL containers, stdlib.h and iostream. Now the executable is around 800kb in size.... I guess I shouldn't statically link libraries. How to do this with GCC? And in Eclipse CDT?
EDIT:
As I responses away from what I want I think it's the case for a clarification. What I want to know is why such a small program is so big in size and what is the relationship with static, shared libraries and their difference. If it's a too long story to tell feel free to give pointers to docs. Thank you
If you give g++ dynamic library names, and don't pass the -static flag, it should link dynamically.
To reduce size, you could of course strip the binary, and pass the -Os (optimize for size) optimization flag to g++.
One thing to remember is that using the STL results in having that extra code in your executable even if you are dynamically linking with the C++ library. This is by virtue of the fact that the STL is a bunch of templates that aren't actually compiled until you write and compile your code. Since the library can't anticipate what you might store in a container, there's no way for the library to already contain the code for that particular usage of the container. Same goes with algorithms and everything else in the STL.
I'm not saying this is definitely the reason your executable is so much larger than you expect. But it may be a factor.
Use -O3 and -s flags to produce the most optimized binary. Also see this link for some more information.
If you are building for Windows, consider using the Microsoft compiler. It always produces the smallest binary on that platform.
Eclipse should be linking dynamically by default, unless you've set the static flag on the linker in your makefile.
In response to your EDIT :
-when you link statically, the executable contains a full copy of each library you've linked to.
-when you link dynamically, the executable only contains references and hooks to the linked libraries, which is a much much smaller amount of code.
The executable has to contain more than just your code.
At the very least, it contains some startup code, setting up the environment and if necessary, loading any external libraries, before the program launches.
If you've statically linked the runtime library, you also get that included in your executable. Otherwise you only get a small stub, just big enough to redirect system calls to the external runtime.
It may, depending on compiler settings also include a lot of debugging info and other non-essential data. If optimizations are enabled, that may have increased code size as well.
The real question is why does this matter? 800KB still fits easily on a floppy disk!
Most of this is a one-time cost. it doesn't mean that if you write twice as much code, it'll take up 1600KB. More likely, it'll take 810KB or something like that.
Don't worry about one-time startup costs.
The size usually results in static libraries being linked into your application.
You can reduce the size of the compiled binary by compiling to RELEASE versions, with optimizations to binary size.
Another source of executable size are the libraries. You said that you don't use external libraries, except for STD, so I believe you're including the C Runtime with your executable, ie, linking statically. so check for dynamic linking.
IMO you shouldn't really worry about that, but if you're really paranoid, check this: Smallest x86 ELF Hello World
use Visual C++ 6.0
it supported with Windows 95 to Windows 7.
and can be compiled as x86 platforms but only for Windows.
so if you are a Windows user just stick with Windows Compilers other than GCC which is sux actually.most of people who say Visual C++ is sux cause they are Anti-Microsofters.
also remember use "Visual C++ 6.0" if you use a newer one probably you can't run your files on Windows 95. I have tested all those things that's why I said.
GCC produces largest binaries, but Visual C++ not ,Intel Compiler can use to save more than 30% of space but it demands a Intel processor unless performance would be horrible.
another thing u need to remember is when u use templates though you see small lines
when you compiles those functions would be expanded so the result is make larger binaries.
if you need smaller binaries I suggest move to C cause C is actually widely used but not OO
infact C is easy to use than C++
this does make sense then C++ example
cout << "Hello World" << endl;
printf("%s","Hello World");
second one say print field %s means you type a string so it's easy. :P