I have a static library (lib.a) and a program that links to it. The library doesn't have any entry point that would always be called before using it, but I need to execute a piece of code very early in the program (preferably before main() starts). Therefore I thought I would use static variable of my own class. I added new source file that contains something like:
#include <MyClass.h>
static MyClass myVar;
The constructor of MyClass would then execute my code. When I link lib.a and try executing "nm" on it I get information that myVar is there. However, when I link my program and try "nm" on it I do not see myVar. When I put this piece of code into an existing file then the symbol is visible in the final executable. Why is that? Can linker omit object file from lib.a library in this case? I know that the variable is not referenced from outside (it cannot be as it is static) but it should execute code on it's own and therefore I don't get why should it be removed.
In case it makes a difference I'm using some old SunPro compiler.
Technically speaking, the linker should be forced to include that object file while compiling your program. However, support for this is buggy in many compilers, such as MSVC++. Adding an external reference somewhere in your main program should force that object file to be included.
Also note that in the case of nm, it's possible that your static initializer was inlined, and therefore the symbol need not exist in your final binary. Try something with side effects (such as a std::cout statement) in your static, and make sure it doesn't run before blaming the compiler :)
It turns out that what the linker does is pretty standard (I don't mean C++ standard, just generally observer behaviour) and you can work around it. In GNU ld it is --whole-archive option, in my case of Sun tools it is -z allextract. Which didn't actually work as expected for my project, so I used some magic with weak symbols an -z weakextract to achieve what I wanted.
Related
This code:
void undefined_fcn();
void defined_fcn() {}
struct api_t {
void (*first)();
void (*second)();
};
api_t api = {undefined_fcn, defined_fcn};
defines a global variable api with a pointer to a non-existent function. However, it compiles, and to my surprise, links with absolutely no complaints from GCC, even with all those -Wall -Wextra -Werror -pedantic flags.
This code is part of a shared library. Only when I load the library, at run-time, it finally fails. How do I check, at library link-time, that I did't forget to define any function?
Update: this question mentions the same problem, and the answer is the same: -Wl,--no-undefined. (by the way, I guess this could even be marked as duplicate). However, according to the accepted answer below, you should be careful when using -Wl,--no-undefined.
This code is part of a shared library.
That's the key. The whole purpose of having a shared library is to have an "incomplete" shared object, with undefined symbols that must be resolved when the main executable loads it and all other shared libraries it gets linked with. At that time, the runtime loader attempts to resolve all undefined symbols; and all undefined symbols must be resolved, otherwise the executable will not start.
You stated you're using gcc, so you are likely using GNU ld. For the reason stated above, ld will link a shared library with undefined symbols, but will fail to link an executable unless all undefined symbols are resolved against the shared libraries the executable gets linked with. So, at runtime, the expected behavior is that the runtime loader is expected to successfully resolve all symbols too; so the only situation when the runtime loader fails to start the executable will indicate a fatal runtime environment failure (such as a shared library getting replaced with an incompatible version).
There are some options that can be used to override this behavior. The --no-undefined option instructs ld to report a link failure for undefined symbols when linking a shared libraries, just like executables. When invoking ld indirectly via gcc this becomes -Wl,--no-undefined.
However, you are likely to discover that this is going to be a losing proposition. You better hope that none of the code in your shared library uses any class in the standard C++ or C library. Because, guess what? -- those references will be undefined symbols, and you will fail to link your shared library!
In other words, this is a necessary evil that you need to deal with.
You can't have the compiler tell you whether you forgot to define the function in that implementation file. And the reason is when you define a function it is implicitly marked extern in C++. And you cannot tell what is in a shared library until after it is linked (the compiler's linker does not know if the reference is defined)
If you are not familiar with what extern means. Things marked extern signal external linkage, so if you have a variable that is extern the compiler doesn't require a definition for that variable to be in the translation unit that uses it. The definition can be in another implementation file and the reference is resolved at link time (when you link with a translation unit that defines the variable). The same applies for functions, which are essentially variables of a function type.
To get the behavior you want make the function static which tells the compiler that the function is not extern and is a part of the current translation unit, in which case it must be defined -Wundefined-internal picks up on this (-Wundefined-internal is a part of -Werror so just compile with that)
I found one question about compiling and linking in C++ and I don't know which answer is correct. It was discussed with my friends and opinions are divided. Here is a question:
In order to run program written in C++ language its source code is:
(A) compiled to machine code,
(B) compiled and linked to machine code
In my opinion the correct answer is A but I don't have any source to prove it.
Google, first hit.
Linkage is needed as well to create a standalone executable.
You need to link the code you have produced to make it into an executable file. For simple programs, the compiler does this for you, by calling the linker at the end of the compilation process.
The compiler proper simply translates C code to either assembler (classic C compiler) which is then assembled with an assembler or directly to machine code (many modern compilers). The machine code is usually produced as "object files", which are not "executable", because they refer to external units - such as when you call printf(). It is possible to write C code that is completely standalone, but you still typically need to combine more than one object file, and it certainly needs to be "formatted" to the right way to make an executable file - which is a different file-format than an object file [although typically fairly SIMILAR].
Compilation does nothing except creation of object files which means converting C/C++ source code to machine codes.
Linking process is the creation of executable file from multiple obj files. So for running an application/executable you have to also link it.
During compilation, compiler doesn't complain about non existing functions or broken functions, because it will assume it might be defined in another object (source code file). Linker verifies all functions and their existance, so if you have a broken function, you'll get error in linking process
Compiling: Takes input C/C++-code and produces machinecode (object file)
gcc –c MyProgram.c
Note that the object file does not contain all external references!
Linking: Combines object file with external references into an executable file
gcc MyProgram.o –o MyProgram
Note that no unresolved references!
Illustration:
Where libc.a is the standard C library and it's automatically linked into your programs by the gcc.
I've just noticed that your question was about c++, the same concept is in c++ too, if you understand this, you'll understand how it works in c++ too
strictly speaking. Answer A.
But for you to see the whole picture, lets say you have defined some function. Then the compiler writes the machine code code of that function at some address, and puts that address and the name of the function in the object ".o" file where the linker can find it. The linker then take this "machine code" and resolve the symbols as you might heard in some previous error.
Motivation
I have 2 static libraries, libStatic1.a and libStatic2.a. In addition, I have many SOs (Shared Objects) that compile with libStatic1.a.
Up until now, libStatic1.a and libStatic2.a were independent and everything was okay. But now I added to the code that generates libStatic1.a a dependency on the code that generates libStatic2.a. Therefore, any SO that depends on libStatic1.a now needs to be compiled with libStatic2.a. This is undesirable, because it adds a dependency on libStatic2.a to every build targets that depends on libStatic1.a.
only on libStatic1.a now need to compile their code with libStatic2.a in order for the compilation/runtime to succeed/not crash. This creates an unnecessary coupling and I would like to avoid it.
Therefore, I need to somehow "embed" the object code of libStatic2.a in libStatic1.a. If I would just compile libStatic1.a with all the object files of libStatic2.a (in addition to its own), it will basically contain it but this creates another problem- If some user of libStatic1.a will decide to use libStatic2.a and will link it, he will get a weird "multiple definitions" error. If I could somehow tell the compiler to generate the object files of libStatic2.a with weak symbols (only for the use in libStatic1.a) this would solve the problem- no one will get multiple definitions, and no makefile of all the many SOs that use libStatic1.a will need to change.
My thinking: I know that it is possible (using GCC/g++ extensions to the C language) to declare a function with the keyword __attribute__ and the weak attribute as following:
void __attribute__((weak)) foo(int j);
Is there a way to tell the compiler (g++) to compile an entire compilation unit as "weak", meaning all its global symbols in the symbol table will be considered weak when linking?
Alternatively, is there a way to tell the linker (ld) to consider all the symbols of some object file/library as if they are weak?
If your library is small, the simplest way is still to change the declarations by adding manually the __attribute__((weak)).
Another possibility might be to ask g++ to spill the assembly code (with -S) and have some (perhaps awkor ed) script work on it.
You could also code a GCC plugin (assuming your g++ is a 4.6 version) or a GCC MELT extension for that.
Compile it normally and then objcopy the object file with --weaken.
No, there doesn't appear to be; are there so many weak external functions that it's not practical to set their attributes individually?
Under Solaris 10, I'm creating a library A.so that calls a function f() which is defined in library B.so. To compile the library A.so, I declare in my code f() as extern.
Unfortunately, I "forgot" to declare in A's makefile that it has to link with B.
However, "make A" causes no warning, no error, and the library A.so is created.
Of course, when executing A's code, the call of f() crashes because it is undefined.
Is there a way (linker option, code trick...) to make the compilation of library A fail ?
How can I be sure that all symbols refered to in library A are defined at compile time ?
Thanks for any suggestions.
Simplest way: add a "test_lib" target in the makefile, that will produce a binary using all the symbols expored from libraryA. (doesn't have to be anything meaningful... just take the address, no need to call the function or anything, it just needs to be referenced).
Tanks
I think I found something interesting and even simpler in the linker's manual (d'ho)
The -z defs option and the --no-undefined option force a fatal error if any undefined symbols remain at the end of the link. This mode is the default when an executable is built. For historic reasons, this mode is not the default when building a shared object. Use of the -z defs option is recommended, as this mode assures the object being built is self-contained. A self-contained object has all symbolic references resolved internally, or to the object's immediate dependencies.
I am writing a hello world c++ application, in the instruction #include help the compiler or linker to import the c++ library. My " cout << "hello world"; " use a cout in the library. The question is after compile and generated exe is about 96k in size, so what instructions are actually contained in this exe file, does this file also contains the iostream library?
Thanks
In the general case, the linker will only bring in what it needs. Once the compiler phase has turned your source code into an object file, it's treated much the same as all other object files. You have:
the C start-up code which prepares the execution environment (sets up argv, argv and so on) then calls your main or equivalent.
your code itself.
whatever object files need to be dragged in from libraries (dynamic linking is a special case of linking that happens at runtime and I won't cover that here since you asked specifically about static linking).
The linker will include all the object files you explicitly specify (unless it's a particularly smart linker and can tell you're not using the object file).
With libraries, it's a little different. Basically, you start with a list of unresolved symbols (like cout). The linker will search all the object files in all the libraries you specify and, when it finds an object file that satisfies that symbol, it will drag it in and fix up the symbol references.
This may, of course, add even more unresolved symbols if, for example, there was something in the object file that relies on the C printf function (unlikely but possible).
The linker continues like this until all symbols are satisfied (when it gives you an executable) or one cannot be satisfied (when it complains to you bitterly about your coding practices).
So as to what is in your executable, it may be the entire iostream library or it may just be the minimum required to do what you asked. It will usually depend on how many object files the iostream library was built into.
I've seen code where an entire subsystem went into one object file so, that if you wanted to just use one tiny bit, you still got the lot. Alternatively, you can put every single function into its own object file and the linker will probably create an executable as small as possible.
There are options to the linker which can produce a link map which will show you how things are organised. You probably won't generally see it if you're using the IDE but it'll be buried deep within the compile-time options dialogs under MSVC.
And, in terms of your added comment, the code:
cout << "hello";
will quite possibly bring in sizeable chunks of both the iostream and string processing code.
Use cl /EHsc hello.cpp -link /MAP. The .map file generated will give you a rough idea which pieces of the static library are present in the .exe.
Some of the space is used by C++ startup code, and the portions of the static library that you use.
In windows, the library or part of the libraries (which are used) are also usually included in the .exe, the case is different in case of Linux. However, there are optimization options.
I guess this Wiki link will be useful : Static Libraries