How to check why some symbol is required for linkage? - c++

This question gets cumbersome, lets try short version:
Usually when you fail with unresolved symbol reference it is quite strait forward, here you call something that linker cant find. You just feed your linker with library and it just works. Sometimes, there are cases when you banging your head on the wall and dont see why the linker wants this symbol here and there, it is not called, at least not directly. Is there a tool/linker switch that may explain why it thinks the symbol is needed "here"?
The original question:
It is all about static linkage. I have small utility, couple of lines of code, a couple of includes. The utility linked statically with library named lib1. Lets say lib1 has a dependency on another library, lib2, since lib1 uses symbol sym1 from lib2. However nothing that uses sym1 from lib2 is used/called from the utility nor anything from lib1 that may be dependent on lib2. The aforementioned tiny utility however fails with unresolved symbol for sym1. The first question is why? Since, in the utility the sym1 is not required anywhere and even no symbol from lib1 that uses sym1 used in the utility, why linker bothers with looking for this symbol in the first place? The second question, there is a chance that inclusion chain introduces the symbol sym1 to my utility, then it answers the "why" but it should not introduce it (at least there is no obvious reason for that), so the second question is how do I find why linker thinks the utility needs sym1 from lib2?
What/whenre/why: Linux, C/C++, GCC-9/Clang-9

Well, apparently I managed to answer the question not seeing the code as well as error message. Time to open my psi-consultancy.
Concerning linking on Linux/ELF target, it is important to remember that linker, while trying to satisfy/resolve symbols, is merging (and copying to final executable) sections (aka segments). Typically app has .text (code segment), .rodata (read-only data) segment, .data (r/w initialized data) segment, .bss (uninitialized data) etc. So if needed symbol is among, say, three functions in one compiled file, whole .text section of the file will be picked. And if unused but present in the section functions calls something else, linker will start searching for that "something else" to satisfy, even if it's irrelevant to the application.
Plus, there is some C++ specific thingy: for class with virtual functions compiler generates vtable, with pointers to each virtual function, and move this table to .rodata section. Note, that what
we think as code is actually ended up in (read-only) DATA section.
If you have all but one virtual functions defined, linker most likely will complain with error message like
/tmp/cc5YTcBb.o:(.rodata._ZTV3CL1[_ZTV3CL1]+0x18): undefined reference to `CL1::fnc2()
where you could see that problem is with .rodata, not .text.
Moral of the story: chop your code and data into large number of smallest possible sections/segments, your atoms of linking. Ideally, each functions goes into its own section, as well as piece of initialized or r/o data.
Final step is to instruct linker (via -Wl option) to discard (garbage-collect) all unused sections.
In general, one should expect more RAM used by linker, probably slower link stage, but smaller and faster app.
Command line to use, take a look at GCC manual wrt options meaning.
g++ -fdata-sections -ffunction-sections -fipa-pta main.cpp -Wl,--gc-sections -Wl,-O1 -Wl,--as-needed

Related

Understanding and resolving the linker warning "ld: warning: direct access in function 'Foo' to global weak symbol 'Bar'

I'm working on multiple VST3 audio plugin projects for macOS. Technically, an audio plugin is a shared library (.dylib) loaded by a host application at runtime via dlopen. I'm linking both plugins against a self-built static version of protobuf 3.20.0 and I'm building my plugins using CMake.
I now had a bug where both plugins defined a protobuf message with the same name, so both plugins defined symbols with the same name but different content and as soon as I loaded the second one crashes occurred. The call stacks revealed that both protobuf messages interfered with each other, as e.g. destructors invoked in plugin a ended up calling the destructor symbol in plugin b.
Specifying CXX_VISIBILITY_PRESET hidden on my CMake target that encapsulates the protobuf library dependency and the plugin-specific protobuf messages, solves the problem, but leaves me with linker warnings like
ld: warning: direct access in function 'google::protobuf::internal::InternalMetadata::Container<google::protobuf::UnknownFieldSet>* google::protobuf::Arena::Create<google::protobuf::internal::InternalMetadata::Container<google::protobuf::UnknownFieldSet> >(google::protobuf::Arena*)' from file 'plugin_artefacts/Debug/libplugin_SharedCode.a(Message.pb.cc.o)' to global weak symbol 'typeinfo for google::protobuf::internal::InternalMetadata::Container<google::protobuf::UnknownFieldSet>' from file '/path/to/lib/libprotobuf.a(descriptor.pb.o)' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings.
I found a lot of posts that simply suggest to build everything with -fvisibility=hidden but none that actually has an in-depth explanation about the actual problem that this warning is about.
What I got so far is that libprotobuf.a defines the symbol in question as a weak symbol, that means a symbol that can be overridden at runtime. Right? This aligns with the output of
objdump --syms --demangle libprotobuf.a
which reveals something like
0000000000010470 w O __DATA,__const typeinfo for google::protobuf::internal::InternalMetadata::Container<google::protobuf::UnknownFieldSet>
so, yes there is a weak symbol in there.
However, I don't quite get if and why this is problematic? As far as I get it, this symbol should only be relevant internally to my plugin and does not need to be exposed public anyway, so is there any need/benefit in it being declared weak? And why is it a weak symbol at all? If I get it right, this is a compiler generated symbol. Can I even alter the visibility of it? Should I?
I tried re-compiling libprotobuf.a with -fvisibility=hidden as additional entry in the CXXFLAGS environment variable before running the protobuf configure script, but inspecting the output still revealed the symbol being declared weak. Now I'm not sure if this is just because I passed the flag in a wrong way to the configure script (other flags handled the same way seem to have an effect though) or if this is just not the right thing to do.
In any case, after experiencing the bug described above, I would prefer to gain a deep understanding of the issue and then pick the right solution based on that understanding rather than just applying some compiler option to make it work without knowing what happened and what other unexpected side effects that could employ.
Links to blog posts etc. which discuss this topic in depth are also highly welcome – I rarely found anything like that during my own research.

Why does the program work after removing the symbol information?

I made one SO file and compiled it with a compile option called "-Xlinker --strip-all " to counter any reverse engineering (use clang).
Thanks to this, most of the symbols of functions other than functions directly exposed to the outside do not appear (objdump -TC test.so). The question is, if a symbol is deleted like this, it should not be used inside the program, so I think it is normal. What am I missing?
You're right, debugging symbols aren't needed by the program itself to execute; the linker computes (and therefore knows at link-time) what the memory-address of each function/global-variable/etc will be at run-time, so it can just place that memory-address directly into the executable where necessary.
The symbols are there for a debugger to use, to make the debugging output easier for a human (or a debugging tool) to use and understand.

Garbage from other linking units

I asked myself the following question, when I was discussing this topic .
Are there cases when some unused code from translation units will link to final executable code (in release mode of course) for popular compilers like GCC and VC++?
For example suppose we have 2 compilation units:
//A.hpp
//Here are declarations of some classes, functions, extern variables etc.
And source file
//A.cpp
//defination of A.hpp declarations
And finally main
//main.cpp
//including A.hpp library
#include "A.hpp"
//here we will use some stuff from A.hpp library, but not everything
My question is. What if in main.cpp not all the stuff from A.hpp is used? Will the linker remove all unused code, or there are some cases, when some unused code can link with executable file?
Edit: I'm interested in G++ and VC++ linkers.
Edit: Of course I mean in release mode.
Edit: I'm starting bounty for this question to get good and full answer. I'm expecting answer, which will explain in which cases g++ and VC++ linkers are linking junk and what kind of code they are able to remove from executable file(unneeded functions, unneeded global variables, unneeded class definitions, etc...) and why aren't they able to remove some kind of unneeded stuff.
As other posters have indicated, the linker typically does not remove dead code before building the final executable. However, there are often Optimization settings you can use to force the linker to try extra hard to do this.
For GCC, this is accomplished in two stages:
First compile the data but tell the compiler to separate the code into separate sections within the translation unit. This will be done for functions, classes, and external variables by using the following two compiler flags:
-fdata-sections -ffunction-sections
Link the translation units together using the linker optimization flag (this causes the linker to discard unreferenced sections):
-Wl,--gc-sections
So if you had one file called test.cpp that had two functions declared in it, but one of them was unused, you could omit the unused one with the following command to gcc(g++):
gcc -Os -fdata-sections -ffunction-sections test.cpp -o test.o -Wl,--gc-sections
(Note that -Os is an additional linker flag that tells GCC to optimize for size)
I have also read somewhere that linking static libraries is different though. That GCC automatically omits unused symbols in this case. Perhaps another poster can confirm/disprove this.
As for MSVC, as others have mentioned, function level linking accomplishes the same thing.
I believe the compiler flag for this is (to sort things into sections):
/Gy
And then the linker flag (to discard unused sections):
/OPT:REF
EDIT: After further research, I think that bit about GCC automatically doing this for static libraries is false.
The linker will not remove code.
You can still access it via dlsym dynamically in your code.
In general, linkers tend to include everything from the object files explicitly passed on the command line, but only pull in those object files from a static library that contain symbols needed to resolve external references from object files already linked.
However, a linker may decide to discard functions that are never called, or data which is never referenced. The precise details will depend on the compiler and linker switches.
In C++ code, if a source file is explicitly compiled and linked in to your application then I would expect that the objects with static storage duration that have constructors and/or destructors will be included, and their constructors/destructors run at the appropriate times. Consequently, any code called from those constructors or destructors must be in the final executable. However, if the code is not called from anywhere then you cannot write a program to tell whether or not the code is included without using things like dlsym, so the linker may well omit to include it in the final executable.
I would also expect that any symbols defined with global visibility such that they could be found via dlsym (as opposed to "hidden" symbols which are only visible within the executable) would be present in the final executable. However, this is an expectation rather than something I have confirmed by testing or reading the docs.
If you wanted to ensure code was in your executable even if it isn't called by inside it, you could load it in as a statically aware dynamic link library (a statically aware library is one which is loaded automatically into memory as the program is loaded, as opposed to the functionality where you can pass a string to a function that loads a library and then you manually search for hooks)

Detect duplicate definitions of a variable in shared library

It appears GCC linker doesn't care for one variable being defined in two files. I suspect this is the cause of trouble a 3rd party library is causing us.
Take this:
File a.cpp contains:
int foo;
//do things with it.
File b.cpp contains:
int foo;
//do other things with it.
File c.cpp contains:
extern int foo;
//do other things with it.
They are all compiled by gcc to .o files, then linked as shared object.
gcc -fPIC -c a.cpp
gcc -fPIC -c b.cpp
gcc -fPIC -c c.cpp
ld *.o -shared -soname,mylib -o mylib
The linker doesn't complain at all, but the resulting binary misbehaves. We suspect there are at least a few conflicts of this kind and would like to locate them. What kind of linker options would let us detect them?
(interestingly, if the variables are initialized (int foo=0) in both files, it produces an error).
Hold on now -- are you using foo for two different purposes in the two files? That would certainly lead to run-time errors. If foo needs to be global, then it should be defined in just one module -- the linker may accept it, but you will still only get one copy of foo. If it doesn't need to be global, it should be declared 'static int foo;'
This is a serious design bug in gcc/ld, it does not occur using MSVC. It won't happen linking programs, only shared libraries. When you link a program, the linker ensures all external references are satisfied, at least at link time. When you link a shared library it does not. Instead, external references for which there are no definitions are just left to dangle, the argument (given in ld man page) being that the symbol has to be resolved at load-time dynamically anyhow, so there's no point checking it. It's also hard if you use the stupid feature of shared libraries grabbing symbols from executables.
Your program will not misbehave. If you specify that symbols in a library must be satisfied on loading, you will get a load-time error, if you specify lazy linkage, the error will still occur, but only on the first use of the symbol (AFAIK!)
Some older OS, I believe BSD for example, allowed unsatisfied external pointers to be left as NULL so that you could write an "in program" check to see if the symbol was linked or not. Linux ld at least does not support this AFAIK.
There is a linker switch to force satisfaction of external references for shared libraries, but it is hard to use correctly in portable builds because it requires you to explicitly link the startup library for your processor.
I consider this a very serious design bug, and tried to file a bug report. In my own product we were happy to be able to build under Cygwin because underneath it uses MSVC linker which does not permit this behaviour, quite a lot of bugs were found in my code this way.
It seems compiler option -fno-common forces all the variables to be initialized, so it triggers errors upon linking.

static link library

I am writing a hello world c++ application, in the instruction #include help the compiler or linker to import the c++ library. My " cout << "hello world"; " use a cout in the library. The question is after compile and generated exe is about 96k in size, so what instructions are actually contained in this exe file, does this file also contains the iostream library?
Thanks
In the general case, the linker will only bring in what it needs. Once the compiler phase has turned your source code into an object file, it's treated much the same as all other object files. You have:
the C start-up code which prepares the execution environment (sets up argv, argv and so on) then calls your main or equivalent.
your code itself.
whatever object files need to be dragged in from libraries (dynamic linking is a special case of linking that happens at runtime and I won't cover that here since you asked specifically about static linking).
The linker will include all the object files you explicitly specify (unless it's a particularly smart linker and can tell you're not using the object file).
With libraries, it's a little different. Basically, you start with a list of unresolved symbols (like cout). The linker will search all the object files in all the libraries you specify and, when it finds an object file that satisfies that symbol, it will drag it in and fix up the symbol references.
This may, of course, add even more unresolved symbols if, for example, there was something in the object file that relies on the C printf function (unlikely but possible).
The linker continues like this until all symbols are satisfied (when it gives you an executable) or one cannot be satisfied (when it complains to you bitterly about your coding practices).
So as to what is in your executable, it may be the entire iostream library or it may just be the minimum required to do what you asked. It will usually depend on how many object files the iostream library was built into.
I've seen code where an entire subsystem went into one object file so, that if you wanted to just use one tiny bit, you still got the lot. Alternatively, you can put every single function into its own object file and the linker will probably create an executable as small as possible.
There are options to the linker which can produce a link map which will show you how things are organised. You probably won't generally see it if you're using the IDE but it'll be buried deep within the compile-time options dialogs under MSVC.
And, in terms of your added comment, the code:
cout << "hello";
will quite possibly bring in sizeable chunks of both the iostream and string processing code.
Use cl /EHsc hello.cpp -link /MAP. The .map file generated will give you a rough idea which pieces of the static library are present in the .exe.
Some of the space is used by C++ startup code, and the portions of the static library that you use.
In windows, the library or part of the libraries (which are used) are also usually included in the .exe, the case is different in case of Linux. However, there are optimization options.
I guess this Wiki link will be useful : Static Libraries