Remove C++-STL/Boost debug symbols (... or do not create them) - c++

Linux/Gcc/LD - Toolchain.
I would like to remove STL/Boost debug symbols from libraries and executable, for two reasons:
Linking gets very slow for big programs
Debugging jumps into stl/boost code, which is annoying
For 1. incremental linking would be a big improvement, but AFAIK ld does not support incremental linking. There is a workaround "pseudo incremental linking" in an 1999 dr.dobb's journal (not in the web any more, but at archive.org (the idea is to put everything in a dynamic library and all updated object files in an second one that is loaded first) but this is not really a general solution.
For 2. there is a script here, but a) it did not work for me (it did not remove symbols), b) it is very slow as it works at the end of the pipe, while it would be more efficient to remove the symbols earlier.
Obviously, the other debug symbols should stay in place.

GNU strip accepts regex arguments to --strip-symbols=
The STL and boost symbols are name-mangled because of the namespaces they're in. I don't have GCC binutils handy at this moment, but just peek at the name mangling used for namespaces and construct the regex for 'symbols from namespace X' and pass this to --strip-symbols=

As far as I know there's no real option to do what you want in gcc. The main problem being that all the code you want to strip debug symbols for is defined in headers.
Otherwhise it would be possible to build a library separatly, strip that, and link with the stripped version.
But only getting debug symbols from certain parts of a compilation unit, while building and linking (for your desired link time speedup) is not possible in gcc as far as I know.

You probably don't want to strip the debug symbols from the shared libraries, as you may need that at some point.
If you are using GDB or DDD to debug, you may be able to get away with removing the Boost source files from the Source Path so it can't trace into the functions. (Or just don't trace into them, trace over!)
You can remove the option to compile the program with debug symbols, which will speed the link time.
Like the script you link to, you can consult the strip program ("man strip") to remove all or certain symbols.

You may want to use strip.
strip --strip-unneeded --strip-debug libfoo.so
Why don't you just build without debugging in the first place though?

This answer provides some specifics that I needed to make MSalters' answer work for removing STL symbols.
The STL symbol names are mangled. The trick is to find a regular expression that covers these names. I looked these symbols up with GNU's Binutils:
> nm --debug-syms <objectfile>
I basically searched on STL functions, like resize. If this is difficult, the output becomes readable when using the following command:
> nm --debug-syms --demangle <objectfile>
Look up a line number containing an STL function call, then look up it's mangled name on that same line number using the first provided command. This allowed me to see that all STL symbol names began with _ZNSt[0-9]+ or _ZSt[0-9]+, etc.
To allow GNU Strip to remove these symbols I used:
> strip --wildcard \
--strip-symbol='_ZNKSt*' \
--strip-symbol='_ZNSt*' \
--strip-symbol='_ZSt*' \
--strip-symbol='_ZNSa*' \
<objectfile>
I used these commands directly on the compiled/linked binary. I verified the removal of these symbols by comparing the output of nm before and after the removal (I wrote the output to files and used vimdiff). The --wildcard option allows the use of regular expressions. Although I would expect [0-9]* to mean 0 to an infinite amount of numbers, here it actually means 1 number followed by an infinite amount of anything (until the end of the line).
If you are looking to not step into STL code this can be achieved by gdb's skip file command, as done here.
Hope it helps

Which compiler are you using? For example, if I understand your question correctly, this is a trivial matter in MS Visual Studio.

Related

Why does the program work after removing the symbol information?

I made one SO file and compiled it with a compile option called "-Xlinker --strip-all " to counter any reverse engineering (use clang).
Thanks to this, most of the symbols of functions other than functions directly exposed to the outside do not appear (objdump -TC test.so). The question is, if a symbol is deleted like this, it should not be used inside the program, so I think it is normal. What am I missing?
You're right, debugging symbols aren't needed by the program itself to execute; the linker computes (and therefore knows at link-time) what the memory-address of each function/global-variable/etc will be at run-time, so it can just place that memory-address directly into the executable where necessary.
The symbols are there for a debugger to use, to make the debugging output easier for a human (or a debugging tool) to use and understand.

Checking value of preprocessor symbol(#define)

i'm trying to cross -ompile an application to another system. I created all dependencies and started compiling. This then stops with one of my dependency libraries, namely Qt3, causing compiler errors:
Error: expected class-name before "{" token
and
Error: "QMutex" does not name a type
I'm suspecting the Q_EXPORT symbol to be defined wrong because i forgot to simulate some environment settings. But because it's definition depends on symbols which depend on symbols which depend on symbols, and so on, it's hard to check.
Just outputing it in an test program isn't working either because the value of Q_EXPORT is not always convertable to string.
My question is:
How do i check the value of a preprocessor symbol (while compiling/preprocessing) with GNU Compiler.
I thought there would be an option for this but i havn't found anything while searching on the web.
Debugging of macro symbols can be tricky, because it happens before the actual compilation [1]. Running the build system in such a way that you print the actual compilation command is a good starting point.
Then you can grab the actual compile command, and substitute the -c with -E or something similar, to inspect the actual generated preprocessor output. Then locate the actual place in the source that you are compiling - expect the output from -E to be HUGE - a million lines of output is not unusual. Use the #file and #line preprocessor symbols to track which file you are in, and what line you're at.
[1] Not strictly true in all compilers, as to help with precisely the problem that macros are making it hard to follow what the code is actually doing, modern compilers expand macros during the proper parsing of the code. However, that's not helping in this particular case, apparently.

cpp: How to find symbols removed in function level linking

I have a very huge executable built on IBM AIX. When I enable function level linking, the size of the task is 2.8GB, whereas when I disable function level linking task size goes up to 3.50GB.
This would most likely mean that there's additional object files that are pulled in which my application doesnt need, right? If so, how can I find the symbols that are removed with function level linking.
I tried to look at nm output on both tasks, but was clueless on what to look for, and what to diff
You need to add -Wl,--print-gc-sections to LDFLAGS.

Can a Visual Studio produced static library, be stripped of symbols?

I'll divide this questions in 3 parts:
I would like to produce a static library and strip off its symbols. (Debug info is already not included)
Similar to the strip command in linux. Can it be done?
Is there an equivalent tool in windows env, to the nm tool in linux?
When creating a static library using VS2008. Is it possible to define a script that will exclude some of the produced .obj files out of the build and out of the static lib?
Can it be dynamic? I mean I'd define a compilation mode in the script and this would result in specific object files being excluded from the build
If anything is visible that you feel should not be, try declaring it with the "static" keyword. This tells the compiler that it is accessible only to the current module.
There are cases where it would be convenient to be able to strip out all but a small number of "exported" public symbols, but it's not really feasible.
A static library is little more than a collection of .obj files. The internal dependencies haven't been resolved yet, and they won't be resolved until link time.
For example, if your .lib consists of foo.obj and bar.obj, and there's a call in foo.obj to a function defined in bar.obj, then that symbol must be available at link time, even if nothing outside of the library should be able to see it.
For that reason, you cannot strip the symbols (with the possible exception of file-scope static symbols). Even class methods that are protected or private (in the C++-sense) will exist in the symbol table, since the enforcement of the visibility is a compile-time issue, not a link-time one.
In contrast, a dynamic library is a standalone binary that has already been linked. References from foo.obj to bar.obj have already been resolved. Thus a DLL can be stripped of symbols except for the ones that must be exported (and even those can be renamed or replaced by ordinals).
If your DLL exposes a simple C API, then you're all set. But if you want to expose a C++ class, you're probably going to end up exporting all of its methods, even the protected and private ones (since inlining in the external application might result in direct calls to private methods).
No, how do you think the users of the static library would link to it without knowing where are the symbols they use defined?
Yes, try the DUMPBIN utility.
Well, yes. You can run the LIB utility with /REMOVE:foo.
That said, I think you are doing something that either is not worth doing or could be done a lot simpler than with removing library members.
I kept finding the names of certain (but not all) static functions in .obj files produced by VS2010. Interestingly, they were visible in my Release .obj files but not the Debug .obj files. I just used cygwin strings to perform the search:
$ strings myObjectFile.obj | grep myStaticFunctionName
I tracked it down to the "Whole Program Optimization = Yes" setting ("/GL"). When I switched this to "No" the function names no longer appear.
Update: As a followup test I opened the "cleansed" myObjectFile.obj in vim and I can still find them (with either :set encoding=utf-8 or :set encoding=latin1). I'm not sure why strings was missing the matches. Oh well.

gcc's fvisibility at compile time or link time

I am trying to limit the ABI of a shared library using the gcc's fvisibility feature. However I am confused what is the correct way to do it.
My makefile organizes the build process in two stages. At the first step all .cpp files are built to object files using some gcc options. Then all the object files are linked together using another set of gcc and ld options. From what I have read fvisibility is relevant to the second step. However this contradicts with the results I observer. If I add fvisibility=hidden to the compile time options the result is as expected, nm -D reporting a much smaller set of exported symbols. On the contrary if I add it to the link time options it does not seem to affect the build.
While looking for an explanation I have compared the object files produced with and without fvisibility. The difference seems to be in the addresses of the symbols inside the object file. However I am not aware how that difference in addresses carries the message to the linker so that it is able to hide the symbols in one of the cases and expose them in the other.
Could anyone please explain to me that. Thank you for your time.
Compile time, as the visibility is placed in the object (.o) files, and then used by the linker when creating the complete executable/shared object. When using it at link time, but not compile time, it will have no effect, as the visibility in the object files is still default. There's also no need to use it at link time at all I've found.
In the case of how the visibility is stored, the different symbols are probably in different segments, and they get their visibility from the options of the segment.
You may find http://gcc.gnu.org/wiki/Visibility to be helpful