hide private symbols automatically - c++

I have a C++ project with public and private header files.
To increase encapsulation and decrease symbol clashes in a larger project I would like to export only the minimal set of symbols.
Although we could manually annotate each function with visibility attributes, I'd prefer an approach that does not require changing the source code.
Given the following project structure:
LibA
include
*.h
src
*.h
*.cpp
Is there a way to automatically hide all the symbols that don't appear in include/*.h ?
Is there an elegant way of instrumenting the compiler/linker?
Could we automatically generate a version-script ?

With gcc and clang, this is as simple as building with -fvisibility=hidden. Then you only have to explicitly export the few public symbols you want exposed.
For more details, there's a gcc article on symbol visibility that you may want to read.

Could we automatically generate a version-script?
You sure could: run nm -C *.o | egrep ' [TDBW] ' to get the list of global symbols, then look in include/*.h to see which ones should be exported. This will likely be fragile: if you e.g. use macros to generate symbol names, this will probablynot work at all.
It may be worth it to generate the list once, hand-curate it, and then maintain it together with the sources by hand in the revision control system.
If the number of symbols to be exported is relatively small, compiling with -fvisibility-hidden and annotating just the public symbols is a much more robust solution.

Related

Can an export map select only the functions you want to link to?

I am writing a test harness with Googletest and need to control the symbol table to avoid conflicts (the code base is mainly C with a bit of C++ on Linux).
I am looking for a way to link against only the functions I want in a file and also to be able to create custom sets of functions to link against for each test.
This is a bit broad I know but any suggestions or ideas will be most welcome!
You can use a version script for your linker to define, which symbols should be exported in the symbol table.
Such a version script can look like this:
{
global:
symb1;
symb2;
symb3;
local: *;
};
This example will only export the symbols symb1-3, all other symbols are omitted from the symbol table.
Now specify this script as version script for the linker, an example for a shared library:
cc -shared obj1.o obj2.o obj3.o -o library.so -Wl,--version-script=<scriptname>
Even more control can be gained through symbol versions, more details can be found in the ld-documentation: http://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_25.html

Limiting the scope of global symbols from linked objects

I have a C library in an archive file, clib.a. I've written a C++ wrapper for it, cpp.o, and would like to use this as a static library:
ar cTrvs cppwrap.a clib.a cpp.o
Code which links to this won't be able to use the stuff from clib.a directly unless the correct header is included. However, if someone coincidentally creates an appropriate prototype -- e.g. void myCoincidentallyNamedGlobalFunction() -- I'm concerned which definition of myCoincidentallyNamedGlobalFunction will apply.
Since the symbols from clib.a only need to be accessed in cpp.o, and not anything linked to cppwrap.a, is there a way to completely hide them so that there is no possible collision (so even including the clib header would fail)?
You can manually remove unneeded symbols on the final combined library:
$ objcopy -N foo cppwrap.a (remove symbol)
Or, if you need the symbols but want to make sure that external users can't get to them:
$ objcopy -L bar cppwrap.a (localize symbol)
Or, if a symbol in clib.a must be visible by something in cpp.o but you don't want it to be used by anyone else:
$ objcopy -W baz cppwrap.a (weaken symbol)
In this case, collisions with symbols from other object files/libraries will defer to their usage, even though the symbol will still be visible. To obscure things further or to reduce chances of even a deferential collision, you can also use:
$ objcopy --redefine-sym old=new cppwrap.a
An anonymous namespace may help in some cases, but not if there's functionality that your wrapper needs but is trying to hide from external users.

Workaround for when -whole-library is not available

I'm trying to compile on an environment where the -Wl,-whole-library flag is not supported (emscripten). How can I trick to force the compiler to include the exported symbols ? The solution should met as many of these properties as possible :
Could be applied on a single library (I don't want to include unused symbols from other libraries)
Could be automatically generated (for example by fetching the exported symbol table with nm?)
Would work with functions & member functions
I thought about computing a file with something like :
int x = (int)(&func_a)+(int)(&func_b)+...;
But it doesn't work with member functions, which cannot be casted to int (and can be private).
Do you have any idea ?
Ideas:
Use --whole-library flag before linking the lib you want and just
after add -no-whole-library before listing other libs so that only
the one you need To be wholly linked is and try add --export-dynamic flag using a linker that supports it.
Then dig the nm/objdump/exportmap road http://accu.org/index.php/journals/1372 to export/build link info and for using link info http://runtimecompiledcplusplus.blogspot.fr/ for using exported maps and code so that you can mimic the -Wl in your code.

Can a Visual Studio produced static library, be stripped of symbols?

I'll divide this questions in 3 parts:
I would like to produce a static library and strip off its symbols. (Debug info is already not included)
Similar to the strip command in linux. Can it be done?
Is there an equivalent tool in windows env, to the nm tool in linux?
When creating a static library using VS2008. Is it possible to define a script that will exclude some of the produced .obj files out of the build and out of the static lib?
Can it be dynamic? I mean I'd define a compilation mode in the script and this would result in specific object files being excluded from the build
If anything is visible that you feel should not be, try declaring it with the "static" keyword. This tells the compiler that it is accessible only to the current module.
There are cases where it would be convenient to be able to strip out all but a small number of "exported" public symbols, but it's not really feasible.
A static library is little more than a collection of .obj files. The internal dependencies haven't been resolved yet, and they won't be resolved until link time.
For example, if your .lib consists of foo.obj and bar.obj, and there's a call in foo.obj to a function defined in bar.obj, then that symbol must be available at link time, even if nothing outside of the library should be able to see it.
For that reason, you cannot strip the symbols (with the possible exception of file-scope static symbols). Even class methods that are protected or private (in the C++-sense) will exist in the symbol table, since the enforcement of the visibility is a compile-time issue, not a link-time one.
In contrast, a dynamic library is a standalone binary that has already been linked. References from foo.obj to bar.obj have already been resolved. Thus a DLL can be stripped of symbols except for the ones that must be exported (and even those can be renamed or replaced by ordinals).
If your DLL exposes a simple C API, then you're all set. But if you want to expose a C++ class, you're probably going to end up exporting all of its methods, even the protected and private ones (since inlining in the external application might result in direct calls to private methods).
No, how do you think the users of the static library would link to it without knowing where are the symbols they use defined?
Yes, try the DUMPBIN utility.
Well, yes. You can run the LIB utility with /REMOVE:foo.
That said, I think you are doing something that either is not worth doing or could be done a lot simpler than with removing library members.
I kept finding the names of certain (but not all) static functions in .obj files produced by VS2010. Interestingly, they were visible in my Release .obj files but not the Debug .obj files. I just used cygwin strings to perform the search:
$ strings myObjectFile.obj | grep myStaticFunctionName
I tracked it down to the "Whole Program Optimization = Yes" setting ("/GL"). When I switched this to "No" the function names no longer appear.
Update: As a followup test I opened the "cleansed" myObjectFile.obj in vim and I can still find them (with either :set encoding=utf-8 or :set encoding=latin1). I'm not sure why strings was missing the matches. Oh well.

C++ shared library shows internal symbols

I have built a shared library (.dll, .so) with VC++2008 and GCC.
The problem is that inside both libs it shows the names of private symbols (classes, functions) and they weren't exported.
I don't want my app to display the name of classes/functions that weren't exported.
Is any way i can do that?
In GCC i did:
Compiled with -fvisibility=hidden and then made public with attribute ((visibility("default")))
In VC++:
__declspec(dllexport)
Thanks!
For GNU tool chains you can use th strip command to remove symbols from object files. It takes various command options to control its behavior. It may do what you want.
You can create a header file to obfuscate the internal function and method names you want to be hidden. Ie something like below (need some include guard too)
#define someFunctionName1 sJkahe28273jwknd
#define someFunctionName2 lSKlajdwe98
#define someMethodName1 ksdKLJLKJl22fss
#define someMethodName2 lsk89hHHuhu7g
...and include this in the header files where the real definitions live.
The private keyword when used for access specification only
effectively works at compile time and is intended as an aid to programmers, not a security feature - as you have found out the "privacy" is implemented
using lexical means .
It's easy to see that this must be so - if you implement two private functions with dependencies between each other in two separate .cpp files, the linker has to find the private names in the resulting object (or library) files.
Bottom line - C++ has no code security features - if you give someone the object code of your program, they will always be able to examine it.