C++ shared library shows internal symbols

C++ shared library shows internal symbols - c++

I have built a shared library (.dll, .so) with VC++2008 and GCC.
The problem is that inside both libs it shows the names of private symbols (classes, functions) and they weren't exported.
I don't want my app to display the name of classes/functions that weren't exported.
Is any way i can do that?
In GCC i did:
Compiled with -fvisibility=hidden and then made public with attribute ((visibility("default")))
In VC++:
__declspec(dllexport)
Thanks!

For GNU tool chains you can use th strip command to remove symbols from object files. It takes various command options to control its behavior. It may do what you want.

You can create a header file to obfuscate the internal function and method names you want to be hidden. Ie something like below (need some include guard too)
#define someFunctionName1 sJkahe28273jwknd
#define someFunctionName2 lSKlajdwe98
#define someMethodName1 ksdKLJLKJl22fss
#define someMethodName2 lsk89hHHuhu7g
...and include this in the header files where the real definitions live.

The private keyword when used for access specification only
effectively works at compile time and is intended as an aid to programmers, not a security feature - as you have found out the "privacy" is implemented
using lexical means .
It's easy to see that this must be so - if you implement two private functions with dependencies between each other in two separate .cpp files, the linker has to find the private names in the resulting object (or library) files.
Bottom line - C++ has no code security features - if you give someone the object code of your program, they will always be able to examine it.

Related

Library design: allow user to decide between "header-only" and dynamically linked?

I have created several C++ libraries that currently are header-only. Both the interface and the implementation of my classes are written in the same .hpp file.
I've recently started thinking that this kind of design is not very good:
If the user wants to compile the library and link it dynamically, he/she can't.
Changing a single line of code requires full recompilation of existing projects that depend on the library.
I really enjoy the aspects of header-only libraries though: all functions get potentially inlined and they're very very easy to include in your projects - no need to compile/link anything, just a simple #include directive.
Is it possible to get the best of both worlds? I mean - allowing the user to choose how he/she wants to use the library. It would also speed up development, as I'd work on the library in "dynamically-linking mode" to avoid absurd compilation times, and release my finished products in "header-only mode" to maximize performance.
The first logical step is dividing interface and implementation in .hpp and .inl files.
I'm not sure how to go forward, though. I've seen many libraries prepend LIBRARY_API macros to their function/class declarations - maybe something similar would be needed to allow the user to choose?
All of my library functions are prefixed with the inline keyword, to avoid "multiple definition of..." errors. I assume the keyword would be replaced by a LIBRARY_INLINE macro in the .inl files? The macro would resolve to inline for "header-only mode", and to nothing for the "dynamically-linking mode".

Preliminary note: I am assuming a Windows environment, but this should be easily transferable to other environments.
Your library has to be prepared for four situations:
Used as header-only library
Used as static library
Used as dynamic library (functions are imported)
Built as dynamic library (functions are exported)
So let's make up four preprocessor defines for those cases: INLINE_LIBRARY, STATIC_LIBRARY, IMPORT_LIBRARY, and EXPORT_LIBRARY (it is just an example; you may want to use some sophisticated naming scheme).
The user has to define one of them, depending on what he/she wants.
Then you can write your headers like this:
// foo.hpp
#if defined(INLINE_LIBRARY)
#define LIBRARY_API inline
#elif defined(STATIC_LIBRARY)
#define LIBRARY_API
#elif defined(EXPORT_LIBRARY)
#define LIBRARY_API __declspec(dllexport)
#elif defined(IMPORT_LIBRARY)
#define LIBRARY_API __declspec(dllimport)
#endif
LIBRARY_API void foo();
#ifdef INLINE_LIBRARY
#include "foo.cpp"
#endif
Your implementation file looks just like usual:
// foo.cpp
#include "foo.hpp"
#include <iostream>
void foo()
{
std::cout << "foo";
}
If INLINE_LIBRARY is defined, the functions are declared inline and the implementation gets included like a .inl file.
If STATIC_LIBRARY is defined, the functions are declared without any specifier, and the user has to include the .cpp file into his/her build process.
If IMPORT_LIBRARY is defined, the functions are imported, and there isn't a need for any implementation.
If EXPORT_LIBRARY is defined, the functions are exported and the user has to compile those .cpp files.
Switching between static / import / export is a really common thing, but I'm not sure if adding header-only to the equation is a good thing. Normally, there are good reasons for defining something inline or not to do so.
Personally, I like to put everything into .cpp files unless it really has to be inlined (like templates) or it makes sense performance-wise (very small functions, usually one-liners). This reduces both compile time and - way more important - dependencies.
But if I choose to define something inline, I always put it in separate .inl files, just to keep the header files clean and easy to understand.

It is operating system and compiler specific. On Linux with a very recent GCC compiler (version 4.9) you might produce a static library using interprocedural linktime optimization.
This means that you build your library with g++ -O2 -flto both at compile and at library link time, and that you use your library with g++ -O2 -flto both at compile and link time of the invoking program.

This is to complement #Horstling's answer.
You can either create a static or a dynamic library. When you create statically-linked libraries, compiled code for all functions/objects will be saved to a file (with .lib extension in Windows). At main project (the project using the library) 's link time, these codes will be linked into your final executable together with the main project codes. So the final executable wouldn't have any runtime dependency.
Dynamically linked libraries will be merged into the main project at run time (and not link time). When you compile the library you get a .dll file (which contains actual compiled code) and a .lib file (which contains enough data for the compiler/runtime to find functions/objects in the .dll file). At link time, the executable will be configured to load the .dll and use compiled code from that .dll as needed. You will need to distribute the .dll file with your executable to be able to run it.
There is no need to choose between static or dynamic linking (or header-only) when designing your library, you create multiple project/makefiles, one to create a static .lib, another to create a .lib/.dll pair, and distribute both versions, for the user to choose between. (You'll need to use preprocessor macros like the ones #Horstling suggested).
You cannot put any templates in a pre-compiled library, unless you use a technique called Explicit Instantiation, which limits template parameters.
Also note that modern compiler/linkers usually do not respect the inline modifier. They may inline a function even if it's not designated as inline, or may dynamically call another that has inline modifier, as they see fit. (Regardless, I'll advise explicitly putting inline where applicable for maximum compatibility). So, there won't be any runtime performance penalty if you use a statically linked library instead of a header-only library (and enable compiler/linker optimizations, of course). As others have suggested, for really small functions that are sure to benefit from being called inline, it is best practice to put them in header files, so dynamically linked libraries will also not suffer any significant performance loss. (In any case, inlining functions will only affect performance for functions that are being called very often, inside loops that are going to be called thousands/millions of times).
Instead of putting inline functions in header files (with an #include "foo.cpp" in your header), you can change makefile/project settings and add foo.cpp to the list of source files to be compiled. This way, if you change any function implementation there will be no need to re-compile the whole project and only foo.cpp will be re-compiled. As I mentioned earlier, your small functions will still be inlined by the optimizing compiler, and you don't need to worry about that.
If you use/design a pre-compiled library, you should consider the case where the library is compiled with a different version of compiler to the main project. Each different compiler version (even different configurations, like Debug or Release) uses a different C runtime (things like memcpy, printf, fopen, ...) and C++ standard library runtime (things like std::vector<>, std::string, ...). These different library implementations may complicate linking, or even create runtime errors.
As a general rule, always avoid sharing compiler runtime objects (data structures that are not defined by standards, like FILE*) across libraries, because incompatible data structures will lead to runtime errors.
When linking your project, C/C++ runtime functions must be linked into your library .lib or .lib/.dll, or your executable .exe. C/C++ runtime itself can be linked as static or dynamic library (you can set this in makefile/project settings).
You will find that dynamically linking to C/C++ runtime in both the library and the main project (even when you compile the library itself as a static library) avoids most linking problems (with duplicate function implementations in multiple runtime versions). Of course you would need to distribute runtime DLLs for all used versions with your executable and library.
There are scenarios that statically linking to C/C++ runtime is needed, and the best approach in these cases would be to compile the library with the same compiler setting as the main project to avoid linking problems.

Rationale
Put as little as necessary in header files and as much as possible in library modules, because of the very reasons that you mentioned: compile-time dependency and long compilation time. The only good reasons for header-only modules are:
generic templates for user-defined template parameter;
very short convenience functions when inlining gives significant
performance.
In case 1, it is often possible to hide some functionality that does not depend on the user-defined type in a .cpp file.
Conclusion
If you stick to this rationale, then there is no choice: templated functionality that must allow user-defined types cannot be pre-compiled, but requires a header-only implementation. Other functionality should be hidden from the user in a library to avoid exposing them to the implementation details.

Rather than a dynamic library, you could have a precompiled static library and thin header file. In an interactive quick build, you get the benefit of not having to recompile the world if implementation details changes. But a fully optimized release build can do global optimization and still figure out it can inline functions. Basically, with "link-time code generation" the toolset does the trick you were thinking about.
I'm familiar with Microsoft's compiler, which I know for sure does this as of Visual Studio 2010 (if not earlier).

Templated code will necessarily be header-only: for instantiating this code, the type parameters must be known at compilation time. There is no way to embed template code in shared libraries. Only .NET and Java support JIT instantiation from byte-code.
Re: non-template code, for short one-liners I suggest keeping it header-only. Inline functions give the compiler a lot more opportunities to optimize the final code.
To avoid "insane compilation time", Microsoft Visual C++ has a "precompiled headers" feature. I do not think GCC has a similar feature.
Long functions should not be inlined in any case.
I had one project which had header-only bits, compiled library bits and some bits I could not decide where belonged. I ended up having .inc files, conditionally included in either .hpp or .cxx depending on #ifdef. Truth to be told, the project was always compiled in "max inline" mode, so after a while I got rid of the .inc files and simply moved the contents to .hpp files.

Is it possible to get the best of both worlds?
In terms; limitations arise because tools aren't smart enough. This answer gives the current best effort that is still portable enough to be used effectively.
I've recently started thinking that this kind of design is not very good.
It ought to be. Header-only libraries are ideal because they simplify deployment: makes the reusing mechanism of the language similar to almost all others', which is just the sane thing to do. But this is C++. Current C++ tools still rely on half-a-century-old linking models that remove important degrees of flexibility, such as choosing which entry points to import or export on an individual level without being forced to change the library's original source code. Also, C++ lacks a proper module system and still relies on glorified copy-paste operations to work (although this is just a side factor to the problem in question).
In fact, MSVC is a little better in this regard. It is the only major implementation trying to achieve some degree of modularity in C++ (by attempting e.g. C++ modules). And it is the only compiler that actually allows e.g. the following:
//// Module.c++
#pragma once
inline void Func() { /* ... */ }
//// Program1.c++
#include <Module.c++>
// Inlines or "vague" links Func(), whatever is better.
int main() { Func(); }
//// Program2.c++
// This forces Func() to be imported.
// The declaration must come *BEFORE* the definition.
__declspec(dllimport) __declspec(noinline) void Func();
#include <Module.c++>
int main() { Func(); }
//// Program3.c++
// This forces Func() to be exported.
__declspec(dllexport) __declspec(noinline) void Func();
#include <Module.c++>
Note that this can be used to selectively import and export individual symbols from the library, although still cumbersomely.
GCC also accepts this (but the order of the declarations must be changed) and Clang does not have any way to achieve the same effect without changing the library's source.

why do I need to link a lib file to my project?

I am creating a project that uses a DLL. To build my project, I need to include a header file, and a lib file. Why do I need to include the respective lib file? shouldn't the header file declare all the needed information and then at runtime load any needed library/dll?
Thanks

In many other languages, the equivalent of the header file is all you need. But the common C linkers on Windows have always used import libraries, C++ linkers followed suit, and it's probably too late to change.
As a thought experiment, one could imagine syntax like this:
__declspec(dllimport, "kernel32") void __stdcall Sleep(DWORD dwMilliseconds);
Armed with that information the compiler/linker tool chain could do the rest.
As a further example, in Delphi one would import this function, using implicit linking, like so:
procedure Sleep(dwMilliseconds: DWORD); stdcall; external 'kernel32';
which just goes to show that import libraries are not, a priori, essential for linking to DLLs.

That is a so-called "import library" that contains minimal wiring that will later (at load time) ask the operating system to load the DLL.

DLLs are a Windows (MS/Intel) thing. The (generated) lib contains the code needed to call into the DLL and it exposes 'normal' functions to the rest of your App.

No, the header file isn't necassarily enough. The header file can contain just the declarations of the functions and classes and other things you need, not their implementations.
There is a world of difference between this code:
void Multiply(int x, int y);
and this code:
void Multiply(int x, int y)
{
return x * y;
}
The first is a declaration, and the second is a definition or implementation. Usually the first example is put in header files, and the second one is put in .CPP files (If you are creating libraries). If you included a header with the first and didn't link in anything, how is your application supposed to know how to implement Multiply?
Now if you are using header files that contain code that is ALL inlined, then you do not need to link anything. But if even one method is NOT inlined, but has its implementation in a .CPP file that is compiled to a .lib file, than you need to link in the .lib file.
[EDIT]
With your use of Import Libraries, you are telling the linker to NOT include the implementation details of the imported code into your binary. Instead the OS will then load the import DLL at run-time into your process. This will make your application smaller, but you have to ship another DLL with it. If the implementation of the library changes, you can just reship another DLL to your customers, and not have to reship the entire application.
There is another option where you can just link in a library and you don't need to ship another DLL. That option is where the Linker will include the implementation into your application, making it bigger in size. If you have to change the implementation details in the imported library, then you have to recompile and relink your entire application, and reship the entire thing to your customers.

There are two relevant phases in the building process here:
compilation: from the source code to an object file. During the compilation, the compiler needs to know what external things are available, for that one needs a declaration. Declarations designed to be used in several compilation units are grouped in header. So you need the headers for the library.
linking: For static libraries, you need the compiled version of the library. For dynamic libraries, in Unix you need the library, in windows, you need the "import library".
You could think that a library could also embed the declarations or the header could include the library which needs to be linked. The first is often done in other languages. The second is sometimes available through pragmas in C and C++, but there is no standard way to do this and would be in conflict with common usage (such as choosing a library among several which provide code variant for the same declarations, for instance debug/release single thread/multithreads). And neither choice correspond well with the simple compilation model of C and C++ which has its roots in the 60's.

The header file is consumed by the compiler. It contains all the forward declarations of functions, classes and global variables that will be used. It may also contain some inline function definitions as well.
These are used by the compiler to give it the bare minimum information that it needs to compile your code. It will not contain the implementation details.
However you still need to link in all the function, and variable definitions that you have told the compiler about. Failure to do so will result in a linker error. Often this is contains in other object files which may be joined into a single static library.
In the case of DLLs (or .so files), we still need to tell the linker where in the DLL or shared object the missing symbols are. On windows, this information is contained in a .lib file. This will generate the code to load and link the code at runtime.
On unix the the dll and lib files are combined into a single .so file which you must link against to about linker errors.
You can still use a dll without a .lib file but you will then have to load and link in all the symbols manually using operating system APIs.

from 1000 ft, the lib contains the list of the functions that dll exports and addresses that are needed for the call.

Can a Visual Studio produced static library, be stripped of symbols?

I'll divide this questions in 3 parts:
I would like to produce a static library and strip off its symbols. (Debug info is already not included)
Similar to the strip command in linux. Can it be done?
Is there an equivalent tool in windows env, to the nm tool in linux?
When creating a static library using VS2008. Is it possible to define a script that will exclude some of the produced .obj files out of the build and out of the static lib?
Can it be dynamic? I mean I'd define a compilation mode in the script and this would result in specific object files being excluded from the build

If anything is visible that you feel should not be, try declaring it with the "static" keyword. This tells the compiler that it is accessible only to the current module.

There are cases where it would be convenient to be able to strip out all but a small number of "exported" public symbols, but it's not really feasible.
A static library is little more than a collection of .obj files. The internal dependencies haven't been resolved yet, and they won't be resolved until link time.
For example, if your .lib consists of foo.obj and bar.obj, and there's a call in foo.obj to a function defined in bar.obj, then that symbol must be available at link time, even if nothing outside of the library should be able to see it.
For that reason, you cannot strip the symbols (with the possible exception of file-scope static symbols). Even class methods that are protected or private (in the C++-sense) will exist in the symbol table, since the enforcement of the visibility is a compile-time issue, not a link-time one.
In contrast, a dynamic library is a standalone binary that has already been linked. References from foo.obj to bar.obj have already been resolved. Thus a DLL can be stripped of symbols except for the ones that must be exported (and even those can be renamed or replaced by ordinals).
If your DLL exposes a simple C API, then you're all set. But if you want to expose a C++ class, you're probably going to end up exporting all of its methods, even the protected and private ones (since inlining in the external application might result in direct calls to private methods).

No, how do you think the users of the static library would link to it without knowing where are the symbols they use defined?
Yes, try the DUMPBIN utility.
Well, yes. You can run the LIB utility with /REMOVE:foo.
That said, I think you are doing something that either is not worth doing or could be done a lot simpler than with removing library members.

I kept finding the names of certain (but not all) static functions in .obj files produced by VS2010. Interestingly, they were visible in my Release .obj files but not the Debug .obj files. I just used cygwin strings to perform the search:
$ strings myObjectFile.obj | grep myStaticFunctionName
I tracked it down to the "Whole Program Optimization = Yes" setting ("/GL"). When I switched this to "No" the function names no longer appear.
Update: As a followup test I opened the "cleansed" myObjectFile.obj in vim and I can still find them (with either :set encoding=utf-8 or :set encoding=latin1). I'm not sure why strings was missing the matches. Oh well.

LNK4221 and LNK4006 Warnings

I am making a static library of my own. I have taken my code which works and now put it into a static library for another program to use. In my library I am using another static library which I don't want the people who will be using my API to know. Since, I want to hide that information from them I can't tell them to install the other static library.
Anyway, I used the command line Lib.exe to extract and create a smaller lib file of just the obj's I used. However, I get a bunch of LNK4006 :second definition ignored linker warnings for each obj I use followed by LNK4221 no public symbols found;archive member will be inaccessible.
I am doing this work in vs2008 and I am not sure what I am doing wrong.
I am using the #pragma comment line in my .cpp file
I have also modified the librarian to add my smaller .lib along with its location.
my code simply makes calls to a couple functions which it should be able to get from those Obj file in the smaller lib.
All my functions are implemented in .cpp file and my header just have the includes of the third party header files and come standard c++ header files. nothing fancy. I have actually no function definitions in there atm. I was going to put the API definition in there and implement that in the .cpp for this static lib that i was going to make. However, I just wanted to build my code before I added more to it.
I did read http://support.microsoft.com/default.aspx?scid=kb;EN-US;815773 but it did not provide a solution.

Even if you extract all objects from the other library and put them in your own library, your users will still be able to see what's in your library and thus see all the object names. In many cases the names of the objects will reveal what's actually the other library you are using.
Instead of distributing your library as a static library, consider distributing it as a DLL. In the DLL you can easily hide all the underlying things and only make public what you want to make public.

Why to use __declspec(dllexport)? Seems to be working without it

Been a while since I have programmed in C++, so the whole export/import idea slipped off my mind.
Can you explain me why to use __declspec(dllexport) & import thingy if it looks like I can use classes from other libraries without those.
I have created a solution in VC++ 2005, added the console applicaiton project and two dll libraries projects. Then create ClassA in a LibA, ClassB in LibB project.
Once I have included ClassA.h & ClassB.h into my console app source code, and has linked it with a LibA.lib and LibB.lib I was able to create and use instances of ClassA and ClassB in a console applicaiton. So basically I was able to use classes without exporting/importing them using __declspec.
Can you explain me - what I am missing here.

Once I have included ClassA.h & ClassB.h into my console app source code, and has linked it with a LibA.lib and LibB.lib I was able to create and use instances of ClassA and ClassB in a console applicaiton.
This sounds like you have used static linking. This works without the __declspec(dllexport) in the same way as linking with the object files of your classes directly.
If you want to use dynamic (run-time) linking with a DLL, you have to use either the mentioned declaration or a DEF-file specifying the exported functions. DLLs contain an exports table listing the functions exposed to other executables. All other functions remain internal to your DLL.
Perhaps you are confused coming from the Linux world, where the situation is the other way round: All symbols are visible externally by default.

You would use __declspec(dllexport) if you wanted to provide symbols in your dll for other dlls/exes to access.
You would use __declspec(dllimport) if you wanted to access symbols in your dll/exe provided by another dll.
Not necessary if you are linking against a static .lib.

If you are including .h files and linking to .lib files then you can drop the DLL declarations. Why do you need a dynamic link library if you only need static linking?
The export declaration marks the function as available for exporting. The declaration you are using may be a macro for "extern" and "pascal" It's been years since I've done this but I think DLLs function calls have a different ordering of pushing params on the stack and the allocation of the return result is done differently (the pascal flag). The extern declaration helps the linker make the function available when you link the library.
You may have missed the step of linking the DLL - the linker will take classA.lib and turn it into classA.dll ( you may need to setup a setupA.def file to define the DLL library). Same applies to ClassB

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js