Strip out symbols from dll - c++

I have built a dll using MinGW gcc on windows. I used .def to specified exported functions and generated .lib import lib. Then I used strip.exe to strip out the symbols table. I tried objdump and it prints out empty symbol table. But when I use strings.exe, it can still print a bunch of function and class names. Is this a problem? Would others be able to query functions according to the names from the dll?

Read about PE, it has address and name tables. If you clean up name table, no process will be able to locate exported function by name, only by address, that's done by linker, but not dynamically:
http://msdn.microsoft.com/en-us/magazine/cc301808.aspx
There are many ways to secure exported functions from dll,but all of them have published exploits. If you really want to secure your function - create custom calling convention or signature verification at all exported functions.

All that stripping does to the dll is to remove the debugging symbols. It does NOT remove the functions from the dll. In other words, if someone imports your dll or uses it, they can access only whatever you exported. If you do not want them to be able to access it, simply do not export it.
Also, when you do a release build, it should strip it if you have -s option enabled.

Related

When building a dll, can the lib be used for linking?

I wanted to create a C++ dll (to be used in a dot net application). Some functionality I needed was already implemented in another C++ dll.
The dll I was referencing was set up like this (including the comment):
extern "C"
{
__declspec(dllexport) BOOL SomeFunctionToBeUsedExternally();
}
// internal functions
BOOL OtherFunctions();
I need to use one of the OtherFunctions in my code.
So, I added the proper include in my own code, added dependencies on the lib created by the dll above, and used the method I needed. As a result, of course, I got another __declspec(dllexport)... function.
It refused to link though, I got an error about the OtherFunction.
I verified everything, looked online - nothing seems to solve my problem.
Then, I added a __declspec(dllexport) in front of the function I needed, and it works.
I don't understand though. I thought, the dllexport marked functions will be exported to the dll, but aren't all functions sent to the lib ?
Why do I have to export functions to the dll, if I am not linking against the dll but against the lib ?
No, the linker does not automatically export all identifiers. The dllexport attribute tells the linker which identifiers are exported. Without this you would be forced to either export every identifier in the DLL or specify which identifiers should not be exported. When the linker creates the DLL it also creates an import library and includes information about which identifiers are exported based on that attribute.
When you want to use a DLL you need link with the appropriate .lib file for good reason. The .lib file tells the linker which identifiers are exported, the name of the DLL they are in and other information. It is also possible to export identifiers from the DLL by ordinal instead of by name. In this case the linker still needs to match the identifier with the appropriate ordinal. This is only possible by having an accompanying library file that contains that information since it is not present in DLL's export table.
No, only exported functions end up in the .lib. As you can tell.
It is not a static link library, it the import library for the DLL. It is a very simple and very small file since it contains no code at all. Just a list of the exported functions. The linker needs it to resolve the external in the client code, it needs to know the name of the DLL and the actual exported function name or ordinal (could be different) so it can add the entry to client's import table. Import libraries having the same filename extension as static libraries was perhaps a bit unfortunate.

Hide export symbols without using a DEF file

I would like to hide exported symbols from a DLL for obfuscation purposes.
That is pretty neatly doable when using a module definition file (.def) looking something like this;
EXPORTS
??0Foo#QAE#XZ #1 NONAME
??1Foo#QAE#XZ #2 NONAME
?Bar#Foo#UAEHXZ #3 NONAME
Trouble is, such solution is highly inflexible and demands manual work. As you can see within my example, I am exporting C++ symbols, hence they are heavily decorated by my compiler.
So my current workflow looks like this;
I have to first create a version of my DLL that exports all symbols in the standard way using __declspec(dllexport), then I need to extract all exported symbol names using dumpbin or alike. After that is done, I need to copy&paste the symbols into my module definition file and add that NONAME directive. Then I have to make sure that my original sources do not use that __declspec(dllexport) anymore. Once all of that is done, I need to activate that .def file within the project settings and then I can finally build the export symbol free version of that DLL. Plenty of work for that rather simple task, I guess.
Before covering all of this using a bunch of scripts and stuff, I thought that maybe, just maybe there is a solution that is much simpler?
Please note that I am using VisualStudio (2012) and hence that nifty GCC pragma hidden wont do, as far as I know.

Can a Visual Studio produced static library, be stripped of symbols?

I'll divide this questions in 3 parts:
I would like to produce a static library and strip off its symbols. (Debug info is already not included)
Similar to the strip command in linux. Can it be done?
Is there an equivalent tool in windows env, to the nm tool in linux?
When creating a static library using VS2008. Is it possible to define a script that will exclude some of the produced .obj files out of the build and out of the static lib?
Can it be dynamic? I mean I'd define a compilation mode in the script and this would result in specific object files being excluded from the build
If anything is visible that you feel should not be, try declaring it with the "static" keyword. This tells the compiler that it is accessible only to the current module.
There are cases where it would be convenient to be able to strip out all but a small number of "exported" public symbols, but it's not really feasible.
A static library is little more than a collection of .obj files. The internal dependencies haven't been resolved yet, and they won't be resolved until link time.
For example, if your .lib consists of foo.obj and bar.obj, and there's a call in foo.obj to a function defined in bar.obj, then that symbol must be available at link time, even if nothing outside of the library should be able to see it.
For that reason, you cannot strip the symbols (with the possible exception of file-scope static symbols). Even class methods that are protected or private (in the C++-sense) will exist in the symbol table, since the enforcement of the visibility is a compile-time issue, not a link-time one.
In contrast, a dynamic library is a standalone binary that has already been linked. References from foo.obj to bar.obj have already been resolved. Thus a DLL can be stripped of symbols except for the ones that must be exported (and even those can be renamed or replaced by ordinals).
If your DLL exposes a simple C API, then you're all set. But if you want to expose a C++ class, you're probably going to end up exporting all of its methods, even the protected and private ones (since inlining in the external application might result in direct calls to private methods).
No, how do you think the users of the static library would link to it without knowing where are the symbols they use defined?
Yes, try the DUMPBIN utility.
Well, yes. You can run the LIB utility with /REMOVE:foo.
That said, I think you are doing something that either is not worth doing or could be done a lot simpler than with removing library members.
I kept finding the names of certain (but not all) static functions in .obj files produced by VS2010. Interestingly, they were visible in my Release .obj files but not the Debug .obj files. I just used cygwin strings to perform the search:
$ strings myObjectFile.obj | grep myStaticFunctionName
I tracked it down to the "Whole Program Optimization = Yes" setting ("/GL"). When I switched this to "No" the function names no longer appear.
Update: As a followup test I opened the "cleansed" myObjectFile.obj in vim and I can still find them (with either :set encoding=utf-8 or :set encoding=latin1). I'm not sure why strings was missing the matches. Oh well.

Is there a .def file equivalent on Linux for controlling exported function names in a shared library?

I am building a shared library on Ubuntu 9.10. I want to export only a subset of my functions from the library. On the Windows platform, this would be done using a module definition (.def) file which would contain a list of the external and internal names of the functions exported from the library.
I have the following questions:
How can I restrict the exported functions of a shared library to those I want (i.e. a .def file equivalent)
Using .def files as an example, you can give a function an external name that is different from its internal name (useful for prevent name collisions and also redecorating mangled names etc)
On windows I can use the EXPORT command (IIRC) to check the list of exported functions and addresses, what is the equivalent way to do this on Linux?
The most common way to only make certain symbols visible in a shared object on linux is to pass the -fvisibility=hidden to gcc and then decorate the symbols that you want to be visible with __attribute__((visibility("default"))).
If your looking for an export file like solution you might want to look at the linker option --retain-symbols-file=FILENAME which may do what you are looking for.
I don't know an easy way of exporting a function with a different name from its function name, but it is probably possible with an elf editor. Edit: I think you can use a linker script (have a look at the man page for ld) to assign values to symbols in the link step, hence giving an alternative name to a given function. Note, I haven't ever actually tried this.
To view the visible symbols in a shared object you can use the readelf command. readelf -Ds if I remember correctly.
How can I restrict the exported functions of a shared library to those I want (i.e. a .def file equivalent)
Perhaps you're looking for GNU Export Maps or Symbol Versioning
g++ -shared spaceship.cpp -o libspaceship.so.1
-Wl,-soname=libspaceship.so.1 -Wl,
--version-script=spaceship.expmap
gcc also supports the VC syntax of __declspec(dllexport). See this.
Another option is to use the strip command with this way:
strip --keep-symbol=symbol_to_export1 --keep-symbol=symbol_to_export2 ... \
libtotrip.so -o libout.so

Trouble compiling dll that accesses another dll

So, I have an interesting issue. I am working with a proprietary set of dlls that I ,obviously, don't have the source for. The goal is to write an intermediate dll that groups together a large series of funnction calls from the proprietary dlls. The problem I am having, when compiling with g++, is that I get errors for the original dlls along the lines of:
cannot export libname_NULL_THUNK_DATA. Symbol not found.
If I add a main and just compile to an executable everything works as expected. I'm using mingw for compilation. Thanks for any help.
In response to the first reply: Either I'm confused about what you're saying or I didn't word my question very well. I'm not explicitly trying to export anything from my wrapper I am just calling functions from their dlls. The problem is that I get errors that it can't export these specific symbols from the dll to my wrapper. The issue is that I'm not even entirely sure what these _NULL_THUNK_DATA symbols are for. I did a search and read somewhere that they shouldn't be exported because they're internal symbols that windows uses. I have tried using the --exclude-symbols directive to the linker but it didn't seem to do anything. I apologize if I'm completely misunderstanding what you're trying to say.
So, I think my issue was related to this. When just compiling a standard executable that uses a dll I was able to include the headers and directly call the functions for example:
#include :3rdparty.h
int main(){
dostuff(); // a function in the 3rdparty.dll
}
this would compile and run fine. I just needed to link the libraries in the g++ command.
When linking with the -shared flag I would get these errors (with main removed of course). I think it has something to do with the fact that by default g++ attempts to import all symbols from the dll. What I didn't understand is why this happens in the dll vs in an executable. I will try doing it using GetProcAddress(). Thank you!
it should be as easy as you think it should be.
eg:
your dll code needs:
void doStuff()
{
3rdparty.login();
3rdparty.dostuff();
3rdparty.logoff();
};
so far - so good, you've included the right headers .... (if you have them, if you don't then you need to import the library using LoadLibrary(), then create a function pointer to each exported dll entrypoint using GetProcAddress() and then call that function pointer)
You then link with the 3rd party lib and that's it. Occasionally you will have to wrap the definitions with 'extern "C"' in order to get the linkage name mangling correct.
As you say you're using g++, you can't be getting confused with __declspec(dllimport) which is a MS VC extension.
"Compiling" tells me that you're approaching this from the wrong end. Your DLL should not export its own wrapper functions, but directly refer to exports from other DLLs.
E.g. in a Windows Kernel32.DEF file, the following forward exists:
EXPORTS
...
HeapAlloc = NTDLL.RtlAllocHeap
There's no code for the HeapAlloc function.