I'm new to using DLL's in C++ and I am wondering how to sucessfully load and use classes contained in a DLL without getting a "corrupted stack", a "null pointer" or any application crash ^^. This is a bit confusing to me for now.
Even if it might be safer I don't want to use interface class because it seems too complicated to use. I know that I should have the same include in both DLL and application to prevent crashes. That means that my include file should mention members name.
Here is the problem :
I want to distribute my software (DLL + include file) and I don't want
my customers to see the architecture of my project. But the only class
I want to export from my DLL has as member, objects from other
classes. So my class is dependent of other classes and so on in
cascade.
Is it possible to provide only one include file of one class with just useful methods without having risks of crashes ? If no, what solutions might fit my needs ?
Thanks
FIrst of all, if you use the same include in DLL and in the application using it, if both were compiled with the same compiler and compatible settings, and both use the same runtime library (DLL one, not statically linked), you should have none of the problem you fear.
If you want to hide details of your DLL and provide an alterered, simplified header to the DLL users, you work on your own risk. For example, if there are data members missing (even private ones) of if there are virtual functions missing (even private ones), then it might probably crash, or worse, corrupt something without being noticed. So I'd strongly advise you not to consider this approach unless you're very very experienced with the assembler generated by your compiler.
If you want to hide implementation details of your class, the best approach would probably to design your library around the PIMPL idiom.
Related
I'm updating a DLL for a customer and, due to corporate politics - among other things - my company has decided to no longer share source code with the customer.
Previously. I assume they had all the source and imported it as a VC++6 project. Now they will have to link to the pre-compiled DLL. I would imagine, at minimum, that I'll need to distribute the *.lib file with the DLL so that DLL entry-points can be defined. However, do I also need to distribute the header file?
If I can get away with not distributing it, how would the customer go about importing the DLL into their code?
Yes, you will need to distribute the header along with your .lib and .dll
Why ?
At least two reasons:
because C++ needs to know the return type and arguments of the functions in the library (roughly said, most compilers use name mangling, to map the C++ function signature to the library entry point).
because if your library uses classes, the C++ compiler needs to know their layout to generate code in you the library client (e.g. how many bytes to put on the stack for parameter passing).
Additional note: If you're asking this question because you want to hide implementation details from the headers, you could consider the pimpl idiom. But this would require some refactoring of your code and could also have some consequences in terms of performance, so consider it carefully
However, do I also need to distribute the header file?
Yes. Otherwise, your customers will have to manually declare the functions themselves before they can use it. As you can imagine, that will be very error prone and a debugging nightmare.
In addition to what others explained about header/LIB file, here is different perspective.
The customer will anyway be able to reverse-engineer the DLL, with basic tools such as Dependency Walker to find out what system DLLs your DLL is using, what functions being used by your DLL (for example some function from AdvApi32.DLL).
If you want your DLL to be obscured, your DLL must:
Load all custom DLLs dynamically (and if not possible, do the next thing anyhow)
Call GetProcAddress on all functions you want to call (GetProcessToken from ADVAPI32.DLL for example
This way, at least dependency walker (without tracing) won't be able to find what functions (or DLLs) are being used. You can load the functions of system DLL by ordinal, and not by name so it becomes more difficult to reverse-engineer by text search in DLL.
Debuggers will still be able to debug your DLL (among other tools) and reverse engineer it. You need to find techniques to prevent debugging the DLL. For example, the very basic API is IsDebuggerPresent. Other advanced approaches are available.
Why did I say all this? Well, if you intend to not to deliver header/DLL, the customer will still be able to find exported functions and would use it. You, as a DLL provider, must also provide programming elements with it. If you must hide, then hide it totally.
One alternative you could use is to pass only the DLL and make the customer to load it dynamically using LoadLibrary() + GetProcAddress(). Although you still need to let your customer know the signature of the functions in the DLL.
More detailed sample here:
Dynamically load a function from a DLL
What is the best practice to add or modify a single class method in a well-established C++ library like OpenCV, while still reusing the remaining of the library code, preferably in the lib format.
At this point the only way I know is to copy all the source and header files that belong to the specific library (let's say OpenCV's core library) to the current source folder, modify that one function and recompile the module with the rest of the code. Ideally, I want to be able to link all the current .lib files the way they are, but simply define a new method (or modify a current method) for a class defined inside those libs in a way that my implementation of the method supersedes the implementation of the default library files.
Inheritance doesn't always seem to be an option, since sometimes the base class has private members that are required for the correct inherited class implementation.
I'm not aware of a clean way in C++ to accomplish what you're asking. What you're really asking to do (given that you need to use or modify private methods) is violate encapsulation, and the C++ language is designed to not let you do that.
A few options exist:
A .lib file is simply a collection of .obj files. Your compiler toolchain should have a command-line program for adding, deleting, and replacing .obj files in a .lib, so you could build one or two .obj files and merge them into the .lib. I suspect that this solution would be ugly and fragile.
If there's something that the library doesn't do and should do, then there's always a chance that you can submit a patch or feature request to the library authors to get that change made. Of course, this can take a while, if it works at all.
As #fatih_k suggests, adding your changes as friend classes would work. If the only change you make to OpenCV is to add a friend line to the header file, then the library's ABI will be unchanged, and you won't have to touch the .lib.
The cleanest option is to simply accept that you need to modify the OpenCV library and track its source code, along with your modifications, along with the source code you develop yourself, and build it along with the source code you build yourself. This is a very common approach, and various patterns and techniques exist to help you do it; for example, Subversion has the concept of vendor branches. This approach is more work to set up but is definitely the cleanest in the long run.
If the library is already compiled, there is not much you can do portably and cleanly.
If you know the specific target architecture the program will be runnning on, you could get the pointer to the member function and monkey patch the instructions with a jmp instruction to your own version of the method. If the method is virtual, you can modify the vtable. Those requires a lot of compiler-specific knowledge and would not be portable.
If the library ships in dynamic link archive, you could extract the archive and replace the method with your own version, and repack the archive.
Another method is you can copy the class' declaration from the header and add a friend declaration. Alternatively, you can do #define private public or #define private protected before including a header file. These will give you access to their private members.
With any of the above, you need to be careful that your changes does not modify the ABI of the library.
Well, OpenCV is licensed under BSD so you can make your changes without worries about republishing them.
You could always follow the Proxy design pattern and add the new method external to the library, and call into the library from there. That means you don't need to worry about maintaining your own version of OpenCV and distributing it as well. There's more information about Proxy patterns on Wiki to get you started.
Update:
Thanks everyone for kindly replying. Seems I was a little bit confused. The change was to add a new member function to the base class. I just realized maybe I do need to recompile everything that depends on the dll that exporting this class, since the addresses for function names in the symbol table changed. Please confirm I'm right or wrong.
Got into a debate for this,
When there is a member function change in some base class, I only recompiled all classes derived from this base, and ran into some run time error.
On the other side of the debate, I was told instead I should recompile all the classes that "depends" on this base class.
I'm not sure whether that is correct or not? Because I'm building DLLs and I always understand this dynamic link idea to be not to recompile.
If it's true, I'm also wondering what kind of "dependency" here is?
This question might be asked too general, please let me know if any other details I should provide. Really need to learn about the compiling and linking stuff.
Thanks!
DLLs and classes don't play well together. (Using classes inside a DLL is fine, it's when you try to export them that you have problems.)
For this reason, component/object systems (such as COM, ActiveX, CORBA) define an interface that separates the user from the implementation. If the public API for the DLL only uses pointers to a type with only pure virtual functions, then it has no layout shared between the DLL and caller.
If you try to share classes with data or inline functions, you have tight coupling and will need to recompile all users for the slightest implementation change.
I have a question about shared libraries/.dlls and class importing/exporting. From what I have understood after some research the two ways to do it is as follows:
Just using __declspec(dllexport/dllimport) before the class and praying that the different compiler versions mangle the names in the same way, and accept the fact that your .dll will not be useable by different compilers.
Use pure virtual classes in the dll as interfaces. Then all the classes that would need exporting would inherit from them and should implement the virtual functions. What the .dll would export in this case is just "factory" Construct/Destruct functions which would create and release the objects.
These are the only two ways I know. The first is a no-no since it offers 0 portability. The second, though convenient and a good programming design for a .dll that accomplishes a purpose starts to be annoying when you realize that you need to have a different constructor function for every different constructor, can only use POD types as parameters and you lose many advantages of C++ classes such as overloaded functions and default function arguments.
For a .dll that is supposed to offer a library to the user, a collection of classes for example even the second way becomes really inconvenient. So I am wondering what is the solution here? Just compile a different .dll with the first way for each major compiler? How do the shared libraries version of big libraries work? For example wxWidgets does offer a .dll version too. How do they achieve normal usage of classes by the end user of the .dll avoiding the interface solution?
Solution with having separate DLL for each compiler version works. At the same time it is still not an official feature. You will never know if next service pack will break the compatibility or not, nobody will give you exact list of compiler/linker keys that will maintain/break the compatibility, etc. I heard reliable rumors that Windows8 finally implemented this in the right way.
By the way Visual Studio 2012 Release Candidate can be still downloaded from: http://msdn.microsoft.com/en-us/vstudio/bb984878.aspx. Maybe you should try it?
Is it possible to implement monkey patching in C++?
Or any other similar approach to that?
Thanks.
Not portably so, and due to the dangers for larger projects you better have good reason.
The Preprocessor is probably the best candidate, due to it's ignorance of the language itself. It can be used to rename attributes, methods and other symbol names - but the replacement is global at least for a single #include or sequence of code.
I've used that before to beat "library diamonds" into submission - Library A and B both importing an OS library S, but in different ways so that some symbols of S would be identically named but different. (namespaces were out of the question, for they'd have much more far-reaching consequences).
Similary, you can replace symbol names with compatible-but-superior classes.
e.g. in VC, #import generates an import library that uses _bstr_t as type adapter. In one project I've successfully replaced these _bstr_t uses with a compatible-enough class that interoperated better with other code, just be #define'ing _bstr_t as my replacement class for the #import.
Patching the Virtual Method Table - either replacing the entire VMT or individual methods - is somethign else I've come across. It requires good understanding of how your compiler implements VMTs. I wouldn't do that in a real life project, because it depends on compiler internals, and you don't get any warning when thigns have changed. It's a fun exercise to learn about the implementation details of C++, though. One application would be switching at runtime from an initializer/loader stub to a full - or even data-dependent - implementation.
Generating code on the fly is common in certain scenarios, such as forwarding/filtering COM Interface calls or mapping OS Window Handles to library objects. I'm not sure if this is still "monkey-patching", as it isn't really toying with the language itself.
To add to other answers, consider that any function exposed through a shared object or DLL (depending on platform) can be overridden at run-time. Linux provides the LD_PRELOAD environment variable, which can specify a shared object to load after all others, which can be used to override arbitrary function definitions. It's actually about the best way to provide a "mock object" for unit-testing purposes, since it is not really invasive. However, unlike other forms of monkey-patching, be aware that a change like this is global. You can't specify one particular call to be different, without impacting other calls.
Considering the "guerilla third-party library use" aspect of monkey-patching, C++ offers a number of facilities:
const_cast lets you work around zealous const declarations.
#define private public prior to header inclusion lets you access private members.
subclassing and use Parent::protected_field lets you access protected members.
you can redefine a number of things at link time.
If the third party content you're working around is provided already compiled, though, most of the things feasible in dynamic languages isn't as easy, and often isn't possible at all.
I suppose it depends what you want to do. If you've already linked your program, you're gonna have a hard time replacing anything (short of actually changing the instructions in memory, which might be a stretch as well). However, before this happens, there are options. If you have a dynamically linked program, you can alter the way the linker operates (e.g. LD_LIBRARY_PATH environment variable) and have it link something else than the intended library.
Have a look at valgrind for example, which replaces (among alot of other magic stuff it's dealing with) the standard memory allocation mechanisms.
As monkey patching refers to dynamically changing code, I can't imagine how this could be implemented in C++...