How do I make my dependencies call my global operator new? - c++

I have a test app that is linked with some DLLs (or .so's). In my main app I defined a global new/delete like this:
void* operator new(size_t n)
{
....
}
void operator delete(void* p)
{
...
}
But I noticed the operators are only called for things I allocate in my main app, but not if one of the DLLs does.
How do I make allocations in the DLLs go through my operator new/delete? (This should also include memory allocated by STL, so if one of the DLLs has an std::string, I'd like my operator new to be called when STL allocates its std::string internal buffer).
I'm more interested in a Windows solution, but a Linux one would also be appreciated.
edit: perhaps I wasn't clear originally, this test app I was doing was meant to track memory usage for a few auto-generated classes defined in a DLL. Creating my own allocator and using that in the generated code STL structures isn't an option, more so there are other non-STL allocations. But seeing the answers, I think the best option is either to use a profiler or simply monitor the memory usage using perfmon.

I'd like my operator new to be called when STL allocates its std::string internal buffer
typedef std::basic_string<char, std::char_traits<char>, ALLOCATOR> mystring;
The code in the DLLs already uses its own new implementation, and there's no good reason why defining your own implementation should magically change the implementation that the DLLs use (what if they use their own custom implementation?).
So if you want the strings to use your allocator, you need to explicitly create them as such.

Anything that is intended to use your global definitions must be compiled with those definitions available. The technique you use will not override anything already compiled in DLLs or even other source files that don't include these definitions. In many cases the allocators and standard functions will also not use these functions even when visible.
If you really need to do this you'll have to intercept calls to malloc (and other allocation routine). This isn't easy. You can't simply do it from the code. You'll have to tell the linker how to do this. On Linux I think this is LD_PRELOAD, though I can't remember, and on Windows I'm not sure at all.
If you can indicate why you'd like to do this perhaps I can offer an alternate solution.

There is no way to completely do what you want to do. There are too many corner cases where memory is leaked.
The closest I think you can get is by doing the following:
Each class in your dll/.so will have to have a static factory/destroy method. Pass a pointer to an allocation function to the factory and the deallocation function to the destroy method in each class in each dll/.so.
For an example of how to get close, google for the HORDE memory allocation library, which does get close.
Another thing to look at is the various C++ class plugin libraries that allow you to load any dll/.so as a plugin. Last time I checked there were at least 10 such libraries with source in the googlesphere. :)

Related

Mixing versions of the MSVCRT

So, I have a C++ library with a statically linked copy of the MSVCRT. I want for anyone to be able to use my library with any version of the MSVC Runtime. What is the best way to accomplish this goal?
I'm already being quite careful with how things are done.
Memory never passes the DLL barrier to be freed
Runtime C++ objects aren't passed across barriers (ie, vectors, maps, etc.. unless they were created on that side of the barrier)
No file handles or resource handles are passed between barriers
Yet, I still have some simple code that causes heap corruption.
I have an object like so in my library:
class Foos
{
public: //There is an Add method, but it's not used, so not relevant here
DLL_API Foos();
DLL_API ~Foos();
private:
std::map<std::wstring, Foo*> map;
};
Foos::~Foos()
{
// start at the begining and go to the end deleting the data object
for(std::map<std::wstring, Foo*>::iterator it = map.begin(); it != map.end(); it++)
{
delete it->second;
}
map.clear();
}
And then I use it from my application like so:
void bar() {
Foos list;
}
After I call this function from anywhere, I get a debug warning about stack corruption. And If I actually let it run out, it actually does corrupt the stack and segfault.
My calling application is compiled with Visual Studio 2012 platform tools. The library is compiled using Visual Studio 2010 platform tools.
Is this just something I should absolutely not be doing, or am I actually violating the rules for using multiple runtimes?
Memory never passes the DLL barrier
But, it does. Many times in fact. Your application created the storage for the class object, in this case on the stack. And then passes a pointer to the methods in the library. Starting with the constructor call. That pointer is this inside the library code.
What goes wrong in a scenario like this one is that it didn't create the correct amount of storage. You got the VS2012 compiler to look at your class declaration. It uses the VS2012 implementation of std::map. Your library however was compiled with VS2010, it uses a completely different implementation of std::map. With an entirely different size. Huge changes thanks to C++11.
This is just complete memory corruption at work, code in your application that writes stack variables will corrupt the std::map. And the other way around.
Exposing C++ classes across module boundaries are filled with traps like that. Only ever consider it when you can guarantee that everything is compiled with the exact same compiler version and the exact same settings. No shortcuts on that, you can't mix Debug and Release build code either. Crafting the library so no implementation details are exposed is certainly possible, you have to abide by these rules:
Only expose pure interfaces with virtual methods, argument types must be simple types or interface pointers.
Use a class factory to create the interface instance
Use reference counting for memory management so it is always the library that releases.
Nail down core details like packing and calling convention with hard rules.
Never allow exceptions to cross the module boundary, only use error codes.
You'd be well on you way to write COM code by then, also the style you see used in for example DirectX.
map member variable is still created by application with some internal data allocated by application and not DLL (and they may use different implementations of map). As a rule of thumb don't use stack objects from DLLs, add something like Foos * CreateFoos() in your DLL.
Runtime C++ objects aren't passed across barriers (ie, vectors, maps,
etc.. unless they were created on that side of the barrier)
You are doing exactly that. Your Foos object is being created by the main program on the stack and then being used in the library. The object contains a map as a part of it...
When you compile the main program it looks at the header files etc to determine how much stack space to allocate for the Foos object. And the calls the constructor which is defined in the library... Which might have been expecting an entirely different layout/size of the object
It may not fit your needs, but don't forgot that implementing the whole thing in header files simplifies the problem (sort of) :-)

Replace Standard C++ Allocator?

I want to replace the standard allocator with a more robust allocator (the C++ standard only requires an overflow check on vector::resize). The various C++ allocators supplied with many libraries fall flat on their face when fed negative self tests.
I have access to a more robust allocator. ESAPI's allocator not only checks for overflow, it also has debug instrumentation to help find mistakes. http://code.google.com/p/owasp-esapi-cplusplus/source/browse/trunk/esapi/util/zAllocator.h.
Is there a standard way to replace the C++ allocator used in a program without too much effort? I also want to ensure its replaced in library code, which I may not have access to source code.
Unlike malloc which is a library function that can be replaced by another function with the same signature, std::allocator is a class template and template code is instantiated as needed and inlined into code that uses it. Some standard library code will have already been compiled into the library's object files and will contain instantiated std::allocator code which can't be replaced. So the only way is if the standard library provides some non-standard way to replace its std::allocator. Luckily, GCC's libstdc++ allows you to do just that, allowing you to select the implementation used for std::allocator when GCC is configured and built, with a few different choices
It wouldn't be too much work to add the ESAPI allocator to the GCC sources as one of the options, then rebuild GCC to use that allocator as the base class of std::allocator providing its implementation. You might need to tweak the ESAPI allocator code a bit, and maybe alter the libstdc++ configure script to allow you to say --enable-libstdcxx-allocator=esapi
If you want to modify allocation on a global basis instead of per-container, you probably want to replace ::operator new and ::operator delete. Conceivably, you'd also want to replace ::operator new[] and ::operator delete[] as well -- but these are only used for allocating arrays, which you should almost never use anyway (aside, in case it wasn't obvious: no, these are not used to allocate memory for a std::vector, despite its being rather similar to an array in some ways).
Although trying to replace most parts of the library is prohibited, the standard specifically allows replacing these.
Of course, if somebody is already specifying a different allocator for a particular container, and that allocator doesn't (eventually) get its memory via ::operator new (or ::operator new[]) this will not affect that container/those containers.
In C++0x, define a new template alias in namespace mystd that is a std::vector but with your custom allocator. Replace all std::vectors with mystd::vector. Get rid of all using namespace std and using std::vector in your code.
Rebuild. Replace the places where you used a raw vector<T> with mystd::vector<T>.
Oh, and use a better name than mystd.

Heap corruption when using DLL code

I have some code that I need to place in a common library dll. This code, a class CalibrationFileData, works perfectly fine when it's built as part of the current project. However, if CalibrationFileData is built in the common library, the program crashes, mentioning heap corruptions.
I have made sure that all allocations and deallocations occur within the class, with appropriate accessors, etc. Still, the problem won't go away. Just in case it makes any difference, I am sometimes passing vectors of pairs, definitely not plain old data, but the vector manipulation only occurs through the accessors, so there shouldn't be any allocation happening across modules.
Anything I'm missing?
Edit: The vectors are these:
std::vector<std::pair<CvPoint2D32f, CvPoint3D32f>>* extrinsicCorrespondences;
std::vector<int>* pointsPerImage;
I shouldn't need to worry about deep copies, since they're not heap allocated, right? Incidentally, I tried using pointers to vectors, as above, to sidestep the problem, but it didn't make a difference anyway.
Check the compile flags match between the library and the executable. For example, on Windows ensure you're using the same C Runtime Library (CRT) (/MD vs /MT). Check for warnings from the linker.
Are you certain that when you take ownership of the contents of the vector objects, within your methods, you are deep-copying them into your instance variables?
you should check deep copies inside vector object,it is pertain with deepcopy i think
Could it be different values defined for _SECURE_SCL in the two projects ?
It's likely you missed some piece of the client's code that tries to deallocate a memory that your DLL allocated or vice versa.
Perhaps the easiest thing to do would be to ensure that both client and the DLL use the same memory allocator. Not just the same kind, but the same actual "instance" of the allocator. On Visual C++, this is most easily achieved by both client and DLL using a "Multi-threaded Debug DLL (/MDd)" or "Multi-threaded DLL (/MD)" runtime library.
In the command line option, in visual studio remove this _SECURE_SCL. Mostly this crash is caused by _SECURE_SCL mismatch among the details. More details can be found here: http://kmdarshan.com/blog/?p=3830
You should distinguish between tightly coupled and loosely coupled DLLs.
Tightly coupled DLLs mean that
the DLL is built with the exact same compiler version, packing and
calling convention settings, library options as the application, and
both dynamically link to the runtime library (/MD compiler option).
This lets you pass objects back and forth including STL containers,
allocate DLL objects from inside the application, derive from base
classes in the other module, do just about everything you could
without using DLLs. The disadvantage is that you can no longer
deploy the DLL independently of the main application. Both must be
built together. The DLL is just to improve your process startup time
and working set, because the application can start running before
loading the DLL (using the /delayload linker option). Build times
are also faster than a single module, especially when whole program
optimization is used. But optimization doesn't take place across the
application-DLL boundary. And any non-trivial change will still
require rebuilding both.
In case of loosely coupled DLLs, then
When exporting DLL functions, it's best if they accept only integral
data types, i.e. int or pointers.
With reference, for example, to strings, then:
When you need to pass a string, pass it as a const char *. When you
need the DLL function to return a string, pass to the DLL a char *
pointer to a pre-allocated buffer, where the DLL would write the
string.
Finally, concerning memory allocation, then:
Never use memory allocated by the DLL outside of the DLL's own
functions, and never pass by value structures that have their own
constructor/destructor.
References
Heap corruption when returning from function inside a dll
Can I pass std::string to a DLL?

How do i make a C++ DLL call my overloaded global new operator?

I have overloaded global new/delete (and new[] / delete[]) to fill and check guarded blocks. Works fine. Now i link to C++ DLLs passing STL-Container instances that are filled or modified by the DLLs. When destructing these containers i run into an error, as they were not allocated using my overloaded new operator and vice versa the dlls generate errors when freeing container elements that were created using my overloaded new.
How can i make DLLs call my new operator?
For some DLLs i have the sources, for others i don't have it.
There must be a overall approach as i.e. the Visual Studio Runtime DLLs MSVCP*.DLL call my overloaded operators. How can i make other DLLs call my operators too?
a) With having the sources for the DLL?
and check
b) Without the sources for the DLL?
for dlls you can compile, you can get them to call your overloaded methods by
making sure the calling code includes the header defining your overloads
exporting those overloads from your dll by specifying them in an export file
here are the exports (using mangled names, never found another way to do it) for new/delete/new[]/delete[], throwing versions.
x86:
EXPORTS
??2#YAPAXI#Z
??3#YAXPAX#Z
??_U#YAPAXI#Z
??_V#YAXPAX#Z
x64:
EXPORTS
??2#YAPEAX_K#Z
??3#YAXPEAX#Z
??_U#YAPEAX_K#Z
??_V#YAXPEAX#Z
I don't think this works for dlls that you're not compiling yourself though (at the time they were built the linker already took care of finding the references to the methods); to do that you might have to use rather dirty tricks like hooking the crt for your process.
edit for the other way around, you could pass allocators from the host application into the dll and make sure the dll only uses those allocators to allocate, instead of new/delete. Havee thos alloactors call your overloaded new/delete in turn. It's a bit messy but should work, also with STL since you can specify the allocator for those containers; but again, this does not solve anything if you want a dll for which you do not have the code to allocate using your bounds checking code.
I think you need to place the new/delete code in a DLL and ensure that both the exe and your extra DLL's call this common code.
Even this approach has problems so for me it is better to ensure architecturally that the module that allocated a chunk of memory is the same module that deletes it however this isn't a simple requirement is many cases.

C++ : When do I need a shared memory allocator for std::vector?

First_Layer
I have a win32 dll written in VC++6 service pack 6. Let's call this dll as FirstLayer. I do not have access to FirstLayer's source code but I need to call it from managed code. The problem is that FirstLayer makes heavy use of std::vector and std::string as function arguments and there is no way of marshaling these types into a C# application directly.
Second_Layer
The solution that I can think of is to first create another win32 dll written in VC++6 service pack 6. Let's call this dll as "SecondLayer". SecondLayer acts as a wrapper for FirstLayer. This layer contains wrapper classes for std::vector so std::vector is not exposed in all function parameters in this layer. Let's call this wrapper class for std::vector as StdVectorWrapper.
This layer does not make use of any new or delete operations to allocate or deallocate memory since this is handled by std::vector internally.
Third_Layer
I also created a VC++2005 class library as a wrapper for SecondLayer. This wrapper does all the dirty work of converting the unmanaged SecondLayer into managed code. Let's call this layer as "ThirdLayer".
Similar to SecondLayer, this layer does not make use of new and delete when dealing with StdVectorWrapper.
Fourth_Layer
To top it all, I created a C#2005 console application to call ThirdLayer. Let's call this C# console application as "FourthLayer".
Call Sequence Summary
FourthLayer(C#2005) -> ThirdLayer(VC++2005) -> SecondLayer(VC++6) -> FirstLayer(VC++6)
The Problem
I noticed that the "System.AccessViolationException: Attempted to read or write protected memory" exception is being thrown which I suspect to be due to SecondLayer's internal std::vector allocating memory which is illegal for ThirdLayer to access.
This is confirmed I think because when I recompile FirstLayer (simulated) and SecondLayer in VC++2005, the problem disappears completely. However, recompiling the production version of FirstLayer is not possible as I do not have the source code.
I have heard that in order to get rid of this problem, I need to write a shared memory allocator in C++ for SecondLayer's std::vector which is found in the StdVectorWrapper class. I do not fully understand why I need a shared memory allocator and how it works? Any idea?
Is there any readily available source code for this on the internet that I can just compile and use together with my code in SecondLayer?
Note that I am unable to use the boost library for this.
Each executable or dll links to a particular version of the c runtime library, which is what contains the implementation of new and delete. If two modules have different compilers (VC2005 vs VC6) or build settings (Debug vs Release) or other settings (Multithreaded runtime vs non-multithreaded runtime), then they will link to different c runtimes. That becomes a problem if memory allocated by one runtime is freed by a different runtime.
Now, if I'm not mistaken, templates (such as std::vector or std::string) can cause this problem to sneak in where it isn't immediately obvious. The issue comes from the fact that templates are compiled into each module separately.
Example: module 1 uses a vector (thus allocating memory), then passes it as a function parameter to module 2, and then module 2 manipulates the vector causing the memory to be deallocated. In this case, the memory was allocated using module 1's runtime and deallocated using module 2's runtime. If those runtimes are different, then you have a problem.
So given all that, you have two places for potential problems. One is between FirstLayer and SecondLayer if those two modules haven't been compiled with the exact same settings. The other is between SecondLayer and ThirdLayer if any memory is allocated in one and deallocated in the other.
You could write a couple more test programs to confirm which place(s) have problems.
To test FirstLayer-SecondLayer, copy the implementation of the SecondLayer functions into a VC6 program, write just enough code to call those functions in a typical manner, and link only against FirstLayer.
If that first test doesn't fail, or else once you've fixed it, then test SecondLayer-ThirdLayer: copy the ThirdLayer implementation into a VC2005 program, write the code to make typical calls, and link against SecondLayer.
I think you should look at a different solution architecture.
The binary code generated by VC6 stl vector and string is, I believe, different from the code generated by more recent version of VC because of the many stl upgrades. Because of this I don't think your architecture will work as the dlls will have two implementations of std::vector and std::string that are not binary compatible.
My suggested solution is to go back to VC6 and write a new wrapper dll for FirstLayer that exposes it via a pure C API -- this layer will need to be compiled using VC6sp6 to ensure it is binary compatible with FirstLayer. Then use PInvoke in Fourth_Layer to access the VC6 wrapper DLL.
I have found a solution for the problem. Basically, the StdVectorWrapper class which I wrote do not implement deep copy. So all I need to do is to add the following to the StdVectorWrapper class to implement deep copy.
Copy Constructor
Assignment Operator
Deconstructor
Edit: Alternative Solution
An even better solution would be to use clone_ptr for all the elements contained in std::vector as well as for std::vector itself. This eliminates the need for the copy constructor, assignment operator and deconstructor.