Dealing with an object corrupting the heap - c++

In my application I am creating an object pretty much like this :
connect() {
mVHTGlove = new vhtGlove(params);
}
and once I am about to close application I call this one :
disconnect() {
if (mVHTGlove)
delete mVHTGlove;
}
This call always triggers a breakpoint with the following message :
Windows has triggered a breakpoint in
DesignerDynD.exe.
This may be due to a corruption of the
heap, which indicates a bug in
DesignerDynD.exe or any of the DLLs it
has loaded.
This may also be due to the user
pressing F12 while DesignerDynD.exe
has focus.
The output window may have more
diagnostic information.
I cannot modify the vhtGlove class to fix the corruption of the stack as it is an external library provided only in the form of header files, lib files and dlls.
Is there any way to use this class in a clean way ?
**** EDIT ::: I tried to strip things down to a bare minimum, however I get the same results... here you have the ENTIRE code.
#include "vhandtk/vhtCyberGlove.h"
#include "vhandtk/vhtIOConn.h"
#include "vhandtk/vhtBaseException.h"
using namespace std;
int main(int argc, char* argv[])
{
vhtCyberGlove* testGlove = NULL;
vhtIOConn gloveAddress("cyberglove", "localhost", "12345", "com1", "115200");
try
{
testGlove = new vhtCyberGlove(&gloveAddress,false);
if (testGlove->connect())
cout << "Glove connected successfully" << endl;
else
{
throw vhtBaseException("testGlove()->connect() returned false.");
}
if (testGlove->disconnect())
{
cout << "Glove disconnected successfully" << endl;
}
else
{
throw vhtBaseException("testGlove()->disconnect() returned false.");
}
}
catch (vhtBaseException *e)
{
cout << "Error with gloves: " << e << endl;
system("pause");
exit(0);
}
delete testGlove;
return 0;
}
Still crashes on deletion of the glove.
EDIT #2 :: If I just allocate and delete an instance of vhtCyberGlove it also crashes.
int main(int argc, char* argv[])
{
vhtCyberGlove* testGlove = NULL;
vhtIOConn gloveAddress("cyberglove", "localhost", "12345", "com1", "115200");
testGlove = new vhtCyberGlove(&gloveAddress,false);
delete testGlove; //<<crash!
return 0;
}
Any ideas?
thanks!
JC

One possiblity is that mVHTGlove isn't being initialized to 0. If disconnect was then called without a connect ever being called, then you'd be attempting to deallocate a garbage pointer. Boom.
Another possibility is that you are actually corrupting the stack a bit before that point, but that is where the corruption actually causes the crash. A good way to check that would be to comment out as much code as you can and still get the program to run, then see if you still get the corruption. If you don't, slowly bring back in bits of code until you see it come back.
Some further thoughts (after your edits).
You might check and see if the API doesn't have its own calls for memory management, rather than expecting you to "new" and "delete" objects manually. The reason I say this is that I've seen some DLLs have issues that looked a lot like this when some memory was managed inside the DLL and some outside.

The heap corruption error is reported when the vhtGlove is deleted. However, it may just as well be your own code that causes the corruption. This often happens as a result of overwriting a buffer allocated on the heap, perhaps from a call to malloc. Or you are perhaps deleting the same object twice. You can avoid this by using a smart pointer like std::auto_ptr to store the pointer to the object.

One thing you might try to track down the source of the corruption is to look at the memory location pointed to by mVHTGlove using Visual Sudio's "Memory" window when the heap corruption is detected. See if you see anything in that memory that looks obviously like something that overran a buffer. For example, if you see a string used elsewhere in the program, then go review the code that manipulates that string -- it might be overrunning its buffer.

Given vhtCyberGlove's implementation is on another DLL, I would look for heaps mismatch. In VS, for example, this would happen if the DLL is linked to the Release CRT, while your EXE is linked to the Debug CRT. When this is the case, each module uses a different heap, and as soon as you try to free memory using the wrong heap, you'll crash.
In your case, it is possible that vhtCyberGlove gets some stuff that is allocated on the other DLL, but when you delete the vhtCyberGlove instance the stuff is being deleted directly, namely referring to your heap rather than the DLL's. And when trying to free a pointer that points to another heap, your effectively corrupting yours.
If this is indeed the case, without having more details I can offer two fixes:
Make sure your EXE uses the same heap as the DLL. Will probably lock you in Release mode, so it's not the best way to go
Get the provider of vhtCyberGlove to manage its memory usage properly...

You are passing the address of a local vhtIOConn to the constructor. Is it possible that the object is assuming ownership of this pointer and trying to delete it in the destructor?

Related

Can I store objects created in a DLL after unloading the DLL?

I need some clarification on the following code.
In the following code, I am retrieving a private implementation of an interface from a DLL.
After the DLL has been unloaded(CASE #2), the string retrieved from the interface, and the interface itself both become invalid, and access violations occur(expected) when they are accessed.
But what's confusing me, is that when when I reload the DLL, I can use the interface and the string again, without re-retrieving them from the DLL.
Is this supposed to work this way?
How can the memory become invalid, and them suddenly become valid again once the DLL is loaded again?
Is this just luck, and the DLL conveniently loading into the exact same place in memory the second time because of the relative simplicity of my test program?
int main(int argc, int *argv[])
{
Interface *itf;
const char *name;
{
// loads the dll and gets functions
PersistenInterface pi("PersistentInterface.dll");
// returns a private implementation of the interface
itf = pi.CreateInterface();
name = itf->GetName();
// CASE #1
cout << name << endl; // suceeds
} // dll is unloaded in ~PersistenInterface()
// CASE #2
//cout << name << endl; // crashes
//cout << itf->GetName() << endl; // crashes
{
PersistenInterface pi("PersistentInterface.dll");
// CASE #3
cout << name << endl; // suceeds !?
cout << itf->GetName() << endl; // suceeds !?
}
return 0;
}
I hope you are not returning a local object on stack from DLL. It will definitely crash your system because the local object is destroyed after function returns from the DLL. It has nothing to do with the concept of DLL. If you try to reference a local object after its containing function returns, you'll see the same problem.
If the memory is dynamically allocated on heap inside DLL, it's still valid after the DLL is uninitialized. You can still use the memory in your main application, because DLL itself doesn't own anything. Remember, the heap belongs to the process, not DLL.
However, the general rule is that the owner of that memory(who creates that memory, e.g. the DLL in your case) is responsible to release the same memory. Allocating memory in DLL and deallocating it in the other side may cause you unnecessary troubles, if the two sides use different instances of CRT, e.g.
The answer is a little complex. Actually, if you load a DLL, it is not really loaded in your heap memory. It is loaded in global memory and this memory is mapped into your address-space.With this mechanism the OS saves memory, because it can map the same code into different processes. So if you access an address within the DLL space, the access will be redirect into the global memory where the DLL really exist. After unloading, there is no memory anymore at this address. Your access will not be redirect any more, a exception will be thrown. This is different to a malloc and free (or new and delete) where only the memory is marked as unused, but still there with all its data (unless zero out memory is activated). Static DLL data (like constant strings) and code of a DLL acts like described above. After you load a DLL again, the addresses you have, which obviously direct to static data or code, became valid again. At least Windows XP loads a DLL always at the same address-space. But do not rely on this! For security reason, modern OS can decide to map DLLs of your process on different addresses on every time loadLibrary is used or every time you start your process. In the debugger or with Windows XP, DLL's are always mapped at the same address as far as I know, that's the reason of the behavior for your code. It is not a bug or a feature, it is by DLL concept.

Howto debug double deletes in C++?

I'm maintaining a legacy application written in C++. It crashes every now and then and Valgrind tells me its a double delete of some object.
What are the best ways to find the bug that is causing a double delete in an application you don't fully understand and which is too large to be rewritten ?
Please share your best tips and tricks!
Here's some general suggestion's that have helped me in that situation:
Turn your logging level up to full debug, if you are using a logger. Look for suspicious stuff in the output. If your app doesn't log pointer allocations and deletes of the object/class under suspicion, it's time to insert some cout << "class Foo constructed, ptr= " << this << endl; statements in your code (and corresponding delete/destructor prints).
Run valgrind with --db-attach=yes. I've found this very handy, if a bit tedious. Valgrind will show you a stack trace every time it detects a significant memory error or event and then ask you if you want to debug it. You may find yourself repeatedly pressing 'n' many many times if your app is large, but keep looking for the line of code where the object in question is first (and secondly) deleted.
Just scour the code. Look for construction/deletion of the object in question. Sadly, sometimes it winds up being in a 3rd party library :-(.
Update: Just found this out recently: Apparently gcc 4.8 and later (if you can use GCC on your system) has some new built-in features for detecting memory errors, the "address sanitizer". Also available in the LLVM compiler system.
Yep. What #OliCharlesworth said. There's no surefire way of testing a pointer to see if it points to allocated memory, since it really is just the memory location itself.
The biggest problem your question implies is the lack of reproducability. Continuing with that in mind, you're stuck with changing simple 'delete' constructs to delete foo;foo = NULL;.
Even then the best case scenario is "it seems to occur less" until you've really stamped it down.
I'd also ask by what evidence Valgrind suggests it's a double-delete problem. Might be a better clue lingering around in there.
It's one of the simpler truly nasty problems.
This may or may not work for you.
Long time ago I was working on 1M+ lines program that was 15 years old at the time. Faced with the exact same problem - double delete with huge data set. With such data any out of the box "memory profiler" would be a no go.
Things that were on my side:
It was very reproducible - we had macro language and running same script exactly the same way reproduced it every time
Sometime during the history of the project someone decided that "#define malloc my_malloc" and "#define free my_free" had some use. These didn't do much more than call built-in malloc() and free() but project already compiled and worked this way.
Now the trick/idea:
my_malloc(int size)
{
static int allocation_num = 0; // it was single threaded
void* p = builtin_malloc(size+16);
*(int*)p = ++allocation_num;
*((char*)p+sizeof(int)) = 0; // not freed
return (char*)p+16; // check for NULL in order here
}
my_free(void* p)
{
if (*((char*)p+sizeof(int)))
{
// this is double free, check allocation_number
// then rerun app with this in my_alloc
// if (alloc_num == XXX) debug_break();
}
*((char*)p+sizeof(int)) = 1; // freed
//built_in_free((char*)p-16); // do not do this until problem is figured out
}
With new/delete it might be trickier, but still with LD_PRELOAD you might be able to replace malloc/free without even recompiling your app.
you are probably upgrading from a version that treated delete differently then the new version.
probably what the previous version did was when delete was called it did a static check for if (X != NULL){ delete X; X = NULL;} and then in the new version it just does the delete action.
you might need to go through and check for pointer assignments, and tracking references of object names from construction to deletion.
I've found this useful: backtrace() on linux. (You have to compile with -rdynamic.) This lets you find out where that double free is coming from by putting a try/catch block around all memory operations (new/delete) then in the catch block, print out your stack trace.
This way you can narrow down the suspects much faster than running valgrind.
I wrapped backtrace in a handy little class so that I can just say:
try {
...
} catch (...) {
StackTrace trace;
std::cerr << "Double free!!!\n" << trace << std::endl;
throw;
}
On Windows, assuming the app is built with MSVC++, you can take advantage of the extensive heap debugging tools built into the debug version of the standard library.
Also on Windows, you can use Application Verifier. If I recall correctly, it has a mode the forces each allocation onto a separate page with protected guard pages in between. It's very effective at finding buffer overruns, but I suspect it would also be useful for a double-free situation.
Another thing you could do (on any platform) would be to make a copy of the sources that are transformed (perhaps with macros) so that every instance of:
delete foo;
is replaced with:
{ delete foo; foo = nullptr; }
(The braces help in many cases, though it's not perfect.) That will turn many instances of double-free into a null pointer reference, making it much easier to detect. It doesn't catch everything; you might have a copy of a stale pointer, but it can help squash a lot of the common use-after-delete scenarios.

CrtIsValidHeapPointer problem with Oracle OCCI MetaData::getString

I am trying to get a name of a column from an oracle table using the MetaData class. I get a vector of MetaData objects from the ResultSet and then I loop over them executing the getString() function on each item, the problem is that on the second iteration, when exiting the loop to start a new iteration, it crashes on gives me CrtIsValidHeapPointer Assertion.
/*
* If this ASSERT fails, a bad pointer has been passed in. It may be
* totally bogus, or it may have been allocated from another heap.
* The pointer MUST come from the 'local' heap.
*/
_ASSERTE(_CrtIsValidHeapPointer(pUserData));
The data being pointed to by pUserData is actually valid, so I suspect my heap from the external API DLL is not the same as the CRT heap, the question is how do I resolve this issue?
my code:
std::vector<oracle::occi::MetaData> data = res->getColumnListMetaData();
for (std::vector<oracle::occi::MetaData>::iterator iter = data.begin(); iter != data.end(); iter++)
{
//Crash on second iteration after this statement
std::string s = (iter->getString(oracle::occi::MetaData::ATTR_NAME));
int i = iter->getInt(oracle::occi::MetaData::ATTR_DATA_TYPE);
std::cout << i << std::endl;
}
Does anybody have any suggestions or has anybody had this problem and solved it?
OS = Windows, VS2008, Oracle 11.2
Nothing in that code does any direct heap deallocation, although, of course, std::string does allocate and deallocate heap memory. However, that shouldn't be a problem unless
the heap is corrupted by some other operation or
you pass std::string across executable boundaries, resulting in one executable (e.g. the DLL) allocating memory and another (e.g. the EXE) deallocating it.
You seem to be expecting the latter:
The data being pointed to by pUserData is actually valid, so I suspect my heap from the external API DLL is not the same as the CRT heap, the question is how do I resolve this issue?
That might indeed be the case. If you have control over both executables, you can make them both use the same dynamic RTL ("Multi-threaded Debug DLL" or something like that in VC).
However, in general it's not a good idea to have one executable free the resources of another one. Usually you should pass resources back to the API you obtained them from, so that it can be freed where it was allocated.

Program crashes only in Release mode outside debugger

I have quite massive program (>10k lines of C++ code). It works perfectly in debug mode or in release mode when launched from within Visual Studio, but the release mode binary usually crashes when launched manually from the command line (not always!!!).
The line with delete causes the crash:
bool Save(const short* data, unsigned int width, unsigned int height,
const wstring* implicit_path, const wstring* name = NULL,
bool enable_overlay = false)
{
char* buf = new char[17];
delete [] buf;
}
EDIT: Upon request expanded the example.
The "len" has length 16 in my test case. It doesn't matter, if I do something with the buf or not, it crashes on the delete.
EDIT: The application works fine without the delete [] line, but I suppose it leaks memory then (since the block is never unallocated). The buf in never used after the delete line. It also seems it does not crash with any other type than char. Now I am really confused.
The crash message is very unspecific (typical Windows "xyz.exe has stopped working"). When I click the "Debug the program" option, it enters VS, where the error is specified to be "Access violation writing location xxxxxxxx". It is unable to locate the place of the error though "No symbols were loaded for any stack frame".
I guess it is some pretty serious case of heap corruption, but how to debug this? What should I look for?
Thanks for help.
have you checked memory leaks elsewhere?
usually weird delete behavior is caused by the heap getting corrupted at one point, then much much later on, it becomes apparent because of another heap usage.
The difference between debug and release can be caused by the way windows allocate the heap in each context. For example in debug, the heap can be very sparse and the corruption doesn't affect anything right away.
The biggest difference between launched in debugger and launched on its own is that when an application is lunched from the debugger Windows provides a "debug heap", that is filled with the 0xBAADF00D pattern; note that this is not the debug heap provided by the CRT, which instead is filled with the 0xCD pattern (IIRC).
Here is one of the few mentions that Microsoft makes about this feature, and here you can find some links about it.
Also mentioned in that link is "starting a program and attaching to it with a debugger does NOT cause it to use the "special debug heap" to be used."
You probably have a memory overwrite somewhere and the delete[] is simply the first time it causes a problem. But the overwrite itself can be located in a totally different part of your program. The difficulty is finding the overwrite.
Add the following function
#include <malloc.h>
#define CHKHEAP() (check_heap(__FILE__, __LINE__))
void check_heap(char *file, int line)
{
static char *lastOkFile = "here";
static int lastOkLine = 0;
static int heapOK = 1;
if (!heapOK) return;
if (_heapchk() == _HEAPOK)
{
lastOkFile = file;
lastOkLine = line;
return;
}
heapOK = 0;
printf("Heap corruption detected at %s (%d)\n", file, line);
printf("Last OK at %s (%d)\n", lastOkFile, lastOkLine);
}
Now call CHKHEAP() frequently throughout your program and run again. It should show you the source file and line where the heap becomes corrupted and where it was OK for the last time.
There are many possible causes of crashes. It's always difficult to locate them, especially when they differ from debug to release mode.
On the other hand, since you are using C++, you could get away by using a std::string instead of a manually allocated buffer >> there is a reason for which RAII exists ;)
It sounds like you have an unitialised variable somewhere in the code.
In debug mode all the memory is initialised to somthing standard so you will get consistant behavior.
In release mode the memory is not initialised unless you explicitly do somthing.
Run your compiler with the warnings set at the highest level possable.
Then make sure you code compiles with no warnings.
These two are the first two lines in their function.
If you really mean that the way I interpret it, then the first line is declaring a local variable buf in one function, but the delete is deleting some different buf declared outside the second function.
Maybe you should show the two functions.
Have you tried simply isolating this with the same build file but code based just on what you've put above? Something like:
int main(int argc, char* argv[] )
{
const int len( 16 );
char* buf = new char[len + 1];
delete [] buf;
}
The code you've given is absolutely fine and, on it's own, should run with no problems either in debug or optimised. So if the problem isn't down to specifics of your code, then it must be down to specifics of the project (i.e. compilation / linkage)
Have you tried creating a brand new project and placing the 10K+ lines of C++ into it? Might not take too long to prove the point. Especially if the existing project has either been imported in or heavily altered.
I was having the same issue, and I figured out that my program was only crashing when I went to delete[] char pointers with a string length of 1.
void DeleteCharArray(char* array){
if(strlen(array)>1){delete [] array;}
else{delete array;}
}
This fixed the issue, but it is still error prone, but could be modified to be otherwise.
Anyhow the reason this happens I suspect is that to C++ char* str=new char[1] and char* str=new char; are the same thing, and that means that when you're trying to delete a pointer with delete[] which is made for arrays only then results are unexpected, and often fatal.
One type of problem I had when I observed this symptom is that I had a multi-process program crash on me when run in shell, but ran flawlessly when called from valgrind or gdb. I discovered (much to my embarrassment), that I had a few stray processes of the same program still running in the system, causing a mq_send() call to return with error. The problem was that those stray processes were also assigned the message queue handle by the kernel/system and so the mq_send() in my newly spawned process(es) failed, but undeterministically (per the kernel scheduling circumstances).
Like I said, trivial, but until you find it out, you'll tear your hair out!
I learnt from this hard lesson, and my Makefile these days has all the appropriate commands to create a new build, and cleanup the old environment (including tearing down old message queues and shared memory and semaphores and such). This way, I don't forget to do something and have to get heartburn over a seemingly difficult (but clearly trivially solvable) problem. Here is a cut-and-paste from my latest project:
[Makefile]
all:
...
...
obj:
...
clean:
...
prep:
#echo "\n!! ATTENTION !!!\n\n"
#echo "First: Create and mount mqueues onto /dev/mqueue (Change for non ubuntu)"
rm -rf /run/shm/*Pool /run/shm/sem.*;
rm -rf /dev/mqueue/Test;
rm -rf /dev/mqueue/*Task;
killall multiProcessProject || true;

Questions about C++ memory allocation and delete

I'm getting a bad error. When I call delete on an object at the top of an object hierarchy (hoping to the cause the deletion of its child objects), my progam quits and I get this:
*** glibc detected *** /home/mossen/workspace/abbot/Debug/abbot: double free or corruption (out): 0xb7ec2158 ***
followed by what looks like a memory dump of some kind. I've searched for this error and from what I gather it seems to occur when you attempt to delete memory that has already been deleted. Impossible as there's only one place in my code that attempts this delete. Here's the wacky part: it does not occur in debug mode. The code in question:
Terrain::~Terrain()
{
if (heightmap != NULL) // 'heightmap' is a Heightmap*
{
cout << "heightmap& == " << heightmap << endl;
delete heightmap;
}
}
I have commented out everything in the heightmap destructor, and still this error. When the error occurs,
heightmap& == 0xb7ec2158
is printed. In debug mode I can step through the code slowly and
heightmap& == 0x00000000
is printed, and there is no error. If I comment out the 'delete heightmap;' line, error never occurs. The destructor above is called from another destructor (separate classes, no virtual destructors or anything like that). The heightmap pointer is new'd in a method like this:
Heightmap* HeightmapLoader::load() // a static method
{
// ....
Heightmap* heightmap = new Heightmap();
// ....other code
return heightmap;
}
Could it be something to do with returning a pointer that was initialized in the stack space of a static method? Am I doing the delete correctly? Any other tips on what I could check for or do better?
What happens if load() is never called? Does your class constructor initialise heightmap, or is it uninitialised when it gets to the destructor?
Also, you say:
... delete memory that has already been deleted. Impossible as there's only one place in my code that attempts this delete.
However, you haven't taken into consideration that your destructor might be called more than once during the execution of your program.
In debug mode pointers are often set to NULL and memory blocks zeroed out. That is the reason why you are experiencing different behavior in debug/release mode.
I would suggest you use a smart pointer instead of a traditional pointer
auto_ptr<Heightmap> HeightmapLoader::load() // a static method
{
// ....
auto_ptr<Heightmap> heightmap( new Heightmap() );
// ....other code
return heightmap;
}
that way you don't need to delete it later as it will be done for you automatically
see also boost::shared_ptr
It's quite possible that you're calling that dtor twice; in debug mode the pointer happens to be zeroed on delete, in optimized mode it's left alone. While not a clean resolution, the first workaround that comes to mind is setting heightmap = NULL; right after the delete -- it shouldn't be necessary but surely can't hurt while you're looking for the explanation of why you're destroying some Terrain instance twice!-) [[there's absolutely nothing in the tiny amount of code you're showing that can help us explain the reason for the double-destruction.]]
It looks like the classic case of uninitialized pointer. As #Greg said, what if load() is not called from Terrain? I think you are not initializing the HeightMap* pointer inside the Terrain constructor. In debug mode, this pointer may be set to NULL and C++ gurantees that deleting a NULL pointer is a valid operation and hence the code doesn't crash. However, in release mode due to optimizations, the pointer in uninitialized and you try to free some random block of memory and the above crash occurs.