Of Memory Management, Heap Corruption, and C++ - c++

So, I need some help. I am working on a project in C++. However, I think I have somehow managed to corrupt my heap. This is based on the fact that I added an std::string to a class and assigning it a value from another std::string:
std::string hello = "Hello, world.\n";
/* exampleString = "Hello, world.\n" would work fine. */
exampleString = hello;
crashes on my system with a stack dump. So basically I need to stop and go through all my code and memory management stuff and find out where I've screwed up. The codebase is still small (about 1000 lines), so this is easily do-able.
Still, I'm over my head with this kind of stuff, so I thought I'd throw it out there. I'm on a Linux system and have poked around with valgrind, and while not knowing completely what I'm doing, it did report that the std::string's destructor was an invalid free. I have to admit to getting the term 'Heap Corruption' from a Google search; any general purpose articles on this sort of stuff would be appreciated as well.
(In before rm -rf ProjectDir, do again in C# :D)
EDIT:
I haven't made it clear, but what I'm asking for are ways an advice of diagnosing these sort of memory problems. I know the std::string stuff is right, so it's something I've done (or a bug, but there's Not A Problem With Select). I'm sure I could check the code I've written up and you very smart folks would see the problem in no time, but I want to add this kind of code analysis to my 'toolbox', as it were.

These are relatively cheap mechanisms for possibly solving the problem:
Keep an eye on my heap corruption question - I'm updating with the answers as they shake out. The first was balancing new[] and delete[], but you're already doing that.
Give valgrind more of a go; it's an excellent tool, and I only wish it was available under Windows. I only slows your program down by about half, which is pretty good compared to the Windows equivalents.
Think about using the Google Performance Tools as a replacement malloc/new.
Have you cleaned out all your object files and started over? Perhaps your make file is... "suboptimal"
You're not assert()ing enough in your code. How do I know that without having seen it? Like flossing, no-one assert()s enough in their code. Add in a validation function for your objects and call that on method start and method end.
Are you compiling -wall? If not, do so.
Find yourself a lint tool like PC-Lint. A small app like yours might fit in the PC-lint demo page, meaning no purchase for you!
Check you're NULLing out pointers after deleteing them. Nobody likes a dangling pointer. Same gig with declared but unallocated pointers.
Stop using arrays. Use a vector instead.
Don't use raw pointers. Use a smart pointer. Don't use auto_ptr! That thing is... surprising; its semantics are very odd. Instead, choose one of the Boost smart pointers, or something out of the Loki library.

We once had a bug which eluded all of the regular techniques, valgrind, purify etc. The crash only ever happened on machines with lots of memory and only on large input data sets.
Eventually we tracked it down using debugger watch points. I'll try to describe the procedure here:
1) Find the cause of the failure. It looks from your example code, that the memory for "exampleString" is being corrupted, and so cannot be written to. Let's continue with this assumption.
2) Set a breakpoint at the last known location that "exampleString" is used or modified without any problem.
3) Add a watch point to the data member of 'exampleString'. With my version of g++, the string is stored in _M_dataplus._M_p. We want to know when this data member changes. The GDB technique for this is:
(gdb) p &exampleString._M_dataplus._M_p
$3 = (char **) 0xbfccc2d8
(gdb) watch *$3
Hardware watchpoint 1: *$3
I'm obviously using linux with g++ and gdb here, but I believe that memory watch points are available with most debuggers.
4) Continue until the watch point is triggered:
Continuing.
Hardware watchpoint 2: *$3
Old value = 0xb7ec2604 ""
New value = 0x804a014 ""
0xb7e70a1c in std::string::_M_mutate () from /usr/lib/libstdc++.so.6
(gdb) where
The gdb where command will give a back trace showing what resulted in the modification. This is either a perfectly legal modification, in which case just continue - or if you're lucky it will be the modification due to the memory corruption. In the latter case, you should now be able to review the code that is really causing the problem and hopefully fix it.
The cause of our bug was an array access with a negative index. The index was the result of a cast of a pointer to an 'int' modulos the size of the array. The bug was missed by valgrind et al. as the memory addresses allocated when running under those tools was never "> MAX_INT" and so never resulted in a negative index.

Oh, if you want to know how to debug the problem, that's simple. First, get a dead chicken. Then, start shaking it.
Seriously, I haven't found a consistent way to track these kinds of bugs down. Because there's so many potential problems, there's not a simple checklist to go through. However, I would recommend the following:
Get comfortable in a debugger.
Start tromping around in the debugger to see if you can find anything that looks fishy. Check especially to see what's happening during the exampleString = hello; line.
Check to make sure it's actually crashing on the exampleString = hello; line, and not when exiting some enclosing block (which could cause destructors to fire).
Check any pointer magic you might be doing. Pointer arithmetic, casting, etc.
Check all of your allocations and deallocations to make sure they are matched (no double-deallocations).
Make sure you aren't returning any references or pointers to objects on the stack.
There are lots of other things to try, too. I'm sure some other people will chime in with ideas as well.

Some places to start:
If you're on windows, and using visual C++6 (I hope to god nobody still uses it these days) it's implentation of std::string is not threadsafe, and can lead to this kind of thing.
Here's an article I found which explains a lot of the common causes of memory leaks and corruption.
At my previous workplace we used Compuware Boundschecker to help with this. It's commercial and very expensive, so may not be an option.
Here's a couple of free libraries which may be of some use
http://www.codeguru.com/cpp/misc/misc/memory/article.php/c3745/
http://www.codeproject.com/KB/cpp/MemLeakDetect.aspx
Hope that helps. Memory corruption is a sucky place to be in!

It could be heap corruption, but it's just as likely to be stack corruption. Jim's right. We really need a bit more context. Those two lines of source don't tell us much in isolation. There could be any number of things causing this (which is the real joy of C/C++).
If you're comfortable posting your code, you could even throw all of it up on a server and post a link. I'm sure you'd gets lots more advice that way (some of it undoubtedly unrelated to your question).

The code was simply an example of where my program was failing (it was allocated on the stack, Jim). I'm not actually looking for 'what have I done wrong', but rather 'how do I diagnose what I've done wrong'. Teach a man to fish and all that. Though looking at the question, I haven't made that clear enough. Thank goodness for the edit function. :')
Also, I actually fixed the std::string problem. How? By replacing it with a vector, compiling, then replacing the string again. It was consistently crashing there, and that fixed even though it...couldn't. There's something nasty there, and I'm not sure what. I did want to check the one time I manually allocate memory on the heap, though:
this->map = new Area*[largestY + 1];
for (int i = 0; i < largestY + 1; i++) {
this->map[i] = new Area[largestX + 1];
}
and deleting it:
for (int i = 0; i < largestY + 1; i++) {
delete [] this->map[i];
}
delete [] this->map;
I haven't allocated a 2d array with C++ before. It seems to work.

Also, I actually fixed the std::string problem. How? By replacing it with a vector, compiling, then replacing the string again. It was consistently crashing there, and that fixed even though it...couldn't. There's something nasty there, and I'm not sure what.
That sounds like you really did shake a chicken at it. If you don't know why it's working now, then it's still broken, and pretty much guaranteed to bite you again later (after you've added even more complexity).

Run Purify.
It is a near-magical tool that will report when you are clobbering memory you shouldn't be touching, leaking memory by not freeing things, double-freeing, etc.
It works at the machine code level, so you don't even have to have the source code.
One of the most enjoyable vendor conference calls I was ever on was when Purify found a memory leak in their code, and we were able to ask, "is it possible you're not freeing memory in your function foo()" and hear the astonishment in their voices.
They thought we were debugging gods but then we let them in on the secret so they could run Purify before we had to use their code. :-)
http://www-306.ibm.com/software/awdtools/purify/unix/
(It's pretty pricey but they have a free eval download)

One of the debugging techniques that I use frequently (except in cases of the most extreme weirdness) is to divide and conquer. If your program currently fails with some specific error, then divide it in half in some way and see if it still has the same error. Obviously the trick is to decide where to divide your program!
Your example as given doesn't show enough context to determine where the error might be. If anybody else were to try your example, it would work fine. So, in your program, try removing as much of the extra stuff you didn't show us and see if it works then. If so, then add the other code back in a bit at a time until it starts failing. Then, the thing you just added is probably the problem.
Note that if your program is multithreaded, then you probably have larger problems. If not, then you should be able to narrow it down in this way. Good luck!

Other than tools like Boundschecker or Purify, your best bet at solving problems like this is to just get really good at reading code and become familiar with the code that you're working on.
Memory corruption is one of the most difficult things to troubleshoot and usually these types of problems are solved by spending hours/days in a debugger and noticing something like "hey, pointer X is being used after it was deleted!".
If it helps any, it's something you get better at as you gain experience.
Your memory allocation for the array looks correct, but make sure you check all the places where you access the array too.

Your code as I can see has no errors. As has been said more context is needed.
If you haven't already tried, install gdb (the gcc debugger) and compile the program with -g. This will compile in debugging symbols which gdb can use. Once you have gdb installed run it with the program (gdb <your_program>). This is a useful cheatsheat for using gdb.
Set a breakpoint for the function that is producing the bug, and see what the value of exampleString is. Also do the same for whatever parameter you are passing to exampleString. This should at least tell you if the std::strings are valid.
I found the answer from this article to be a good guide about pointers.

As far as I can tell your code is correct. Assuming exampleString is an std::string that has class scope like you describe, you ought to be able to initialize/assign it that way. Perhaps there is some other issue? Maybe a snippet of actual code would help put it in context.
Question: Is exampleString a pointer to a string object created with new?

Related

Segmentation fault on method call

I realize the debugger would help but Im a little lacking on knowledge of using it at the moment. But I promise I will begin learning it asap! So if anyone also knows some good reading on how I can learn to use gdb via prompt. Id greately appreciate it! Thanks.
If you're using GCC, I heartily recommend using GDB.
I love Eclipse ... but I usually find the command line faster and more useful. IMHO...
ANYWAY:
1) compile with "-g" to allow debugging,
2) run your program inside of gdb,
3) note the line# it crashes on
4) Look backwards to see if there's something about that line you didn't allocate, you already deallocated or, most likely, you overwrite with a bad array access.
Here are a couple of good, short tutorials on GDB:
http://www.yolinux.com/TUTORIALS/GDB-Commands.html
http://web.eecs.umich.edu/~sugih/pointers/summary.html
http://cs.baylor.edu/~donahoo/tools/gdb/tutorial.html
'Hope that helps!
PS:
When you start debugging, I'd encourage you to set breakpoints in your "Nodes" constructor and your ManipulateArray constructor.
If you don't hit the breakpoint ... then an object never got created ... and you probably found your bug :)
Wow, that's one big mess of code. I don't have a clue what it's for but there's one problem I can see
In your Node class you have an array of four Node pointers called attachedNode. At no time in your code do you make those pointers point at anything. But you dereference those pointers in your attachNewNode method. That's a seg fault right there.
I have no idea how to advise you to fix that problem (or any other problems you might have, I think there are a few) because I don't have much idea what the code is supposed to be doing.
However one piece of advice. This code is too big and complex. Get a smaller piece of it working first, and gradually build up to the whole program. The slow and steady approach will get you there faster in the end.
In a quick look
void Node::attachNewNode(Node *newNode, int direction) {*newNode = *attachedNode[direction];}
looks to be faulty. The assignment should be attachedNode[direction] = newNode;
You want to attach new node in some direction.

Is it possible to define/scramble indeterminate values in C/C++ for debug purposes?

Uninitialized variables may have indeterminate values, as this answer to an earlier question points out. Is there a way to specify this indeterminate data to, say, repeat 0xDEADDEAD? The indeterminate data is apparently compiler-specific, but it would always be nice to force it to be something easily recognizable.
Are there existing memory leak/corruption detection libraries allowing this? Overloading new seems like a solution in some cases, but I'd rather not delve into that trickery myself.
The problem is that indeterminate values usually cause undefined behaviour of code, and rarely occurring run time bugs, so, for example, I'd like to spot if I've forgotten a memset() somewhere in my code. Maybe even randomizing the indeterminate values could serve as a test bench.
In case this is not possible, are there better approaches to solve the problem?
Here are some guidelines for producing good quality C code:
Create/use coding guidelines that help avoid memory bugs (and other types of bugs). There are many examples on the internet. The best approach is to take a look at 5 or 6 and compile it into one, keeping just the things that fit your needs.
Use code reviews/inspections to find bugs and to check adherence to the coding guidelines. Code reviews is one of the best ways to find bugs that are undetectable by tools. Code reviews is extra important for beginner/intermiediate programmers, because it adds to your learning curve, both when you review code that others have written and when you are being reviewed.
Test your code with test cases that can be run automatically.
Use tools like valgrind to find many types of bugs.
To check the pattern of the value of the variables at runtime is a little bit tricky. As you say, it is compiler/architecture dependent.
Usually static analysis tools can give you warnings about uninitialized variables. Here's a free static code checker that you can play with: cppcheck.
There's indeterminate values, there are memory management errors, and there's the intersection of the two.
I don't know what, if anything C/C++ compilers do for indeterminate values. (A compiler I built for an arguably C like parallel language has an explicit debug switch, that fills every unassigned variable with a value designed to "cause trouble", e.g., for ints, -2^31, for pointers, specific not-void values gauranteed to cause a memory access fault, etc.). I suspect your mileage will vary here by compiler.
Memory management is notoriously hard. In C++ you can use constructors and destructors in a regular way to ensure that many of such errors don't occur, see stackoverflow.com/questions/76796/memory-management-in-c
C is harder, carefully inspecting your code, and ensuring each routine has clear responsibility for either allocation, deallocation or neither will help.
For both C and C++ you can use static analysis tools (Coverity, Fortify) to detect many such allocation errors. Similarly, you can use dynamic analysis tools such as Valgrind, which watches what your object code does and stops it when some memory management errors occur. For C only, you can use our dynamic analysis CheckPointer tool; it will detect all the errors valgrind detects and more (e.g., valgrind can't detect an access outside of a local array [one allocated in your stack]; CheckPointer can).

Memory Leak Analysis

There is a memory leak in my application. The memory consumption shoots up after a couple of days of running the application. I need to dump call stack information of each orphaned block address. How is it possible with WinDbg?
I tried referring to document created by my colleague, but I'm confused about how to specify the symbol path and stuff like that. It didn't work out. Where can I get a step-by-step document.
You can use umdh.exe to capture and compare snapshots of the process before and after leak happens. This works best with Debug binaries - it will give you the callstacks of memory allocated between the 1st and the 2nd snapshot.
http://support.microsoft.com/kb/268343
See the "Who called HeapAlloc" entry on this page: http://www.windbg.info/doc/1-common-cmds.html
See this page: http://www.microsoft.com/whdc/DevTools/Debugging/debugstart.mspx for info about the symbol server.
First of all I must say you must be a masochist to use WinDbg! If you code in C++ you are not developing drivers, even in this case there are more decent debuggers. Throw away that crap, really!
To tackle the problem I would first use a static code checker to analyze the code. PC-Lint is a cheap one. Then run the app inside a dynamic code checker (like Boundschecker for example or Purify).
Only if you could not find the culprit code, I would start where you are. Investing in such a tool is really worth the money if you write apps that have to run for days and days. It enables you a faster validation (not 100%) of the code before you start long running tests to find out what a code checker would have found within minutes...
With Boundchecker you can use Marks, it is using a similar feature (or maybe exactly the same?) than Steve Townsend is telling about. With it you would see all memory blocks still hanging on in memory since the last Mark. This is rather tedious in big apps as you end up with a big buck of memory blocks.... But if you came up with that question, then you probably are already so desperate that you would like to try it ;-)
I had never used Memory Validator (http://www.softwareverify.com/cpp/memory/index.html) before yesterday but it did help me track something down today.
For leaks I have been using Visual Leak Detector while it only works in debug mode it is free and seems reasonably reliable

Most common reasons for unstable bugs in C++?

I am currently working on a large project, and I spend most of the time debugging. While debugging is a normal process, there are bugs, that are unstable, and these bugs are the greatest pain for the developer. The program does not work, well, sometimes... Sometimes it does, and there is nothing you can do about it.
What can be done about these bugs? Most common debugging tools (interactive debuggers, watches, log messages) may lead you nowhere, because the bug will disappear ... just to appear once again, later. That is why I am asking for some heuristics: what are the most common reasons for such bugs? What suspicious code should we investigate to locate such a bugs?
Let me start the list:
using uninitialized variables.
Common misprints like mMember =
mMember;
thread synchronization.
Sometimes it can be a matter of
luck;
working with non-smart
pointers, dereferencing invalid
ones;
what else?
IME the underlying problem in many projects is that developers use low-level features of C++ like manual memory management, C-style string handling, etc. even though they are very rarely ever necessary (and then only well encapsulated in classes). This leads to memory corruption, invalid pointers, buffer overflows, resource leaks and whatnot. All the while nice and clean high-level constructs are available.
I was part of the team for a large (several MLoC) application for several years and the number of crashing bugs for different parts of the application nicely correlated to the programming style used within these parts. When asked why they wouldn't change their programming style some of the culprits answered that their style in general yields more performance. (Not only is this wrong, it's also a fact that customers rather have a more stable but slower program than a fast one that keeps crashing on them. Also, most of their code wasn't even required to be fast...)
As for multi-threading: I don't feel expert enough to offer solutions here, but I think Herb Sutter's Effective Concurrency columns are a very worthwhile read on the subject.
Edit to address the discussions in the comments:
I did not write that "C-style string handling is not more performant". (Certainly a lot of negation in this sentence, but since I feel misread, I try to be precise.) What I said is that high level constructs are not in general less performant: std::vector isn't in general slower than manually doing dynamically allocated C arrays, since it is a dynamically allocated C array. Of course, there are cases where something coded according to special requirements will perform better than any general solution -- but that doesn't necessarily mean you'll have to resort to manual memory management. This is why I wrote that, if such things are necessary, then only well-encapsulated in classes.
But what's even more important: in most code the difference doesn't matter. Whether a button depresses 0.01secs after someone clicked it or 0.05secs simply doesn't matter, so even a factor 5 speed gain is irrelevant in the button's code. Whether the code crashes, however, always matters.
To sum up my argument: First make it work correctly. This is best done using well-proven off-the-shelf building blocks. Then measure. Then improve performance where it matters, using well-proven off-the-shelf idioms.
I was actually going to post a question that asked exactly the opposite - do others find, as I do, that you spend almost no time using the debugger when working with C++? I honestly cannot remember the last time I used one - it must have been about six months ago.
Frankly, if you spend most of the time in the debugger, I think there is something very wrong with your basic coding practices.
Race conditions.
These are one of the few things that still sends a shiver down my spine when it comes up in debugging (or in the issue tracker). Inherently horrible to debug, and extremely easy to create. The three most common causes of bugs in my C++ software have been race conditions, reliance on uninitialised memory, and reliance on static constructor order.
And if you don't know what race conditions are, chances are they're the cause of your instability ;)
If you are really in a position where you already have bad code that breaks, the best plan is probably to throw as many tools at it as you can (OS/lib-level memory checking, automated testing, logging, core dumps, etc) to find the problem areas. Then rewrite the code to do something more deterministic. Most of the bugs come from people doing things that mostly work most of the time, but C++ offers stronger guarantees if you use the right tools and approaches.
Haven't seen this one mentioned yet:
Inheriting from a class that does not have a virtual destructor.
Reading from uncached memory while a cache line is being written back over the memory (This is a right bastard to find).
Buffer overwrites
Stack overflows!
The only 3 i can think of at the mo ... may edit later :)
buffer overflows
using pointers to deleted objects
returning invalid references or references to out of scope objects
unhandled exceptions
resource leaks (not only memory)
infinite recursion
dynamic libraries version mismatch
Not really a C++ issue but seen in a C/C++ project.
The trickiest issue I had to deal with was an initialization issue when starting up the OS on our platform that lead to unusual crashes. It took years before we found out what happened. Before that we ran the system overnight and if it didn't crash, then it was normally okay.
Luckily, the OS isn't sold anymore.
addresses and memory used before allocation or after deallocation, segmentation faults, arrayoutofbounds, offset, threadlocks, unintelligible operator overloading, inline assembly, void exit and void in general where return values are desired complicates where math.h functions are worth a look since all math.h functions both have working arguments and return values compared to other library overly void, emptiness tests, nils, nulls and voids. 4 general conventions I recommend are return values, arguments, ternary choices and invertible changes. Faultprone to avoid are vectors (use arrays instead) void with empty arguments and in my subjective opinion I avoid the switch statement in favor of more intelligible or readable if...elseif or more abstract "is".
C++ also has rather lousy forward compatibility compared to scripts and interpreted, to try a decade old Java it still runs unchanged and safe in later vm.

How to create good debugging problems for a contest?

I am involved in a contest, and in one event we have debugging questions. I have to design some really good debugging problems in C and C++.
How can I create some good problems on debugging? What aspects should I consider while designing the problems?
My brainstorming session:
Memory leaks of the subtle sort are always nice to have. Mess around with classes, constructors, copy-constructors and destructors, and you should be able to create a difficult-to-spot problem with ease.
One-off errors for array loops are also a classic.
Then you can simply mess with the minds of the readers by playing with names of things. Create variables with subtly different names, variables with randomized (AND subtly different) names, etc. and then let them try and spot the one place where you've mixed up length and lenght. Don't forget about casing differences.
Calling conventions can be abused to create subtle bugs too (like reversing the order of parameters).
Also let's not forget about endless hours of fun from tricky preprocessor defines and templates (did you know that C++ templates are supposedly Turing-complete?) Metaprogramming bugs should be entertaining.
Next idea that comes to mind is to provide a correct program, but flawed input data (subtly, of course). The program will then fail for the lack of error checking, but it will be some time until people realize that they are looking for problems in the wrong place.
Race conditions are often a difficult to reproduce and fix, try to play with multithreading.
Underflows/overflows can be easily missed by casual inspection.
And last, but not least - if you're a a programmer, try remembering what was the last big problem that you spent two weeks on solving. If you're not a computer programmer, try to find one and ask them. I'm a .NET programmer, so unfortunately my experiences will relate little to your requirement of C/C++.
For some simple "find the bug in this source code" excercises, check out PC-lint's bug of the month archive.
In addition to what's above, consider side effects. For example:
// this function adds two ints and returns the sum
int add_em(int &one, int &two)
{
two += one;
return two;
}
As you can see, this code modifies the two variable, although the comment doesn't mention that...
Debugging is a broad scope, and it may be wise to reflect that in your questions. Without going into details, I can see the following categories :
Source-level debugging - no hints
Questions in this category just have source code, without any further hints on what's wrong.
The actual bug can vary quite a lot here: from straightforward logic bugs like buffer overflows and counting errors to mistaken assumptions, via mathematical errors like rounding errors to just mistaken assumptions like assuming a particular endianness or padding.
Source-level debugging - problem stated
Questions in this category have source code, as well as desired versus actual output/behavior.
E.g. "This program should print 42, but instead prints Out of Memory. Why?"
Crashed code
Questions in this category come not just with source code, but also with a crash dump.
I'll add to the answers above that another form of bugs is the incorrect use of some library or API code. Superficially everything looks ok, but there is some caveat (e.g., a precondition or a limination) that one is not aware of. Interactive debuggers are not as effective by themselves in these situations because they don't expose that information to you (it's often hidden in the documentation).
For example, I did in the past a study of this stuff. I gave people code that used (a messaging API in Java), where the error was that the program was getting stuck as soon as you tried to receive a message. Debugging this interactively was almost impossible. They had to manually figure out what was going on, and realize that one of the queues wasn't set up correctly.
These sort of bugs are actually quite common.
Real world debugging would include find synchronization problems and problems between managed/unmanaged boundary, so please consider c/c++/c# as an option.
Or for real fun, consider using just c# and finding memory leaks.
Also, you will need to mention which tools are allowed to be used. On windows, there are literally dozens of debugging tools available.