This is utterly mystifying me. I have, in my class declaration, two lines:
std::multimap<int, int> commands;
std::multimap<std::string, std::string> config;
The code compiles without issue, but when I run it, I get the following error:
*** glibc detected *** ./antares: free(): invalid pointer: 0xb5ac1b64 ***
Seems simple enough, except that it has nothing to do with how the two variables are later handled. I removed all references in the rest of the code to the variables - still crashed. I commented out one of the lines - either one, and the program ran without issue. How can the error not be with either particular variable? I'm working under the assumption that there isn't a bug in STL, but I've run out of ideas on how my code could possibly be doing this.
This one has me stumped, so I'd appreciate any help you can provide.
Wyatt
EDIT: I'm not suggesting there's a problem with STL, that was just me being a bit glib. I know the bug is in my code, what I want to know is - what could possibly be wrong that declaring an unreferenced variable would cause it to crash? Why would that affect my code at all?
My code is a few thousand lines long, so it's not really worth anyone's time reading through it, I'm just looking for someone to point me in the right direction.
You're correct to assume the problem isn't in GCC or the STL. However, if the maps are causing free errors, your other code is likely stack smashing (or heap smashing). A truly terrible bug to chase down. The worse part about stack smashing is the object that breaks is not the object with the bug.
Here are some debugging tips.
Run the app under valgrind.
define _GLIBCXX_DEBUG to enable stl debugging
add MALLOC_CHECK_=1 as an environment variable. This will give you better malloc error messages. More info here.
On rare occasions I have been able to add a memory watch to the location that will be smashed. But it is rare when you can predict where the smashing will occur.
You are right: the crash is not from these two lines - they just make it visible.
Here's how to diagnose this problem:
first, leave your variables defined (make your program crash)
second, remove or disable other parts of your code until the crash stops happening. Then you will know an approximate area that corrupts your memory.
third (once you have an area that when disabled stops the crash) start enabling parts of it until the crash happens again.
Edit: I'd say your problem is with code that contains your two multimaps (a copy constructor or assignment operator is missing or something like that). It's just a wild guess so don't put much stock on it.
Related
as the title mentions, I have a problem where one executable of a big project that gives a segmentation fault when it runs but is compiled normally and not with debug.
We are working on linux SUSE servers and code is mostly C++. Through bt in gdb, I have been able to see where exactly the problem occurs, which brings me to the question. The file is an auto-generated one which has not been changed for years. The difference now is that we have updated a third party component, gSOAP. Before updating the third party version it worked normally on both debug and not.
With debug flags, the problem disappears magically (for newbies like me).
I am sorry but its not possible to include a lot of code, only the line that is:
/*------------------------------------------------------------.
| yynewstate -- Push a new state, which is found in yystate. |
`------------------------------------------------------------*/
yynewstate:
/* In all cases, when you get here, the value and location stacks
have just been pushed. So pushing a state here evens the stacks. */
yyssp++;
yysetstate:
*yyssp = yystate; <------------------ THIS LINE
So, any help would appreciated. I actually dont understand why this problem rises and what steps I should take to solve it.
EDIT, I dont expect you to solve this particular case for me, as in more to help me understand why in programming this could occur, my case in this code is just an example.
First, please realize that you're using C++, not Java or any other language where the running of your program is always predictable, even runtime issues are predictable.
In C++, things are not predictable as in those languages. Just because your original program hasn't changed for years does not mean the program was error-free. That's how C++ works -- you think you have an error-free program, and it is not really error-free.
From your code, the exception is because yyssp is pointing to something it shouldn't be pointing to, and dereferencing this pointer causes the exception. That is the only thing that could be concluded from the code you posted. Why the pointer is pointing to where it is? We don't know, that is what you need to discover by debugging.
As to why things run differently in debug and release -- again, a bug like this allows a program to run in an unpredictable way. Add or remove code, run it on another machine, run it with differing compiler options, maybe even run it next week, and it might behave differently.
One thing you should not do -- if you make a totally irrelevant code change and magically your program works, do not claim the problem is fixed or resolved. No -- the problem is not fixed -- you've either masked it, or the bug is moved to another part of your code, hidden from you. Every fix that entails things like this must be reasoned as to why the fix addresses the problem.
Too many times, a naive programmer thinks that moving things around, adding or removing lines, and bingo, things work, that becomes the fix. Don't fall into that trap.
someone in my team found a temporary solution for this,
it was the optimization flags that this library is build with.
The default for our build was -O2 while on debug this changes.
Building the library with -O0 (changing the makefile) provides a temporary solution.
First of all, thank you for taking the time to view my question and help. I noticed that a lot of questioners here show little or no appreciation, but I'm sincerely appreciative for the help and the community here :)
I wrote a C++ plugin (compromised of hundreds of source files) for an application I do not have the source code for (it's a video game). In other words, I only have the source code for my plugin, but not the game. Now, somewhere in those thousands of lines in my plugin, something causes the game engine to throw (probably an access violation) and I don't know where. By the time the debugger breaks, the stack is corrupted and all I get are hex addresses for DLLs I do not have the source for (but the exception occurs in my DLL for sure). I tried everything... I just can't seem to find where the exception occurs. Sometimes the debugger points to a "memory relocation" function (which I never used in my plugin), sometimes it points to the engine's GameFrame(), and other times it points to a damage callback (all these are just different member functions of a class).
I tried practically everything... I googled for hours trying to find out how to use other debuggers like WinDbg and Microsoft Application Verifier. I tried to comment out one or the other, or both, where the debugger points, but it still crashes. I even inserted OUTPUT("The name of the last executed function is: %s", __FUNCTION__) into EVERY function in my application hoping to painstakingly catch the last function but it seems any kind of I/O prevents the exception from occurring for some reason... And 10 minutes of debugging and the crash happens at some random last executed function.
I can't find out where this access violation is happening or where some temporary object is removed to cause these bad pointers (I check every pointer before using it), but damn, I'm reaching my limit's end here.
So, how does one debug the impossible... a random crash with a crappy debugger call stack? Thanks in advance for your patient and kind help!
My suggestion: try different debuggers (non MS), they catch different things.
My experience: a program I have source code and full debugging symbols corrupt the stack, VS nor WinDbg can help but Ollydbg comments a non-string var with the value "r for pattern.", so I had overwrote some string buffer onto this var. Also Ollydbg have option to walk the stack the hard way (not using dbghelp.dll)
From my experience, the old adage "Prevention is better than cure" is very relevant. It is best to prevent the bugs from creeping in, by following good software development practices (unit tests, regressions, code review, etc.) than to work it out later once the bugs show up.
Of course, real world is not perfect, and bugs do show up. To debug memory corruption, you have some nice tools like valgrind, which at least narrow down the problem sections for you to take a closer look at. Debugging a complex program is not easy, and if your debugger throws up, it requires a lot of persistence on your part. One technique I find useful is to selectively enable or disable certain modules, to narrow down the module has the problem.
Sometimes you need to use "referential transparency" to unload some modules. To give you a stripped down example, consider:
int foo = factorial(3);
If I suspect there's a problem in this code (and the debugger crashes before I can see the call stack), I have to try by removing this code, and see if the problem persists. However, foo may be used later, so I cannot just remove it. Instead I can replace it with int foo = 6; and continue.
Another important point is to always maintain a trace file, where your code keeps logging what it is doing. When a program crashes, the trace file can often help narrow down the problem. Of course, you disable the tracing by default, so that it doesn't cause a performance bottleneck.
I'm facing a problem that is so mysterious, that I don't even know how to formulate this question... I cannot even post any piece of code.
I develop a big project on my own, started from scratch. It's nearly release time, but I can't get rid of some annoying error. My program writes an output file from time to time and during that I get either:
std::string out_of_range error
std::string length_error
just lots of nonsense on output
Worth noting that those errors appear very rarely and can never be reproduced, even with the same input. Memcheck shows no memory violation, even on runs where errors were previously noted. Cppcheck has no complains as well. I use STL and pthreads intensively, but without the latter one errors also happen.
I tried both newest g++ and icpc. I am running on some version of Ubuntu, but I don't believe that's the reason.
I would appreciate any help from you, guys, on how to tackle such problems.
Thanks in advance.
Enable coredumps (ulimit -c or setrlimit()), get a core and start gdb'ing. Or, if you can, make a setup where you always run under gdb, so that when the error eventually happen you have some information available.
The symptoms hint at a memory corruption.
If I had to guess, I'd say that something is corrupting the internal state of the std::string object that you're writing out. Does the string object live on the stack? Have you eliminated stack smashing as a possible cause (that wouldn't be detectable by valgrind)?
I would also suggest running your executable under a debugger, set up in such a way that it would trigger a breakpoint whenever the problem happens. This would allow you to examine the state of your process at that point, which might be helpful in figuring out what's going on.
gdb and valgrind are very useful tools for debugging errors like this. valgrind is especially powerful for identifying memory access problems and memory leaks.
I encountered strange optimization bugs in gcc (like a ++i being assembled to i++ in rare circumstances). You could try declaring some critical variables volatile but if valgrind doesn't find anything, chances are low. And of course it's like shooting in the dark...
If you can at least detect that something is wrong in a certain run from inside the program, like detecting nonsensical output, you could then call an empty "gotNonsense()" function that you can break into with gdb.
If you cannot determine where exactly in the code does your program crash, one way to find that place would be using a debug output. Debug output is good way of debugging bugs that cannot be reproduced, because you will get more information about the bug the next time it happens, without the need to actively reproduce it. I recommend using some logging lib for that, boost provides one, for example.
You are using STL intensively, so you can try to run your program with libstdc++ in debug mode. It will do extra checks on iterators, containers and algorithms. To use the libstdc++ debug mode, compile your application with the compiler flag -D_GLIBCXX_DEBUG
When I'm using my debugger (in my particular case, it was QT Creator together with GDB that inspired this) on my C++ code, sometimes even after calling make clean followed by make the debugger seems to freak out.
Sometimes it will seem to be lined up with another piece of code's line numbers, and will jump around. Sometimes this is is off by one line, sometimes this is totally off and it'll jump around erratically.
Other times, it'll freak out by stepping into things I didn't ask it to step into, like while stepping over a function call, it might step into the string initialization routine that is part of it.
When I get seg faults, sometimes it's able to tell me where it happened perfectly, and other times it's not even able to display question marks for which functions called the code and from where, and all I see is assembly, even while running the exact same code repeatedly.
I can't seem to figure out a pattern to what causes these failures, and sometimes my debugger is perfectly well behaved.
What are the theoretical reasons behind these debugger freak outs, and what are the concrete steps I can take to prevent them?
There's 3 very common reasons
You're debugging optimized code. This rarely works - optimized code can be reordered/inlined/precomputed/etc. to the point there's no chance whatsoever to map it back to the source code.
You're not debugging, for whatever reason, the binary matching the current source code.
You've invoked undefined behavior somewhere - if whatever stuff your code did, it has messed around with the scaffolding the debugger needs to keep its sanity. This is what usually happens when you get a segfault and you can't get a sane stack trace, you've overwritten/messed with the information(e.g. stack pointers) the debugger needs to do its job.
And probably hundreds more - of the stuff I personally encounter is: debugging multithreaded code; depending on gcc/gdb versions and various other things - there's been quite a handful debugger bugs.
One possible reason is that debuggers are as buggy as any other program!
But the most common reason for a debugger not showing the right source location is that the compiler optimized the code in some way, so there is no simple correspondence between the source code and the executable code. A common optimization that confuses debuggers is inlining, and C++ is very prone to it.
For example, your string initialization routine was probably inlined into the function call, so as far as the debugger was concerned, there was just one function that happened to start with some string initialization code.
If you're tracking down an algorithm bug (as opposed to a coding bug that produces undefined behavior, or a concurrency bug), turning the optimization level down will help you track the bug, because the debugger will have a simpler view of the code.
I have the same question like yours, and I cannot solve it yet. But I have came out one problem solution which is to install a virtual machine and install Unix system in it. And debug it in Linux system. Perhaps it will work.
I have found out the reason, you should rebuild the project every time you changed your code, or the Qt will just run the old version of the code.
I have been working for the last few weeks trying to track down a really difficult bug that crashes my application. First, the application was crashing on the assign of a std::string, then during the free of a local variable.
After careful inspection of the code, there was no reason for it to crash at these locations; however, it always crashed while trying to free an invalid pointer (i.e. a pointer that pointed to invalid memory). And I have no idea why this pointer was not pointing to the right location.
I suspect that the issue has to do with a memory corruption problem or pointer corruption problem of some sort. The problem is that I can't visually track it down....yet. I have no idea where to start looking in the code, and there are thousands of lines of code to go through so this does not seem like a realistic approach to the problem.
So in comes Valgrind...
A tool that I have depended upon many a time to find issues within the code that may lead to a crash of this type. However, this time it has come up empty handed! I do not see any errors in valgrind when the problem occurs and so hence the reason for me asking this question.
Are there any other applications that can complement valgrind and help find issues in the code that may cause a crash mentioned above?
Thanks!
I assume you're using valgrind's memcheck tool, which is what it is famous for. Since you are using valgrind already you might also try running your program through valgrind --tool=exp-sgcheck (formerly exp-ptrcheck), which is an experimental tool that is designed to catch certain types of errors that memcheck will miss, including access checks for stack and global arrays, and use of pointers that happen to point to a valid object but not the object that was intended. It does this by using a completely different mechanism, essentially tracking each pointer into memory rather than tracking the memory itself, and through use of heuristics.
Be aware that the tool is experimental, but you may find that it catches something significant. Currently it does not yet support OS X or non-Intel processors.
In my experience, coverity and purify have founds such kind of errors than valgrind didn't (in fact all found problems which weren't seen by the others).
But sometimes no tool give an hint and you have to dig more, add instrumentation, play with breakpoints on "modify memory at address", try to simply the testcase which fails and so on to find out the root cause. That's can be very painful.
My experience is that often this sort of problem is caused by a heap overflow. Electric Fence is a relatively simple allocation debugging tool I like to use. Its main use is as a dynamic analysis tool to check for heap overflows, a complement to "-fstack-protector-all" which checks for stack overflows.
More links to efence stuff.
Is it possible some stack corruption is occurring? If so, try enabling stack canaries with the -fstack-protector-all option, assuming you are using g++.
Other than that, have you cranked up warning flags to help identify suspicious code?
In my opinion, using a debugger with "reverse debugging" capabilities could help.
You would be able to step back in time and hopefully find out what was the real source of the problem.
Here are a couple of links:
http://www.gnu.org/software/gdb/news/reversible.html
http://undo-software.com/ (which apparently is free for non-commercial applications)
You didn't specify the platform, but I can recommend Gimpel PC-lint as an excellent static analysis tool (don't be fooled by the name!). They also offer FlexeLint for other platforms, but I have no personal experience of that product.
Have you tried using lint, flexlint or cppcheck. These may help identify a problem.
If you know what area of memory is being corrupted have you tried marking this memory as protected. This may mask your problem and not help at all but if it still crashes the point at where the memory is modified will help resolve your problem.
If valgrind can identify the bad pointer being passed to free(), you could try running the program under DDD, which can set a hardware watchpoing on the memory location and halt the program when it is getting a bad value. If the pointer is getting changed a lot you may have to write some code around malloc and free to keep track of which values are good and bad.