How to debug code causing computer to crash

How to debug code causing computer to crash - c++

I have written a program to control several scientific instruments which ends up going through several thousand loops as it runs. This all told tends to take about half an hour to run.
I have run into a weird bug/issue in which about every other time I run the program, the program freezes the computer and I have to hard restart. When I just do a small number of loops to test the program I never have problems, it's only when I try full data runs that it crashes off and on.
Is there anyway to trace the error if it only occurs intermittently? Is there anyway to catch what the error is before the computer freezes? Could it be related to running the code in debug mode and not in release?
I am using Visual C++ 2013 on a Win 7 64-bit machine. All the various includes are the 64-bit versions. I can post the code if that would be helpful, but I must warn that it is very long. Thanks

Being the test procedure so long, maybe the best "in house" way to deal with it is to write on a file the needed debug infos at almost every step the program takes.
Be sure to close the file each time, otherwise you will probably lose your data on freezing.
Sure it will take a lot of more time, but if you are lucky and the error has a certain regularity, after a couple of check you can go with conditional breakpoints and debug the usual way.
If I had to bet on the cause, I will say this is a memory leak.
Hope this helps

Related

C++ Code is running slower over time

I have a long C++ program consisting of thousands of lines of codes, several classes, functions etc. The program basically reads info from an input file, runs an algorithm, and writes the results in output files.
Today I realized that run time of the program drastically changes from time to time. So I do a small test by restarting my computer, closing every other thing possible, and running the code 5 times in a row using the same input file. The run times are 50, 80, 130, 180, 190 seconds, respectively.
My first guess in this situation is the non-deleted dynamic memories. But I have been using dynamic arrays just twice in the whole code, and I am sure I delete those arrays.
Do you guys have any explanation for this? I am using Visual Studio 2010 on Windows 7 computer.

Beware running programs from within visual studio debugger as the LFH (low fragmentation heap) memory allocator is disabled in this case. Try the software from outside of VS.
I have seen cases where tasks would take seconds to complete normally take hours to complete just by running from within visual studio.
Above all if you still don't know what is going on divide and conquer. Instrument the app to see runtimes of subsystems or just place debug timers in various areas to see where execution time is changing drastically and drill down from there. If it is a memory allocator issue you will normally see large runtimes while freeing the arrays.

Your code runs in an environment, which includes the state of the operating system, disk, network, time, memory, other processes launched, etc.
Executing the same code in the same environment will give the same result, every time.
Now, you're getting different results (execution times). If you're running the same executable repeatedly, then something is changing in the surrounding environment.
Now, the most obvious question is : Is your code causing a change to the outside environment? A simple example would be: It reads in a file, changes the data and writes it back out to the same file.
You know your code. Just use this approach to isolate any effects your code may be having on its environment and you'll find the reason.

Debug code taking several orders of magnitude longer to run than release code

I have a large bit of code that takes about 5 minutes to run in debug mode of Visual Studio, and about 10 seconds to run in release mode.
This becomes an enormous issue when I have to debug code at the end of the program, where I have to wait far too long just for the program to hit the breakpoint.
I gave serialization a shot, and used boost::serialize to serialize all the variables before the debug code, but it turns out that deserializing all those variables still takes a minute or two.
So what gives? I'm aware that many optimizations and inline stuff is disabled when running code in debug mode, but it strikes me as very peculiar that it takes almost 2 orders of magnitude longer to run the code in debug mode. Are there any hacks or something programmers use to bypass this wait time? I know there's lots of programs out there much more computationally intensive than mine, but I highly doubt that they would wait 5 minutes just for their debug code to hit a breakpoint.

I have a large bit of code that takes about 5 minutes to run in debug mode of Visual Studio, and about 10 seconds to run in release mode.
That's normal.
So what gives? I'm aware that many optimizations and inline stuff is disabled when running code in debug mode,
That isn't all. In addition to that msvc insert MANY sanity checks, especially when stl containers are involved. For example, it will warn you about incompatible iterators, broken ordering comparator in std::map, and many other similar issues. I think it also detects memory corruption to some extent, and buffer overruns, out of range access for std::vector, etc. This can be useful, but overhead is massive. Throw a certain profiler on top of that and your 10 seconds can as well take 30 minutes to finish. And this will also be normal.
Are there any hacks or something programmers use to bypass this wait time?
Aside from using it instead of #1 excuse...
You could build debug version of your code on mingw - it doesn't insert (this kind of) sanity checks.
You could also investigate source STL libraries and see which macros enables all those features. It is quite possible that it can be disabled. It is also quite possible that said macros is documented somewhere on msdn.
You could try to find alternative STL implementation for the debug mode.
YOu could also build release mode with debug info and debug it instead.

OP here, so I was messing around with release builds running without the debugger attached, and found that with VS2010 ultimate (may also be for express too), when a program crashes it gives you a prompt asking if you want to debug or close the program (however before this it asks you if you want to abort, retry, or ignore; choose ignore). Clicking debug and selecting the current open solution in visual studio will load up the code and pretend the entire crash occurred while the program was being debugged.
This essentially means that if you put intentional glitches in your code where you would want a breakpoint, you can run the code in fast release mode without the debugger attached, and start debugging after the program crashes. For the purpose of forcing a crash I created an empty vector and had it try to access an element of the empty vector.
However this method has a major setback, being that it's one-time use. After the program crashes and you start debugging it, you cannot do anything more than view the watchlist and other variables, which means no other breakpoints can be set since you're technically not running a debug-enabled process.
Sure, it's a pretty huge setback, but that doesn't mean the method wouldn't have its uses.

Depends on what you want to debug. If you're ready to tolerate some strange behavior, you can usually customize what you want to debug. Try turning on optimization, using release mode libraries (keeping debug info enabled).

How to easily figure out where and why a program crashed?

I'm currently working on a program (in C++, using Code::Blocks) that uses a lot of random numbers and takes a while to get going; most of the time, it works fine, but every now and then it performs an illegal operation and must shut down. Given the random numbers all over the place, and the fact that it currently takes ~3-5 minutes for the program to reach the stage at which the errors occur (this timeframe is normal/acceptable), reproducing the problems reliably and convenient is extremely difficult, and reporting on every other line of code to cout to manually track things is time-consuming, visually clutters reporting on things not related to bugs, and is not always helpful, since even if I know when the program stops, I sometimes don't know why.
Is there some way for me to see what the last operation in the program was before it crashed, and for me to see why this operation lead to a crash? Something within CodeBlocks would be best, but something third-party works too. It also needs to be something I can use every time I test the program, because I never know when a crash is going to occur.

That is what debuggers are for. Build the system with full debugging symbols, configure the system so that you get a full crash report (in linux a core file), and then launch the debugger with the core file (alternatively run the whole program inside the debugger, but that might take a while, running inside a debugger is usually much slower than running outside of it.
The debugger should be able to give you the state of the program when the illegal instruction happened and you will get some insight as of the state that the program was on. From there either you figure what is wrong, or maybe you can make a couple of smaller testcases that might trigger the error.
Debugging issues that cannot be reproduced systematically is a pain, good luck there!

Sounds like you want a debugger. Debugging C and C++ programs using GDB

How do I detect where the program is stuck in an infinite loop?

I am working on a (relatively complex) game. The game freezes in release mode. The freeze happens after 1-2 min. of game-play. The current configuration of the release mode that I have allows me to break (that is go into debug), which is good, but may give me wrong information but that is fine for this particular case (I can turn off the optimization for a single file/function/code).
Problem is, I (we, since we are a team) don't know where it is hanging. It is not as simple as one relatively small infinite loop that is hanging, as other things (Graphics, sound) are being updated, just that the game-play has stalled. The main game loop (an infinite loop) is always running and is very long/complex, so stepping through is going to be a pain (but it is one of the options).
The first thing I tried is Visual Studio's break all but it always breaks in code that is not mine and consequently shows me assembly output. Eventually, with enough persistence, SVN history checking and commenting out code I will be able to figure out where it is hanging, but there has to be a better way... hopefully?
Note: There is a Visual Studio option I am aware of that allows debugging user code only, but that is managed code only.
EDIT: Was able to solve the problem via stack trace and lots of hours of keeping track of various things to see where the game is hanging. I will select Sjoerd's answer as the correct one, however, if someone has a suggestion for a tool/technique that allows to automate such a task, by all means, add your answer!

If you break and you encounter native code that is not yours, check the call stack. The call stack is the list of functions that got called to reach the current point in the code. Go up some levels in the stack until you encounter the method which is currently running.

Hit the pause button in Visual Studio while the program is hung.
This should break the debugger at the current line. You can then step through and see what is happening.

As an alternative to debugging symbols and breaks (which is the tool of choice when possible), add logging: It is not uncommon for games (and other apps) to have a huge logging system they can turn on and off with a compiler flag so they can still do some kind of debugging/tracing in "release builds". If your logging works fine you should see what is and what is not happening and get at least some idea where things go wrong.

You might well never be able to catch the problem via an interrupt if the code that should be executing isn't executing. There are lots of ways this can happen. Just a few:
You have some parameter that indicates the time at which the next update is to be performed. If this somehow gets set to some big number, the code that does the update will happily see that nothing needs to be done. Next! This can give all the appearances of a hung program even though it isn't really hung at all. The state update and the graphics functions are still being called at their prescribed rate.
You may some counter that represents time and some rounding mechanism for incrementing time. If the counter is a 32 bit signed int and the granularity of your counter is 0.1 microseconds, you will hit INT32_MAX after just 3.6 minutes. Now time is frozen, so once again you have a situation where updates may not be performed.
You are using a single precision floating point number to represent time and update time via time += delta_t; This will stop working after a couple of minutes if your delta_t is 10 microseconds. This is yet another mechanism by which time can be frozen.
Edit
Have you looked at the CPU usage in your various threads? The above problems might cause the physics or game-playing thread to exhibit a drastic drop in CPU usage after a couple of minutes. You might also get this behavior if the game playing thread is perpetually locked, but here you might (with the right tool) get an indication that that thread is always asleep.

How could running code in the debugger makes it faster?

It never happened to me. In Visual Studio, I have a part of code that is executed 300 times, I time it every iteration with the performance counter, and then average it.
If I'm running the code in the debugger I get an average of 1.01 ms if I run it without the debugger I get 1.8 ms.
I closed all other apps, I rebooted, I tried it many times: Always the same timing.
I'm trying to optimize my code, but before throwing me into changing the code, I want to be sure of my timings. To have something to compare with.
What can cause that strange behaviour?
Edit:
Some clarification:
I'm running the same compiled piece of code: the release build. The only difference is (F5 vs CTRL-F5)
So, the compiler optimization should not be invoved.
Since each calcuated times were verry small, I changed the way I benchmark: I'm now timing the 300 iterations and then divide by 300. I have the same result.
About caching: The code is doing some image cross correlation, with different images at each iterations. The steps of the processing are not modified by the data in the images. So, I think caching is not the problem.

I think I figured it out.
If I add a Sleep(3000) before running the tests, they give the same result.
I think it has something to do with the loading of misc. dlls. In the debugger, the dlls were loaded before any code was executed. Outside the debugger, the dlls were loaded on demand, and one or more were loaded after the timer was started.
Thanks all.

I don't think anyone has mentioned this yet, but the debug build may not only affect the way your code executes, but also the way the timer itself executes. This can lead to the timer being inaccurate / slower / definitely not reliable. I would recommend using a profiler as others have mentioned, and compare only similar configurations.

You are likely to get very erroneous results by doing it this way ... you should be using a profiler. You should read this article entitled The Perils of MicroBenchmarking:
http://blogs.msdn.com/shawnhar/archive/2009/07/14/the-perils-of-microbenchmarking.aspx

It's probably a compiler optimization that's actually making your code worse. This is extremely rare these days but if you're doing odd, odd stuff, this can happen.
Some debugger / IDEs like Visual Studio will automatically zero out memory for you in Debug mode; this may be a contributing factor.

Are you running the exact same code in the debugger and outside the debugger or running debug in the debugger and release outside? If so the code isn't the same. If you're running debug and release and seeing the difference you could turn off optimization in release and see what that does or run your code in a profiler in debug and release and see what changes.

The debug version initializes variables to 0 (usually).
While a release binary does not initialize variables (unless the code explicitly does). This may affect what the code is doing the ziae of a loop or a whole host of other possibilities.
Set the warning level to the highest level (level 4, default 3).
Set the flag that says treat warnings as errors.
Recompile and re-test.

Before you dive into an optimization session get some facts:
dose it makes a difference? dose this application runs twice as slow measured over a reasonable length of time?
how are the debug and release builds configured
what is the state of this project? Is it a complete software or are you profiling a single function ?
how are you running the debug and build releases , are you sure you are testing under the same conditions (e.g. process priority settings )
suppose you do optimize the code what do you have in mind ?

Having read your additional data a distant bell started to ring ...
When running a program in the debugger it will catch both C++ exceptions and structured exceptions (windows execution)
One event that will trigger a structured exception is a divide by zero, it is possible that the debugger quickly catches and dismiss this event (as a first chance exception handling) while the release code goes a bit longer before doing something about it.
so if your code might be generating such or similar exceptions it worth a while to look into it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js