In what situation you must work on a debug build? - c++

Given the fact that you can debug in release build as mentioned in http://msdn.microsoft.com/en-gb/library/fsk896zz.aspx, in what situation do you really need to build a debug build in the development process?

While you can debug the release configuration, the settings in the release configuration are for the release build (and probably should be seen/maintained as such, through the development lifecycle).
Changing them similar to that article is a step that you will probably have to revert at some point, unless sending debugging information to your clients is what you want to do.
In some projects there are three maintained build configurations:
debug: supporting no optimizations and full diagnostics information (optimized for code maintenance, by the developers)
release: build what the clients will see/buy
release with debug symbols (similar to the link you ask about): this is for testing; the QA team will test something as similar as possible to what the clients will see, but in case it doesn't work, developers should have enough context information to investigate the issue.

A lot depends on the type of application, but normally, you
won't want two different builds; you want to work and debug on
the same build you deliver. What you call it is up to
you—you generally don't need a name for it, since it is
the only configuration you use.
This will typically be fairly close to what Microsoft calls the
Debug build; it will have assert active, for example, and not
do much optimizing
The exception is when performance is an issue. If you find
yourself in a case where you cannot afford to leave asserts
active, bounds checking in arrays, etc., and you need
optimization in the code you deliver, you will probably want to
have two builds, one for testing and debugging, and one that you
deliver (which will also require testing). The reason is, of
course, that it is very difficult to debug optimized code, since
the generated code doesn't always correspond too closely to what
you have written. Also, a lot of debuggers (including both VS
and gdb, I think) are incapable of showing the values of
variables that the compiler has optimized into a register.
In many such cases, you may also want to create three builds;
iterator validation can be very, very expensive, and you may
want to have a build which turns that off, but still does no
optimizing. (It's very painful to debug if you need to wait 20
minutes to reach the spot where the program fails.)

The optimizations performed for the Release build make debugging harder.
For example,
a = c;
b = d;
In a Debug build, the compiled code will consist of four instructions:
read c
write a
read d
write b
That is fairly straightforward, and when stepping through the program line by line, executing the first line will run instructions 1 and 2, allowing me to examine the program state at this point afterwards.
In contrast, a Release build might notice that a and b are right next to each other in memory, as are c and d, so you could use a single access to read both c and d in one go, and write a and b.
Now we have only two instructions, and there is no clear mapping between source code lines and machine instructions. If you ask the debugger to step over the first line, it will execute both instructions (so you can see the result), but you never get the exact state between the two lines.
This is a simple example. Typically, the optimizer will try to comb instructions together so that the CPU is optimally loaded with instructions.
This especially means pulling memory reads to the front so they have a chance of being executed before the data is used in a calculation (otherwise, the CPU has to stop there and wait for the memory access to complete), mixing floating-point and integer operations (because these are run on different circuitry and can be parallelized), and calculating conditions for conditional jumps as early as possible (so the instruction prefetch mechanism knows whether to follow the jump or not).
In short, while debugging using a Release build is possible and sometimes necessary to reproduce customer bug reports, you really want the predictable behavior of a Debug build most of the time.

Related

Debug code taking several orders of magnitude longer to run than release code

I have a large bit of code that takes about 5 minutes to run in debug mode of Visual Studio, and about 10 seconds to run in release mode.
This becomes an enormous issue when I have to debug code at the end of the program, where I have to wait far too long just for the program to hit the breakpoint.
I gave serialization a shot, and used boost::serialize to serialize all the variables before the debug code, but it turns out that deserializing all those variables still takes a minute or two.
So what gives? I'm aware that many optimizations and inline stuff is disabled when running code in debug mode, but it strikes me as very peculiar that it takes almost 2 orders of magnitude longer to run the code in debug mode. Are there any hacks or something programmers use to bypass this wait time? I know there's lots of programs out there much more computationally intensive than mine, but I highly doubt that they would wait 5 minutes just for their debug code to hit a breakpoint.
I have a large bit of code that takes about 5 minutes to run in debug mode of Visual Studio, and about 10 seconds to run in release mode.
That's normal.
So what gives? I'm aware that many optimizations and inline stuff is disabled when running code in debug mode,
That isn't all. In addition to that msvc insert MANY sanity checks, especially when stl containers are involved. For example, it will warn you about incompatible iterators, broken ordering comparator in std::map, and many other similar issues. I think it also detects memory corruption to some extent, and buffer overruns, out of range access for std::vector, etc. This can be useful, but overhead is massive. Throw a certain profiler on top of that and your 10 seconds can as well take 30 minutes to finish. And this will also be normal.
Are there any hacks or something programmers use to bypass this wait time?
Aside from using it instead of #1 excuse...
You could build debug version of your code on mingw - it doesn't insert (this kind of) sanity checks.
You could also investigate source STL libraries and see which macros enables all those features. It is quite possible that it can be disabled. It is also quite possible that said macros is documented somewhere on msdn.
You could try to find alternative STL implementation for the debug mode.
YOu could also build release mode with debug info and debug it instead.
OP here, so I was messing around with release builds running without the debugger attached, and found that with VS2010 ultimate (may also be for express too), when a program crashes it gives you a prompt asking if you want to debug or close the program (however before this it asks you if you want to abort, retry, or ignore; choose ignore). Clicking debug and selecting the current open solution in visual studio will load up the code and pretend the entire crash occurred while the program was being debugged.
This essentially means that if you put intentional glitches in your code where you would want a breakpoint, you can run the code in fast release mode without the debugger attached, and start debugging after the program crashes. For the purpose of forcing a crash I created an empty vector and had it try to access an element of the empty vector.
However this method has a major setback, being that it's one-time use. After the program crashes and you start debugging it, you cannot do anything more than view the watchlist and other variables, which means no other breakpoints can be set since you're technically not running a debug-enabled process.
Sure, it's a pretty huge setback, but that doesn't mean the method wouldn't have its uses.
Depends on what you want to debug. If you're ready to tolerate some strange behavior, you can usually customize what you want to debug. Try turning on optimization, using release mode libraries (keeping debug info enabled).

How could running code in the debugger makes it faster?

It never happened to me. In Visual Studio, I have a part of code that is executed 300 times, I time it every iteration with the performance counter, and then average it.
If I'm running the code in the debugger I get an average of 1.01 ms if I run it without the debugger I get 1.8 ms.
I closed all other apps, I rebooted, I tried it many times: Always the same timing.
I'm trying to optimize my code, but before throwing me into changing the code, I want to be sure of my timings. To have something to compare with.
What can cause that strange behaviour?
Edit:
Some clarification:
I'm running the same compiled piece of code: the release build. The only difference is (F5 vs CTRL-F5)
So, the compiler optimization should not be invoved.
Since each calcuated times were verry small, I changed the way I benchmark: I'm now timing the 300 iterations and then divide by 300. I have the same result.
About caching: The code is doing some image cross correlation, with different images at each iterations. The steps of the processing are not modified by the data in the images. So, I think caching is not the problem.
I think I figured it out.
If I add a Sleep(3000) before running the tests, they give the same result.
I think it has something to do with the loading of misc. dlls. In the debugger, the dlls were loaded before any code was executed. Outside the debugger, the dlls were loaded on demand, and one or more were loaded after the timer was started.
Thanks all.
I don't think anyone has mentioned this yet, but the debug build may not only affect the way your code executes, but also the way the timer itself executes. This can lead to the timer being inaccurate / slower / definitely not reliable. I would recommend using a profiler as others have mentioned, and compare only similar configurations.
You are likely to get very erroneous results by doing it this way ... you should be using a profiler. You should read this article entitled The Perils of MicroBenchmarking:
http://blogs.msdn.com/shawnhar/archive/2009/07/14/the-perils-of-microbenchmarking.aspx
It's probably a compiler optimization that's actually making your code worse. This is extremely rare these days but if you're doing odd, odd stuff, this can happen.
Some debugger / IDEs like Visual Studio will automatically zero out memory for you in Debug mode; this may be a contributing factor.
Are you running the exact same code in the debugger and outside the debugger or running debug in the debugger and release outside? If so the code isn't the same. If you're running debug and release and seeing the difference you could turn off optimization in release and see what that does or run your code in a profiler in debug and release and see what changes.
The debug version initializes variables to 0 (usually).
While a release binary does not initialize variables (unless the code explicitly does). This may affect what the code is doing the ziae of a loop or a whole host of other possibilities.
Set the warning level to the highest level (level 4, default 3).
Set the flag that says treat warnings as errors.
Recompile and re-test.
Before you dive into an optimization session get some facts:
dose it makes a difference? dose this application runs twice as slow measured over a reasonable length of time?
how are the debug and release builds configured
what is the state of this project? Is it a complete software or are you profiling a single function ?
how are you running the debug and build releases , are you sure you are testing under the same conditions (e.g. process priority settings )
suppose you do optimize the code what do you have in mind ?
Having read your additional data a distant bell started to ring ...
When running a program in the debugger it will catch both C++ exceptions and structured exceptions (windows execution)
One event that will trigger a structured exception is a divide by zero, it is possible that the debugger quickly catches and dismiss this event (as a first chance exception handling) while the release code goes a bit longer before doing something about it.
so if your code might be generating such or similar exceptions it worth a while to look into it.

Common reasons for bugs in release version not present in debug mode

What are the typical reasons for bugs and abnormal program behavior that manifest themselves only in release compilation mode but which do not occur when in debug mode?
Many times, in debug mode in C++ all variables are null initialized, whereas the same does not happen in release mode unless explicitly stated.
Check for any debug macros and uninitialized variables
Does your program uses threading, then optimization can also cause some issues in release mode.
Also check for all exceptions, for example not directly related to release mode but sometime we just ignore some critical exceptions, like mem access violation in VC++, but the same can be a issue at least in other OS like Linux, Solaris. Ideally your program should not catch such critical exceptions like accessing a NULL pointer.
A common pitfall is using an expression with side effect inside an ASSERT.
I've been bitten by a number of bugs in the past that have been fine in Debug builds but crash in Release builds. There are many underlying causes (including of course those that have already been summarised in this thread) and I've been caught out by all of the following:
Member variables or member functions in an #ifdef _DEBUG, so that a class is a different size in a debug build. Sometimes #ifndef NDEBUG is used in a release build
Similarly, there's a different #ifdef which happens to be only present in one of the two builds
The debug version uses debug versions of the system libraries, especially the heap and memory allocation functions
Inlined functions in a release build
Order of inclusion of header files. This shouldn't cause problems, but if you have something like a #pragma pack that hasn't been reset then this can lead to nasty problems. Similar problems can also occur using precompiled headers and forced includes
Caches: you may have code such as caches that only gets used in release builds, or cache size limits that are different
Project configurations: the debug and release configurations may have different build settings (this is likely to happen when using an IDE)
Race conditions, timing issues and miscellanous side-effects occurring as a result of debug only code
Some tips that I've accumulated over the years for getting to the bottom of debug/release bugs:
Try to reproduce anomalous behaviour in a debug build if you can, and even better, write a unit test to capture it
Think about what differs between the two: compiler settings, caches, debug-only code. Try to minimise those differences temporarily
Create a release build with optimisations switched off (so you're more likely to get useful data in the debugger), or an optimised debug build. By minimising the changes between debug and release, you're more likely to be able to isolate which difference is causing the bug.
Other differences might be:
In a garbage-collected language, the
collector is usually more aggressive
in release mode;
Layout of memory may
often be different;
Memory may be
initialized differently (eg could be
zeroed in debug mode, or re-used more
aggressively in release);
Locals may
be promoted to register values in release, which can
cause issues with floating point
values.
Yes!, if you have conditional compilation, there may be timing bugs (optimised release code verse, non-optimised debug code), memory re-use vs. debug heap.
It can, especially if you are in the C realm.
One cause could be that the DEBUG version may add code to check for stray pointers and somehow protect your code from crashing (or behave incorrectly). If this is the case you should carefully check warnings and other messages you get from your compiler.
Another cause could be optimization (which is normally on for release versions and off for debug). The code and data layout may have been optimized and while your debugging program just was, for example, accessing unused memory, the release version is now trying to access reserved memory or even pointing to code!
EDIT: I see other mentioned it: of course you might have entire code sections that are conditionally excluded if not compiling in DEBUG mode. If that's the case, I hope that is really debugging code and not something vital for the correctness of the program itself!
The CRT library functions behave differently in debug vs release (/MD vs /MDd).
For example, the debug versions often prefill buffers you pass to the indicated length to verify your claim. Examples include strcpy_s, StringCchCopy, etc. Even if the strings terminate earlier, your szDest better be n bytes long!
Sure, for example, if you use constructions like
#if DEBUG
//some code
#endif
You'd need to give a lot more information, but yes, it's possible. It depends what your debug version does. You may well have logging or extra checks in that that don't get compiled into a release version. These debug only code paths may have unintended side effects which change state or affect variables in strange ways. Debug builds usually run slower, so this may affect threading and hide race conditions. The same for straight forward optimisations from a release compile, it's possible (although unlikely these days) that a release compile may short circuit something as an optimisation.
Without more details, I will assume that "not OK" means that it either does not compile or throws some sort of error at runtime. Check if you have code that relies on the compilation version, either via #if DEBUG statements or via methods marked with the Conditional attribute.
In .NET, even if you don't use conditional compilation like #if DEBUG, the compiler is still alot more liberal with optimisations in release mode than it is in debug mode, which can lead to release only bugs as well.
That is possible, if you have conditional compilation so that the debug code and release code are different, and there is a bug in the code that is only use in the release mode.
Other than that, it's not possible. There are difference in how debug code and release code are compiled, and differences in how code is executed if run under a debugger or not, but if any of those differences cause anything other than a performance difference, the problem was there all along.
In the debug version the error might not be occuring (because the timing or memory allocation is different), but that doesn't mean that the error is not there. There may also be other factors that are not related to the debug mode that changes the timing of the code, causing the error to occur or not, but it all boils down to the fact that if the code was correct, the error would not occur in any of the situations.
So, no, the debug version is not OK just because you can run it without getting an error. If an error occurs when you run it in release mode, it's not because of the release mode, it's because the error was there from the start.
There are compiler optimizations that can break valid code because they are too aggressive.
Try compiling your code with less optimization turned on.
In a non-void function, all execution paths should end with a return statement.
In debug mode, if you forget to end such a path with a return statement then the function usually returns 0 by default.
However, in release mode your function may return garbage values, which may affect how your program runs.
It's possible. If it happens and no conditional compilation is involved, than you can be pretty sure that your program is wrong, and is working in debug mode only because of fortuitous memory initializations or even layout in memory!
I just experienced that when I was calling an assembly function that didn't restored the registers' previous values.
In the "Release" configuration, VS was compiling with /O2 which optimizes the code for speed. Thus some local variables where merely mapping to CPU registers (for optimization) which were shared with the aforementioned function leading to serious memory corruption.
Anyhow see if you aren't indirectly messing with CPU registers anywhere in your code.
Another reasons could be DB calls.
Are you saving and updating same record multiple times in same thread,
sometimes for updating.
Its possible the update failed or didnt work as expected because the previous create command was still processing and for update, the db call failed to find any record.
this wont happen in debug as debugger makes sure to complete all pending tasks before landing.
I remember while ago when we were building dll and pdb in c/c++.
I remember this:
Adding log data would sometime make the bug move or disappear or make a totally other error appears (so it was not really an option).
Many of these errors where related to char allocation in strcpy and strcat and arrays of char[] etc...
We weeded some out by running bounds checker and simply fixing the
memory alloc/dealloc issues.
Many times, we systematically went through the code and fixed a char allocation.
My two cents is that it is related to memory allocation and management and constraints and differences between Debug mode and release mode.
And then kept at going through that cycle.
We sometimes, temporarily swapped release for debug versions of dlls, in order not to hold off production, while working on these bugs.

How does "Edit and continue" work in Visual Studio?

I have always found this to be a very useful feature in Visual Studio. For those who don't know about it, it allows you to edit code while you are debugging a running process, re-compile the code while the binary is still running and continue using the application seamlessly with the new code, without the need to restart it.
How is this feature implemented? If the code I am modifying is in a DLL loaded by the application, does the application simply unload the DLL and reload it again? This would seem to me like it would be prone to instability issues, so I assume it would be smarter than this. Any ideas?
My understanding is that when the app is compiled with support for Edit and Continue enabled, the compiler leaves extra room around the functions in the binary image to allow for adding additional code. Then the debugger can compile a new version of the function, replace the existing version (using the padding space as necessary), fix up the stack, set the instruction pointer, and keep going. That way you don't have to fix up any jump pointers, as long as you have enough padding.
Note that Edit and Continue doesn't usually work on code in libs/dlls, only with the main executable code.
My guess is that it recompiles the app (and for small changes this wouldn't mean very much would have to be recompiled). Then since Microsoft makes both the compiler and debugger they can make guarantees about how memory and the like are laid out. So, they can use the debugging API to re-write the code segments with the new ones as long as the changes are small enough.
If the changes redirect to entirely new code, this can obviously be loaded into memory in a similar style as DLLs.
Microsoft also has a mechanism for "hot-patching". Functions have a 2 byte no-op instruction usually something like "mov edx, edx" before any real code. This allows them to redirect the execution of a function cleanly. This may be an option as well.
The key thing to remember is that the application isn't "running", all it's threads are in the stopped state. So as far as the process is concerned any modifications the debugger makes are entirely atomic.
Of course, this is all speculation ;)
My guess is all objects are aligned to a 4096 byte memory boundary. So if you make small changes to some code then the objects will still be within those boundaries and therefore run as before.
I've had instances where changing a couple of lines will cause a full recompile and link and others where a fairly substantial refactoring of a function will e&c just fine.

Why does a C/C++ program often have optimization turned off in debug mode?

In most C or C++ environments, there is a "debug" mode and a "release" mode compilation.
Looking at the difference between the two, you find that the debug mode adds the debug symbols (often the -g option on lots of compilers) but it also disables most optimizations.
In "release" mode, you usually have all sorts of optimizations turned on.
Why the difference?
Without any optimization on, the flow through your code is linear. If you are on line 5 and single step, you step to line 6. With optimization on, you can get instruction re-ordering, loop unrolling and all sorts of optimizations.
For example:
void foo() {
1: int i;
2: for(i = 0; i &lt 2; )
3: i++;
4: return;
In this example, without optimization, you could single step through the code and hit lines 1, 2, 3, 2, 3, 2, 4
With optimization on, you might get an execution path that looks like: 2, 3, 3, 4 or even just 4! (The function does nothing after all...)
Bottom line, debugging code with optimization enabled can be a royal pain! Especially if you have large functions.
Note that turning on optimization changes the code! In certain environment (safety critical systems), this is unacceptable and the code being debugged has to be the code shipped. Gotta debug with optimization on in that case.
While the optimized and non-optimized code should be "functionally" equivalent, under certain circumstances, the behavior will change.
Here is a simplistic example:
int* ptr = 0xdeadbeef; // some address to memory-mapped I/O device
*ptr = 0; // setup hardware device
while(*ptr == 1) { // loop until hardware device is done
// do something
}
With optimization off, this is straightforward, and you kinda know what to expect.
However, if you turn optimization on, a couple of things might happen:
The compiler might optimize the while block away (we init to 0, it'll never be 1)
Instead of accessing memory, pointer access might be moved to a register->No I/O Update
memory access might be cached (not necessarily compiler optimization related)
In all these cases, the behavior would be drastically different and most likely wrong.
Another crucial difference between debug and release is how local variables are stored. Conceptually local variables are allocated storage in a functions stack frame. The symbol file generated by the compiler tells the debugger the offset of the variable in the stack frame, so the debugger can show it to you. The debugger peeks at the memory location to do this.
However, this means every time a local variable is changed the generated code for that source line has to write the value back to the correct location on the stack. This is very inefficient due to the memory overhead.
In a release build the compiler may assign a local variable to a register for a portion of a function. In some cases it may not assign stack storage for it at all (the more registers a machine has the easier this is to do).
However, the debugger doesn't know how registers map to local variables for a particular point in the code (I'm not aware of any symbol format that includes this information), so it can't show it to you accurately as it doesn't know where to go looking for it.
Another optimization would be function inlining. In optimized builds the compiler may replace a call to foo() with the actual code for foo everywhere it is used because the function is small enough. However, when you try to set a breakpoint on foo() the debugger wants to know the address of the instructions for foo(), and there is no longer a simple answer to this -- there may be thousands of copies of the foo() code bytes spread over your program. A debug build will guarantee that there is somewhere for you to put the breakpoint.
Optimizing code is an automated process that improves the runtime performance of the code while preserving semantics. This process can remove intermediate results which are unncessary to complete an expression or function evaluation, but may be of interest to you when debugging. Similarly, optimizations can alter the apparent control flow so that things may happen in a slightly different order than what appears in the source code. This is done to skip unnecessary or redundant calculations. This rejiggering of code can mess with the mapping between source code line numbers and object code addresses making it hard for a debugger to follow the flow of control as you wrote it.
Debugging in unoptimized mode allows you to see everything you've written as you've written it without the optimizer removing or reordering things.
Once you are happy that your program is working correctly you can turn on optimizations to get improved performance. Even though optimizers are pretty trustworthy these days, it's still a good idea to build a good quality test suite to ensure that your program runs identically (from a functional point of view, not considering performance) in both optimized and unoptimized mode.
The expectation is for the debug version to be - debugged! Setting breakpoints, single-stepping while watching variables, stack traces, and everything else you do in a debugger (IDE or otherwise) make sense if every line of non-empty, non-comment source code matches some machine code instruction.
Most optimizations mess with the order of machine codes. Loop unrolling is a good example. Common subexpressions can be lifted out of loops. With optimization turned on, even the simplest level, you may be trying to set a breakpoint on a line that, at the machine code level, doesn't exist. Sometime you can't monitor a local variable due to it being kept in a CPU register, or perhaps even optimized out of existence!
If you're debugging at the instruction level rather than the source level, it's an awful lot for you easier to map unoptimized instructions back to the source. Also, compilers are occasionally buggy in their optimizers.
In the Windows division at Microsoft, all release binaries are built with debugging symbols and full optimizations. The symbols are stored in separate PDB files and do not affect the performance of the code. They don't ship with the product, but most of them are available at the Microsoft Symbol Server.
Another of the issues with optimizations are inline functions, also in the sense that you will always single-step through them.
With GCC, with debugging and optimizations enabled together, if you don't know what to expect you will think that the code is misbehaving and re-executing the same statement multiple times - it happened to a couple of my colleagues.
Also debugging info given by GCC with optimizations on tend to be of poorer quality than they could, actually.
However, in languages hosted by a Virtual Machine like Java, optimizations and debugging can coexist - even during debugging, JIT compilation to native code continues, and only the code of debugged methods is transparently converted to an unoptimized version.
I would like to emphasize that optimization should not change the behaviour of the code, unless the used optimizer is buggy, or the code itself is buggy and relies on partially undefined semantics; the latter is more common in multithreaded programming or when inline assembly is also used.
Code with debugging symbols are larger which may mean more cache misses, i.e. slower, which may be an issue for server software.
At least on Linux (and there's no reason why Windows should be different) debug info are packaged in a separate section of the binary, and are not loaded during normal execution. They can be split into a different file to be used for debugging.
Also, on some compilers (including Gcc, I guess also with Microsoft's C compiler) debugging info and optimizations can be both enabled together. If not, obviously the code is going to be slower.