I have a c++ code that segfaults with optimization flags, but not when I run it with debug flags. This precludes me from using a debugger. Is there any other way/ guidelines apart from a barrage of cout statements?
I am on a *nix platform and using intel-12.1 compilers and I am quite certain that it is a memory issue that I need to catch with valgrind. The only thing that puzzles me is why it does not show in the debug mode.
Valgrind is a useful tool for Unix-based systems for troubleshooting release-mode executables (gflags and WinDebug are useful for Windows.)
I also recommend not giving up on your debugger - you can run non-debug executables inside a debugger, and still get useful information about the segfault. Often times you can also add in some level of debug information, even with optimizations turned on, to provide you more context. You might also check for any debug-mode heap-checking facility that the intel compiler might provide, as these can go undetected in debug builds (due to different memory management).
Also note that there are usually multiple levels of optimization you can use for "release mode". You might try backing down to a less aggressive optimization level, and see if the error still occurs.
You might also check the the Intel compiler web site to see if there have been any bugfixes/bug-reports regarding optimization for the compiler version you're using.
If none of these help, you can try using an alternate compiler (unless you're using something Intel-specific) to see if the problem is compiler-related or not.
Finally, as klm123 noted, commenting out blocks is a good way to localize the problem.
Related
Currently using VSCode, g++, C++20, Ubuntu 20.04 Lts.
What compiler flags can I use for release builds and debug builds separately? Do I turn off every optimization flag for debug builds? Or does it not really matter? I would appreciate any advice, recommendations, or feedback as I couldn't find much on my own.
Do I turn off every optimization flag for debug builds?
Yes, I would say that is the best way to go, and it does really matter! Depending on your code, your understanding of the compiler/debugger and level of optimisation chosen, the experience of debugging it will vary from mildly annoying to frustrating and almost useless. This answer gives a synopsis of the different levels for gcc and this question has several answers going into more detail about optimisations.
As a summary, the compiler is in general allowed to modify your code in any way it sees fit, as long as it still behaves as if all your statements were executed exactly as written. In practice, -O1 already enables dozens of techniques and -O2 and -O3 will probably leave almost nothing untouched, which makes it harder to pinpoint issues because:
Stepping through code may visit statements in a different order or skip them entirely, also hindering the use of breakpoints;
Function calls may disappear because they were inlined, and no longer be callable from the debugging prompt;
Local variables tend to have shorter lifetimes than in your source code, so you can't always query their values.
I personally build with CMake and primarily use two of its build types:
Debug (-g): No optimisations, compiles runtime assert statements;
RelWithDebInfo (-O2 -g -DNDEBUG): Fast code without these assertions that is harder to debug, but suitable for performance analysis once your program is working correctly.
I understand that valgrind can call memcheck to perform memory leak check, and in this case the compiled C++ executable program must contain debug information. Then, if I want to use valgrind/callgrind to perform profiling, must the executable contain debug information? I have run a small test, and it seems that valgrind/callgrind can work on release executable programs without debug information. Can anyone confirm it?
From Official Valgrind documentation link, following information can be found:
2.2. Getting started
First off, consider whether it might be beneficial to recompile your application and supporting libraries with debugging info enabled (the -g option).
Without debugging info, the best Valgrind tools will be able to do is guess which function a particular piece of code belongs to, which makes both error messages and profiling output nearly useless. With -g, you'll get messages which point directly to the relevant source code lines.
Another option you might like to consider, if you are working with C++, is -fno-inline. That makes it easier to see the function-call chain, which can help reduce confusion when navigating around large C++ apps. For example, debugging OpenOffice.org with Memcheck is a bit easier when using this option. You don't have to do this, but doing so helps Valgrind produce more accurate and less confusing error reports. Chances are you're set up like this already, if you intended to debug your program with GNU GDB, or some other debug.
Hence the recommended step is to recompile your program with -g option to get maximum information from the Valgrind.
According to the valgrind manual:
http://valgrind.org/docs/manual/manual-core.html
If you are planning to use Memcheck: On rare occasions, compiler optimisations (at -O2 and above, and sometimes -O1) have been observed to generate code which fools Memcheck into wrongly reporting uninitialised value errors, or missing uninitialised value errors. We have looked in detail into fixing this, and unfortunately the result is that doing so would give a further significant slowdown in what is already a slow tool. So the best solution is to turn off optimisation altogether. Since this often makes things unmanageably slow, a reasonable compromise is to use -O. This gets you the majority of the benefits of higher optimisation levels whilst keeping relatively small the chances of false positives or false negatives from Memcheck. Also, you should compile your code with -Wall because it can identify some or all of the problems that Valgrind can miss at the higher optimisation levels. (Using -Wall is also a good idea in general.) All other tools (as far as we know) are unaffected by optimisation level, and for profiling tools like Cachegrind it is better to compile your program at its normal optimisation level.
I've seen posts talk about what might cause differences between Debug and Release builds, but I don't think anybody has addressed from a development standpoint what is the most efficient way to solve the problem.
The first thing I do when a bug appears in the Release build but not in Debug is I run my program through valgrind in hopes of a better analysis. If that reveals nothing, -- and this has happened to me before -- then I try various inputs in hopes of getting the bug to surface also in the Debug build. If that fails, then I would try to track changes to find the most recent version for which the two builds diverge in behavior. And finally I guess I would resort to print statements.
Are there any best software engineering practices for efficiently debugging when the Debug and Release builds differ? Also, what tools are there that operate at a more fundamental level than valgrind to help debug these cases?
EDIT: I notice a lot of responses suggesting some general good practices such as unit testing and regression testing, which I agree are great for finding any bug. However, is there something specifically tailored to this Release vs. Debug problem? For example, is there such a thing as a static analysis tool that says "Hey, this macro or this code or this programming practice is dangerous because it has the potential to cause differences between your Debug/Release builds?"
One other "Best Practice", or rather a combination of two: Have Automated Unit Tests, and Divide and Conquer.
If you have a modular application, and each module has good unit tests, then you may be able to quickly isolate the errant piece.
The very existence of two configurations is a problem from debugging point of view. Proper engineering would be such that the system on the ground and in the air behave the same way, and achieve this by reducing the number of ways by which the system can tell the difference.
Debug and Release builds differ in 3 aspects:
_DEBUG define
optimizations
different version of the standard library
The best way around, the way I often work, is this:
Disable optimizations where performance is not critical. Debugging is more important. Most important is disable function auto-inlining, keep standard stack frame and variable reuse optimizations. These annoy debug the most.
Monitor code for dependence on DEBUG define. Never use debug-only asserts, or any other tools sensitive to DEBUG define.
By default, compile and work /release.
When I come across a bug that only happens in release, the first thing I always look for is use of an uninitialized stack variable in the code that I am working on. On Windows, the debug C runtime will automatically initialise stack variables to a know bit pattern, 0xcdcdcdcd or something. In release, stack variables will contain the value that was last stored at that memory location, which is going to be an unexpected value.
Secondly, I will try to identify what is different between debug and release builds. I look at the compiler optimization settings that the compiler is passed in Debug and Release configurations. You can see this is the last property page of the compiler settings in Visual Studio. I will start with the release config, and change the command line arguments passed to the compiler one item at a time until they match the command line that is used for compiling in debug. After each change I run the program and reproducing the bug. This will often lead me to the particular setting that causes the bug to happen.
A third technique can be to take a function that is misbehaving and disable optimizations around it using the pre-processor. This will allow you run the program in release with the particular function compiled in debug. The behaviour of the program which has been built in this way will help you learn more about the bug.
#pragma optimize( "", off )
void foo() {
return 1;
}
#pragma optimize( "", on )
From experience, the problems are usually stack initialization, memory scrubbing in the memory allocator, or strange #define directives causing the code to be compiled incorrectly.
The most obvious cause is simply the use of #ifdef and #ifndef directives associated DEBUG or similar symbol that change between the two builds.
Before going down the debugging road (which is not my personal idea of fun), I would inspect both command lines and check which flags are passed in one mode and not the other, then grep my code for this flags and check their uses.
One particular issue that comes to mind are macros:
#ifdef _DEBUG_
#define CHECK(CheckSymbol) { if (!(CheckSymbol)) throw CheckException(); }
#else
#define CHECK(CheckSymbol)
#endif
also known as a soft-assert.
Well, if you call it with a function that has side effect, or rely on it to guard a function (contract enforcement) and somehow catches the exception it throws in debug and ignore it... you will see differences in release :)
When debug and release differ it means:
you code depends on the _DEBUG or similar macros (defined when compiling a debug version - no optimizations)
your compiler has an optimization bug (I seen this few times)
You can easily deal with (1) (code modification) but with (2) you will have to isolate the compiler bug. After isolating the bug you do a little "code rewriting" to get the compiler generate correct binary code (I did this a few times - the most difficult part is to isolate the bug).
I can say that when enabling debug information for release version the debugging process works ... (though because of optimizations you might see some "strange" jumps when running).
You will need to have some "black-box" tests for your application - valgrind is a solution in this case. These solutions help you find differences between release and debug (which is very important).
The best solution is to set up something like automated unit testing to thoroughly test all aspects of the application (not just individual components, but real world tests which use the application the same way a regular user would with all of the dependencies). This allows you to know immediately when a release-only bug has been introduced which should give you a good idea of where the problem is.
Good practice to actively monitor and seek out problems beats any tool to help you fix them long after they happen.
However, when you have one of those cases where it's too late: too many builds have gone by, can't reproduce consistently, etc. then I don't know of any one tool for the job. Sometimes fiddling with your release settings can give a bit of insight as to why the bug is occurring: if you can eliminate optimizations which suddenly make the bug go away, that could give you some useful information about it.
Release-only bugs can fall into various categories, but the most common ones (aside from something like a misuse of assertions) is:
1) Uninitialized memory. I use this term over uninitialized variables as a variable may be initialized but still be pointing to memory which hasn't been initialized properly. For this, memory diagnostic tools like Valgrind can help.
2) Timing (ex: race conditions). These can be a nightmare to debug, but there are some multithreading profilers and diagnostic tools which can help. I can't suggest any off the bat, but there's Coverity Integrity Manager as one example.
So I have just followed the advice in enabling debug symbols for Release mode and after enabling debug symbols, disabling optimization and finding that break-points do work if symbols are complied with a release mode, I find myself wondering...
Isn't the purpose of Debug mode to help you to find bugs?
Why bother with Debug mode if it let bugs slip past you?
Any advice?
In truth there is no such thing as a release mode or a debug mode. there are just different configurations with different options enabled. Release 'mode' and Debug 'mode' are just common configurations.
What you've done is to modify the release configuration to enable some options which are usually enabled in the debug configuration.
Enabling these options makes the binaries larger and slower according to which ones you enable.
The more of these options you enable, the easier it is to find bugs. I think your question should really be "Why bother with release mode?" The answer to that is that it's smaller and faster.
Debug mode doesn't "let bugs slip past you". It inserts checks to catch a large number of bugs, but the presence of these checks may also hide certain other bugs. All the error-checking code catches a lot of errors, but it also acts as padding, and may hide subtle bounds errors.
So that in itself should be plenty of reason to run both. MSVC performs a lot of additional error checking in Debug mode.
In addition, many debug tools, such as assert rely on NDEBUG not being defined, which is the case in debug builds, but not, by default, in release builds.
Optimisations will be turned off, which makes debugging easier (otherwise code can be reordered in strange ways). Also conditional code such as assert() etc. can be included.
Apart from your application being very debuggeable in release mode, the MSVC runtime libraries aren't so nice, neither are some other libraries.
The debug heap, for instance, adds no-man's land markers around allocated memory to trap buffer overruns. The standard library used by MSVC asserts for iterator validity. And much more.
Bugs due to optimisers aren't unheard of; but normally they hint at a deeper problem, for instance, not using volatile when you need to will cause optimiser bugs (optimising away comparisons and using cached results instead).
At the end of the day, including debug symbols in early releases can help you trace down bugs after deployment.
Now, to answer your questions directly:
Debug mode has other things, like assert()s which are stripped in release mode.
Even if you do all your testing in release mode, bugs will still slip by. Infact, some bugs (like the above volatile bug) are only hidden in debug mode: they still exist, they are just harder to trigger.
Including full symbols with your application includes significant information about the build machine (paths etc. are embedded).
The advice is to include "PDB Only" symbols with release builds (don't include file, line and local variable symbols) with optimisation on. And debug build has no optimisation and full symbols.
And (as noted in other answer) common sub-expression elimination and instructin reordering can make debugging interesting (move next goes to lines n, n+2, n+1...).
optimization is a nightmare for debugging. Once i had an application with this for loop
for (int i = 0; i < 10; i++)
{
//use i with something
}
i was always 0 in debugging. but outputting it to a console showed that it did increase
Many times I work with optimized code (sometimes even involving vectorized loops), which contain bugs and such. How would one debug such code? I'm looking for any kind of tools or techniques. I use the following (possibly outdated) tools, so I'm looking to upgrade.
I use the following:
Since with ddd, you cannot see the code, I use gdb+ dissambler command and see the produced code; I can't really step through the program using this.
ndisasm
Thanks
It is always harder to debug optimised programs, but there are always ways. Some additional tips:
Make a debug build, and see if you get the same bug in a debug build. No point debugging an optimised version if you don't have to.
Use valgrind if on a platform that supports it. The errors you see may be harder to understand, but catching the problem early often simplifies debugging.
printf debugging is primitive, but sometimes it is the simplest way if you have a complex issue that only shows up in optimised builds.
If you suspect a timing issue (especially in a multithreaded program), roll your own version of assert which aborts or prints if the condition is violated, and use it in a few select places, to rule out possible problems.
See if you can reproduce the problem without using -fomit-frame-pointers, since that makes code very hard to debug, and with -O2 or -O3 enabled. That might give you enough information to find the cause of your problem.
Isolate parts of your code, build a test-suite, and see if you can identify any testcases which fail. It is much easier to debug one function than the whole program.
Try turning off optimisations one by one with the -fno-X options. This might help you find common problems like strict aliasing problems.
Turn on more compiler warnings. Some things, like strict aliasing problems, can generate compiler warnings if they create a difference in behaviour between different optimisation levels.
When debugging release builds you can put in __asm nops; as a placeholder for breakpoints (int 3). This is nice as you can guarantee breakpoint locations without messing up compiler optimizations or writing printf/cout statements.
It's always easier to debug a non-optimized version, of course. Failing that, disassembly of the code can be helpful. Other techinques I've used include partially de-optimizing the code by forcing intermediate results to be printed or logged, or changing a critical variable to "volatile" so I can at least look at that value in the debugger.
Chances are what you call optimized code is scrambled to shave cycles (which makes debugging hard) but is not really very optimized. Here is an example of what I mean.
I would turn off the compiler optimization, debug and tune it yourself, and then turn compiler optimization back on if the code has hotspots that are actually in code the compiler sees (not in outside libraries). (I define a hotspot as a part of code where the PC is often found. That automatically exempts loops containing function calls because they steal away the PC.)