I have a future.hh header file, and I set a breakpoint at line 800, like the following:
795 ~future() {
796 if (_promise) {
797 _promise->_future = nullptr;
798 }
799 if (failed()) {
800 report_failed_future(state()->get_exception());
801 }
I thought if exception occurred at future destruction, I could get the stacktrace. However I got this:
This is not what I want. Why is that ? So many breakpoints. When I do continue, it will stop every time, not what I expected.
When you make a request like break future.hh:800, gdb attempts to set a breakpoint at every possible address corresponding to that source location.
In your case, what is most likely happening is that the destructor has been inlined many times, so you wind up with very many breakpoint locations. (Another less likely option here is a compiler bug causing it to emit incorrect line tables somehow.)
Compiling without optimization won't really help -- it may result in fewer breakpoint locations, but you will still see just as many stops, because all that is happening is a stop on each invocation of the destructor.
Instead, if you know that you only want to stop in certain destructors, then the best approach is to try to narrow the points at which the stops happen. A few ideas:
Put a breakpoint in the surrounding code you care about, not in this destructor
Disable some or most of the locations on the breakpoint. (In gdb, breakpoints can be individually disabled.)
Make the breakpoint conditional to try to reduce the number of undesired stops
Related
I dislike pointers, and generally try to write as much code as I can using refs instead.
I've written a very rudimentary "vertical layout" system for a small Win32 app. Most of the Layout methods look like this:
void Control::DoLayout(int availableWidth, int &consumedYAmt)
{
textYPosition = consumedYAmt;
consumedYAmt += measureText(font, availableWidth);
}
They are looped through like so:
int innerYValue = 0;
foreach(control in controls) {
control->DoLayout(availableWidth, innerYValue);
}
int heightOfControl = innerYValue;
It's not drawing its content here, just calculating exactly how much space this control will require (usually it's adding padding too, etc). This has worked great for me.......in debug mode.
I found that in Release mode, I could suddenly see tangible, loggable issues where, when I'm looping through controls and calling DoLayout(), the consumedYAmt variable actually stays at 0 in the outside loop. The most annoying part is that if I put in breakpoints and walk through the code line by line, this stops happening and parts of it are properly updated by the inside "add" methods.
I'm kind of thinking about whether this would be some compiler optimization where they think I'm simply adding the ref flag to ints as a way to optimize memory; or if there's any possibility this actually works in a way different from how it seems.
I would give a minimum reproducible example, but I wasn't able to do so with a tiny commandline app. I get the sense that if this is an optimization, it only kicks in for larger code blocks and indirections.
EDIT: Again sorry for generally low information, but I'm now getting hints that this might be some kind of linker issue. I skipped one part of the inheritance model in my pseudocode: The calling class actually calls "Layout()", which is a non-virtual function on the root definition of the class. This function performs some implementation-neutral logic, and then calls DoLayout() with the same arguments. However, I'm now noticing that if I try adding a breakpoint to Layout(), Visual Studio claims that "The breakpoint will not be hit. No executable code of the debugger's target code type is associated with this line." I am able to add breakpoints to certain other lines, but I'm beginning to notice weird stepping logic where it refuses to go inside certain functions, like Layout. Already tried completely clearing the build folders and rebuilding. I'm going to have to keep looking, since I have to admit this isn't a lot to go on.
Also, random addition: The "controls" list is a vector containing shared_ptr objects. I hadn't suspected the looping mechanism previously but now I'm looking more closely.
"the consumedYAmt variable actually stays at 0"
The behavior you describe is typical for a specific optimization that's more due to the CPU than the compiler. I suspect you're logging consumedYAmt from another thread. The updates to consumedYAmt simply don't make it to that other thread.
This is legal for the CPU, because the C++ compiler didn't put in memory fences. And the CPU compiler didn't put in fences because the variable isn't atomic.
In a small program without threads, this simply doesn't show up, nor does it show in debug mode.
Written by OP
Okay, eventually figured this one out. As simple as the issue was, pinning it down became difficult because of Release mode's debugger seemingly acting in inconsistent ways. When I changed tactic to adding Logging statements in lots of places, I found that my Control class had an "mShowing" variable that was uninitialized in its constructor. In debug mode, it apparently retained uninitialized memory which I guess made it "true" - but in release mode, my best analysis is that memory protections made it default to "false", which as it turns out skipped the main body of the DoLayout method most of the time.
Since through the process, responders were low on information to work with (certainly could've been easier if I posted a longer example), I instead simply upvoted each comment that mentioned uninitialized variables.
I've had this problem for a few days now, and haven't found the cause of it. Whenever I build and run my program in debug mode, everything runs fine, however, release mode (with optimization, it works fine without) is a whole other story.
How the program works is that I have one thread waiting for a member variable called (bool) pipeReady to be set to true before continuing, and I have another thread which handles pipe connections and after opening a connection it'll follow a callback to a function which sets pipeReady to true.
When following this by stepping through the program, it acts extremely weird (skips lines, jumps over lines, goes back a few lines), but in the end everything seems to work like it should, except for one thing: this. It stays in that loop even though it's conditions aren't met. I know that this might not be the best way to handle this, but it should work, shouldn't it? How can this happen? What could lead to something like this? And why does it only happen when optimization is on?
Thanks, André
If the link is broken in the future, it shows the debugger being stuck on this line:
while(!pipeReady){};
While pipeReady's value is true according to the debugger.
Weird Debugger Behavior
The weird behavior of the debugger skipping lines is due to the optimizations by the compiler.
When the compiler generates new code or eliminates code, the source line numbers will not align. This confuses the IDE and shows the Debugger jumping around. For example, if there are multiple returns from a function, the compiler may issue a branch to the first one for the other returns. This causes a step to the one return statement, which is not according to the source code.
Elimination of empty loops.
The compiler may eliminate empty loops like this one:
while (!pipe_ready)
{
}
Because the variable is not changed inside the loop.
To fix this, declare the variable pipe_ready as volatile.
I have a C++ application cross-compiled for Linux running on an ARM CortexA9 processor which is crashing with a SIGFPE/Arithmetic exception. Initially I thought that it's because of some optimizations introduced by the -O3 flag of gcc but then I built it in debug mode and it still crashes.
I debugged the application with gdb which catches the exception but unfortunately the operation triggering exception seems to also trash the stack so I cannot get any detailed information about the place in my code which causes that to happen. The only detail I could finally get was the operation triggering the exception(from the following piece of stack trace):
3 raise() 0x402720ac
2 __aeabi_uldivmod() 0x400bb0b8
1 __divsi3() 0x400b9880
The __aeabi_uldivmod() is performing an unsigned long long division and reminder so I tried the brute force approach and searched my code for places that might use that operation but without much success as it proved to be a daunting task. Also I tried to check for potential divisions by zero but again the code base it's pretty large and checking every division operation it's a cumbersome and somewhat dumb approach. So there must be a smarter way to figure out what's happening.
Are there any techniques to track down the causes of such exceptions when the debugger cannot do much to help?
UPDATE: After crunching on hex numbers, dumping memory and doing stack forensics(thanks Crashworks) I came across this gem in the ARM Compiler documentation(even though I'm not using the ARM Ltd. compiler):
Integer division-by-zero errors can be trapped and identified by
re-implementing the appropriate C library helper functions. The
default behavior when division by zero occurs is that when the signal
function is used, or
__rt_raise() or __aeabi_idiv0() are re-implemented, __aeabi_idiv0() is
called. Otherwise, the division function returns zero.
__aeabi_idiv0() raises SIGFPE with an additional argument, DIVBYZERO.
So I put a breakpoint at __aeabi_idiv0(_aeabi_ldiv0) et Voila!, I had my complete stack trace before being completely trashed. Thanks everybody for their very informative answers!
Disclaimer: the "winning" answer was chosen solely and subjectively taking into account the weight of its suggestions into my debugging efforts, because more than one was informative and really helpful.
My first suggestion would be to open a memory window looking at the region around your stack pointer, and go digging through it to see if you can find uncorrupted stack frames nearby that might give you a clue as to where the crash was. Usually stack-trashes only burn a couple of the stack frames, so if you look upwards a few hundred bytes, you can get past the damaged area and get a general sense of where the code was. You can even look down the stack, on the assumption that the dead function might have called some other function before it died, and thus there might be an old frame still in memory pointing back at the current IP.
In the comments, I linked some presentation slides that illustrate the technique on a PowerPC — look at around #73-86 for a case study in a similar botched-stack crash. Obviously your ARM's stack frames will be laid out differently, but the general principle holds.
(Using the basic idea from Fedor Skrynnikov, but with compiler help instead)
Compile your code with -pg. This will insert calls to mcount and mcountleave() in every function. Do not link against the GCC profiling lib, but provide your own. The only thing you want to do in your mcount and mcountleave() is to keep a copy of the current stack, so just copy the top 128 bytes or so of the stack to a fixed buffer. Both the stack and the buffer will be in cache all the time so it's fairly cheap.
You can implement special guards in functions that can cause the exception. Guard is a simple class, in constractor of this class you put the name of the file and line (_FILE_, _LINE_) into file/array/whatever. The main condition is that this storage should be the same for all instances of this class(kind of stack). In the destructor you remove this line. To make it works you need to put the creation of this guard on the first line of each function and to create it only on stack. When you will be out of current block deconstructor will be called. So in the moment of your exception you will know from this improvised callstack which function is causing a problem.
Ofcaurse you may put creation of this class under debug condition
Enable generation of core files, and open the core file with the debuger
Since it uses raise() to raise the exception, I would expect that signal() should be able to catch it. Is this not the case?
Alternatively, you can set a conditional breakpoint at __aeabi_uldivmod to break when divisor (r1) is 0.
This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Common reasons for bugs in release version not present in debug mode
Sometimes I encouter such strange situations that the program run incorrectly while running normally and it will pop-up the termination dialog,but correctly while debugging.This do make me frustrated when I want to use debugger to find the bug inside my code.
Have you ever met this kind of situation and why?
Update:
To prove there are logic reasons that will led such a frustrating situation:
I think one big possibility is heap access volidation. I once wrote a function that allocate a small buffer, but later I step out the boudary. It will run correctly within gdb, cdb, etc (I do not know why, but it do run correctly); but terminate abnormally while running normally.
I am using C++.
I do not think my problem duplicate the above one.
That one is comparision between release mode and debug mode,but mine is between debugging and not debugging,which have a word heisenbug, as many other noted.
thanks.
You have a heisenbug.
Debugger might be initializing values
Some environments initialize variables and/or memory to known values like zero in debug builds but not release builds.
Release might be built with optimizations
Modern compilers are good, but it could hypothetically happen that optimized code functions differently than non-optimized code. Edit: These days, compiler bugs are rare. If you find yourself thinking you have one, exhaust all other ideas first.
There can be other reasons for heisenbugs.
Here's a common gotcha that can lead to a Heisenbug (love that name!):
// Sanity check - this should never fail
ASSERT( ReleaseResources() == SUCCESS);
In a debug build, this will work as expected, but the ASSERT macro's argument is ignored in a release build. By ignored, I mean that not only won't the result be reported, but the expression won't be evaluated at all (i.e. ReleaseResources() won't be called).
This is a common mistake, and it's why the Windows SDK defines a VERIFY() macro in addition to the ASSERT() macro. They both generate an assertion dialog at runtime in a debug build if the argument evaluates to false. Their behavior is different for a release build, however. Here's the difference:
ASSERT( foo() == true ); // Confirm that call to foo() was successful
VERIFY( bar() == true ); // Confirm that call to bar() was successful
In a debug build, the above two macros behave identically. In a release build, however, they are essentially equivalent to:
; // Confirm that call to foo() was successful
bar(); // Confirm that call to bar() was successful
By the way, if your environment defines an ASSERT() macro, but not a VERIFY() macro, you can easily define your own:
#ifdef _DEBUG
// DEBUG build: Define VERIFY simply as ASSERT
# define VERIFY(expr) ASSERT(expr)
#else
// RELEASE build: Define VERIFY as the expression, without any checking
# define VERIFY(expr) ((void)(expr))
#endif
Hope that helps.
Apparently stackoverflow won't let me post a response which contains only a single word :)
VALGRIND
When using a debugger, sometimes memory gets initialized (e.g. zero'ed) whereas without a debugging session, memory can be random. This could explain the behavior you are seeing.
You have dialogs, so there may be threads in your application. If there is threads, there is a possibility of race conditions.
Let say your main thread initialize a structure that another thread uses. When you run your program inside the debugger the initializing thread may be scheduled before the other thread while in your real-life situation the thread that use the structure is scheduled before the other thread actually initialize it.
In addition to what JeffH said, you have to consider if the deploying computer (or server) has the same environment/libraries/whatever_related_to_the_program.
Sometimes it's very difficult to debug correctly if you debug with other conditions.
Giovanni
Also, debuggers might add some padding around allocated memory changing the behaviour. This has caught me out a number of times, so you need to be aware of it. Getting the same memory behaviour in debug is important.
For MSVC, this can be disabled with the env-var _NO_DEBUG_HEAP=1. (The debug heap is slow, so this helps if your debug runs are hideously slow too..).
Another method to get the same is to start the process outside the debugger, so you get a normal startup, then wait on first line in main and attach the debugger to process. That should work for "any" system. provided that you don't crash before main. (You could wait on a ctor on a statically pre-mani constructed object then...)
But I've no experience with gcc/gdb in this matter, but things might be similar there... (Comments welcome.)
One real-world example of heisenbug from Raymand Zhang.
/*--------------------------------------------------------------
GdPage.cpp : a real example to illustrate Heisenberg Effect
related with guard page by Raymond Zhang, Oct. 2008
--------------------------------------------------------------*/
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
LPVOID lpvAddr; // address of the test memory
lpvAddr = VirtualAlloc(NULL, 0x4096,
MEM_RESERVE | MEM_COMMIT,
PAGE_READONLY | PAGE_GUARD);
if(lpvAddr == NULL)
{
printf("VirtualAlloc failed with %ld\n", GetLastError());
return -1;
}
return *(long *)lpvAddr;
}
The program would terminate abnormally whether compile with Debug or Release,because
by specifying the PAGE_GUARD flag would cause the:
Pages in the region become guard
pages. Any attempt to read from or
write to a guard page causes the
system to raise a STATUS_GUARD_PAGE
exception and turn off the guard page
status. Guard pages thus act as a
one-shot access alarm.
So you'd get STATUS_GUARD_PAGE while trying to access *lpvAddr.But if you use debugger load the program and watch *lpvAddv or step the last statement return *(long *)lpvAddr assembly by assembly,the debugger would forsee the guard page to determine the value of *lpvAddr.So the debugger would have cleared the guard alarm for us before we access *lpvAddr.
Which programming language are you using. Certain languages, such as C++, behave slightly differently between release and debug builds. In the case of C++, this means that when you declare a var, such as int i;, in debug builds it will be initialised to 0, while in release builds it may take any value (whatever was stored in its memory location before).
One big reason is that debug code may define the _DEBUG macro that one may use in the code to add extra stuff in debug builds.
For multithreaded code, optimization may affect ordering which may influence race conditions.
I do not know if debug code adds code on the stack to mark stack frames. Any extra stuff on the stack may hide the effects of buffer overruns.
Try using the same command options as your release build and just add the -g (or equivalent debug flag). gcc allows the debug option together with the optimization options.
If your logic depends on data from the system clock, you could see serious probe effects. If you break into the debugger, you will obviously effect the values returned from clock functions such as timeGetTime(). The same is true if your program takes longer to execute. As other people have said, debug builds insert NOOPs. Also, simply running under the debugger (without hitting breakpoints) might slow things down.
An example of where this might happen is a real-time physics simulation with a variable time step, based off elapsed system time. This is why there are articles like this:
http://gafferongames.com/game-physics/fix-your-timestep/
I am trying to debug a small operating system I have written in an university course in C++. At runtime somewhere one of my objects is getting corrupted. It seems like this happens due to accidentally writing to the wrong memory address. As I am unable to find the place where this happens from pure looking at the code, I need another way.
As this is an operating system, I cannot attach tools like valgrind to it, but I can run it in emulators (bochs/qemu) with gdb attached.
Is there a way in gdb to trace write access to a class instance or more general a specific memory range? I would like to break as soon as write access happens, so I can verify if this is valid or not.
You can put a watchpoint:
watch x
This will break when x is modified. x can be any type of variable. If you have:
class A;
A x;
Then gdb will break whenever x is modified.
You can actually put a watchpoint on any expression, and gdb will break when the expression changes. Be careful with this, though, because if the expression isn't something that the underlying hardware supports, gdb will have to evaluate this after every instruction, which leads to awful performance. For example, if A above is a class with many members then gdb can watch the entire instance x, but the way it'll work is:
execute an instruction
jump to the debug breakpoint
check if x has changed
return to the program
Naturally, this is very slow. If x is an int then gdb can use a hardware breakpoint.
If you have a specific memory address you can watch it too:
watch *0x1234
This will break when the contents of [0x1234] changes.
You can also set a read breakpoint using rwatch, or awatch to set a read/write breakpoint.
If you know at least approximately where it happens you can also just use "display" instead of watch and manually step line by line until you see when the change happens. Watching an address using "watch" is just too painfully slow.