GDB back trace from address - gdb

I am experiencing an issue with GDB bt. I am in the interrupt context during debugging and therefore I can see only current stack, so back trace will only show few calls which I am not interested in. However in the embedded software we are writing each time the panic happens we have preserving information global structure. It is pointing to the stack before crash.
My question is, can I ask GDB about to do the back trace from my known address, (with the assumption that no remapping is happening in the hardware).
I am using gdb 7.0 with olimex, I am debugging custom ARM based chip.
Best Regards

Related

How is gdb stack trace readability of release code influenced on x64?

I am working on a project, where the request "we want more information in release build stack traces" came up.
With "stack trace" I mean basically the output of t a a bt in gdb, which I suppose to be equivalent to the output of gstack for a running process. If this is true would be one of my questions.
My main problem is that availability of stack traces is rather erratic (sometimes you have them, sometimes you don't) and documentation could be more detailed (e.g. gdb documentation states that "-fomit-frame-pointer makes debugging impossible on some machines.", without any clear information about x86_64)
Also, when examining a running program with gstack, I get a quite perfect stack traces. I am unsure, though, if this is exactly what I would get from a core dump with gdb (which would mean that all cases where I get less information, the stack has been really corrupted).
Currently, the code is compiled with -O2. I have seen one stack trace lately, where our own program code's stack frames did not have any function parameter values, but the first (inner) frames, where our code already called a third party library, provided these values. Here, I am not sure if this is a sign that the first party library had better gcc debugging options set, or if these information is just lost at some point iterating down the stack trace.
I guess my questions are:
Which compiler options influence the stack trace quality on x86_64
are stack traces from these origins identical:
output of gstack of a running program
attached gdb to a running program, executed t a a bt
called gcore on a running program, opening core with gdb, then t a a bt
program aborted and core file written by system, opened with gdb
Is there some in-depth documentation which parameters affect stack trace quality on x86_64?
All considerations made under the assumption that the program binary exists for the core dump, and source code is not available.
With "quality of a stack trace" i mean 3 criteria:
called function names are available, not just "??"
The source codes file name and line number is available
function call parameters are available.
Which compiler options influence the stack trace quality on x86_64
The -fomit-frame-pointer is the default on x86_64, and does not cause stack traces to be unusable.
GDB relies on unwind descriptors, and you could strip these with either strip or -fno-unwind-tables (this is ill-advised).
are stack traces from these origins identical:
- output of gstack of a running program
Last I looked, gstack was a trivial shell script that invoked gdb, so yes.
attached gdb to a running program, executed "t a a bt"
Yes.
called gcore on a running program, opening core with gdb, then "t a a bt"
Yes, provided the core is opened with GDB on the same system where gcore was run.
program aborted and core file written by system, opened with gdb
Same as above.
If you are trying to open core on a different system from the one where it was produced, and the binary uses dynamic libraries, you need to set sysroot appropriately. See this question and answer.
Note that there are a few reasons stack may look corrupt or unavailable in GDB:
-fno-unwind-tables or striping mentioned above
code compiled from assembly, and lacking proper .cfi directives
third party libraries that were built with very old compiler, and have incorrect unwind descriptors (anything before gcc-4.4 was pretty bad).
and finally, stack corruption.

Memory corrupted crash analysis

This question is to get ideas. So basically we have run into a particular crash, where the backtrace is meaningless above some frame. This is from an ARM based binary. We were unable to reproduce the issue yet, which hardens the analysis. Some observations first:
Nearly all the required libraries' unstripped shared libraries are available so the backtrace should be ok normally
However, the backtrace points on frame 3 to a memory address, which is clearly not in the code segment
That problematic memory address is also visible in some argument parameters
Register info shows unaligned address on R3 register. Disassemble shows that stmia ASM operation on it, which was the root cause - it caused sigbus.
All in all, this is because of a corrupted memory, stack, etc. I have only limited information right now. I want to trace back at least where this call came, but because of the corruption, I am now not really able to find it.
Do you have any ideas, how to proceed which such problems? Any tools, which can help me to analyze the assembly flow and data and registers in a better environment, maybe seeing the stack, etc in a more intuitive way? Much thanks in advance!
EDIT: To be a bit more precise: how would you try to find out what was above frame 2? Is it possible in some scenarios to decode it back, even though the stack was somehow corrupted? Or finding out what corrupted it?
EDIT2: i think i will debug that function and I saw a particular log line before the crash. I will place a breakpoint and check where the calls are coming, albeit it is a very generic méthod, plus the initialization functions' orders are really fuzzy. Have no other clue. Do you have some "intuitive reading ASM level data from core file" tool idea?

Recover from crash with a core dump

A C++ program crashed on FreeBSD 6.2 and OS was kind enough to create a core dump. Is it possible to amputate some stack frames, reset the instruction pointer and restart the process in gdb, and how?
Is it possible to amputate some stack frames, reset the instruction pointer and restart the process in gdb?
I assume you mean: change the process state, and set it to start executing again (as if it never crashed in the first place).
No. For one thing, how do you propose GDB (if it magically had this capability) would handle your file descriptors (which the kernel automatically closed when your process died)?
Yes, gdb can debug core dumps just as well as running programs. Assuming that a.out is the name of your program's executable and that a.core is the name of your core file, invoke gdb like so:
gdb a.out a.core
And then you can debug like normal, except you cannot continue execution in any way (even if you could, the program would just crash again). You can examine the stack trace, registers, memory, etc.
Possible duplicate of this: Best practices for recovering from a segmentation fault
Summary: It is possible but not recommended. The way to do it is to usse setjmp() and longjmp() from a signal handler. (Please look at complete source code example in duplicate post.

Application crash at customer machine

Our DCOM server crashes at customer machine. The application does not crash if I enable Page Heap,Put pdb files or attach AD Plus. It does not crash in any of our machines.
I generated crash dump with NTSD using Just In Time feature of Windows in the customer machine. But the crash location is different at different times.
What technique should I use to identify the cause of the crash?
This sounds like a memory corruption. Generally the stack trace is not reliable at this point. First thing to do is to look at the stack segment. Best way to do this is to dump the raw stack and not a stack trace and see if the stack can be manually reconstructed. In addition when the memory gets overwritten check if you see a data pattern in the overwritten data.

What causes a Sigtrap in a Debug Session

In my c++ program I'm using a library which will "send?" a Sigtrap on a certain operations when
I'm debugging it (using gdb as a debugger). I can then choose whether I wish to Continue or Stop the program. If I choose to continue the program works as expected, but setting custom breakpoints after a Sigtrap has been caught causes the debugger/program to crash.
So here are my questions:
What causes such a Sigtrap? Is it a leftover line of code that can be removed, or is it caused by the debugger when he "finds something he doesn't like" ?
Is a sigtrap, generally speaking, a bad thing, and if so, why does the program run flawlessly when I compile a Release and not a Debug Version?
What does a Sigtrap indicate?
This is a more general approach to a question I posted yesterday Boost Filesystem: recursive_directory_iterator constructor causes SIGTRAPS and debug problems.
I think my question was far to specific, and I don't want you to solve my problem but help me (and hopefully others) to understand the background.
Thanks a lot.
With processors that support instruction breakpoints or data watchpoints, the debugger will ask the CPU to watch for instruction accesses to a specific address, or data reads/writes to a specific address, and then run full-speed.
When the processor detects the event, it will trap into the kernel, and the kernel will send SIGTRAP to the process being debugged. Normally, SIGTRAP would kill the process, but because it is being debugged, the debugger will be notified of the signal and handle it, mostly by letting you inspect the state of the process before continuing execution.
With processors that don't support breakpoints or watchpoints, the entire debugging environment is probably done through code interpretation and memory emulation, which is immensely slower. (I imagine clever tricks could be done by setting pagetable flags to forbid reading or writing, whichever needs to be trapped, and letting the kernel fix up the pagetables, signaling the debugger, and then restricting the page flags again. This could probably support near-arbitrary number of watchpoints and breakpoints, and run only marginally slower for cases when the watchpoint or breakpoint aren't frequently accessed.)
The question I placed into the comment field looks apropos here, only because Windows isn't actually sending a SIGTRAP, but rather signaling a breakpoint in its own native way. I assume when you're debugging programs, that debug versions of system libraries are used, and ensure that memory accesses appear to make sense. You might have a bug in your program that is papered-over at runtime, but may in fact be causing further problems elsewhere.
I haven't done development on Windows, but perhaps you could get further details by looking through your Windows Event Log?
While working in Eclipse with minGW/gcc compiler, I realized it's reacting very bad with vectors in my code, resulting to an unclear SIGTRAP signal and sometimes even showing abnormal debugger behavior (i.e. jumping somewhere up in the code and continuing execution of the code in reverse order!).
I have copied the files from my project into the VisualStudio and resolved the issues, then copied the changes back to eclipse and voila, worked like a charm. The reasons were like vector initialization differences with reserve() and resize() functions, or trying to access elements out of the bounds of the vector array.
Hope this will help someone else.
I received a SIGTRAP from my debugger and found out that the cause was due to a missing return value.
string getName() { printf("Name!");};