I'm trying to get line numbers of address I collected in a stackwalk by using symgetlinefromaddr64, but I can't seem to get addresses of simple commands or their lines.
for example, if i'm looking at the method:
void Test(int g)
{
g++;
DoSomething(g);
g--;
}
I'll get only the line number of "DoSomething", but I want the line numbers of "g++" etc.
I suppose it is doable because debuggers do it.
how can I do it myself in c++ on windows?
A stack walk will only retrieve addresses that are stored on the stack, which pretty much means function calls. If you want the address of your g++ or g--, you'll need to use something other than a stack walk to get them (e.g., SymGetFileLineOffsets64). If you're starting from a stackwalk and have info from SymGetLineFromAddr64, you can use SymGetLineNext64 and SymGetLinePrev64 to get information about the surrounding lines.
The only way to do it is to use compiler generated symbol files like the *.pdb files for microsoft visual studio compilers (pdb stands for program database). These files contain all symbols used during the compilation step. Even for a release compilation you'll get information about the symbols in use (some may have be optimized away).
The main disadvantage is that this is highly compiler dependent/specific. gcc for example may include symbol information in the executable so-file or executable. Other compilers have other formats...
What compiler do you use (name/version)?
Related
Linker question:
if I had a file. c that has no includes at all, would we still need a linker?
Although the linker is so-named because it links together multiple object files, it performs other functions as well. It may resolve addresses that were left incomplete by the compiler. It produces a program in an executable file format that the system’s program loader can read and load, and that format may differ from that of object modules. Specifics depend on the operating system and build tools.
Further, to have a complete program in one source file, you must provide not just the main routine you are familiar with from C and C++ but also the true start of the program, the entry point that the program loader starts execution at, and you must provide implementations for all functions you use in the program, such as invocations of system services via special trap or system-call instructions to read and write data.
You can create a project, which has no typical C startup code, in which case, you may not even have a main(). However, you still need a linker, because the linker creates the required executable file format for the given architecture.
It also will set the entrypoint, where the actual execution starts.
So you can omit the standard libraries, and create a binary, which is completly void of any C functions, but you still need the linker to actually make a runable binary.
The object file format, generated by the compiler, is very different to the executable file format, because it only provides all information, that is required for the linker.
Yes. The linker does more than merely link the files. Check out this resource for more info: https://en.wikibooks.org/wiki/C%2B%2B_Programming/Programming_Languages/C%2B%2B/Code/Compiler/Linker#:~:text=The%20linker%20is%20a%20program,translation%20unit%20have%20external%20linkage.
Believe it or not, multiple libraries can be referenced by default. So, even if you don't #includea resource, the compiler may have to internally link or reference something outside of the translation unit. There are also redundancies and other considerations that are "eliminated" by the compiler.
Despite its name the linker is properly a "linker/locater". It performs two functions - 1) linking object code, 2) determining where in memory the data and code elements exist.
The object code out of the compiler is not "located" even if it has no unresolved links.
Also even if you have the simplest possible valid code:
int main(){ return 0; }
with no includes, the linker will normally implicitly link the C runtime start-up, which is required to do everything necessary before running main(). That may be very little. On some target such as ARM Cortex-M you can in fact run C code directly from the reset vector so long as you don't assume static initialisation or complete library support. So it is possible to write the reset code entirely in C, but you probably still need code to initialise the vector table with the reset handler (your C start-up function) and the initial stack pointer. On Cortex-M that can be done using in-line assembler perhaps, but it is all rather cumbersome and unnecessary and does not forgo the linker.
I have disassembled code in arm. I want to know the corresponding line number of these instructions in its original source file.
Also, I would like to understand few things.
a function for example say android::CameraHardware::createInstance is being shown in assembly as _ZN7android18CameraHardware14createInstanceEib . I am not even completely sure if this is the right function i am supposed to compare it with or not.
Why are names so strange and things are appended in front and back? I generally do the same for C code. There function names look straight forward in disassembled code.
So to summarize I have two questions.
Inside GDB, is there a way i could get the line number of a
particular line of assembly instruction?
Say for example at 0x40d9078c, i want to know which line it
corresponds to in its source file. I tried info line. No use. Any
other suggestions?
When we are understanding the disassembly of cpp code, how to
understand the naming conventions? Also what other things we need to
understand as prerequisites?
Thanks.
The translation from android::CameraHardware::createInstance to _ZN7android18CameraHardware14createInstanceEib is called "name mangling", and is normal for C++. It is how you can have multiple functions with the same name, taking different parameters, and get the linker to tell you that "I couldn't find a foo(int x, double y)" when you only declared it, but didn't define it.
In Linux, you can use c++filt to translate a mangled name to its unmangled form (assuming it's compiled with Linux style mangling convention - which android does - but if you were to take a Microsoft compiled piece of code, it clearly wouldn't work).
If you compile with debug symbols, gdb should be able to show you the source for a given piece of code. Add -g to the g++ line in the compile.
My executable contains symbol table. But it seems that the stack trace is overwrited.
How to get more information out of that core please? For instance, is there a way to inspect the heap ? See the objects instances populating the heap to get some clues. Whatever, any idea is appreciated.
I am a C++ programmer for a living and I have encountered this issue more times than i like to admit. Your application is smashing HUGE part of the stack. Chances are the function that is corrupting the stack is also crashing on return. The reason why is because the return address has been overwritten, and this is why GDB's stack trace is messed up.
This is how I debug this issue:
1)Step though the application until it crashes. (Look for a function that is crashing on return).
2)Once you have identified the function, declare a variable at the VERY FIRST LINE of the function:
int canary=0;
(The reason why it must be the first line is that this value must be at the very top of the stack. This "canary" will be overwritten before the function's return address.)
3) Put a variable watch on canary, step though the function and when canary!=0, then you have found your buffer overflow! Another possibility it to put a variable breakpoint for when canary!=0 and just run the program normally, this is a little easier but not all IDE's support variable breakpoints.
EDIT: I have talked to a senior programmer at my office and in order to understand the core dump you need to resolve the memory addresses it has. One way to figure out these addresses is to look at the MAP file for the binary, which is human readable. Here is an example of generating a MAP file using gcc:
gcc -o foo -Wl,-Map,foo.map foo.c
This is a piece of the puzzle, but it will still be very difficult to obtain the address of function that is crashing. If you are running this application on a modern platform then ASLR will probably make the addresses in the core dump useless. Some implementation of ASLR will randomize the function addresses of your binary which makes the core dump absolutely worthless.
You have to use some debugger to detect, valgrind is ok
while you are compiling your code make sure you add -Wall option, it makes compiler will tell you if there are some mistakes or not (make sure you done have any warning in your code).
ex: gcc -Wall -g -c -o oke.o oke.c
3. Make sure you also have -g option to produce debugging information. You can call debugging information using some macros. The following macros are very useful for me:
__LINE__ : tells you the line
__FILE__ : tells you the source file
__func__ : tells yout the function
Using the debugger is not enough I think, you should get used to to maximize compiler ablity.
Hope this would help
TL;DR: extremely large local variable declarations in functions are allocated on the stack, which, on certain platform and compiler combinations, can overrun and corrupt the stack.
Just to add another potential cause to this issue. I was recently debugging a very similar issue. Running gdb with the application and core file would produce results such as:
Core was generated by `myExecutable myArguments'.
Program terminated with signal 6, Aborted.
#0 0x00002b075174ba45 in ?? ()
(gdb)
That was extremely unhelpful and disappointing. After hours of scouring the internet, I found a forum that talked about how the particular compiler we were using (Intel compiler) had a smaller default stack size than other compilers, and that large local variables could overrun and corrupt the stack. Looking at our code, I found the culprit:
void MyClass::MyMethod {
...
char charBuffer[MAX_BUFFER_SIZE];
...
}
Bingo! I found MAX_BUFFER_SIZE was set to 10000000, thus a 10MB local variable was being allocated on the stack! After changing the implementation to use a shared_ptr and create the buffer dynamically, suddenly the program started working perfectly.
Try running with Valgrind memory debugger.
To confirm, was your executable compiled in release mode, i.e. no debug symbols....that could explain why there's ?? Try recompiling with -g switch which 'includes debugging information and embedding it into the executable'..Other than that, I am out of ideas as to why you have '??'...
Not really. Sure you can dig around in memory and look at things. But without a stack trace you don't know how you got to where you are or what the parameter values were.
However, the very fact that your stack is corrupt tells you that you need to look for code that writes into the stack.
Overwriting a stack array. This can be done the obvious way or by calling a function or system call with bad size arguments or pointers of the wrong type.
Using a pointer or reference to a function's local stack variables after that function has returned.
Casting a pointer to a stack value to a pointer of the wrong size and using it.
If you have a Unix system, "valgrind" is a good tool for finding some of these problems.
I assume that since you say "My executable contains symbol table" that you compiled and linked with -g, and that your binary wasn't stripped.
We can just confirm this:
strings -a |grep function_name_you_know_should_exist
Also try using pstack on the core ans see if it does a better job of picking up the callstack. In that case it sounds like your gdb is out of date compared to your gcc/g++ version.
Sounds like you're not using the identical glibc version on your machine as the corefile was when it crashed on production. Get the files output by "ldd ./appname" and load them onto your machine, then tell gdb where to look;
set solib-absolute-prefix /path/to/libs
I have a 3rd party source code that I have to investigate. I want to see in what order the functions are called but I don't want to waste my time typing:
printf("Entered into %s", __FUNCTION__)
and
printf("Exited from %s", __FUNCTION__)
for each function, nor do I want to touch any source file.
Do you have any suggestions? Is there a compiler flag that automagically does this for me?
Clarifications to the comments:
I will cross-compile the source to run it on ARM.
I will compile it with gcc.
I don't want to analyze the static code. I want to trace the runtime. So doxygen will not make my life easier.
I have the source and I can compile it.
I don't want to use Aspect Oriented Programming.
EDIT:
I found that 'frame' command in the gdb prompt prints the current frame (or, function name, you could say) at that point in time. Perhaps, it is possible (using gdb scripts) to call 'frame' command everytime a function is called. What do you think?
Besides the usual debugger and aspect-oriented programming techniques, you can also inject your own instrumentation functions using gcc's -finstrument-functions command line options. You'll have to implement your own __cyg_profile_func_enter() and __cyg_profile_func_exit() functions (declare these as extern "C" in C++).
They provide a means to track what function was called from where. However, the interface is a bit difficult to use since the address of the function being called and its call site are passed instead of a function name, for example. You could log the addresses, and then pull the corresponding names from the symbol table using something like objdump --syms or nm, assuming of course the symbols haven't been stripped from the binaries in question.
It may just be easier to use gdb. YMMV. :)
You said "nor do I want to touch any source file"... fair game if you let a script do it for you?
Run this on all your .cpp files
sed 's/^{/{ENTRY/'
So that it transforms them into this:
void foo()
{ENTRY
// code here
}
Put this in a header that can be #included by every unit:
#define ENTRY EntryRaiiObject obj ## __LINE__ (__FUNCTION__);
struct EntryRaiiObject {
EntryRaiiObject(const char *f) : f_(f) { printf("Entered into %s", f_); }
~EntryRaiiObject() { printf("Exited from %s", f_); }
const char *f_;
};
You may have to get fancier with the sed script. You can also put the ENTRY macro anywhere else you want to probe, like some deeply nested inner scope of a function.
Use /Gh (Enable _penter Hook Function) and /GH (Enable _pexit Hook Function) compiler switches (if you can compile the sources ofcourse)
NOTE: you won't be able to use those macro's. See here ("you will need to get the function address (in EIP register) and compare it against addresses in the map file that can be generated by the linker (assuming no rebasing has occurred). It'll be very slow though.")
If you're using gcc, the magic compiler flag is -g. Compile with debugging symbols, run the program under gdb, and generate stack traces. You could also use ptrace, but it's probably a lot easier to just use gdb.
Agree with William, use gdb to see the run time flow.
There are some static code analyzer which can tell which functions call which and can give you some call flow graph. One tool is "Understand C++" (support C/C++) but thats not free i guess. But you can find similar tools.
how to find unused functions in a c++ project vc2008
I always use "/OPT:REF" when creating release versions. This flag removes all unreferenced functions and will reduce the final binary substantially if there are many functions not being used (in our case we have a kernel with loads of methods used differently from different customized applications).
The "/VERBOSE" will send information about the linking session to the output window, or to stdout if you are linking on the command line. In the latter you can always redirect this to a file.
Using both flags together will make the output contain all eliminated functions and/or data that is never referenced.
Cheers!
Select "Run code Analysis on 'your project name'" from the Analyze/Build menu (according to your VS edition), VS will show a warning if there is an unused functions.
You should be able to use link.exe with /map and /mapinfo to generate a map file that tells you which functions aren't called.