arm assembly code - understanding disassemble of cpp source

arm assembly code - understanding disassemble of cpp source - c++

I have disassembled code in arm. I want to know the corresponding line number of these instructions in its original source file.
Also, I would like to understand few things.
a function for example say android::CameraHardware::createInstance is being shown in assembly as _ZN7android18CameraHardware14createInstanceEib . I am not even completely sure if this is the right function i am supposed to compare it with or not.
Why are names so strange and things are appended in front and back? I generally do the same for C code. There function names look straight forward in disassembled code.
So to summarize I have two questions.
Inside GDB, is there a way i could get the line number of a
particular line of assembly instruction?
Say for example at 0x40d9078c, i want to know which line it
corresponds to in its source file. I tried info line. No use. Any
other suggestions?
When we are understanding the disassembly of cpp code, how to
understand the naming conventions? Also what other things we need to
understand as prerequisites?
Thanks.

The translation from android::CameraHardware::createInstance to _ZN7android18CameraHardware14createInstanceEib is called "name mangling", and is normal for C++. It is how you can have multiple functions with the same name, taking different parameters, and get the linker to tell you that "I couldn't find a foo(int x, double y)" when you only declared it, but didn't define it.
In Linux, you can use c++filt to translate a mangled name to its unmangled form (assuming it's compiled with Linux style mangling convention - which android does - but if you were to take a Microsoft compiled piece of code, it clearly wouldn't work).
If you compile with debug symbols, gdb should be able to show you the source for a given piece of code. Add -g to the g++ line in the compile.

Related

Find Main in Assembly

I have simple C++ programm:
#include <iostream>
using namespace std;
void main()
{
cout << "Hello, world, from Visual C++!" << endl;
}
Compiled with following command: cl /EHsc hello.cpp
I want to start debugging of executable, How can I find this main function's corresponding assembly code in the debugger? (I'm using x64dbg)
Entry point is not same as Main function.
I found main function and it is somewhere not near with Entry Point, I had strings and I found this easily.
Is there any way or rule or best practise how to guess where is main's corresponding assmebly code?
EDIT:
I have source code, but I just learning RE.

Although the entry point is usually not the main defined in your executable, this is merely because it is quite common for compilers to to wrap main with some initialization code.
In most cases the initialization code is quite similar and has one of few versions per compiler. Most of those functions have an IDA FLIRT signature, and opening the binary with IDA will define an WinMain, main, etc function for you automatically. You can also use free (trial) versions of IDA for that.
If that's not the case, it's pretty straight forward to get from the entrypoint to the main, by following the few calls inside the entrypoint function one level deep. the main call is usually near the end of the entrypoint function.
Here's an example, main function is selected near the bottom (Note this is a unix executable compiled for windows using mingw, so this is somewhat different from most native win32 executables).

if you debugging own code - the best way to stop somewhere under debugger - use next code
if (IsDebuggerPresent()) __debugbreak();
so you can insert it at begin of your main or any other places.
if you debugging not own binary code - binary can at all not containing c/c++ CRT code - so question became senseless. however if CRT code exist, despite many different implementations - all use common patterns and after some practice - possible found where CRT code call main.
in case standard windows binaries, for which exist pdb files - this is not a problem at all

Generally, you can't.
When you compile a program, you get a binary and (optionally) debugging symbols.
If you have the debugging symbols, let IDA or your debugger load them, and then you should be able to symbolically evaluate main to the address of the function (e.g in IDA, just press g and write main and you'll be there. In WinDbg or gdb you can type b main)
However, the more common case would be to find the main function on a binary for which you do not posses the debugging symbols. In this case, you don't know where the main function is, nor if it is even there. The binary may not use the common libc practice of an entry point doing initialization and then calling main(int argc, char *argv[], char *envp[]).
But because you're an intelligent human, I'd recommended reading the libc implementation for the compiler/platform you think you're working with, and follow the logic from the platform-defined entry point until you see the call main instruction.
(Please note that .NET binaries and other types of binaries may behave completely differently.)

The gdb in NDK r7 maps code addresses to source code completely wrong

When I debug an app written in mostly native code (C++ and some C, multiple shared objects), that uses NativeActivity, ndk-gdb manages to set breakpoints in C++ functions just fine, but it maps code addresses to completely wrong source code locations. If I set a breakpoint at one C++ function that is in no way special except that its prototype is extern "C", "i b" shows the breakpoint being at /Users/tml/android-ndk-r7/sources/cxx-stl/gnu-libstdc++/include/exception:61 ... this makes single stepping through the function a bit silly, as gdb all the time thinks I am at line 61 in the exception header. What could be the problem?

You could try the solution suggested here (switch to stabs):
http://groups.google.com/group/android-ndk/browse_thread/thread/ebd969a055af3196

How do debuggers get line numbers of commands?

I'm trying to get line numbers of address I collected in a stackwalk by using symgetlinefromaddr64, but I can't seem to get addresses of simple commands or their lines.
for example, if i'm looking at the method:
void Test(int g)
{
g++;
DoSomething(g);
g--;
}
I'll get only the line number of "DoSomething", but I want the line numbers of "g++" etc.
I suppose it is doable because debuggers do it.
how can I do it myself in c++ on windows?

A stack walk will only retrieve addresses that are stored on the stack, which pretty much means function calls. If you want the address of your g++ or g--, you'll need to use something other than a stack walk to get them (e.g., SymGetFileLineOffsets64). If you're starting from a stackwalk and have info from SymGetLineFromAddr64, you can use SymGetLineNext64 and SymGetLinePrev64 to get information about the surrounding lines.

The only way to do it is to use compiler generated symbol files like the *.pdb files for microsoft visual studio compilers (pdb stands for program database). These files contain all symbols used during the compilation step. Even for a release compilation you'll get information about the symbols in use (some may have be optimized away).
The main disadvantage is that this is highly compiler dependent/specific. gcc for example may include symbol information in the executable so-file or executable. Other compilers have other formats...
What compiler do you use (name/version)?

How To Extract Function Name From Main() Function Of C Source

I just want to ask your ideas regarding this matter. For a certain important reason, I must extract/acquire all function names of functions that were called inside a "main()" function of a C source file (ex: main.c).
Example source code:
int main()
{
int a = functionA(); // functionA must be extracted
int b = functionB(); // functionB must be extracted
}
As you know, the only thing that I can use as a marker/sign to identify these function calls are it's parenthesis "()". I've already considered several factors in implementing this function name extraction. These are:
1. functions may have parameters. Ex: functionA(100)
2. Loop operators. Ex: while()
3. Other operators. Ex: if(), else if()
4. Other operator between function calls with no spaces. Ex: functionA()+functionB()
As of this moment I know what you're saying, this is a pain in the $$$... So please share your thoughts and ideas... and bear with me on this one...
Note: this is in C++ language...

You can write a Small C++ parser by combining FLEX (or LEX) and BISON (or YACC).
Take C++'s grammar
Generate a C++ program parser with the mentioned tools
Make that program count the funcion calls you are mentioning
Maybe a little bit too complicated for what you need to do, but it should certainly work. And LEX/YACC are amazing tools!

One option is to write your own C tokenizer (simple: just be careful enough to skip over strings, character constants and comments), and to write a simple parser, which counts the number of {s open, and finds instances of identifier + ( within. However, this won't be 100% correct. The disadvantage of this option is that it's cumbersome to implement preprocessor directives (e.g. #include and #define): there can be a function called from a macro (e.g. getchar) defined in an #include file.
An option that works for 100% is compiling your .c file to an assembly file, e.g. gcc -S file.c, and finding the call instructions in the file.S. A similar option is compiling your .c file to an object file, e.g, gcc -c file.c, generating a disassembly dump with objdump -d file.o, and searching for call instructions.
Another option is finding a parser using Clang / LLVM.

gnu cflow might be helpful

Automatically adding Enter/Exit Function Logs to a Project

I have a 3rd party source code that I have to investigate. I want to see in what order the functions are called but I don't want to waste my time typing:
printf("Entered into %s", __FUNCTION__)
and
printf("Exited from %s", __FUNCTION__)
for each function, nor do I want to touch any source file.
Do you have any suggestions? Is there a compiler flag that automagically does this for me?
Clarifications to the comments:
I will cross-compile the source to run it on ARM.
I will compile it with gcc.
I don't want to analyze the static code. I want to trace the runtime. So doxygen will not make my life easier.
I have the source and I can compile it.
I don't want to use Aspect Oriented Programming.
EDIT:
I found that 'frame' command in the gdb prompt prints the current frame (or, function name, you could say) at that point in time. Perhaps, it is possible (using gdb scripts) to call 'frame' command everytime a function is called. What do you think?

Besides the usual debugger and aspect-oriented programming techniques, you can also inject your own instrumentation functions using gcc's -finstrument-functions command line options. You'll have to implement your own __cyg_profile_func_enter() and __cyg_profile_func_exit() functions (declare these as extern "C" in C++).
They provide a means to track what function was called from where. However, the interface is a bit difficult to use since the address of the function being called and its call site are passed instead of a function name, for example. You could log the addresses, and then pull the corresponding names from the symbol table using something like objdump --syms or nm, assuming of course the symbols haven't been stripped from the binaries in question.
It may just be easier to use gdb. YMMV. :)

You said "nor do I want to touch any source file"... fair game if you let a script do it for you?
Run this on all your .cpp files
sed 's/^{/{ENTRY/'
So that it transforms them into this:
void foo()
{ENTRY
// code here
}
Put this in a header that can be #included by every unit:
#define ENTRY EntryRaiiObject obj ## __LINE__ (__FUNCTION__);
struct EntryRaiiObject {
EntryRaiiObject(const char *f) : f_(f) { printf("Entered into %s", f_); }
~EntryRaiiObject() { printf("Exited from %s", f_); }
const char *f_;
};
You may have to get fancier with the sed script. You can also put the ENTRY macro anywhere else you want to probe, like some deeply nested inner scope of a function.

Use /Gh (Enable _penter Hook Function) and /GH (Enable _pexit Hook Function) compiler switches (if you can compile the sources ofcourse)
NOTE: you won't be able to use those macro's. See here ("you will need to get the function address (in EIP register) and compare it against addresses in the map file that can be generated by the linker (assuming no rebasing has occurred). It'll be very slow though.")

If you're using gcc, the magic compiler flag is -g. Compile with debugging symbols, run the program under gdb, and generate stack traces. You could also use ptrace, but it's probably a lot easier to just use gdb.

Agree with William, use gdb to see the run time flow.
There are some static code analyzer which can tell which functions call which and can give you some call flow graph. One tool is "Understand C++" (support C/C++) but thats not free i guess. But you can find similar tools.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

arm assembly code - understanding disassemble of cpp source - c++

Related

Find Main in Assembly

The gdb in NDK r7 maps code addresses to source code completely wrong

How do debuggers get line numbers of commands?

How To Extract Function Name From Main() Function Of C Source

Automatically adding Enter/Exit Function Logs to a Project

Categories

Resources