debugging on ARM - inspecting the stack - gdb

I have this particular statement :-
0x133541a2 <_ZN13AIFFAudioFile14+794>: 9259 str r2, [sp, #356]
and my r2 seems to have a value of 0x41414141; how could I evaluate [sp, #356] so that I could look at the value at the resulting address?
I understand the #356 ends up accessing a value from the literal pool, so how can I go about inspecting the resultant address into which r2 is stored?

Problem solved.
Its enough to use $sp+356, but something to remember is that 356 is in decimal, not hex. Thats what was confusing me earlier.

Related

Instruction disassembly for ARM

I just setup a raspberry pi machine and tried reverse engineering the following piece of code.
#include<stdio.h>
int main() {
printf("this is a test\n");
}
For the most part the following disassembly in gdb seemed to make sense.
0x000083c8 <+0>: push {r11, lr}
0x000083cc <+4>: add r11, sp, #4
0x000083d0 <+8>: ldr r0, [pc, #8] ; 0x83e0 <main+24>
0x000083d4 <+12>: bl 0x82ec <puts>
0x000083d8 <+16>: mov r0, r3
0x000083dc <+20>: pop {r11, pc}
0x000083e0 <+24>: andeq r8, r0, r4, asr r4
However, I fail to understand why the instruction at 0x000083e0 exists. Is that instruction even a part of the main function? Wouldn't the value that was pushed in at 0x000083c8 be popped out into pc, immediately transferring control over to some other location?
Also I tried setting a breakpoint at 0x000083e0 -- I seem to be getting a very strange SEGFAULT. Why would that be?
When this function is called (i.e. when execution begins at instruction 0x000083c8), the link register (LR) should already contain the return address. Fast-forward to 0x000083d8: the puts function's return result is placed in R0 in accordance with the ARM C calling convention (link, link). Then, the return address is popped from the stack into the PC - effectively ending execution of this function. This implies that the instruction at 0x000083e0 is not a part of your program, and your inspection should be limited to instructions 0x000083c8 through 0x000083dc.
So to answer your questions:
Correct.
The "instruction" at 0x000083e0 is essentially junk. You may not even have execution and/or access privileges to this memory depending on the specifics of your ARM core (Does it have an MMU, etc?). Thus, a seg fault is a reasonable outcome when attempting to inspect that location.
EDIT: in agreement with comments below, the contents of 0x000083e0 should be interpreted as data, not instructions.
Four bytes at 0x000083e0 isn't junk. It is part of the PC relative load at
0x000083d0 <+8>: ldr r0, [pc, #8] ; 0x83e0 <main+24>
It is also visible in the comment as ; 0x83e0 <main+24>.
Problem here since you need to pass address of a string to puts, whose address might change during linking step, compiler needs to create suitable code for such further processing. Thus address of string ends up in instruction stream yet outside of any execution context.

Hypothetical - about making a header for an *existing* static/dynamic library

I want to learn more about unix/linux and this question popped into my head - let's say I made a static/dynamic library (.a or .so) and lost the c/c++ source code and header file. Default nm output gives me the names of the symbols but I need to know return types and parameter count/types to make a header. Is it possible to get this extra information somehow to reverse engineer a header for a given library?
You tagged C and C++ and the answer varies slightly between the two.
For C++, the method names of classes have type information embedded in the symbol name. You just have to figure how what kind of name mangling the compiler that compiled the library did.
For C, there's no real clean way to do it. You could take apart the assembly and analyze which registers and stack areas are read without having been written to figure out how many parameters a function takes. This would require knowledge of the calling conventions used by whatever compiler compiled the library.
Similarly, you can look at how each parameters is used in the assembly. If you see it being used in a load instruction, it is most likely a pointer of some sort while if you see it being used in arithmetic, it's possibly an integer of some sort.
For the return type, you can check if anything seemingly meaningful is placed in the return register before a return instruction. Again, this requires knowledge of calling conventions for your platform.
Here's an example of how I would do things in ARM assembly.
I know that parameters in ARM are passed in registers r0 to r3 and the return value is stored in register r0. With that in mind, we can begin reverse engineering. Let's take a look at the assembly for two functions and try to work out what the function prototype was.
00000000 <func1>:
0: e3510000 cmp r1, #0
4: 0a000007 beq 28 <func1+0x28>
8: e0801001 add r1, r0, r1
c: e1a03000 mov r3, r0
10: e3a00000 mov r0, #0
14: e4d32001 ldrb r2, [r3], #1
18: e1530001 cmp r3, r1
1c: e0800002 add r0, r0, r2
20: 1afffffb bne 14 <func1+0x14>
24: e12fff1e bx lr
28: e1a00001 mov r0, r1
2c: e12fff1e bx lr
If we take a look here, r0 and r1 are both read before anything was written to it. We can also see r2 and r3 are written to before they were read. We can therefore infer that func1 has a maximum of two paramaters.
We also realise that r0 is moved to r3 and then used as an address to ldrb, which is an instruction to load a byte from memory. Hence, we infer that the first parameter is a pointer. Because the instruction only loads a single byte, we can also tell it might be a pointer to some sort of one byte data type.
The second parameter in r1 never seems to be used except in compare and add instructions so it is possibly an integer.
Before each bx lr (a return-to-caller instruction), something is placed in r0 so we infer that the function returns some sort of value.
If this function were presented to me, I'd guess that the function prototype would look something like this:
int func1(unsigned char *, int);
Original:
unsigned int func1(void *, unsigned int);
Here's an another function
00000030 <func2>:
30: e0822001 add r2, r2, r1
34: e5c02000 strb r2, [r0]
38: e12fff1e bx lr
This one is very easy.
We see that r0, r1 and r2 are all read from before being written to so we can guess that the function takes three parameters. r0 is used as an address to a strb instruction (store byte) so it is probably a pointer. Again, it only stores a byte so it is probably a pointer to a byte sized data type.
The other two are only used in an add instruction so are probably integers.
Nothing seems to be placed into r0 at the end so the function either returns the first parameter or doesn't return a value.
I would guess the prototype would be one of the following
void func2(unsigned char *, int, int);
unsigned char *func2(unsigned char *, int, int);
Original:
void func2(char *, char, char);
Keeping in mind that caller/callee conventions vary for different processor instruction sets and you are already aware of name mangling while using c and c++ libraries together, you can try the following way:
gdb <executable>
....
disas <function name>
....
Here you can make a wild guess about the type of return value and parameters using the bit size of those values written on stack making use of assembly code.

Is memory barrier meaningful only in SMP?

I understand why memory barriers are needed, but I don't get it in the case of Uniprocessor.
Do I have to deal with barriers even when I use UP? Every document explains them with SMP but not UP.
In the following code, is there any possibility that r2 == 0 in point a?
// the location 0xdeadbeef has a zero initial value
ldr r0, =0xdeadbeef
ldr r1, =0xdeadbeef
ldr r2, =1
str r2, [r0]
ldr r2, [r1]
// point a
There are memory barriers and compiler barriers.
Memory barriers are not required on a single processor (I'm not sure if hyperthreading counts as multiple processors) but compiler barriers are - the compiler could re-order the code in different threads such that you fail.
Memory barriers must be used only for "global variables". Because local (in stack) and registers are automatically saved while threads switched.
May be universality is better than assumption that you always deal with UP

stack traces stop at the leaf register (lr)

Often I see ARM stack traces (read: Android NDK stack traces) that terminate with an lr pointer, like so:
#00 pc 001c6c20 /data/data/com.audia.dev.qt/lib/libQtGui.so
#01 lr 80a356cc /data/data/com.audia.dev.rta/lib/librta.so
I know that lr stands for link register on ARM and other architectures, and that it's a quick way to store a return address, but I don't understand why it always seems to store a useless address. In this example, 80a356cc cannot be mapped to any code using addr2line or gdb.
Is there any way to get more information? Why must the trace stop at the lr address anyway?
Stumbled on the answer finally. I just had to be more observant. Look at the following short stack trace and the information that comes after it:
#00 pc 000099d6 /system/lib/libandroid.so
#01 lr 80b6c17c /data/data/com.audia.dev.rta/lib/librta.so
code around pc:
a9d899b4 bf00bd0e 2102b507 aa016d43 28004798
a9d899c4 9801bfa8 bf00bd0e 460eb573 93004615
a9d899d4 6d842105 462b4632 200047a0 bf00bd7c
a9d899e4 b100b510 f7fe3808 2800edf4 f04fbf14
a9d899f4 200030ff bf00bd10 b097b5f0 4614af01
code around lr:
80b6c15c e51b3078 e5933038 e5932024 e51b302c
80b6c16c e1a00002 e3a01000 e3a02000 ebfeee5c
80b6c17c e1a03000 e50b303c e51b303c e1a03fa3
80b6c18c e6ef3073 e3530000 0a000005 e59f34fc
80b6c19c e08f3003 e1a00003 e51b103c ebfeebe6
Now the lr address is still a 80xxxxxx address that isn't useful to us.
The address it prints from the pc is 000099d6, but look at the next section, code around pc. The first column is a list of addresses (you can tell from the fact that it increments by 16 each time.) None of those addresses looks like the pc address, unless you chop off the first 16 bits. Then you'll notice that the a9d899d4 must correspond to 000099d4, and the code where the program stopped is two bytes in from that.
Android's stack trace seems to have "chopped off" the first 2 bytes of the pc address for me, but for whatever reason it does not do it for addresses in the leaf register. Which brings us to the solution:
In short, I was able to chop off the first 16 bits from the 80b6c17c address to make it 0000c17c, and so far that has given me a valid code address every time that I can look up with gdb or addr2line. (edit: I've found it's actually usually the first 12 bits or first 3 hexadecimal digits. You can decide this for yourself by looking at the stack trace output like I described above.) I can confirm that it is the right code address as well. This has definitely made debugging a great bit easier!
Do you have all debugging info (-g3) on?
Gcc likes to use the lr as a normal register. Remember that a non-leaf function looks like
push {lr}
; .. setup args here etc.
bl foo ; call a function foo
; .. work with function results
pop {pc}
Once it pushed lr to the stack, the compiler can use it almost freely - lr will be overwritten only by function calls. So its quite likely that there is any intermediate value in lr.
This should be stated in the debugging information that the compiler generates, in order to let the debugger know it has to look at the stack value instead of lr.

Strange behaviour of ldr [pc, #value]

I was debugging some c++ code (WinCE 6 on ARM platform),
and i find some behavior strange:
4277220C mov r3, #0x93, 30
42772210 str r3, [sp]
42772214 ldr r3, [pc, #0x69C]
42772218 ldr r2, [pc, #0x694]
4277221C mov r1, #0
42772220 ldr r0, [pc, #0x688]
Line 42772214 ldr r3, [pc, #0x69C] is used to get some constant from .DATA section, at least I think so.
What is strange that according to the code r2 should be filled with memory from address pc=0x42772214 + 0x69C = 0x427728B0, but according to the memory contents it's loaded from 0x427728B8 (8bytes+), it happens for other ldr usages too.
Is it fault of the debugger or my understanding of ldr/pc?
Another issue I don't get - why access to the .data section is relative to the executed code? I find it little bit strange.
And one more issue: i cannot find syntax of the 1st mov command (any one could point me a optype specification for the Thumb (1C2))
Sorry for the laic description, but I'm just familiarizing with the assemblies.
This is correct. When pc is used for reading there is an 8-byte offset in ARM mode and 4-byte offset in Thumb mode.
From the ARM-ARM:
When an instruction reads the PC, the value read depends on which instruction set it comes from:
For an ARM instruction, the value read is the address of the instruction plus 8 bytes. Bits [1:0] of this value are always zero, because ARM instructions are always word-aligned.
For a Thumb instruction, the value read is the address of the instruction plus 4 bytes. Bit [0] of this value is always zero, because Thumb instructions are always halfword-aligned.
This way of reading the PC is primarily used for quick, position-independent addressing of nearby instructions and data, including position-independent branching within a program.
There are 2 reasons for pc-relative addressing.
Position-independent code, which is in your case.
Get some complicated constants nearby which cannot be written in 1 simple instruction, e.g. mov r3, #0x12345678 is impossible to complete in 1 instruction, so the compiler may put this constant in the end of the function and use e.g. ldr r3, [pc, #0x50] to load it instead.
I don't know what mov r3, #0x93, 30 means. Probably it is mov r3, #0x93, rol 30 (which gives 0xC0000024)?