Does function stack save the return address? - c++

I'm learning things about function stack based on assembly using the system Linux x86.
I've read some article, which told me that a function stack (callee) would save the return address where it is called by the caller so that the computer could know the point where to continue when the function returns.
This is why there is a kind of attack: stack smashing.
The stack smashing means that if we can overflow a function stack, especially overflow the return address with a designed address, the program will execute the codes of hackers.
However, today I'm trying to use gdb to check a simple c++ program as below but I can't find any saved return address in any function stack.
Here is the code:
void func(int x)
{
int a = x;
int b = 0; // set a breakpoint
}
int main()
{
func(10); // set a breakpoint
return 0;
}
Then I use gdb to get its assembly:
main:
func:
Now we can see that there is no return address being saved in the two function stacks (at least this is my view).
If a hacker wants to hack this program with stack smashing, which address in the function stack will be edited illegally by him?

Yes there is. Examine the stack immediately before the callq and immediately after it. You will find that the address of the instruction following the callq now appears on the top of the stack and RSP has decremented by 8.
The callq instruction causes the address of the following instruction to be pushed onto the stack. Inversely, the retq instruction at the end of the function causes the address on the stack to be popped into the RIP.

Now we can see that there is no return address being saved in the two function stacks (at least this is my view).
What you are actually showing is the disassembled code, not the stack.
The return address is pushed on the stack by the caller by means of the callq instruction. At the moment of entering the callee function it is at the top of the stack, i.e.: at that moment, rsp contains the address where the return address is stored.
Inspecting the stack with GDB
p/x $rsp displays the value of the rsp register, i.e.: the address of the top of the stack, since rsp points to the top of the stack.
x/x $rsp displays the memory contents located at the top of stack (i.e.: the contents located at the address pointed by rsp).
With this information in mind you can run the command x/x $rsp at the moment of entering the callee function (before anything else is pushed onto the stack) to obtain the return address.
You can also use the command info frame to inspect the current stack frame. The displayed field with the name saved rip corresponds to the current function's return address. However, you need to run this command after the stack frame for the current function has been created and before it is destroyed (i.e.: after mov %rsp,%rbp but before pop %rbp inside the callee).

Related

segmentation fault after linking c++ file with asm file [duplicate]

I am currently learning x86 assembly. Something is not clear to me still however when using the stack for function calls. I understand that the call instruction will involve pushing the return address on the stack and then load the program counter with the address of the function to call. The ret instruction will load this address back to the program counter.
My confusion is, does it matter when the ret instruction is called within the procedure/function? Will it always find the correct return address stored on the stack, or must the stack pointer be currently pointing to where the return address was stored? If that's the case, can't we just use push and pop instead of call and ret?
For example, the code below could be the first on entering the function , if we push different registers on the stack, must the ret instruction only be called after the registers are popped in the reverse order so that after the pop %ebp instruction , the stack pointer will point to the correct place on the stack where the return address is, or will it still find it regardless where it is called? Thanks in advance
push %ebp
mov %ebp, %esp
//push other registers
...
//pop other registers
mov %esp, %ebp
(could ret instruction go here for example and still pop the correct return address?)
pop %ebp
ret
You must leave the stack and non-volatile registers as you found them. The calling function has no clue what you might have done with them otherwise - the calling function will simply continue to its next instruction after ret. Only ret after you're done cleaning up.
ret will always look to the top of the stack for its return address and will pop it into EIP. If the ret is a "far" return then it will also pop the code segment into the CS register (which would also have been pushed by call for a "far" call). Since these are the first things pushed by call, they must be the last things popped by ret. Otherwise you'll end up reting somewhere undefined.
The CPU has no idea what is function/etc... The ret instruction will fetch value from memory pointed to by esp a jump there. For example you can do things like (to illustrate the CPU is not interested into how you structurally organize your source code):
; slow alternative to "jmp continue_there_address"
push continue_there_address
ret
continue_there_address:
...
Also you don't need to restore the registers from stack, (not even restore them to the original registers), as long as esp points to the return address when ret is executed, it will be used:
call SomeFunction
...
SomeFunction:
push eax
push ebx
push ecx
add esp,8 ; forget about last 2 push
pop ecx ; ecx = original eax
ret ; returns back after call
If your function should be interoperable from other parts of code, you may still want to store/restore the registers as required by the calling convention of the platform you are programming for, so from the caller point of view you will not modify some register value which should be preserved, etc... but none of that bothers CPU and executing instruction ret, the CPU just loads value from stack ([esp]), and jumps there.
Also when the return address is stored to stack, it does not differ from other values pushed to stack in any way, all of them are just values written in memory, so the ret has no chance to somehow find "return address" in stack and skip "values", for CPU the values in memory look the same, each 32 bit value is that, 32 bit value. Whether it was stored by call, push, mov, or something else, doesn't matter, that information (origin of value) is not stored, only value.
If that's the case, can't we just use push and pop instead of call and ret?
You can certainly push preferred return address into stack (my first example). But you can't do pop eip, there's no such instruction. Actually that's what ret does, so pop eip is effectively the same thing, but no x86 assembly programmer use such mnemonics, and the opcode differs from other pop instructions. You can of course pop the return address into different register, like eax, and then do jmp eax, to have slow ret alternative (modifying also eax).
That said, the complex modern x86 CPUs do keep some track of call/ret pairings (to predict where the next ret will return, so it can prefetch the code ahead quickly), so if you will use one of those alternative non-standard ways, at some point the CPU will realize it's prediction system for return address is off the real state, and it will have to drop all those caches/preloads and re-fetch everything from real eip value, so you may pay performance penalty for confusing it.
In the example code, if the return was done before pop %ebp, it would attempt to return to the "address" that was in ebp at the start of the function, which would be the wrong address to return to.

gdb cannot access memory at value-type variable

When trying to debug a core thrown by a seg-fault, the line it crashes doesn't really make sense in my eyes; there are two integers compared and the result is stored in a bool. This is the not simplified code:
bool doLog = level >= debugLevel;
This is the assembly-code where it crashes:
cmp %ebx,0x14(%rbp)
// ebx = 3
// rbp = 0x6e696c7265
However when trying to print the value of the address stored in rbp I get a gdb error: "cannot access memory at address 0x6e696c7279"
What bugs me is that when printing the address of debugLevel I'll get a different address then what is stored in the rbp register used for cmp:
p &debugLevel => 0x6e696c7279
i r rbp => 0x6e696c7265
0x6e696c7265 looks like ASCII codes for letters. You probably overwrote a pointer with string bytes.
(e.g. maybe a buffer overflow stepped on a saved RBP value, and then the function returned to its caller after restoring RBP, breaking access to locals when the caller tries to use RBP as a frame pointer. Actually, RBP+14 wouldn't be a frame pointer, unless maybe this is on Windows and the compiler allocated that local in the shadow space above the return address.)
printing the address of debugLevel I'll get a different address then what is stored in the rbp register used for cmp
GDB knows from debug info that &debugLevel = RBP+0x14.
That's why the cmp instruction uses an addressing mode with a displacement of 0x14, specifically 0x14(%rbp). So of calculating &debugLevel from a corrupted base address will give you another bad address.
0x6e696c7279 - 0x6e696c7265 = 0x14 = 20. This part is not interesting or related to your memory-corruption bug.

What is the significance of the LEA instruction "=> 0xb48daed9 <+3479>: lea -0xc(%ebp),%esp"?

Could anyone tell me what is the significance of this assembly instruction:
0xb48daed9 <+3479>: lea -0xc(%ebp),%esp
I am not very comfortable with Assembly instructions. Actually I am getting a SIGABRT in my application and the culprit, it seems, is this particular assembly instruction.
On the mechanical level, the instruction
lea -0xc(%ebp),%esp
adds -0xc (that is: -12) to %ebp and writes the result to %esp.
On the logical level, it allocates a called function's stack frame. I'd expect to see it in a context similar to this:
push %ebp ; save previous base pointer
mov %esp,%ebp ; set %ebp = %esp: old stack pointer is new base pointer
lea -0xc(%ebp),%esp ; allocate 12 bytes for local variables
%ebp and %esp are the stack pointer registers. %ebp points to the base of the stack frame and %esp to its "top" (actually the bottom because the stack grows downward), so the lea instruction moves the stack pointer 12 bytes below the base, staking a claim of 12 bytes for local variables. Doing this after saving the old base pointer and setting the new base pointer to the old stack pointer pushes a new frame of 12 bytes onto the call stack.
It seems unlikely that this instruction itself causes a trap, but in the event of a stack overflow, the allocated stack frame will be invalid and explosions are expected when trying to use it. My suspicion is that you have a runaway recursive function.
Another possibility, as #abligh mentions, is that the stack pointer became corrupted somewhere along the line. This can happen, among other things, if a buffer overflow happens in a stack-allocated buffer so that a previously saved base pointer is overwritten with garbage. Upon return from the function, the garbage is restored in lieu of the overwritten base pointer, and a subsequent function call will not have anything sensible with which to work.
lea -0xc(%ebp),%esp will:
compute the effective address [1] of %ebp - 12, and
store it in %esp
It has been/is used to perform fast arithmetic with memory operands. According to the Intel manual, it may throw an exception if the source operand is not a memory location.
[1] "Effective address", in Intel's parlance, is an offset which is supplied either as a static value or an address computation of the form: Offset = Base + (Index * Scale) + Displacement

Function Prologue and Epilogue in C

I know data in nested function calls go to the Stack.The stack itself implements a step-by-step method for storing and retrieving data from the stack as the functions get called or returns.The name of these methods is most known as Prologue and Epilogue.
I tried with no success to search material on this topic. Do you guys know any resource ( site,video, article ) about how function prologue and epilogue works generally in C ? Or if you can explain would be even better.
P.S : I just want some general view, not too detailed.
There are lots of resources out there that explain this:
Function prologue (Wikipedia)
x86 Disassembly/Calling Conventions (WikiBooks)
Considerations for Writing Prolog/Epilog Code (MSDN)
to name a few.
Basically, as you somewhat described, "the stack" serves several purposes in the execution of a program:
Keeping track of where to return to, when calling a function
Storage of local variables in the context of a function call
Passing arguments from calling function to callee.
The prolouge is what happens at the beginning of a function. Its responsibility is to set up the stack frame of the called function. The epilog is the exact opposite: it is what happens last in a function, and its purpose is to restore the stack frame of the calling (parent) function.
In IA-32 (x86) cdecl, the ebp register is used by the language to keep track of the function's stack frame. The esp register is used by the processor to point to the most recent addition (the top value) on the stack. (In optimized code, using ebp as a frame pointer is optional; other ways of unwinding the stack for exceptions are possible, so there's no actual requirement to spend instructions setting it up.)
The call instruction does two things: First it pushes the return address onto the stack, then it jumps to the function being called. Immediately after the call, esp points to the return address on the stack. (So on function entry, things are set up so a ret could execute to pop that return address back into EIP. The prologue points ESP somewhere else, which is part of why we need an epilogue.)
Then the prologue is executed:
push ebp ; Save the stack-frame base pointer (of the calling function).
mov ebp, esp ; Set the stack-frame base pointer to be the current
; location on the stack.
sub esp, N ; Grow the stack by N bytes to reserve space for local variables
At this point, we have:
...
ebp + 4: Return address
ebp + 0: Calling function's old ebp value
ebp - 4: (local variables)
...
The epilog:
mov esp, ebp ; Put the stack pointer back where it was when this function
; was called.
pop ebp ; Restore the calling function's stack frame.
ret ; Return to the calling function.
C Function Call Conventions and the Stack explains well the concept of a call stack
Function prologue briefly explains the assembly code and the hows and whys.
The gen on function perilogues
I am quite late to the party & I am sure that in the last 7 years since the question was asked, you'd have gotten a way clearer understanding of things, that is of course if you chose to pursue the question any further. However, I thought I would still give a shot at especially the why part of the prolog & the epilog.
Also, the accepted answer elegantly & quite simply explains the how of the epilog & the prolog, with good references. I only intend to supplement that answer with the why (at least the logical why) part.
I will quote the below from the accepted answer & try to extend it's explanation.
In IA-32 (x86) cdecl, the ebp register is used by the language to keep
track of the function's stack frame. The esp register is used by the
processor to point to the most recent addition (the top value) on the
stack.
The call instruction does two things: First it pushes the return
address onto the stack, then it jumps to the function being called.
Immediately after the call, esp points to the return address on the
stack.
The last line in the quote above says immediately after the call, esp points to the return address on the stack.
Why's that?
So let's say that our code that's getting currently executed has the following situation, as shown in the (really badly drawn) diagram below
So our next instruction to be executed is, say at the address 2. This is where the EIP is pointing. The current instruction has a function call (that would internally translate to the assembly call instruction).
Now ideally, because the EIP is pointing to the very next instruction, that would indeed be the next instruction to get executed. But since there's sort of a diversion from the current execution flow path, (that is now expected because of the call) the EIP's value would change. Why? Because now another instruction, that may be somewhere else, say at the address 1234 (or whatever), may need to get executed. But in order to complete the execution flow of the program as was intended by the programmer, after the diversion activities are done, the control must return back to the address 2 as that is what should have been executed next should the diversion have not happened. Let us call this address 2 as the return address in the context of the call that is being made.
Problem 1
So, before the diversion actually happens, the return address, 2, would need to be stored somewhere temporarily.
There could have been many choices of storing it in any of the available registers, or some memory location etc. But for (I believe good reason) it was decided that the return address would be stored onto the stack.
So what needs to be done now is increment the ESP (the stack pointer) such that the top of the stack now points at the next address on the stack. So TOS' (TOS before the increment) which was pointing to the address, say 292, now gets incremented & starts pointing to the address 293. That is where we put our return address 2. So something like this:
So it looks like now we have achieved our goal of temporarily storing the return address somewhere. We should now just go about making the diversion call. And we could. But there's a small problem. During the execution of the called function, the stack pointer, along with the other register values, could be manipulated multiple times.
Problem 2
So, although the return address of ours, is still stored on the stack, at location 293, after the called function finishes off executing, how would the execution flow know that it should now goto 293 & that's where it would find the return address?
So (I believe for good reason again) one of the ways of solving the above problem could be to store the stack address 293 (where the return address is) in a (designated) register called EBP. But then what about the contents of EBP? Would that not be overwritten? Sure, that's a valid point. So let's store the current contents of EBP on to the stack & then store this stack address into EBP. Something like this:
The stack pointer is incremented. The current value of EBP (denoted as EBP'), which is say xxx, is stored onto the top of the stack, i.e. at the address 294. Now that we have taken a backup of the current contents of EBP, we can safely put any other value onto the EBP. So we put the current address of the top of the stack, that is the address 294, in EBP.
With the above strategy in place, we solve for the Problem 2 discussed above. How? So now when the execution flow wants to know where from should it fetch the return address, it would :
first get the value from EBP out and point the ESP to that value. In our case, this would make TOS (top of stack) point to the address 294 (since that is what is stored in EBP).
Then it would restore the previous value of EBP. To do this it would simply take the value at 294 (the TOS), which is xxx (which was actually the older value of EBP), & put it back to EBP.
Then it would decrement the stack pointer to go to the next lower address in the stack which is 293 in our case. Thus finally reaching 293 (see that's what our problem 2 was). That's where it would find the return address, which is 2.
It will finally pop this 2 out into the EIP, that's the instruction that should have ideally been executed should the diversion have not happened, remember.
And the steps that we just saw being performed, with all the jugglery, to store the return address temporarily & then retrieve it is exactly what gets done with the function prolog (before the function call) & the epilog (before the function ret). The how was already answered, we just answered the why as well.
Just an end note: For the sake of brevity, I have not taken care of the fact that the stack addresses may grow the other way round.
Every function has an identical prologue(The starting of function code) and epilogue ( The ending of a function).
Prologue: The structure of Prologue is look like:
push ebp
mov esp,ebp
Epilogue: The structure of Prologue is look like:
leave
ret
More in detail : what is Prologue and Epilogue

MIPS core dump with ra and pc equal 0000000

I'm getting intermittent core dumps in one of our processes.
All of the threads' stacks, aside from the one which crashed, seem OK, and parsed correctly.
The thread that crashes has an apparently corrupted call stack.
The stack is has two frames, both of them 0x00000000.
Looking on the registers, both PC and RA are 0 (which explains the call stack...)
The cause register is 00800008.
Is there a way I can get more information on the crashed thread?
How come the registers themselves are corrupted? (Or is it the other way around, in core dump the debugger fills these registers based on the stack?)
Thanks!
To answer (2) first -- because understanding what actually happened is important for finding out more information about the root cause of the crash:
It really is the registers themselves, in the machine at runtime, that are 0; but it's not that the registers themselves got corrupted; rather, memory got corrupted, and that corrupted memory then got copied back into the registers, which finally caused the crash.
What's happening is something like this: the stack becomes corrupted, including (a) specifically the RA, while it is stored on the stack memory, gets zeroed out. Then, when the function is ready to return, it (b) restores the RA register from the stack -- so the RA register is now 0 -- and then (c) jump-returns to the RA, thus setting the PC to also point to 0; the next instruction will then cause a crash, while both the RA and PC are 0.
That business about the RA being stored on the stack and then restored from it is explained, for example, at http://logos.cs.uic.edu/366/notes/mips%20quick%20tutorial.htm (emphasis mine):
return address stored in register $ra; if subroutine will call other subroutines, or is
recursive, return address should be copied from $ra onto stack to preserve it,
since jal always places return address in this register and hence will overwrite
previous value.
Here's an example program which crashes with PC and RA both 0, and which illustrates the above sequence nicely (the exact numbers may have to be tweaked, depending on the system):
#include <string.h>
int bar(void)
{
char buf[10] = "ABCDEFGHI";
memset(buf, 0, 50);
return 0;
}
int foo(void)
{
return bar();
}
int main(int argc, char *argv[])
{
return foo();
}
And if we look at the disassembly of foo():
(gdb) disas foo
Dump of assembler code for function foo:
0x00400408 <+0>: addiu sp,sp,-32
0x0040040c <+4>: sw ra,28(sp)
0x00400410 <+8>: sw s8,24(sp)
0x00400414 <+12>: move s8,sp
0x00400418 <+16>: jal 0x4003a0 <bar>
0x0040041c <+20>: nop
0x00400420 <+24>: move sp,s8
0x00400424 <+28>: lw ra,28(sp)
0x00400428 <+32>: lw s8,24(sp)
0x0040042c <+36>: addiu sp,sp,32
0x00400430 <+40>: jr ra
0x00400434 <+44>: nop
End of assembler dump.
we see very nicely that RA gets stored on the stack at the beginning of the function (<+4> sw ra,28(sp)) and then is restored at the end (<+28> lw ra,28(sp)) and then jump-returned to (<+40> jr ra). I showed foo() because it's shorter, but the exact same structure is true for bar() -- except that in bar() there is also the memset() in the middle, which overwrites RA while it is on the stack (it's writing 50 bytes into an array of size 10); and then what gets restored into the register is 0, ultimately causing the crash.
So, now we understand that the root cause of the crash is some kind of stack corruption, which gets us back to question (1): is there any way way to get more information about the crashed thread?
Well, this is a bit more difficult, and is where debugging becomes more of an art than a science, but here are the principles to keep in mind:
The basic idea is to figure out what is causing the stack corruption -- most likely, it is a write to some local buffer, as in the example above.
Try to zero in as much as possible on where in the flow the corruption is occurring. Logging can help a lot here: the last log you see obviously happened before the crash (though not necessarily before the corruption!) -- add more logging in the suspect area to zero in on the crash location. Of course, if you have access to a debugger, you can also step through the code to figure out where it's crashing.
Once you find the crash location, it's much easier to work backwards from there: first of all, before the crash, the PC is not yet set to 0, and therefore you should be able to see a backtrace (though, note that the backtrace itself is "calculated" using the values stored on the stack -- once they are corrupted, the backtrace can't be calculated beyond the corruption. But this is actually helpful in this case: this can tell you quite precisely where in memory the corruption is: the point at which the backtrace is truncated is the RA (on the stack) which got corrupted.)
Once you have found what is being corrupted, but you still don't know what is causing the corruption, use watchpoints: as soon as you enter the function which places the RA that is ultimately overwritten on the stack, set a watchpoint on it. That should cause a break as soon as the corruption occurs...
Hope this helps!