Stack and base pointers - c++

Hi i want to ask how the EBP and ESP registers are initialized and updated during program execution.
I have an image below and the author explains that when entering a new function, first the arguments are pushed onto the stack shown in the yellow region, then the base pointer EBP is updated and the base pointer of the previous function stored, followed by stack allocating more resources for local variables and return addresses.
I want to know wouldn't it be easier to simply update the base pointer when entering the new function, then allocate memory for the previous function's base pointer register address, followed by the arguments, local variables and return value? Instead of having the EBP in the middle of a stack frame for a function.
My other question is, where exactly does the return address value point to in the previous function, is it at the end of the previous function, or the start of the previous function?
My other question is, what significance does storing the previous function base pointer register if you already store the return address of the previous function?
The idea i get from pointers is that you essentially just want to use them as a reference for accessing resources local to a function, you can do with the return address
Thanks

To return from a function you need to restore (at least) two things.
One is where the instruction pointer is, what the next instruction you are running it located at.
The other is where the stack pointer is, where the data you are operating is.
(Depending on calling convention, registers may also have to be reset, but I don't know details here).
The return address -- the instruction pointer -- points to where the code should continue executing when returned from the call. This isn't usually either the start or end of the calling function, but rather the "next instruction" after the call. (But not really; it points to the code that handles the return from the function, which in some calling conventions does real work).
The code calling a function has to update the stack pointer so that the called function knows where its on-stack arguments are, and provide enough information for the called function to return.
The function called has no need to do that internally. It (well, the compiler writing it knows) what variables it has put on the stack, and how big they are. The set of variables on the stack may vary over time during the body of the function. So updating the stack pointer is pointless, and would even make offset calculations a bit more complex (as they keep on changing, instead of being constant for a given "variable") and confusing for an assembly writer.
Code that accesses the stack does so with an offset regardless. Not changing the stack pointer is doing less work than changing the stack pointer, and doing less work usually is faster than doing more work.
So you leave the stack pointer alone, unless you need to change it.
The return address is useless for accessing mutable data. The return address points at usually read-only machine code, not at mutable state on the stack.
Now, there are plenty of calling conventions in the wild. Many of them do slightly different things. Which arguments go on the stack, which are stored in registers, what kind of arguments go where, and who resets the stack pointer (the caller or the callee) can vary, among other things.

Related

When exactly does a called function end?

From a textbook I was reading, it is said that for functions with a return value, the return value is used to initialize a temporary at the call site.
What I want to know is does that happen before or after the called function exits? In other words, if I have a variable(say, an int) defined inside the called function, will it get destroy before or after the temporary is initialized?
This is the technical view. This might not be interesting to language-lawyers, but for other people, it might me instructive.
A function ends precisely with the branch instruction that transfers control back to the caller. Before this point, the callee has full control, after this point, the caller has full control over what the CPU does. Generally, there is precisely one CPU instruction that is used for this control transfer. On X86-64, that's the ret instruction, PowerPC used blr, other CPUs have other names for the same thing. The name does not matter, though.
Anything, that's within the responsibility of the function itself must happen before this instruction, anything that's none of the callee's business happens somewhere else.
As the caller does not know which variables a function creates, it cannot be the callers responsibility to destruct them. More generally, the callee has to release any stack space that it allocated for its own purposes. As such, the callee must perform such cleanup itself before exiting by issuing the ret instruction. This means, that any local variables must vanish before the function exits.
Things are a bit more complicated when it comes to returning a result from a function: This requires both caller and callee to collaborate. The details differ between the different calling conventions, but there are generally two cases:
The return value is passed in a register.
In this case, the callee will load the return value into a well-known register, and the caller will use that same register to access the return value.
The return value is passed on the stack.
In this case, the callee will place the data that is to be returned at a defined position within the callers stack frame, and the caller will examine that same memory region for the functions result after the call has returned.
TL;DR: If something is a functions responsibility, it must happen before the function returns (= executes the ret instruction). Freeing stack space and returning data to the caller are such responsibilities.
Before
For functions returning by value, the temporary at the call site will be initialized before the scope of the function exits. Otherwise, any return value would be destroyed before it could be passed to the call site.
The Function ends after a return statement
the variables that are declared in that scope will go out of scope and you will not be able to use that unless the variable is declared as static

How does Stack memory work Or How are function variables allocated and accessed on the stack

When I read about Stack and Heap for example on this page,
I had one question, if, like in the example given on the page, a functions puts all its local variables on the stack, does the stack actually access the different variables?
Because a stack normally only can access the top, it would only be able to access ONE variable of the function.
Does this imply that variables of a function are stored in a struct on the stack?
The stack-pointer is, like its name implies, a pointer like any other, and it points to normal standard memory. To access any area of the stack you simple add an offset to the pointer.
If you think of it in terms of C pointers, you have the stack pointer
char *stack_pointer = some_memory;
This pointer can then be used as a normal pointer, including adding offsets to access specific locations on the stack, like e.g.
*(int *)(stack_pointer + 4) = 5;
I recommend you try to learn assembler code, then you can make a very simple program, with a few local variables, and compile it to assembler code and read it to see exactly how it works.
There is often confusion between stack semantics vs stack region (or storage area). Same
goes for heap. As well, the proliferation of "stack based virtual machines like JVM and CLR" mislead non-C and C++ programmers into thinking the native runtime stack works the same way.
It is important to differentiate:
Semantics vs Region - One doesn't mean the other. The C and C++ stack is not the Java / CLR stack.
"stack-based call frames" from just "call frames" - Call frames don't have to be stacked
Stacks on most architectures provide random access semantics of O(1). The common
example is the immediate and base+offset addressing modes and the stack and base pointers in x86.
The actual stack area is allocated in a LIFO fashion, but the individual variables are
random accessible, O(1). If you wanted the stack to be gigantic, it could be.
Space is allocated like a LIFO stack. Variables are accessed within the stack like an array/vector or by an absolute address (pointer).
So, no, in C and C++, you aren't limited to a single variable at a time.
You have a little confusion about data organization and access. In stack memory is organized in such a way that new data can be added or removed only from the "top". This however has nothing to do with restrictions about accessing other elements. Such restrictions can be present in some logical stack implementations (like std::stack from C++ STL), but they are not mandatory.
Hardware stack is really more like a fixed size array, with variable array start location (stack pointer), so other elements can be accessed by indexing stack pointer. The difference from "standard" array is that it can contain elements of different size.
A stack frame consists of several elements, including:
- Return address
The address in the program where the function is to return upon completion
- Storage for local data
Memory allocated for local variables
- Storage for parameters
Memory allocated for the function’s parameters
- Stack and base pointers
Pointers used by the runtime system to manage the stack
A stack pointer usually points to the top of the stack. A stack base pointer (frame pointer) is often present and points to an address within the stack frame, such as the return address. This pointer assists in accessing the stack frame’s elements. Neither of these pointers are C pointers. They are addresses used by the runtime system to manage the program stack. If the runtime system is implemented in C, then these pointers may be real C pointers.
What you want to know is how stack frames work.
To use a stack frame you have to have several registers pointing to several "points of interest" on said stack frame and modify them or use an offset of where they are pointing. An example would be:
main() is about to call foo(). The base of main() is pointed to by the "base pointer" register EBP. Until now, main() has been using all the registers for its own stack frame. Now it will need to save the contents of those registers if it is to use them again after the call. After the call, foo() will (among other things) set up its own stack frame by allocating memory for its local variables, setting the "stack pointing" register called ESP to the top of its stack frame while saving the address of the main() base pointer and by copying the contents of the "next instruction" register called EIP so that it knows where to return after its completion. The stack frame of foo() is now on top of the stack frame of main() and the stack looks something like this:
[Registers that foo()saved.]
[Local variables of foo().]
[main() base pointer address. EBP points here and will point to the address stored here after foo() is done.]
[main() return address (foo() will return to where this address is pointing.)]
[Arguments for foo().]
[Registers that main()saved.]
[...]
As you can see, we can access both the arguments of foo() as well as its local variables as simple offsets from where the EBP register points. If the first local variable is 4 bytes long we are going to find it at EBP - 4 for example.

Where memory for 'this' pointer allocated

In C++, this pointer get passed to method as a hidden argument which actually points to current object, but where 'this' pointer stored in memory... in stack, heap, data where?
The standard doesn't specify where the this pointer is stored.
When it's passed to a member function in a call of that function, some compilers pass it in a register, and others pass it on the stack. It can also depend on the compiler options.
About the only thing you can be sure of is that this is an rvalue of basic type, so you can't take its address.
It wasn't always that way.
In pre-standard C++ you could assign to this, e.g. in order to indicate constructor failure. This was before exceptions were introduced. The modern standard way of indicating construction failure is to throw an exception, which guarantees an orderly cleanup (if not foiled by the user's code, such as the infamous MFC placement new bug).
The this pointer is allocated on the stack of your class functions (or sometimes a register).
This is however not likely the question you are actually asking.
In C++, this is "simply" a pointer to the current object. It allows you to access object-specific data.
For example, when code in a class has the following snippet:
this->temperature = 40.0f;
it sets the temperature for whatever object is being acted upon (assuming temperature is not a class-level static, shared amongst all objects of the class).
The this pointer itself doesn't have to be allocated (in terms of dynamic memory), it depends entirely on how it's all handled under the covers, something the standard doesn't actually mandate (the standard tends to focus more on behaviour than internals).
There are any number of places it could be: on the stack, at a specific memory location, in a register, and so on. All you have to concern yourself with is its behaviour which is basically how you use it to get access to the object.
What this points to is the object itself and it's usually allocated with new for dynamic allocation, or on the stack.
The this pointer is in the object itself*. Sort of. It is the memory location of the object.
If the object is on the stack, the this pointer is on the stack.
If the object is on the heap, the this pointer is on the heap.
Bottom line, it's nothing you need to worry about.
*[UPDATE] Let me backpedal/clarify/correct my answer. The physical this pointer is not in the object.
I would conclude the this pointer is derived by the compiler and is simply the address of the object which is stored in the symbol table. Semantically, it is inside the object but that was not what the OP was asking.
Here is the memory layout of 3 variables on the stack. The middle one is an object. You can see it holds it's one variable and nothing else:

clean stack in c++

how can we clean the stack...
return statement is used to go out from the function. now
if (m1.high_[0] < m2.low_[0]) return FALSE;
here it have m1 and m2 two points with high[0],low[0], low[1] and high[1] values..
now if we use return with statement than is this clean the stack.. i mean return statement with if condition is used to clean the stack.. is it?
You dont really "clean" the stack. All that happens is the stack pointer is reset to top of the calling programs stack storage.
Any subsequent function called from this program will be given the same stack pointer as your program receieved (including any values set by your program -- which is why its important to intialise automatic storage!)
Conversly when your program invokes a function the called function will be given a stack pointer of just after the last piece of your stack, and, if you call more than one function they will all end up with the same stack pointer.
To clarify C C++ programs support three types of storage allocation:-
"static" which is effectivly global to the compile unit. A suitable lump of storage is allocated when the main program starts and each "static" is allocated an address in this lump of starage. Which is used until the main program terminates.
"heap" this is a collection of storage areas managed by "malloc" with a little help from the underlying operating system. Most (but not all!) "new" objects allicate memory this way.
Then "automatic" storage (which is the default) uses the stack. Again this is fairly large contiguous area of storage allocated whne your main program starts. Any automatic variables used by "main" will be allocated to the begining of the stack and the stack pointer incremented to point to the word after the end of main's last variable.
When the first function is called it allocates its automatic variables starting from the current stack pointer and the stck pointer is set to the word after the end of its last variable, if if calls other functions then the process is repeated. When a function ends the stack pointer is reset to whatever value it had when the function was called.
In this way storage is constantly reused without the need for any mallocs or frees and it makes it easy to implement recursive functions as each call will get its own piece fo the stack (until the stack runs out!).
Yes, whenever a function returns by executing 'return XXXX', the stack frame for the concerned function is removed. Local automatic storage duration objects are destroyed in this process. Also it may involve manipulation of certain CPU registers (e.g. ESP, EBP on Intel) and is an implementation specific behavior. It does not matter if the return statement is executed in a condition or on the value which is being returned
EDIT 2:
In the code below, the local object 's' (which has automatic storage duration) is destroyed. The local object 'p' and 'x' are also destroyed, but the memory pointed to by 'p' which was newe'd is not deleted automatically until explicitly deleted is done (using delete). All this happens irrespective of when the function 'f' returns via 'return true' or 'return false'
struct S{};
bool f(int x){
S s;
S *p = new S;
if(x == 2) return true;
else return false;
}

how does an optimizing c++ compiler reuse stack slots of a function?

How does an optimizing c++ compiler determine when a stack slot of a function(part of stack frame of a function) is no longer needed by that function, so it can reuse its memory? .
By stack slot I mean a part of stack frame of a function, not necessarily a whole stack frame of a function and an example to clarify the matter is, suppose we have a function that has six integer variables defined in its scope, when it's time to use sixth variable in the function, fifth variable's become useless so compiler can use same memory block for fifth and sixth variables.
any information on this subject is appreciated.
EDIT: I interpreted the question to mean, "how does the compiler reuse a particular memory word in the stack?" Most of the following answers that question, and a note a the end answers the question, "how does the compiler reuse all the stack space needed by a function?".
Most compilers don't assign stack slots first. Rather, what they do, for each function body, is treat each update to a variable, and all accesses to that variable that can see that particular assignment, as a so-called variable lifetime. A variable which is assigned multiple times will thus cause the compiler to create multiple lifetimes.
(There are complications with this idea that occur when multiple assignments can reach an access through different control paths; this is solved by using a clever enhancement to this idea called static single assignment, which I'm not going to discuss here).
At any point in the code, there are a set of variable lifetimes that are valid; as you choose differnt code points, you have different valid variable lifetimes. The compiler's actual problem is to assign different registers or stack slots of each of the lifetimes. One can think of this as a graph-coloring problem: each lifetime is a node, and if two lifetimes can overlap at a point in the code, there is an "interference" arc from that node to the other node representing the other lifetime. You can color the graph (or equivalently use numbers instead of colors), such that no two nodes connected by an interference arc have the same color (number); you may have to use arbitarily large numbers to do this, but for most functions the numbers don't have to be very large. If you do this, the colors (numbers) will tell you a safe stack slot to use for the assigned value of the particular variable lifetime. (This idea is normally used in roughly two phases: once to allocate registers, and once to allocate stack slots for those lifetimes that don't fit into the registers).
By determining the largest number used as a color on the graph, the compiler knows how many slots are needed in the worst case, and can reserve that much storage at function entry time.
There's lots of complications: different values take different amounts of space, etc., but the basic idea is here. Not all compilers use the graph coloring technique, but almost all of them figure out how to assign stack slots and registers in a way to avoid the implied interference. And thus they know stack slot numbers and the size of the stack frame.
EDIT... while typing, it appears that the question has been interpreted as "when does the stack frame for a function vanish"? The answer is, at function exit. The compiler already knows how big it is. It has no need to push or pop onto the stack during function execution; it knows where to put everything based on the stack slot numbering determined by the graph coloring.
The easy part is: When a function exits, all local variables of that function are released. Thus, function exit indicates that the whole stack frame can be freed. That's a no-brainer, though, and you wouldn't have mentioned "optimizing compiler" if that's what you were after.
In theory, a compiler can do flow analysis on a function, find out which chunks of memory are used at what time, and perhaps even re-order stack allocation based on the order in which variables become available. Then, if new automatic variables are introduced somewhere in the middle of the function or other scope nested within the function (rather than at its beginning), those recently freed slots could be re-used.
In practice, this sounds like a lot of spinning gears, and I suspect that stack is simply allocated whenever variables come in scope and popped off en block by decrementing the stack pointer when the scope finishes. But I admit I'm no expert on this topic. Someone with more authoritative knowledge may come along and correct me.
If I understand the question correctly, this is about call chaining, i.e. invoking function from function without allocating new stack frame.
This is possible when the call can be transformed into tail call - the last op before return. This way all local variables (stack) are already out of scope, so the space can be reused. Compiler then generates a jump instead of call instruction. The original return return address is still at the proper place on the stack.
I probably glossed over lots of details here, but that's the idea.