During a C++ program execution does stack frame of a particular function always have a constant size or compiler is allowed to do dynamic stack management in some cases, something similar to what alloca() function does? to better describe it I mean offset of a particular local variable or object in the stack frame may change at different executions of function
At least in most typical implementations, the stack frame for a variadic function varies depending on how many variables are passed. For example:
printf("%d", 1); // stack frame contains 1 pointer, one int
printf("%d %d", 1, 2); // stack frame contains one pointer, 2 ints.
Whether the implementation is particularly similar to alloca depends on the implementation though (especially since alloca isn't standard, so how or even if it's implemented may vary).
The compiler is allowed to do as it wishes, after all it generates the code and as long as it does what the program in C++ says it is fine. In general, whenever possible, compilers calculate the total needed stack space for the function and reserve that upfront (reducing the number of times the stack register is written to), even if objects are created and destructed on demand.
In common implementations, local variables are placed on the stack frame. Some functions may have variables that are accommodated by registers, others will have variables placed on the stack.
Stack frames may also be expanded by non-static variables declared in statement blocks.
There is no standard minimum size for a stack frame. The maximum size of a stack frame depends on the platform and implementation. A common implementation is to have the stack to expand towards the heap and the heap expands towards the stack.
The standard does not say anything about it (it does not even require you to have a stack), and with C++14 likely something like alloca will be needed since it will likely get a "light" version of C99 VLAs.
Related
Recently, I came across this question in an interview: How can we determine how much storage on the stack a particular function is consuming?
The "stack" is famously an implementation detail of the platform that is not inspectable or in any way queryable from within the language itself. It is essentially impossible to guarantee within any part of a C or C++ program whether it will be possible to make another function call. The "stack size", or maybe better called "function call and local variable storage depth", is one of the implementation limits whose existence is acknowledged by the language standard but considered out of scope. (E.g. for C++ see [implimits], Annex B.)
Individual platforms may offer APIs to allow programs to introspect the platform limitations, but neither C nor C++ specify that or how this should be possible.
Exceeding the implementation-defined resource limits leads to undefined behaviour, and you cannot know whether you will exceed the limits.
It's completely implementation defined - the standard does not in any way impose requirements on the possible underlying mechanisms used by a program.
On a x86 machine, one stack frame consists of a return address (4/8 byte), parameters and local variables.
The parameters, if e.g. scalars, may be passed through registers, so we can't say for sure whether they contribute to the storage taken up. The locals may be padded (and often are); We can only deduce a minimum amount of storage for these.
The only way to know for sure is to actually analyze the assembler code a compiler generates, or look at the absolute difference of the stack pointer values at runtime - before and after a particular function was called.
E.g.
#include <iostream>
void f()
{
register void* foo asm ("esp");
std::cout << foo << '\n';
}
int main()
{
register void* foo asm ("esp");
std::cout << foo << '\n';
f();
}
Now compare the outputs. GCC on Coliru gives
0x7fffbcefb410
0x7fffbcefb400
A difference of 16 bytes. (The stack grows downwards on x86.)
As stated by other answers, the program stack is a concept which is not specified within the language itself. However with a knowledge how typical implementation works, you can assume that the address of the first argument of a function is the beginning of its stack frame. The address of the first argument of a next called function is the beginning of the next stack frame. So, they probably wanted to see a code like:
void bar(void *b) {
printf("Foo stack frame is around %lld bytes\n", llabs((long long)b - (long long)&b));
}
void foo(int x) {
bar(&x);
}
The size increase of the stack, for those implementations that use a stack, is:
size of variables that don't fit in the available registers
size of variables declared in the function declared upfront that live for the life of the function
size of other local variables declared along the way or in statement blocks
the maximum stack size used by functions called by this function
everything above * the number of recursive calls
size of the return address
Return Address
Most implementations push the return address on the stack before any other data. So this address takes up space.
Available Registers
Some processors have many registers; however, only a few may be available for passing variables. For example, if the convention allows for 2 variables but there are 5 parameters, 3 parameters will be placed on the stack.
When large objects are passed by value, they will take up space on the stack.
Function Local Variables
This is tricky to calculate, because variables may be pushed onto the stack and then popped off when not used.
Some variables may not be pushed onto the stack until they are declared. So if a function returns midway through, it may not use the remaining variables, so the stack size won't increase for those variables.
The compiler may elect to use registers to hold values or place constants directly into the executable code. In this case, they don't add any length to the stack.
Calling Other Functions
The function may call other functions. Each called function may increase the amount of data on the stack. Those functions that are called may call other functions, and so on.
This again, depends on the snapshot in time of the execution. However, one can produce an approximate maximum increase of the stack by the other called functions.
Recursion
As with calling other functions, a recursive call may increase the size of the stack. A recursive call at the end of the function may increase the stack more than a recursive call near the beginning.
Register Value Saving
Sometimes, the compiler may need more space for data than the allocated registers allow. Thus the compiler may push variables on the stack.
The compiler may push registers on the stack for convenience, such as swapping registers or changing the value's order.
Summary
The exact size of stack space required for a function is very difficult to calculate and may depend on where the execution is. There are many items to consider in stack size calculation, such as parameter quantity and size as well as any other functions called. Due to the variability, most stack size measurements are based on a maximum size, or worst case size. Stack allocation is usually based on the worst case scenario.
For an interview question, I would mention all of the above, which usually makes the interviewer want to move on to the next question quickly.
Allocating stuff on the stack is awesome because than we have RAII and don't have to worry about memory leaks and such. However sometimes we must allocate on the heap:
If the data is really big (recommended) - because the stack is small.
If the size of the data to be allocated is only known at runtime (dynamic allocation).
Two questions:
Why can't we allocate dynamic memory (i.e. memory of size that is
only known at runtime) on the stack?
Why can we only refer to memory on the heap through pointers, while memory on the stack can be referred to via a normal variable? I.e. Thing t;.
Edit: I know some compilers support Variable Length Arrays - which is dynamically allocated stack memory. But that's really an exception to the general rule. I'm interested in understanding the fundamental reasons for why generally, we can't allocate dynamic memory on the stack - the technical reasons for it and the rational behind it.
Why can't we allocate dynamic memory (i.e. memory of size that is only known at runtime) on the stack?
It's more complicated to achieve this. The size of each stack frame is burned-in to your compiled program as a consequence of the sort of instructions the finished executable needs to contain in order to work. The layout and whatnot of your function-local variables, for example, is literally hard-coded into your program through the register and memory addresses it describes in its low-level assembly code: "variables" don't actually exist in the executable. To let the quantity and size of these "variables" change between compilation runs greatly complicates this process, though it's not completely impossible (as you've discovered, with non-standard variable-length arrays).
Why can we only refer to memory on the heap through pointers, while memory on the stack can be referred to via a normal variable
This is just a consequence of the syntax. C++'s "normal" variables happen to be those with automatic or static storage duration. The designers of the language could technically have made it so that you can write something like Thing t = new Thing and just use a t all day, but they did not; again, this would have been more difficult to implement. How do you distinguish between the different types of objects, then? Remember, your compiled executable has to remember to auto-destruct one kind and not the other.
I'd love to go into the details of precisely why and why not these things are difficult, as I believe that's what you're after here. Unfortunately, my knowledge of assembly is too limited.
Why can't we allocate dynamic memory (i.e. memory of size that is only known at runtime) on the stack?
Technically, this is possible. But not approved by the C++ standard. Variable length arrays(VLA) allows you to create dynamic size constructs on stack memory. Most compilers allow this as compiler extension.
example:
int array[n];
//where n is only known at run-time
Why can we only refer to memory on the heap through pointers, while memory on the stack can be referred to via a normal variable? I.e. Thing t;.
We can. Whether you do it or not depends on implementation details of a particular task at hand.
example:
int i;
int *ptr = &i;
We can allocate variable length space dynamically on stack memory by using function _alloca. This function allocates memory from the program stack. It simply takes number of bytes to be allocated and return void* to the allocated space just as malloc call. This allocated memory will be freed automatically on function exit.
So it need not to be freed explicitly. One has to keep in mind about allocation size here, as stack overflow exception may occur. Stack overflow exception handling can be used for such calls. In case of stack overflow exception one can use _resetstkoflw() to restore it back.
So our new code with _alloca would be :
int NewFunctionA()
{
char* pszLineBuffer = (char*) _alloca(1024*sizeof(char));
…..
// Program logic
….
//no need to free szLineBuffer
return 1;
}
Every variable that has a name, after compilation, becomes a dereferenced pointer whose address value is computed by adding (depending on the platform, may be "subtracting"...) an "offset value" to a stack-pointer (a register that contains the address the stack actually is reaching: usually "current function return address" is stored there).
int i,j,k;
becomes
(SP-12) ;i
(SP-8) ;j
(SP-4) ;k
To let this "sum" to be efficient, the offsets have to be constant, so that they can be encode directly in the instruction op-code:
k=i+j;
become
MOV (SP-12),A; i-->>A
ADD A,(SP-8) ; A+=j
MOV A,(SP-4) ; A-->>k
You see here how 4,8 and 12 are now "code", not "data".
That implies that a variable that comes after another requires that "other" to retain a fixed compile-time defined size.
Dynamically declared arrays can be an exception, but they can only be that last variable of a function. Otherwise, all the variables that follows will have an offset that have to be adjusted run-time after that array allocation.
This creates the complication that dereferencing the addresses requires arithmetic (not just a plain offset) or the capability to modify the opcode as variables are declared (self modifying code).
Both the solution becomes sub-optimal in term of performance, since all can break the locality of the addressing, or add more calculation for each variable access.
Why can't we allocate dynamic memory (i.e. memory of size that is only known at runtime) on the stack?
You can with Microsoft compilers using _alloca() or _malloca(). For gcc, it's alloca()
I'm not sure it's part of the C / C++ standards, but variations of alloca() are included with many compilers. If you need aligned allocation, such a "n" bytes of memory starting on a "m" byte boundary (where m is a power of 2), you can allocate n+m bytes of memory, add m to the pointer and mask off the lower bits. Example to allocate hex 1000 bytes of memory on a hex 100 boundary. You don't need to preserve the value returned by _alloca() since it's stack memory and automatically freed when the function exits.
char *p;
p = _alloca(0x1000+0x100);
(size_t)p = ((size_t)0x100 + (size_t)p) & ~(size_t)0xff;
Most important reason is that Memory used can be deallocated in any order but stack requires deallocation of memory in a fixed order i.e LIFO order.Hence practically it would be difficult to implement this.
Virtual memory is a virtualization of memory, meaning that it behaves as the resource it is virtualizing (memory). In a system, each process has a different virtual memory space:
32-bits programs: 2^32 bytes (4 Gigabytes)
64-bits programs: 2^64 bytes (16 Exabytes)
Because virtual space is so big, only some regions of that virtual space are usable (meaning that only some regions can be read/written just as if it were real memory). Virtual memory regions are initialized and made usable through mapping. Virtual memory does not consume resources and can be considered unlimited (for 64-bits programs) BUT usable (mapped) virtual memory is limited and use up resources.
For every process, some mapping is done by the kernel and other by the user code. For example, before even the code start executing, the kernel maps specific regions of the virtual memory space of a process for the code instructions, global variables, shared libraries, the stack space... etc. The user code uses dynamic allocation (allocation wrappers such as malloc and free), or garbage collectors (automatic allocation) to manage the virtual memory mapping at application-level (for example, if there is no enough free usable virtual memory available when calling malloc, new virtual memory is automatically mapped).
You should differentiate between mapped virtual memory (the total size of the stack, the total current size of the heap...) and allocated virtual memory (the part of the heap that malloc explicitly told the program that can be used)
Regarding this, I reinterpret your first question as:
Why can't we save dynamic data (i.e. data whose size is only known at runtime) on the stack?
First, as other have said, it is possible: Variable Length Arrays is just that (at least in C, I figure also in C++). However, it has some technical drawbacks and maybe that's the reason why it is an exception:
The size of the stack used by a function became unknown at compile time, this adds complexity to stack management, additional register (variables) must be used and it may impede some compiler optimizations.
The stack is mapped at the beginning of the process and it has a fixed size. That size should be increased greatly if variable-size-data is going to be placed there by default. Programs that do not make extensive use of the stack would waste usable virtual memory.
Additionally, data saved on the stack must be saved and deleted in Last-In-First-Out order, which is perfect for local variables within functions but unsuitable if we need a more flexible approach.
Why can we only refer to memory on the heap through pointers, while memory on the stack can be referred to via a normal variable?
As this answer explains, we can.
Read a bit about Turing Machines to understand why things are the way they are. Everything was built around them as the starting point.
https://en.wikipedia.org/wiki/Turing_machine
Anything outside of this is technically an abomination and a hack.
I was reading [1) about stack pointers and the need of knowing both ebp (start of the stack for the function) and esp (end). The article said that you need to know both because the stack can grow, but I don't see how this can be possible in c/c++. (Im not talking about another function call because to my mind this would make the stack grow, do some stuff, then recursively be popped and back to state before call)
I have done a little bit of research and only saw people saying that new allocates on the heap. But the pointer will be a local variable, right ? And this is known at compile time and reserved in the stack at the time the function is called.
I started to think that maybe with loops you have an uncontrolled number of local variables
int a;
for (int i = 0; i < n; ++i)
int b = i + 3;
but no, this doesn't allocate n times b, and only 1 int is reserved just as it is for a.
So... any example ?
[1): http://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames
You can allocate memory on the stack with alloca function from stdlib. I don't recommend to use this function in production code. It's to easy to corrupt your stack or get stack overflow.
The use of EBP is more for convenience. It is possible to just use ESP. The problem with that is, as parameters to functions are pushed onto the stack, the address relative to ESP of all the variables and parameters changes.
By setting EBP to a fixed, known position on the stack, usually between the function parameters and the local variables, the address of all these elements remains constant relative to EBP throughout the lifetime of the function. It can also help with debugging, as the value of ESP at the end of the function should equal the value of EBP.
The only way I know of to grow the stack in an indeterminate way at compile time, is to use alloca repeatedly.
A pointer is indeed allocated on the stack. And the size is usually 4 or 8 bytes on 32 and 64bit architectures respectively. So you know the size of a pointer statically, at compile time, and you can keep them on the stack if you choose to do so.
This pointer can point to free store, and you can allocate memory to it dynamically - without having to know the size beforehand. Moreover, it is usually a good idea to keep stack frames empty, and compilers will even "enforce" this with (adjustable) limits. MSVC has 1MB if I recall correctly.
So no, there is no way you can create a stack frame of size that is unknown at compile time. Your stack frame of the code you posted will have room for 3 integers (a, b, i). (And possibly some padding, shadow space, etc, not relevant.) (It is TECHNICALLY possible to extend the size of stackframes at run time but you just don't want to do that almost never.)
I can only imagine
1) parameters;
2) local variables;
what else?
1) function return address?
2) function name?
It really depends on platform and architecture, but typically:
Function return address
Saved values of caller's CPU registers - most importantly, caller's stack frame pointer value
Variables allocated with alloca().
Sometimes - extra stuff for exception handling, this is VERY platform-dependent.
Sometimes - guard values to detect stack clobbering
Function name is never in the stack, to the best of my knowledge, unless your code places it there.
I think that a picture really is a thousand words.
It depends on the calling convention; for Unix, you typically look up this information in the SYSV ABI (Application Binary Interface).
You may find:
Return address (if the machine is a popular Intel architecture). On more modern architectures, the return address is passed in a register.
Callee-saves registers—these are registers that "belong" to the caller which the callee has chosen to borrow and must therefore save and restore.
Any incoming parameters that could not be passed in registers. In IA-32, no parameters are passed in registers; they all go on the stack. In x86-64, up to six integer and six floating-point parameters can be passed in registers, so it is seldom necessary to use the stack for that purpose.
You may or may not find a saved stack pointer or frame pointer. Most modern calling conventions go without a frame pointer in order to save an extra registers. In this design, the size of each frame is known at compile time, so restoring the old stack pointer is just a matter of adding a constant. But it makes it harder to implement alloca().
The older Intel calling conventions use both stack pointer and frame pointer, which burns an extra register, but it simplifies alloca() and also stack unwinding.
Local variables with storage class auto are allocated on the stack.
The stack may contain compiler temporaries that hold values which are "spilled" if the hardware does not provide enough registers to hold all the intermediate results of computations. (This happens if at any point the number of live intermediate results—the ones that will be needed later in the program—exceeds the number of registers available to the compiler for storing intermediate results.)
You may find variables allocated with alloca().
You may find metadata that says which PC ranges are in scope for which exception handlers, or other very platform-dependent exception stuff.
C and C++ do not support garbage collection, but in a language that does, you will often find metadata that identifies where in the stack frame you will find pointers.
Finally, the stack may contain "padding" used to ensure that the stack pointer is aligned on an 8-byte or 16-byte boundary.
Calling conventions are complex beasts, and stack-frame layout is not for the faint of heart!
I know C standards preceding C99 (as well as C++) says that the size of an array on stack must be known at compile time. But why is that? The array on stack is allocated at run-time. So why does the size matter in compile time? Hope someone explain to me what a compiler will do with size at compile time. Thanks.
The example of such an array is:
void func()
{
/*Here "array" is a local variable on stack, its space is allocated
*at run-time. Why does the compiler need know its size at compile-time?
*/
int array[10];
}
To understand why variably-sized arrays are more complicated to implement, you need to know a little about how automatic storage duration ("local") variables are usually implemented.
Local variables tend to be stored on the runtime stack. The stack is basically a large array of memory, which is sequentially allocated to local variables and with a single index pointing to the current "high water mark". This index is the stack pointer.
When a function is entered, the stack pointer is moved in one direction to allocate memory on the stack for local variables; when the function exits, the stack pointer is moved back in the other direction, to deallocate them.
This means that the actual location of local variables in memory is defined only with reference to the value of the stack pointer at function entry1. The code in a function must access local variables via an offset from the stack pointer. The exact offsets to be used depend upon the size of the local variables.
Now, when all the local variables have a size that is fixed at compile-time, these offsets from the stack pointer are also fixed - so they can be coded directly into the instructions that the compiler emits. For example, in this function:
void foo(void)
{
int a;
char b[10];
int c;
a might be accessed as STACK_POINTER + 0, b might be accessed as STACK_POINTER + 4, and c might be accessed as STACK_POINTER + 14.
However, when you introduce a variably-sized array, these offsets can no longer be computed at compile-time; some of them will vary depending upon the size that the array has on this invocation of the function. This makes things significantly more complicated for compiler writers, because they must now write code that accesses STACK_POINTER + N - and since N itself varies, it must also be stored somewhere. Often this means doing two accesses - one to STACK_POINTER + <constant> to load N, then another to load or store the actual local variable of interest.
1. In fact, "the value of the stack pointer at function entry" is such a useful value to have around, that it has a name of its own - the frame pointer - and many CPUs provide a separate register dedicated to storing the frame pointer. In practice, it is usually the frame pointer from which the location of local variables is calculated, rather than the stack pointer itself.
It is not an extremely complicated thing to support, so the reason C89 doesn't allow this is not because it was impossible back then.
There are however two important reasons why it is not in C89:
The runtime code will get less efficient if the array size is not known at compile-time.
Supporting this makes life harder for compiler writers.
Historically, it has been very important that C compilers should be (relatively) easy to write. Also, it should be possible to make compilers simple and small enough to run on modest computer systems (by 80s standards). Another important feature of C is that the generated code should consistently be extremely efficient, without any surprises,
I think it is unfortunate that these values no longer hold for C99.
The compiler has to generate the code to create the space for the frame on the stack to hold the array and other local local variables. For this it needs the size of the array.
Depends on how you allocate the array.
If you create it as a local variable, and specify a length, then it matters because the compiler needs to know how much space to allocate on the stack for the elements of the array. If you don't specify a size of the array, then it doesn't know how much space to set aside for the array elements.
If you create just a pointer to an array, then all you need to do is allocate the space for the pointer itself, and then you can dynamically create array elements during run time. But in this form of array creation, you're allocating space for the array elements in the heap, not on the stack.
In C++ this becomes even more difficult to implement, because variables stored on the stack must have their destructors called in the event of an exception, or upon returning from a given function or scope. Keeping track of the exact number/size of variables to be destroyed adds additional overhead and complexity. While in C it's possible to use something like a frame pointer to make freeing of the VLA implicit, in C++ that doesn't help you, because those destructors need to be called.
Also, VLAs can cause Denial of Service security vulnerabilities. If the user is able to supply any value which is eventually used as the size for a VLA, then they could use a sufficiently large value to cause a stack overflow (and therefore failure) in your process.
Finally, C++ already has a safe and effective variable length array (std::vector<t>), so there's little reason to implement this feature for C++ code.
Let's say you create a variable sized array on the stack. The size of the stack frame needed by the function won't be known at compile time. So, C assumed that some run time environments require this to be known up front. Hence the limitation. C goes back to the early 1970's. Many languages at the time had a "static" look and feel (like Fortran)