When is heap memory prefered over stack memory - c++

I know that local arrays are created on the stack, and have automatic storage duration, since they are destroyed when the function they're in ends. They necessarily have a fixed size:
{
int foo[16];
}
Arrays created with operator new[] have dynamic storage duration and are stored on the heap. They can have varying sizes.
{
const int size = 16;
int* foo = new int[size];
// do something with foo
delete[] foo;
}
The size of the stack is fixed and limited for every process.
My question is:
Is there a rule of thumb when to switch from stack memory to heap memory, in order to reduce the stack memory consumption?
Example:
double a[2] is perfectly reasoable;
double a[1000000000] will most likely result in a stack overflow, if the stack size is 1mb
Where is a reasonable limit to switch to dynamic allocation?

See this answer for a discussion about heap allocation.
Where is a reasonable limit to switch to dynamic allocation?
In several cases, including:
too large automatic variables. As a rule of thumb, I recommend avoiding call frames of more than a few kilobytes (and a call stack of more than a megabytes). That limit might be increased if you are sure that your function is not usable recursively. On many small embedded systems, the stack is much more limited (e.g. to a few kilobytes) so you need to limit even more each call frame (e.g. to only a hundred bytes). BTW, on some systems, you can increase the call stack limit much more (perhaps to several gigabytes), but this is also a sysadmin issue.
non LIFO allocation discipline, which happens quite often.
Notice that most C++ standard containers allocate their data in the heap, even if the container is on the stack. For example, an automatic variable of vector type, e.g. a local std::vector<double> autovec; has its data heap allocated (and released when the vector is destroyed). Read more about RAII.

Related

Is result of allocation in heap interdependent to result of allocation in stack?

Let's consider this code:
static const size_t DATA_SIZE = 100000;
void log_msg(const char* msg)
{
char msg_buffer[DATA_SIZE];
// Do something...
}
int main()
{
// Do something heap-memory consuming...
unsigned char buffer = new unsigned char[DATA_SIZE];
if(!buffer)
{
log_msg("Insufficient memory!");
return 1;
}
// Go ahead...
delete[] buffer;
return 0;
}
Now, let's imagine that at the moment of allocation memory in the heap for the buffer there is no free space AND, at the same time, there is enough free space in the stack.
My question is pretty simple: will allocation in the stack for the msg_buffer be ALWAYS erroneous if allocation for the buffer in the heap is erroneous?
As far as I know, the stack is allocated for each thread and the heap -- for process. So, is there any guaranty that result of memory allocation in the stack will not correlate to result of memory allocation in the heap? Of course, I don't consider the stack overflow in itself. In other words, is the memory which reserved for the stack actually reserved for it fully? Or could there be situations when for some reason during program execution this reservation can be shrunk?
If there are no platform-independent assertions concerning this then I could know whether there are ones for the case of Linux for x86 architecture.
This is explicitly implementation dependent. By the way, the mere notion of stack and heap do not exist in standard even if they are common in real world implementation.
I can remember the good old MS/DOS systems where the allocation types could depend on the memory model. Some compilers used one single segment (SS) in small and medium models for both the stack and the heap, the stack growing from one end and the heap from the other, but used allocation from the memory above the program (so independent of stack) for compact and large models.
In the former case, if stack allocation was not possible, heap allocation was not either, but in the latter heap and stack allocation could succeed or fail independantly.
In modern OS using virtual memory like Linux, it is common to have a fixed size stack, and ask the OS for new free blocks for the heap. In that case, stack and heap allocation can succeed or fail independently

Why can't we allocate dynamic memory on the stack?

Allocating stuff on the stack is awesome because than we have RAII and don't have to worry about memory leaks and such. However sometimes we must allocate on the heap:
If the data is really big (recommended) - because the stack is small.
If the size of the data to be allocated is only known at runtime (dynamic allocation).
Two questions:
Why can't we allocate dynamic memory (i.e. memory of size that is
only known at runtime) on the stack?
Why can we only refer to memory on the heap through pointers, while memory on the stack can be referred to via a normal variable? I.e. Thing t;.
Edit: I know some compilers support Variable Length Arrays - which is dynamically allocated stack memory. But that's really an exception to the general rule. I'm interested in understanding the fundamental reasons for why generally, we can't allocate dynamic memory on the stack - the technical reasons for it and the rational behind it.
Why can't we allocate dynamic memory (i.e. memory of size that is only known at runtime) on the stack?
It's more complicated to achieve this. The size of each stack frame is burned-in to your compiled program as a consequence of the sort of instructions the finished executable needs to contain in order to work. The layout and whatnot of your function-local variables, for example, is literally hard-coded into your program through the register and memory addresses it describes in its low-level assembly code: "variables" don't actually exist in the executable. To let the quantity and size of these "variables" change between compilation runs greatly complicates this process, though it's not completely impossible (as you've discovered, with non-standard variable-length arrays).
Why can we only refer to memory on the heap through pointers, while memory on the stack can be referred to via a normal variable
This is just a consequence of the syntax. C++'s "normal" variables happen to be those with automatic or static storage duration. The designers of the language could technically have made it so that you can write something like Thing t = new Thing and just use a t all day, but they did not; again, this would have been more difficult to implement. How do you distinguish between the different types of objects, then? Remember, your compiled executable has to remember to auto-destruct one kind and not the other.
I'd love to go into the details of precisely why and why not these things are difficult, as I believe that's what you're after here. Unfortunately, my knowledge of assembly is too limited.
Why can't we allocate dynamic memory (i.e. memory of size that is only known at runtime) on the stack?
Technically, this is possible. But not approved by the C++ standard. Variable length arrays(VLA) allows you to create dynamic size constructs on stack memory. Most compilers allow this as compiler extension.
example:
int array[n];
//where n is only known at run-time
Why can we only refer to memory on the heap through pointers, while memory on the stack can be referred to via a normal variable? I.e. Thing t;.
We can. Whether you do it or not depends on implementation details of a particular task at hand.
example:
int i;
int *ptr = &i;
We can allocate variable length space dynamically on stack memory by using function _alloca. This function allocates memory from the program stack. It simply takes number of bytes to be allocated and return void* to the allocated space just as malloc call. This allocated memory will be freed automatically on function exit.
So it need not to be freed explicitly. One has to keep in mind about allocation size here, as stack overflow exception may occur. Stack overflow exception handling can be used for such calls. In case of stack overflow exception one can use _resetstkoflw() to restore it back.
So our new code with _alloca would be :
int NewFunctionA()
{
char* pszLineBuffer = (char*) _alloca(1024*sizeof(char));
…..
// Program logic
….
//no need to free szLineBuffer
return 1;
}
Every variable that has a name, after compilation, becomes a dereferenced pointer whose address value is computed by adding (depending on the platform, may be "subtracting"...) an "offset value" to a stack-pointer (a register that contains the address the stack actually is reaching: usually "current function return address" is stored there).
int i,j,k;
becomes
(SP-12) ;i
(SP-8) ;j
(SP-4) ;k
To let this "sum" to be efficient, the offsets have to be constant, so that they can be encode directly in the instruction op-code:
k=i+j;
become
MOV (SP-12),A; i-->>A
ADD A,(SP-8) ; A+=j
MOV A,(SP-4) ; A-->>k
You see here how 4,8 and 12 are now "code", not "data".
That implies that a variable that comes after another requires that "other" to retain a fixed compile-time defined size.
Dynamically declared arrays can be an exception, but they can only be that last variable of a function. Otherwise, all the variables that follows will have an offset that have to be adjusted run-time after that array allocation.
This creates the complication that dereferencing the addresses requires arithmetic (not just a plain offset) or the capability to modify the opcode as variables are declared (self modifying code).
Both the solution becomes sub-optimal in term of performance, since all can break the locality of the addressing, or add more calculation for each variable access.
Why can't we allocate dynamic memory (i.e. memory of size that is only known at runtime) on the stack?
You can with Microsoft compilers using _alloca() or _malloca(). For gcc, it's alloca()
I'm not sure it's part of the C / C++ standards, but variations of alloca() are included with many compilers. If you need aligned allocation, such a "n" bytes of memory starting on a "m" byte boundary (where m is a power of 2), you can allocate n+m bytes of memory, add m to the pointer and mask off the lower bits. Example to allocate hex 1000 bytes of memory on a hex 100 boundary. You don't need to preserve the value returned by _alloca() since it's stack memory and automatically freed when the function exits.
char *p;
p = _alloca(0x1000+0x100);
(size_t)p = ((size_t)0x100 + (size_t)p) & ~(size_t)0xff;
Most important reason is that Memory used can be deallocated in any order but stack requires deallocation of memory in a fixed order i.e LIFO order.Hence practically it would be difficult to implement this.
Virtual memory is a virtualization of memory, meaning that it behaves as the resource it is virtualizing (memory). In a system, each process has a different virtual memory space:
32-bits programs: 2^32 bytes (4 Gigabytes)
64-bits programs: 2^64 bytes (16 Exabytes)
Because virtual space is so big, only some regions of that virtual space are usable (meaning that only some regions can be read/written just as if it were real memory). Virtual memory regions are initialized and made usable through mapping. Virtual memory does not consume resources and can be considered unlimited (for 64-bits programs) BUT usable (mapped) virtual memory is limited and use up resources.
For every process, some mapping is done by the kernel and other by the user code. For example, before even the code start executing, the kernel maps specific regions of the virtual memory space of a process for the code instructions, global variables, shared libraries, the stack space... etc. The user code uses dynamic allocation (allocation wrappers such as malloc and free), or garbage collectors (automatic allocation) to manage the virtual memory mapping at application-level (for example, if there is no enough free usable virtual memory available when calling malloc, new virtual memory is automatically mapped).
You should differentiate between mapped virtual memory (the total size of the stack, the total current size of the heap...) and allocated virtual memory (the part of the heap that malloc explicitly told the program that can be used)
Regarding this, I reinterpret your first question as:
Why can't we save dynamic data (i.e. data whose size is only known at runtime) on the stack?
First, as other have said, it is possible: Variable Length Arrays is just that (at least in C, I figure also in C++). However, it has some technical drawbacks and maybe that's the reason why it is an exception:
The size of the stack used by a function became unknown at compile time, this adds complexity to stack management, additional register (variables) must be used and it may impede some compiler optimizations.
The stack is mapped at the beginning of the process and it has a fixed size. That size should be increased greatly if variable-size-data is going to be placed there by default. Programs that do not make extensive use of the stack would waste usable virtual memory.
Additionally, data saved on the stack must be saved and deleted in Last-In-First-Out order, which is perfect for local variables within functions but unsuitable if we need a more flexible approach.
Why can we only refer to memory on the heap through pointers, while memory on the stack can be referred to via a normal variable?
As this answer explains, we can.
Read a bit about Turing Machines to understand why things are the way they are. Everything was built around them as the starting point.
https://en.wikipedia.org/wiki/Turing_machine
Anything outside of this is technically an abomination and a hack.

unable to initialize double array

I have 4 gig RAM and the following lines that throws a stackoverflow exception:
int main()
{
double X[4096*512], Y[4096*512], Z[4096*512];
return 0;
}
each double takes 8 bytes space, so my three arrays should be 3*4096*512*8/1024/1024 = 48 Mbyte big, can somebody explain the error or is 48 Mbyte too much to handle?
You are declaring in the stack, normally the stack in OS are limited (eg: 1MB), you could expanded when compiling (eg: in GCC use -Wl,stack_size,134217728 128Mb) but don't recommend.
Better use std::vector<double>.
#include <vector>
int main() {
std::vector<double> X(4096*512), Y(4096*512), Z(4096*512);
return 0;
}
If you want to avoid the overhead of std::vector, you can allocate the arrays on the heap
double *myArray = new double[SIZE];
and remember to free them.
delete [] myArray;
Expanding my comment (not a thorough description, you should do more research), there are two types of memory, the "stack" and the "heap". The stack is the local working memory of a piece of software, while the heap is the big pool. The stack holds all the local variables of a function call. Anything local declared within the function will be stored on the stack. The stack had a nifty property that it can be "pushed": when we call the next function, we go further down the stack and start over. But this means that the amount of stack memory needs to be generally reserved for the lifetime of a program, so we limit it to a small amount: general 1 to a few megabytes.
When you run out of stack memory, you get a "Stack Overflow" ("Now thats what that means!")
The heap is kindof the rest of the memory. The heap is where the program stores any dynamic memory and (possibly, it can be more complicated) global variables. The heap is where things like malloc and new puts its memory.
Whenever you declare a local variable, it is stored on the stack. This isn't a problem for small variables, but arrays like the ones you have get HUGE easy.
If you don't want to worry about new or malloc you can use things like std::vector, which will put the large amount of data on the heap while giving you local variable semantics.
Again, this is "basic programming" so you should get really familiar with this subject.

Increase stack size to use alloca()?

This is two questions overlapping- I wish to try out alloca() for large arrays instead of assigning dynamically-sized arrays on the heap. This is so that I can increase performance without having to make heap allocations. However, I get the impression stack sizes are usually quite small? Are there any disadvantages to increasing the size of my stack so that I can take full advantage of alloca()? Is it The more RAM I have, the larger I can proportionally increase my stack size?
EDIT1: Preferably Linux
EDIT2: I don't have a specified size in mind- I would rather know how to judge what determines the limit/boundaries.
Stack sizes are (by default) 8MB on most unix-y platforms, and 1MB on Windows (namely, because Windows has a deterministic way for recovering from out of stack problems, while unix-y platforms usually throw a generic SIGSEGV signal).
If your allocations are large, you won't see much of a performance difference between allocating on the heap versus allocating on the stack. Sure, the stack is slightly more efficient per allocation, but if your allocations are large the number of allocations is likely to be small.
If you want a larger stack-like structure you can always write your own allocator which obtains a large block from malloc and then handles allocation/deallocation in a stack-like fashion.
#include <stdexcept>
#include <cstddef>
class StackLikeAllocator
{
std::size_t usedSize;
std::size_t maximumSize;
void *memory;
public:
StackLikeAllocator(std::size_t backingSize)
{
memory = new char[backingSize];
usedSize = 0;
maximumSize = backingSize;
}
~StackLikeAllocator()
{
delete[] memory;
}
void * Allocate(std::size_t desiredSize)
{
// You would have to make sure alignment was correct for your
// platform (Exercise to the reader)
std::size_t newUsedSize = usedSize + desiredSize;
if (newUsedSize > maximumSize)
{
throw std::bad_alloc("Exceeded maximum size for this allocator.");
}
void* result = static_cast<void*>(static_cast<char*>(memory) + usedSize);
usedSize = newUsedSize;
return result;
}
// If you need to support deallocation then modifying this shouldn't be
// too difficult
}
The default stack size that the main thread of a program gets is a compiler-specific (and/or OS-specific) thing and you should see the appropriate documentation to find out how to enlarge the stack.
It may happen that you may be unable to enlarge the program's default stack to an arbitrarily large size.
You may, however, as it's been pointed out, be able to create a thread at run time with a stack of the size you want.
In any event, there's not much benefit of alloca() over a once allocated large buffer. You don't need to free and reallocate it many times.
The most important difference between alloca() and new / malloc() is that all memory allocated with alloca() will be gone when you return from the current function.
alloca() is only useful for small temporary data structures.
It is only useful for small data structures since big data structures will destroy the cache locality of your stack which will give you a rather big performance hit. The same goes for arrays as local variables.
Use alloca() only in very specific circumstances. If unsure, do not use it at all.
The general rule is: Do not put big data structures (>= 1k) on the stack. The stack does not scale. It is a very limited resource.
To answer the first question: The stack size is typically small relative to the heap size (this would hold true in most Linux applications).
If the allocations you are planning are large relative to the actual default stack size, then I think it would be better to use dynamic allocation from the heap (rather than trying to increase stack sizes). The cost of using the memory (filling it, reading it, manipulating it) is probably going to far exceed the cost of the allocation. It is unlikely that you would see a measurable benefit by allocating from the stack in this scenario.

Stack overflow - static memory vs. dynamic memory

If you write int m[1000000]; inside the main function of C/C++, it will get a runtime error for stack overflow. Instead if you write vector<int> m; and then push_back 1000000 elements there, it will run fine.
I am very curious about why this is happening. They both are local memory, aren't they? Thanks in advance.
Yes, the vector itself is an automatic (stack) object. But the vector holds a pointer to its contents (an internal dynamic array), and that will be allocated on the heap (by default). To simplify a little, you can think of vector as doing malloc/realloc or new[] calls internally (actually it uses an allocator).
EDIT: As I noted, automatic variables are allocated on the stack, while malloc generally allocates on the heap. The available memory for each are platform and even configuration-specific, but the available stack memory is typically much more limited.
The amount of stack memory is limited, because it has to be reserved in advance. However, the amount of heap memory will typically expend up to much higher limits imposed by your OS, but "nearly" reaching the limits of your virtual address space (2GB for a 32-bit machine, a whole lot more for a 64-bit machine).
You can increase the amount of reserved stack space, typically as a setting to your linker.
int m[1000000] - It will allocate the 1000000 ints on stack. as stack is limited so it will throw the stack overflow runtime error.
vector m; and then push_back of 1000000 elements are working because internally vector allocates the memory on heap not on stack. so in your application stack only the vector object is present so it is not throwing the stack overflow runtime error.
The vector object itself is on the stack; but internally it will allocate memory from the heap as needed to store an arbitary number of elements. So the stack cost is small and fixed for it.