What is more efficient stack memory or heap? [duplicate] - c++

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
C++ Which is faster: Stack allocation or Heap allocation
What is more efficient from memory allocation perspective - stack memory or heap memory? What it depends on?
Obviously there is an overhead of dynamic allocation versus allocation on the stack. Using heap involves finding a location where the memory can be allocated and maintaining structures. On the stack it is simple as you already know where to put the element. I would like to understand what is the overhead in worst case in milliseconds on supporting structures that allow for dynamic allocation?

Stack is usually more efficient speed-wise, and simple to implement!
I tend to agree with Michael from Joel on Software site, who says,
It is more efficient to use the stack
when it is possible.
When you allocate from the heap, the
heap manager has to go through what is
sometimes a relatively complex
procedure, to find a free chunk of
memory. Sometimes it has to look
around a little bit to find something
of the right size.
This is not normally a terrible amount
of overhead, but it is definitely more
complex work compared to how the stack
functions. When you use memory from
the stack, the compiler is able to
immediately claim a chunk of memory
from the stack to use. It's
fundamentally a more simple procedure.
However, the size of the stack is
limited, so you shouldn't use it for
very large things, if you need
something that is greater than
something like 4k or so, then you
should always grab that from the heap
instead.
Another benefit of using the stack is
that it is automatically cleaned up
when the current function exits, you
don't have to worry about cleaning it
yourself. You have to be much more
careful with heap allocations to make
sure that they are cleaned up. Using
smart pointers that handle
automatically deleting heap
allocations can help a lot with this.
I sort of hate it when I see code that
does stuff like allocates 2 integers
from the heap because the programmer
needed a pointer to 2 integers and
when they see a pointer they just
automatically assume that they need to
use the heap. I tend to see this with
less experienced coders somewhat -
this is the type of thing that you
should use the stack for and just have
an array of 2 integers declared on the
stack.
Quoted from a really good discussion at Joel on Software site:
stack versus heap: more efficiency?

Allocating/freeing on the stack is more "efficient" because it just involves incrementing/decrementing a stack pointer, typically, while heap allocation is generally much more complicated. That said, it's generally not a good idea to have huge things on your stack as stack space is far more limited than heap space on most systems (especially when multiple threads are involved as each thread has a separate stack).

These two regions of memory are optimized for different use cases.
The stack is optimized for the case where objects are deallocated in a FIFO order - that is, newer objects are always allocated before older objects. Because of this, memory can be allocated and deallocated quickly by simply maintaining a giant array of bytes, then handing off or retracting the bytes at the end. Because the the memory needed to store local variables for function calls is always reclaimed in this way (because functions always finish executing in the reverse order from which they were called), the stack is a great place to allocate this sort of memory.
However, the stack is not good at doing other sorts of allocation. You cannot easily deallocate memory allocated off the stack that isn't the most-recently allocated block, since this leads to "gaps" in the stack and complicates the logic for determining where bytes are available. For these sorts of allocations, where object lifetime can't be determined from the time at which the object is allocated, the heap is a better place to store things. There are many ways to implement the heap, but most of them rely somehow on the idea of storing a giant table or linked list of the blocks that are allocated in a way that easily allows for locating suitable chunks of memory to hand back to clients. When memory is freed, it is then added back in to the table or linked list, and possibly some other logic is applied to condense the blocks with other blocks. Because of this overhead from the search time, the heap is usually much, much slower than the stack. However, the heap can do allocations in patterns that the stack normally is not at all good at, hence the two are usually both present in a program.
Interestingly, there are some other ways of allocating memory that fall somewhere in-between the two. One common allocation technique uses something called an "arena," where a single large chunk of memory is allocated from the heap that is then partitioned into smaller blocks, like in the stack. This gives the benefit that allocations from the arena are very fast if allocations are sequential (for example, if you're going to allocate a lot of small objects that all live around the same length), but the objects can outlive any particular function call. Many other approaches exist, and this is just a small sampling of what's possible, but it should make clear that memory allocation is all about tradeoffs. You just need to find an allocator that fits your particular needs.

Stack is much more efficient, but limited in size. I think it's something like 1MByte.
When allocating memory on the heap, I keep in mind the figure 1000. Allocating on the heap is something like 1000 times slower than on the stack.

Related

C++ Should I allocate large structs on the heap or the stack?

At what struct size should I consider allocating on the heap / free store using new keyword (or any other method of dynamic allocation) instead of on the stack?
10 bytes? 20 bytes? 200 bytes? 2KB? 2MB? Never?
Even if I wanted to pass it around by pointer, I could still take a reference from the stack variable. I understand that a stack variable will disappear at the end of the scope and dynamically allocated variables will not. I can deal with that either way, but so far I've not found any guidance for when to allocate dynamically. Sure, avoid stack overflow by not putting too much on the stack... but how much is too much?
Any guidance would be appreciated.
To actually answer the question, you'll need to know:
How big the stack is. This is often configurable at a compile-time, but may be capped by the target platform.
What is on the stack already. This knowledge is obtainable either by using deterministic call graph or by making decision actively, based on the current value of the stack pointer.
Without all of the above, any passive decision would be a gamble. Which also means that it's a gamble by default — indeed, in most cases we have to trust the compiler developers to understand how much of a stack space a "typical" program would need, and that our views on "typical" programs do align well and often.
In the long term, just like with any optimization problem, put your bets on measuring the overall performance and testing edge cases that may cause the stack overflow.
(Note. I probably should have searched before answering, but this question is essentially a duplicate of How much stack usage is too much?, nevertheless, here is my opinion on it.)
If you intend to keep a large buffer around for an extended period of time, then you should allocate it on the heap.
If you are in a recursive function, then allocating large buffers on the stack can quickly lead to problems.
Personally, I would keep buffers below ~4KiB on the stack and allocate larger buffers on the heap, unless you have a good overview of your program, and more specifically, how and where your functions are called.
That being said, if you constantly create and destroy buffers, consider putting them on the stack.
(If you are working on an embedded system, then that's a very different story.)

Is it a good idea to extend the size of the stack for large local data processing?

My environment is gcc, C++, Linux.
When my application does some data calculation, it may need a "large" (may be a few MBs) number of memory to store data, calculation results and other things. I got some code using kind of new, delete to finish this. Since there is no ownership outside some function scope, i think all these memory can be allocated in stack.
The problem is, the default stack size(8192Kb in my system) may not be enough. I may need to change stack size for these stack allocation. Morever, if the calculation needs more data in future, i may need to extend stack size again.
So is it an option to extend stack size? Since it cannot be allocated for specific functions, how will it impact on the whole app? Is it REALLY an improvement to allocate data on stack instead of on heap?
You bring up a controversial question that does not have a direct answer. There are pros and cons on each side. In particular:
Memory on the heap is more easy to control: you can check the return value or allow throwing exceptions. When the stack overflows, your thread is simply unloaded with a good change that debugger will not show anything meaningful.
On the contrary, stack allocations happen automatically, and you do not need to do anything specific. I always favor this simplicity.
There is nothing fundamentally wrong in allocating large amounts of data on the stack. At the end of the day any type of memory is finally a memory. This is means that the total amount of required memory is what really matters. Where this memory is allocated is less important. When there is enough memory for your application to work, there is no difference where the memory is allocated. It can be static for example.
Different systems have different rules of allocation. This means that final decision may depend on the actual system.
While it's true that stack allocations are more efficient (more apparent in multi-threaded programs), but if the usage pattern in your case is "allocate a big chunk of memory, process the data, deallocate it", then there won't be much of improvement.
Instead rewrite the code to use RAII, e.g. std::vector or std::unique_ptr so there won't be explicit buggy deletes.
If you use Linux, you can change the stack size with the ulimit command. However, I think the memory which allocated from the heap also is good for you.

Dealing with fragmentation in a memory pool?

Suppose I have a memory pool object with a constructor that takes a pointer to a large chunk of memory ptr and size N. If I do many random allocations and deallocations of various sizes I can get the memory in such a state that I cannot allocate an M byte object contiguously in memory even though there may be a lot free! At the same time, I can't compact the memory because that would cause a dangling pointer on the consumers. How does one resolve fragmentation in this case?
I wanted to add my 2 cents only because no one else pointed out that from your description it sounds like you are implementing a standard heap allocator (i.e what all of us already use every time when we call malloc() or operator new).
A heap is exactly such an object, that goes to virtual memory manager and asks for large chunk of memory (what you call "a pool"). Then it has all kinds of different algorithms for dealing with most efficient way of allocating various size chunks and freeing them. Furthermore, many people have modified and optimized these algorithms over the years. For long time Windows came with an option called low-fragmentation heap (LFH) which you used to have to enable manually. Starting with Vista LFH is used for all heaps by default.
Heaps are not perfect and they can definitely bog down performance when not used properly. Since OS vendors can't possibly anticipate every scenario in which you will use a heap, their heap managers have to be optimized for the "average" use. But if you have a requirement which is similar to the requirements for a regular heap (i.e. many objects, different size....) you should consider just using a heap and not reinventing it because chances are your implementation will be inferior to what OS already provides for you.
With memory allocation, the only time you can gain performance by not simply using the heap is by giving up some other aspect (allocation overhead, allocation lifetime....) which is not important to your specific application.
For example, in our application we had a requirement for many allocations of less than 1KB but these allocations were used only for very short periods of time (milliseconds). To optimize the app, I used Boost Pool library but extended it so that my "allocator" actually contained a collection of boost pool objects, each responsible for allocating one specific size from 16 bytes up to 1024 (in steps of 4). This provided almost free (O(1) complexity) allocation/free of these objects but the catch is that a) memory usage is always large and never goes down even if we don't have a single object allocated, b) Boost Pool never frees the memory it uses (at least in the mode we are using it in) so we only use this for objects which don't stick around very long.
So which aspect(s) of normal memory allocation are you willing to give up in your app?
Depending on the system there are a couple of ways to do it.
Try to avoid fragmentation in the first place, if you allocate blocks in powers of 2 you have less a chance of causing this kind of fragmentation. There are a couple of other ways around it but if you ever reach this state then you just OOM at that point because there are no delicate ways of handling it other than killing the process that asked for memory, blocking until you can allocate memory, or returning NULL as your allocation area.
Another way is to pass pointers to pointers of your data(ex: int **). Then you can rearrange memory beneath the program (thread safe I hope) and compact the allocations so that you can allocate new blocks and still keep the data from old blocks (once the system gets to this state though that becomes a heavy overhead but should seldom be done).
There are also ways of "binning" memory so that you have contiguous pages for instance dedicate 1 page only to allocations of 512 and less, another for 1024 and less, etc... This makes it easier to make decisions about which bin to use and in the worst case you split from the next highest bin or merge from a lower bin which reduces the chance of fragmenting across multiple pages.
Implementing object pools for the objects that you frequently allocate will drive fragmentation down considerably without the need to change your memory allocator.
It would be helpful to know more exactly what you are actually trying to do, because there are many ways to deal with this.
But, the first question is: is this actually happening, or is it a theoretical concern?
One thing to keep in mind is you normally have a lot more virtual memory address space available than physical memory, so even when physical memory is fragmented, there is still plenty of contiguous virtual memory. (Of course, the physical memory is discontiguous underneath but your code doesn't see that.)
I think there is sometimes unwarranted fear of memory fragmentation, and as a result people write a custom memory allocator (or worse, they concoct a scheme with handles and moveable memory and compaction). I think these are rarely needed in practice, and it can sometimes improve performance to throw this out and go back to using malloc.
write the pool to operate as a list of allocations, you can then extended and destroyed as needed. this can reduce fragmentation.
and/or implement allocation transfer (or move) support so you can compact active allocations. the object/holder may need to assist you, since the pool may not necessarily know how to transfer types itself. if the pool is used with a collection type, then it is far easier to accomplish compacting/transfers.

Stack Memory vs Heap Memory [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What and where are the stack and heap
I am programming in C++ and I am always wondering what exactly is stack memory vs heap memory. All I know is when I call new, I would get memory from heap. If if create local variables, I would get memory from stack. After some research on internet, the most common answer is stack memory is temporary and heap memory is permanent.
Is stack and heap memory model a concept of operating system or computer architecture? So some of it might not follow stack and heap memory model or all of them follow it?
Stack and heap memory is the abstraction over the memory model of the virtual memory ( which might swap memory between disk and RAM). So both stack and heap memory physically might be RAM or the disk? Then what is the reason where heap allocation seems to be slower than the stack counterpart?
Also, the main program would be run in the stack or a heap?
Also, what would happen if a process run out of the stack memory or heap memory allocated?
Thanks
In C++ the stack memory is where local variables get stored/constructed. The stack is also used to hold parameters passed to functions.
The stack is very much like the std::stack class: you push parameters onto it and then call a function. The function then knows that the parameters it expects can be found on the end of the stack. Likewise, the function can push locals onto the stack and pop them off it before returning from the function. (caveat - compiler optimizations and calling conventions all mean things aren't this simple)
The stack is really best understood from a low level and I'd recommend Art of Assembly - Passing Parameters on the Stack. Rarely, if ever, would you consider any sort of manual stack manipulation from C++.
Generally speaking, the stack is preferred as it is usually in the CPU cache, so operations involving objects stored on it tend to be faster. However the stack is a limited resource, and shouldn't be used for anything large. Running out of stack memory is called a Stack buffer overflow. It's a serious thing to encounter, but you really shouldn't come across one unless you have a crazy recursive function or something similar.
Heap memory is much as rskar says. In general, C++ objects allocated with new, or blocks of memory allocated with the likes of malloc end up on the heap. Heap memory almost always must be manually freed, though you should really use a smart pointer class or similar to avoid needing to remember to do so. Running out of heap memory can (will?) result in a std::bad_alloc.
Stack memory is specifically the range of memory that is accessible via the Stack register of the CPU. The Stack was used as a way to implement the "Jump-Subroutine"-"Return" code pattern in assembly language, and also as a means to implement hardware-level interrupt handling. For instance, during an interrupt, the Stack was used to store various CPU registers, including Status (which indicates the results of an operation) and Program Counter (where was the CPU in the program when the interrupt occurred).
Stack memory is very much the consequence of usual CPU design. The speed of its allocation/deallocation is fast because it is strictly a last-in/first-out design. It is a simple matter of a move operation and a decrement/increment operation on the Stack register.
Heap memory was simply the memory that was left over after the program was loaded and the Stack memory was allocated. It may (or may not) include global variable space (it's a matter of convention).
Modern pre-emptive multitasking OS's with virtual memory and memory-mapped devices make the actual situation more complicated, but that's Stack vs Heap in a nutshell.
It's a language abstraction - some languages have both, some one, some neither.
In the case of C++, the code is not run in either the stack or the heap. You can test what happens if you run out of heap memory by repeatingly calling new to allocate memory in a loop without calling delete to free it it. But make a system backup before doing this.

When do you overload operator new? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Any reason to overload global new and delete?
In what cases does it make perfect sense to overload operator new?
I heard you do it in a class that is very often allocated using new. Can you give an example?
And are there other cases where you would want to overload operator new?
Update: Thanks for all the answers so far. Could someone give a short code example? That's what I meant when I asked about an example above. Something like: Here is a small toy class, and here is a working operator new for that class.
Some reasons to overload per class
1. Instrumentation i.e tracking allocs, audit trails on caller such as file,line,stack.
2. Optimised allocation routines i.e memory pools ,fixed block allocators.
3. Specialised allocators i.e contiguous allocator for 'allocate once on startup' objects - frequently used in games programming for memory that needs to persist for the whole game etc.
4. Alignment. This is a big one on most non-desktop systems especially in games consoles where you frequently need to allocate on e.g. 16 byte boundaries
5. Specialised memory stores ( again mostly non-desktop systems ) such as non-cacheable memory or non-local memory
The best case I found was to prevent heap fragmentation by providing several heaps of fixed-sized blocks. ie, you create a heap that entirely consists of 4-byte blocks, another with 8 byte blocks etc.
This works better than the default 'all in one' heap because you can reuse a block, knowing that your allocation will fit in the first free block, without having to check or walk the heap looking for a free space that's the right size.
The disadvantage is that you use up more memory, if you have a 4-byte heap and a 8-byte heap, and want to allocate 6 bytes.. you're going to have to put it in the 8-byte heap, wasting 2 bytes. Nowadays, this is hardly a problem (especially when you consider the overhead of alternative schemes)
You can optimise this, if you have a lot of allocations to make, you can create a heap of that exact size. Personally, I think wasting a few bytes isn't a problem (eg you are allocating a lot of 7 bytes, using an 8-byte heap isn't much of a problem).
We did this for a very high performance system, and it worked wonderfully, it reduced our performance issues due to allocations and heap fragmentation dramatically and was entirely transparent to the rest of the code.
Some reasons to overload operator new
Tracking and profiling of memory, and to detect memory leaks
To create object pools (say for a particle system) to optimize memory usage
You do it in cases you want to control allocation of memory for some object. Like when you have small objects that would only be allocated on heap, you could allocate space for 1k objects and use the new operator to allocate 1k objects first, and to use the space until used up.