C++ When to allocate on heap vs stack? - c++

Whilst asking another question (and also before) I was wondering how do I judge whether to create an object on the heap or keep it as an object on the stack? What should I ask myself about the object to make the correct allocation?

Put it on the heap if you have to, the stack if you can.
What kinds of things do you need to put on the heap? Anything of varying length. Any object that might need to be null. Anything that's very large, lest you cause a stack overflow.

Simple answer.
When it goes out of scope, do you want it to hang around and be able to use it?

Depends on intended lifetime of the object.
If you want the object to be alive even after function returns, then HEAP, else STACK
If an object is placed in the HEAP, then it must be explicitly free()'ed or deleted by the programmer, once its usage is over; otherwise the program will be leaking memory.

Stack memory is fast. It is fast because (a) there is no system overhead to allocate the memory - the allocation is done by simply moving the stack pointer in one instruction and (b) the memory in the stack is "hot" so it is already in cache. Heap memory is slow because (a) it requires a lot of system work to look around and find a free chunk of memory and (b) is probably not in cache and will require evicting some data you might have wanted.
Stack memory doesn't get fragmented. It is possible that a heap eventually gets so fragmented, you can't allocate anything (even though ironically there is still enough unused memory!)
For long lived data and for large data (multi KB or more), you have to use a heap.
The danger of allocating a bigger stack is that it might hurt you if are running multiple threads. You have to size the stack for the "worst case" usage. Each thread requires its own stack. On a high core count machine (where you might have 200+ threads running), you may not want to arbitrarily increase the stack. The heap on the other hand does not need to be sized for "worst case" usage - it is much more efficient.

Two reasons to use the heap:
1- You want the data after the current scope.
2- You want to reserve large memory.
Other than that stay on stack.
Note: don't reserve a lot of memory on the stack, or you'll get a "Stack-overflow" ;)

Related

Performance and security in C++ when avoiding use of pointer

I'm trying to create a class in C++ with an idea of absolute encapsulation and efficiency for the sake of practice. In my case this means every data member is supposed to be inside the class with no pointers pointing outside (e.g. to dynamically allocated storage).
For example, I'm using
char name [10];
instead of
std::string name;
char* name;
My idea is that objects of the class are created as completely enclosed blocks on the stack. As well as that performance is increased, since, if I remember correctly, access to the stack is considerably faster than to the heap.
Am I correct in those assumptions?
And is this idea of absolute encapsulation sensible outside practice? (For example to ensure safety, since there seems to be no risk of memory mismanagement or buffer overflow)
access to the stack is considerably faster than to the heap
This is false: an access to memory is an access to memory. Two things might have confused you here.
First, it is true that different types of memory can be accessed at different speeds. For example, the disk is usually the slowest (without talking about networking, which complicates things even further), while registers are usually the fastest. In between is the main memory, or RAM, where both the stack and the heap live. And then you can have caches, different types of disks, and so on.
Second, stack allocation is indeed faster than heap allocation, just because the allocation scheme is simpler. With the stack, as the name implies, you can only allocate and deallocate at the end, meaning you need to follow a specific order. With the heap, you can allocate pretty much anywhere, meaning that you can deallocate at any point and in any order. This implies some kind of management of the memory that comes with its own problems, for example fragmentation.
is this idea of absolute encapsulation sensible outside practice?
First of all, only using the stack is impossible in practice simply because of its limited size. While this size can vary in practice, it's unlikely to be more than 8MB currently. As soon as you need to load a file larger than that, you cannot do it on the stack.
However, even if stack size was practically unlimited, you still need to deallocate things in the reverse order that you allocated them, otherwise it no longer is a stack. Many things are infeasible that way. For example, as soon as you want interactivity, you need some sort of event processing (to respond to user input), and this is usually done with a queue, which is like the opposite of a stack. Sure you could allocate an insanely large queue, but that's infeasible in practice. Another example that comes to mind is networking. If you want to deal with multiple connections at once (like a web browser for example), you need to deal with the memory associated to each one independantly. Again, you could allocate an insane amount of memory to each connection, but again, that's infeasible in practice.
Also, note that encapsulation does not mean "no pointers to dynamically allocated memory". Instead, "hidden memory management" would be closer to the meaning of this concept.

Find which heap an address belongs to?

I'm creating a memory management system and i need a way to find in which heap an allocation I make is.
for example i use HeapAlloc and use the heap returned by GetProcessHeap() as the heap to allocate to I would expect it to allocate to that heap, but appears as though it doesn't.
When I use GetProcessHeaps to run through the heaps i find that the process heap is at something like 0x00670000 and my allocated address is at like 0x0243a385 or something. (in other words nowhere near it)
And sometimes it can actually be before it (so like 0x004335ab or something)
So, i'd like to know if there is a way I can reliably get the starting address of the heap (and the end address if at all possible!?) that i made the allocation in.
Your understanding of heaps is wrong. In general, modern heaps do not rely on allocating a large chunk of data and then parcelling it up with each allocation as you assume (although they may use this as one of their strategies). This means there is no well defined 'start' or 'end' of a heap. As an example, by default, with Windows heaps large allocations always go direct to the operating system via VirtualAlloc(...) which means that allocations from one heap may interleave with allocations from another.
If you really need to work out which heap an allocation comes from, there is a way, although its really slow so you shouldn't rely on it except for debugging or logging or similar. For actual, normal, code you should really know where allocations came from either via deduced context or by actually storing it.
Warnings aside, you can use HeapWalk to enumerate all allocations from each heap looking for the one you want.

Is it ok to allocate lots of memory on stack in single threaded applications?

I understand that if you have a multithreaded application, and you need to allocate a lot of memory, then you should allocate on heap. Stack space is divided up amongst threads of your application, thus the size of stack for each thread gets smaller as you create new threads. Thus, if you tried to allocate lots of memory on stack, it could overflow. But, assuming that you have a single-threaded application, is the stack size essentially the same as that for heap?
I read elsewhere that stack and heap don't have a clearly defined boundary in the address space, rather that they grow into each other.
P.S. Lifetime of the objects being allocated is not an issue. The objects gets created first thing in the program, and gets cleaned at exit. I don't have to worry about it going out of scope, and thus getting cleaned from stack space.
No, stack size is not the same as heap. Stack objects get pushed/popped in a LIFO manner, and used for things such as program flow. For example, arguments are "pushed" into the stack before a function call, then "popped" into function arguments to be accessed. Recursion therefore, uses a lot of stack space if you go too deep. Heap is really for pointers and allocated memory. In the real world, the stack is like the gears in your clock, and the heap is like your desk. Your clock sits on your desk, because it takes up room - but you use it for something completely different than your desk.
Check out this question on Stack Overflow:
Why is memory split up into stack and heap?'

Find huge blocks of allocated memory

I have a program (daemon) that is written in c/c++. It runs flawlessly, but after some period of time( it can be 5 days, week, 2 weeks ) it becomes to allocate a lot of megabytes of memory. I can't understand what parts of code do not free allocated memory. At startup memory usage is about 20-30 megabytes. Then after some period, or maybe event, it grows slowly about 1Mb per hour, and if not terminated can crash because no memory is available.
I've tried to use Valgrind and did shutdown the daemon in usual way when it has already allocated about 500Mb of memory. Shutdown process was really long, but when it finished Valgrind said no memory leaks were found, except for mysql_init/mysql_close procedures(about 504bytes are definetly lost). Google says not to worry about this Mysql leak, and gives some reasons why memory diagnostic tools like Valgrind think that it is a leak.
I don't really know what parts of code allocate memory but free it only on program shutdown. Help me to find out this
Valgrind only detects pointers that aren't deleted, more or less. Keeping them around when you don't need them is a different problem.
Firstly, all objects and memory are freed at shutdown. If there's a leak, valgrind will detect it as memory not referenced by an object, etc. Any leaks however are freed by the operating system in the end.
If you're catching all exceptions (...) and not doing anything with them, well, don't do that. It's a common cause.
Secondly, a logfile of destructors that are called during shutdown might be helpful. Perhaps at the end of main(), set a global flag; any destructors called while that flag is set can output that they exist. See if there are lots of objects that shouldn't be there.
A bit easier, you can use a global variable, each ctor can increment it by 1, and dtor decrement by 1. If you find that the number of objects isn't staying relatively the same, you can investigate which ones are making the problem using similar techniques.
Thirdly, use Boost and its scoped smart pointers to help, but do not rely on smart pointers as the holy grail.
There is a possible underlying issue that I have come across. For long-running programs, memory fragmentation can lead to large memory usage. You may delete a 1mb object, then try to create a 2mb object; the creation will be in new space because that 1mb 'free chunk' is not big enough. Then when you make a 512kb object it may go into that 1mb object's space, only using 1/2 of available space, but making it so that your next 1mb object needs to be allocated in big space.
Unfortunately this problem can become bad, due to small objects being allocated in persistent places. There may be, say, 50-byte classes 300kb apart in memory, and like 100 of them, but no 512kb objects can be allocated in that space, so it allocates an additional 512kb for each new object, effectively wasting 90% of actual 'free' space even though your program owns more than enough already.
This problem is hard to track down as the definite cause, but if you examine your program's flow, look for small allocations. Remember std::list/vector/etc. can all cause this; if you're looking to make a daemon that does lots of memory ops run for weeks, it's a good idea to pre-allocate memory using reserve(). Memory pools are even better.
Depending on the time you want to put in, you can also either make (or find) a custom memory allocator that will report on objects when it shuts down, too.
Try to use Valgrind Massif tool. From Massif manual:
Also, there are certain space leaks that aren't detected by
traditional leak-checkers, such as Memcheck's. That's because the
memory isn't ever actually lost -- a pointer remains to it -- but it's
not in use. Programs that have leaks like this can unnecessarily
increase the amount of memory they are using over time. Massif can
help identify these leaks.
Massif should show you what's happening with memory and where it is allocated and not freeing until shutdown.
Since you are sure, there's no memory leak, your program might be allocating memory and storing data without leaking.
For example, let's say your program uses a linked list...
struct list{
DATA_ARRAY arr; //Some data
struct *list next;
};
While(true) //infinite loop
{
// Add new nodes to list
// Store some data in the node
}
There's no leak here. But the loop adds new nodes forever and stores data and everything is perfectly valid. But memory usage increases all the time. Since you are running for 2-5 days, something like this is certainly possible.
You may have to inspect the code and free memory if no longer needed.

Is it bad to have large objects on the heap?

what are the consequences (if any) of having big objects stored on the heap rather than in the stack? I remember reading that it was preferable to have the bigger objects on the stack to limit the heap fragmentation... is that true?
thanks
edit : question comes from a game I'm making where my basic object that will have all the informations about textures, entities etc will be most likely created on the heap, I don't really have any idea of its size, we could assume something like 300 MB
Generally no.
It depends on the implementation, but on many systems the stack is much more limited in size than the heap. Heap fragmentation is typically going to be an issue if you have a large number of (small) objects allocated on the heap. It also tends to be caused by certain patterns of allocation and deallocation.
You have to keep in mind that stack is limited. The size can be configured on some environment but it also has drawbacks. If your object are short lived, they can reside on the stack but to be able to keep them for a long time, you have to create them and keep calling function and pass them as parameters because when the scope ends, your object is going out the window.
Following your edit, there's no way you're going to store an object of 300 MB on the stack.
You should decide where to put objects based on what their storage duration should be more so than what their size will be; however, as the stack is fairly limited, creating a large object on it is sometimes not a good idea and it may be necessary/more future-proof to new it and put the pointer to it in a scoped_ptr.
If you have enough big objects to cause significant heap fragmentation, or if you have an object that is so big as to be a significant factor by itself (to be honest, I'm not sure this is even possible), are you sure your design is right? Note also that your objects are likely to be smaller than the storage your containers use, and that storage (except that of std::arrays) is all dynamically allocated, i.e. on the heap.
In general, large objects should be created on the heap. The stack should generally be used only for small objects relevant to a particular stack context.