I don't quite get the point of dynamically allocated memory and I am hoping you guys can make things clearer for me.
First of all, every time we allocate memory we simply get a pointer to that memory.
int * dynInt = new int;
So what is the difference between doing what I did above and:
int someInt;
int* dynInt = &someInt;
As I understand, in both cases memory is allocated for an int, and we get a pointer to that memory.
So what's the difference between the two. When is one method preferred to the other.
Further more why do I need to free up memory with
delete dynInt;
in the first case, but not in the second case.
My guesses are:
When dynamically allocating memory for an object, the object doesn't get initialized while if you do something like in the second case, the object get's initialized. If this is the only difference, is there a any motivation behind this apart from the fact that dynamically allocating memory is faster.
The reason we don't need to use delete for the second case is because the fact that the object was initialized creates some kind of an automatic destruction routine.
Those are just guesses would love it if someone corrected me and clarified things for me.
The difference is in storage duration.
Objects with automatic storage duration are your "normal" objects that automatically go out of scope at the end of the block in which they're defined.
Create them like int someInt;
You may have heard of them as "stack objects", though I object to this terminology.
Objects with dynamic storage duration have something of a "manual" lifetime; you have to destroy them yourself with delete, and create them with the keyword new.
You may have heard of them as "heap objects", though I object to this, too.
The use of pointers is actually not strictly relevant to either of them. You can have a pointer to an object of automatic storage duration (your second example), and you can have a pointer to an object of dynamic storage duration (your first example).
But it's rare that you'll want a pointer to an automatic object, because:
you don't have one "by default";
the object isn't going to last very long, so there's not a lot you can do with such a pointer.
By contrast, dynamic objects are often accessed through pointers, simply because the syntax comes close to enforcing it. new returns a pointer for you to use, you have to pass a pointer to delete, and (aside from using references) there's actually no other way to access the object. It lives "out there" in a cloud of dynamicness that's not sitting in the local scope.
Because of this, the usage of pointers is sometimes confused with the usage of dynamic storage, but in fact the former is not causally related to the latter.
An object created like this:
int foo;
has automatic storage duration - the object lives until the variable foo goes out of scope. This means that in your first example, dynInt will be an invalid pointer once someInt goes out of scope (for example, at the end of a function).
An object created like this:
int foo* = new int;
Has dynamic storage duration - the object lives until you explicitly call delete on it.
Initialization of the objects is an orthogonal concept; it is not directly related to which type of storage-duration you use. See here for more information on initialization.
Your program gets an initial chunk of memory at startup. This memory is called the stack. The amount is usually around 2MB these days.
Your program can ask the OS for additional memory. This is called dynamic memory allocation. This allocates memory on the free store (C++ terminology) or the heap (C terminology). You can ask for as much memory as the system is willing to give (multiple gigabytes).
The syntax for allocating a variable on the stack looks like this:
{
int a; // allocate on the stack
} // automatic cleanup on scope exit
The syntax for allocating a variable using memory from the free store looks like this:
int * a = new int; // ask OS memory for storing an int
delete a; // user is responsible for deleting the object
To answer your questions:
When is one method preferred to the other.
Generally stack allocation is preferred.
Dynamic allocation required when you need to store a polymorphic object using its base type.
Always use smart pointer to automate deletion:
C++03: boost::scoped_ptr, boost::shared_ptr or std::auto_ptr.
C++11: std::unique_ptr or std::shared_ptr.
For example:
// stack allocation (safe)
Circle c;
// heap allocation (unsafe)
Shape * shape = new Circle;
delete shape;
// heap allocation with smart pointers (safe)
std::unique_ptr<Shape> shape(new Circle);
Further more why do I need to free up memory in the first case, but not in the second case.
As I mentioned above stack allocated variables are automatically deallocated on scope exit.
Note that you are not allowed to delete stack memory. Doing so would inevitably crash your application.
For a single integer it only makes sense if you need the keep the value after for example, returning from a function. Had you declared someInt as you said, it would have been invalidated as soon as it went out of scope.
However, in general there is a greater use for dynamic allocation. There are many things that your program doesn't know before allocation and depends on input. For example, your program needs to read an image file. How big is that image file? We could say we store it in an array like this:
unsigned char data[1000000];
But that would only work if the image size was less than or equal to 1000000 bytes, and would also be wasteful for smaller images. Instead, we can dynamically allocate the memory:
unsigned char* data = new unsigned char[file_size];
Here, file_size is determined at runtime. You couldn't possibly tell this value at the time of compilation.
Read more about dynamic memory allocation and also garbage collection
You really need to read a good C or C++ programming book.
Explaining in detail would take a lot of time.
The heap is the memory inside which dynamic allocation (with new in C++ or malloc in C) happens. There are system calls involved with growing and shrinking the heap. On Linux, they are mmap & munmap (used to implement malloc and new etc...).
You can call a lot of times the allocation primitive. So you could put int *p = new int; inside a loop, and get a fresh location every time you loop!
Don't forget to release memory (with delete in C++ or free in C). Otherwise, you'll get a memory leak -a naughty kind of bug-. On Linux, valgrind helps to catch them.
Whenever you are using new in C++ memory is allocated through malloc which calls the sbrk system call (or similar) itself. Therefore no one, except the OS, has knowledge about the requested size. So you'll have to use delete (which calls free which goes to sbrk again) for giving memory back to the system. Otherwise you'll get a memory leak.
Now, when it comes to your second case, the compiler has knowledge about the size of the allocated memory. That is, in your case, the size of one int. Setting a pointer to the address of this int does not change anything in the knowledge of the needed memory. Or with other words: The compiler is able to take care about freeing of the memory. In the first case with new this is not possible.
In addition to that: new respectively malloc do not need to allocate exactly the requsted size, which makes things a bit more complicated.
Edit
Two more common phrases: The first case is also known as static memory allocation (done by the compiler), the second case refers to dynamic memory allocation (done by the runtime system).
What happens if your program is supposed to let the user store any number of integers? Then you'll need to decide during run-time, based on the user's input, how many ints to allocate, so this must be done dynamically.
In a nutshell, dynamically allocated object's lifetime is controlled by you and not by the language. This allows you to let it live as long as it is required (as opposed to end of the scope), possibly determined by a condition that can only be calculated at run-rime.
Also, dynamic memory is typically much more "scalable" - i.e. you can allocate more and/or larger objects compared to stack-based allocation.
The allocation essentially "marks" a piece of memory so no other object can be allocated in the same space. De-allocation "unmarks" that piece of memory so it can be reused for later allocations. If you fail to deallocate memory after it is no longer needed, you get a condition known as "memory leak" - your program is occupying a memory it no longer needs, leading to possible failure to allocate new memory (due to the lack of free memory), and just generally putting an unnecessary strain on the system.
Related
An experienced C++ user told me that I should strive for using heap variables, i.e.:
A* obj = new A("A");
as opposed to:
A obj("A");
Aside from all that stuff about using pointers being nice and flexible, he said it's better to put things on the heap rather than the stack (something about the stack being smaller than the heap?). Is it true? If so why?
NB: I know about issues with lifetime. Let's assume I have managed the lifetime of these variables appropriately. (i.e. the only criteria of concern is heap vs. stack storage with no lifetime concern)
Depending on the context we can consider heap or stack. Every thread gets a stack and the thread executes instructions by invoking functions. When a function is called, the function variables are pushed to stack. And when the function returns the stack rollbacks and memory is reclaimed. Now there is a size limitation for the thread local stack, it varies and can be tweaked to some extent. Considering this if every object is created on stack and the object requires large memory, then the stack space will exhaust resulting to stackoverflow error. Besides this if the object is to be accessed by multiple threads then storing such object on stack makes no sense.
Thus small variables, small objects who's size can be determine at compile time and pointers should be stored on stack. The concern of storing objects on heap or free store is, memory management becomes difficult. There are chances of memory leak, which is bad. Also if application tries to access an object which is already deleted, then access violation can happen which can cause application crash.
C++11 introduces smart pointers (shared, unique) to make memory management with heap easier. The actual referenced object is on heap but is encapsulation by the smart pointer which is always on the stack. Hence when the stack rollbacks during function return event or during exception the destructor of smart pointer deletes the actual object on heap. In case of shared pointer the reference count is maintained and the actually object is deleted when the reference count is zero.
http://en.wikipedia.org/wiki/Smart_pointer
There are no general rules regarding use of stack allocated vs heap allocated variables. There are only guidelines, depending on what you are trying to do.
Here are some pros and cons:
Heap Allocation:
Pros:
more flexible - in case you have a lot of information that is not available at compile-time
bigger in size - you can allocate more - however, it's not infinite, so at some point your program might run out of memory if allocations/deallocations are not handled correctly
Cons:
slower - dynamic allocation is usually slower than stack allocation
may cause memory fragmentation - allocating and deallocating objects of different sizes will make the memory look like Swiss cheese :) causing some allocations to fail if there is no memory block of the required size available
harder to maintain - as you know each dynamic allocation must be followed by a deallocation, which should be done by the user - this is error prone as there are a lot of cases where people forget to match every malloc() call with a free() call or new() with delete()
Stack allocation:
Pros:
faster - which is important mostly on embedded systems (I believe that for embedded there is a MISRA rule which forbids dynamic allocation)
does not cause memory fragmentation
makes the behavior of applications more deterministic - e.g. removes the possibility to run out of memory at some point
less error prone - as the user is not needed to handle deallocation
Cons:
less flexible - you have to have all information available at compile-time (data size, data structure, etc.)
smaller in size - however there are ways to calculate total stack size of an application, so running out of stack can be avoided
I think this captures a few of the pros and cons. I'm sure there are more.
In the end it depends on what your application needs.
The stack should be prefered to the heap, as stack allocated variables are automatic variables: their destruction is done automatically when the program goes out of their context.
In fact, the lifespan of object created on the stack and on the heap is different:
The local variables of a function or a code block {} (not allocated by new), are on the stack. They are automatically destroyed when you are returning from the function. (their destructors are called and their memory is freed).
But, if you need something an object to be used outside of the the function, you will have to allocate in on the heap (using new) or return a copy.
Example:
void myFun()
{
A onStack; // On the stack
A* onHeap = new A(); // On the heap
// Do things...
} // End of the function onStack is destroyed, but the &onHeap is still alive
In this example, onHeap will still have its memory allocated when the function ends. Such that if you don't have a pointer to onHeap somewhere, you won't be able to delete it and free the memory. It's a memory leak as the memory will be lost until the program end.
However if you were to return a pointer on onStack, since onStack was destroyed when exiting the function, using the pointer could cause undefined behaviour. While using onHeap is still perfectly valid.
To better understand how stack variables are working, you should search information about the call stack such as this article on Wikipedia. It explains how the variables are stacked to be used in a function.
It is always better to avoid using new as much as possible in C++.
However, there are times when you cannot avoid it.
For ex:
Wanting variables to exist beyond their scopes.
So it should be horses for courses really, but if you have a choice always avoid heap allocated variables.
The answer is not as clear cut as some would make you believe.
In general, you should prefer automatic variables (on the stack) because it's just plain easier. However some situations call for dynamic allocations (on the heap):
unknown size at compile time
extensible (containers use heap allocation internally)
large objects
The latter is a bit tricky. In theory, the automatic variables could get allocated infinitely, but computers are finite and worse all, most of the times the size of the stack is finite too (which is an implementation issue).
Personally, I use the following guideline:
local objects are allocated automatically
local arrays are deferred to std::vector<T> which internally allocates them dynamically
it has served me well (which is just anecdotal evidence, obviously).
Note: you can (and probably should) tie the life of the dynamically allocated object to that of a stack variable using RAII: smart pointers or containers.
C++ has no mention of the Heap or the Stack. As far as the language is concerned they do not exist/are not separate things.
As for a practical answer - use what works best - do you need fast - do you need guarantees. Application A might be much better with everything on the Heap, App B might fragment OS memory so badly it kills the machine - there is no right answer :-(
Simply put, don't manage your own memory unless you need to. ;)
Stack = Static Data allocated during compile time. (not dynamic)
Heap = Dyanamic Data allocated during run time. (Very dynamic)
Although pointers are on the Stack...Those pointers are beautiful because they open the doors for dynamic, spontaneous creation of data (depending on how you code your program).
(But I'm just a savage, so why does it matter what i say)
In "The C++ Programming Language" book Stroustrup says:
"To deallocate space allocated by new, delete and delete[] must be able to determine the size of the object allocated. This implies that an object allocated using the standard implementation of new will occupy slightly more space than a static object. Typically, one word is used to hold the object’s size.
That means every object allocated by new has its size located somewhere in the heap. Is the location known and if it is how can I access it?
In actual fact, the typical implementation of the memory allocators store some other information too.
There is no standard way to access this information, in fact there is nothing in the standard saying WHAT information is stored either (the size in bytes, number of elements and their size, a pointer to the last element, etc).
Edit:
If you have the base-address of the object and the correct type, I suspect the size of the allocation could be relatively easily found (not necessarily "at no cost at all"). However, there are several problems:
It assumes you have the original pointer.
It assumes the memory is allocated exactly with that runtime library's allocation code.
It assumes the allocator doesn't "round" the allocation address in some way.
To illustrate how this could go wrong, let's say we do this:
size_t get_len_array(int *mem)
{
return allcoated_length(mem);
}
...
void func()
{
int *p = new int[100];
cout << get_len_array(p);
delete [] p;
}
void func2()
{
int buf[100];
cout << get_len_array(buf); // Ouch!
}
That means every object allocated by new has its size located somewhere in the heap. Is the location known and if it is how can I access it?
Not really, that is not needed for all cases. To simplify the reasoning, there are two levels at which the sizes could be needed. At the language level, the compiler needs to know what to destroy. At the allocator level, the allocator needs to know how to release the memory given only a pointer.
At the language level, only the array versions new[] and delete[] need to handle any size. When you allocate with new, you get a pointer with the type of the object, and that type has a given size.
To destroy the object the size is not needed. When you delete, either the pointer is to the correct type, or the static type of the pointer is a base and the destructor is virtual. All other cases are undefined behavior, and thus can be ignored (anything can happen). If it is the correct type, then the size is known. If it is a base with a virtual destructor, the dynamic dispatch will find the final overrider, and at that point the type is known.
There could be different strategies to manage this, the one used in the Itanium C++ ABI (used by multiple compilers in multiple platforms, although not Visual Studio) for example generates up to 3 different destructors per type, one of them being a version that takes care of releasing the memory, so although delete ptr is defined in terms of calling the appropriate destructor and then releasing the memory, in this particular ABI delete ptr call a special destructor that both destroys and releases the memory.
When you use new[] the type of the pointer is the same regardless of the number of elements in the dynamic array, so the type cannot be used to retrieve that information back. A common implementation is allocating an extra integral value and storing the size there, followed by the real objects, then returning a pointer to the first object. delete[] would then move the received pointer one integer back, read the number of elements, call the destructor for all of them and then release the memory (pointer retrieved by the allocator, not the pointer given to the program). This is really only needed if the type has a non-trivial destructor, if the type has a trivial destructor, the implementation does not need to store the size and you can avoid storing that number.
Out of the language level, the real memory allocator (think of malloc) needs to know how much memory was allocated so that the same amount can be released. In some cases that can be done by attaching the metadata to the memory buffer in the same way that new[] stores the size of the array, by acquiring a larger block, storing the metadata there and returning a pointer beyond it. The deallocator would then undo the transformation to get to the metadata.
This is, on the other hand, not always needed. A common implementation for allocators of small size is to allocate pages of memory to form pools from which the small allocations are then obtained. To make this efficient, the allocator considers only a few different sizes, and allocations that don't fit one of the sizes exactly are bumped to the next size. If you request, for example, 65 bytes, the allocator might actually give you 128 bytes (assuming pools of 64 and 128 bytes). Thus given one of the larger blocks managed by the allocator, all pointers that were allocated from it have the same size. The allocator can then find the block from which pointer was allocated and infer the size from it.
Of course, this is all implementation details that are not accessible to the C++ program in a standard portable way, and the exact implementation can differ not just based on the program, but also de execution environment. If you are interested in knowing how the information is really kept in your environment, you might be able to find the information, but I would think twice before trying to use it for anything other than learning purposes.
Your are not deleting a object directly, instead you send a pointer to delete operator.
Reference C++
You use delete by following
it with a pointer to a block of memory originally allocated with new:
int * ps = new int; // allocate memory with new
. . . // use the memory
delete ps; // free memory with delete when done
This removes the memory to which ps points; it doesn’t remove the pointer ps itself.
You can reuse ps, for example, to point to another new allocation
An experienced C++ user told me that I should strive for using heap variables, i.e.:
A* obj = new A("A");
as opposed to:
A obj("A");
Aside from all that stuff about using pointers being nice and flexible, he said it's better to put things on the heap rather than the stack (something about the stack being smaller than the heap?). Is it true? If so why?
NB: I know about issues with lifetime. Let's assume I have managed the lifetime of these variables appropriately. (i.e. the only criteria of concern is heap vs. stack storage with no lifetime concern)
Depending on the context we can consider heap or stack. Every thread gets a stack and the thread executes instructions by invoking functions. When a function is called, the function variables are pushed to stack. And when the function returns the stack rollbacks and memory is reclaimed. Now there is a size limitation for the thread local stack, it varies and can be tweaked to some extent. Considering this if every object is created on stack and the object requires large memory, then the stack space will exhaust resulting to stackoverflow error. Besides this if the object is to be accessed by multiple threads then storing such object on stack makes no sense.
Thus small variables, small objects who's size can be determine at compile time and pointers should be stored on stack. The concern of storing objects on heap or free store is, memory management becomes difficult. There are chances of memory leak, which is bad. Also if application tries to access an object which is already deleted, then access violation can happen which can cause application crash.
C++11 introduces smart pointers (shared, unique) to make memory management with heap easier. The actual referenced object is on heap but is encapsulation by the smart pointer which is always on the stack. Hence when the stack rollbacks during function return event or during exception the destructor of smart pointer deletes the actual object on heap. In case of shared pointer the reference count is maintained and the actually object is deleted when the reference count is zero.
http://en.wikipedia.org/wiki/Smart_pointer
There are no general rules regarding use of stack allocated vs heap allocated variables. There are only guidelines, depending on what you are trying to do.
Here are some pros and cons:
Heap Allocation:
Pros:
more flexible - in case you have a lot of information that is not available at compile-time
bigger in size - you can allocate more - however, it's not infinite, so at some point your program might run out of memory if allocations/deallocations are not handled correctly
Cons:
slower - dynamic allocation is usually slower than stack allocation
may cause memory fragmentation - allocating and deallocating objects of different sizes will make the memory look like Swiss cheese :) causing some allocations to fail if there is no memory block of the required size available
harder to maintain - as you know each dynamic allocation must be followed by a deallocation, which should be done by the user - this is error prone as there are a lot of cases where people forget to match every malloc() call with a free() call or new() with delete()
Stack allocation:
Pros:
faster - which is important mostly on embedded systems (I believe that for embedded there is a MISRA rule which forbids dynamic allocation)
does not cause memory fragmentation
makes the behavior of applications more deterministic - e.g. removes the possibility to run out of memory at some point
less error prone - as the user is not needed to handle deallocation
Cons:
less flexible - you have to have all information available at compile-time (data size, data structure, etc.)
smaller in size - however there are ways to calculate total stack size of an application, so running out of stack can be avoided
I think this captures a few of the pros and cons. I'm sure there are more.
In the end it depends on what your application needs.
The stack should be prefered to the heap, as stack allocated variables are automatic variables: their destruction is done automatically when the program goes out of their context.
In fact, the lifespan of object created on the stack and on the heap is different:
The local variables of a function or a code block {} (not allocated by new), are on the stack. They are automatically destroyed when you are returning from the function. (their destructors are called and their memory is freed).
But, if you need something an object to be used outside of the the function, you will have to allocate in on the heap (using new) or return a copy.
Example:
void myFun()
{
A onStack; // On the stack
A* onHeap = new A(); // On the heap
// Do things...
} // End of the function onStack is destroyed, but the &onHeap is still alive
In this example, onHeap will still have its memory allocated when the function ends. Such that if you don't have a pointer to onHeap somewhere, you won't be able to delete it and free the memory. It's a memory leak as the memory will be lost until the program end.
However if you were to return a pointer on onStack, since onStack was destroyed when exiting the function, using the pointer could cause undefined behaviour. While using onHeap is still perfectly valid.
To better understand how stack variables are working, you should search information about the call stack such as this article on Wikipedia. It explains how the variables are stacked to be used in a function.
It is always better to avoid using new as much as possible in C++.
However, there are times when you cannot avoid it.
For ex:
Wanting variables to exist beyond their scopes.
So it should be horses for courses really, but if you have a choice always avoid heap allocated variables.
The answer is not as clear cut as some would make you believe.
In general, you should prefer automatic variables (on the stack) because it's just plain easier. However some situations call for dynamic allocations (on the heap):
unknown size at compile time
extensible (containers use heap allocation internally)
large objects
The latter is a bit tricky. In theory, the automatic variables could get allocated infinitely, but computers are finite and worse all, most of the times the size of the stack is finite too (which is an implementation issue).
Personally, I use the following guideline:
local objects are allocated automatically
local arrays are deferred to std::vector<T> which internally allocates them dynamically
it has served me well (which is just anecdotal evidence, obviously).
Note: you can (and probably should) tie the life of the dynamically allocated object to that of a stack variable using RAII: smart pointers or containers.
C++ has no mention of the Heap or the Stack. As far as the language is concerned they do not exist/are not separate things.
As for a practical answer - use what works best - do you need fast - do you need guarantees. Application A might be much better with everything on the Heap, App B might fragment OS memory so badly it kills the machine - there is no right answer :-(
Simply put, don't manage your own memory unless you need to. ;)
Stack = Static Data allocated during compile time. (not dynamic)
Heap = Dyanamic Data allocated during run time. (Very dynamic)
Although pointers are on the Stack...Those pointers are beautiful because they open the doors for dynamic, spontaneous creation of data (depending on how you code your program).
(But I'm just a savage, so why does it matter what i say)
I have an array, called x, whose size is 6*sizeof(float). I'm aware that declaring:
float x[6];
would allocate 6*sizeof(float) for x in the stack memory. However, if I do the following:
float *x; // in class definition
x = new float[6]; // in class constructor
delete [] x; // in class destructor
I would be allocating dynamic memory of 6*sizeof(float) to x. If the size of x does not change for the lifetime of the class, in terms of best practices for cleanliness and speed (I do vaguely recall, if not correctly, that stack memory operations are faster than dynamic memory operations), should I make sure that x is statically rather than dynamically allocated memory? Thanks in advance.
Declaring the array of fixed size will surely be faster. Each separate dynamic allocation requires finding an unoccupied block and that's not very fast.
So if you really care about speed (have profiled) the rule is if you don't need dynamic allocation - don't use it. If you need it - think twice on how much to allocate since reallocating is not very fast too.
Using an array member will be cleaner (more succinct, less error prone) and faster as there is no need to call allocation and deallocation functions. You will also tend to improve 'locality of reference' for the structure being allocated.
The two main reasons for using dynamically allocated memory for such a member are where the required size is only known at run time, or where the required size is large and it is known that this will have a significant impact on the available stack space on the target platform.
TBH data on the stack generally sits in the cache and hence it is faster. However if you dynamically allocate something once and then use it regularly it will also be cached and hence pretty much as fast.
The important thing is to avoid allocating and deallocating regularly (ie each time a function is called). If you justa void doing regular allocation and deallocations (ie allocate and deallocate once only) then a stack and heap allocated array will preform pretty much as quickly as each other.
Yes, declaring the array statically will perform faster.
This is very easy to test, just write a simple wrapping loop to instantiate X number of these objects. You can also step through the machine code and see the larger number of OPCODEs required to dynamically allocate the memory.
Static allocation is faster (no need to ask to memory ) and there's no way you will forget to delete it or delete it with incorrect delete operator (delete instead of delete[]).
Construction an usage of dynamic/heap data is consists of the following steps:
ask for memory to allocate the objects (calling to new operator). If no memory a new operator will throw bad_alloc exception.
creating the objects with default constructor (also done by new)
release the memory by user (by delete/delete[] operator) - delete will call
to object destructor. Here a user can do a lot of mistakes:
forget to call to delete - this will lead to memory leak
call to not correct delete operator (e.g. delete instead of delete[]) - bad things will happen
call to delete twice - bad things can happen
When using static objects/array of objects, there's no need to allocate memory and release it by user. This makes code simpler and less error-prone.
So to the conclusion, if you know your size on the array on at compilation time and you don't matter about memory (maybe at runtime I'll use not entries in the array), static array is obviously preferred one.
For dynamic allocated data it worth looking for smart pointers (here)
Don't confuse the following cases:
int global_x[6]; // an array with static storage duration
struct Foo {
int *pointer_x; // a pointer member in instance data
int member_x[6]; // an array in instance data
Foo() {
pointer_x = new int[6]; // a heap-allocated array
}
~Foo() { delete[] pointer_x; }
};
int main() {
int auto_x[6]; // an array on the stack (automatic variable)
Foo auto_f; // a Foo on the stack
Foo *dyn_f = new Foo(); // a heap-allocated Foo.
}
Now:
auto_f.member_x is on the stack, because auto_f is on the stack.
(*dyn_f).member_x is on the heap, because *dyn_f is on the heap.
For both Foo objects, pointer_x points to a heap-allocated array.
global_x is in some data section which the OS or runtime creates each time the program is run. This may or may not be from the same heap as dynamic allocations, it doesn't usually matter.
So regardless of whether it's on the heap or not, member_x is a better bet than pointer_x in the case where the length is always 6, because:
It's less code and less error-prone.
Your object only needs a single allocation if the object is heap-allocated, instead of 2.
Your object requires no heap allocations if the object is on the stack.
It uses less memory in total, because of fewer allocations, and also because there's no need for storage for the pointer value.
Reasons to prefer pointer_x:
If you need to reallocate during the lifetime of the object.
If different objects will need a different size array (perhaps based on constructor parameters).
If Foo objects will be placed on the stack, but the array is so large that it won't fit on the stack. For instance if you've got 1MB of stack, then you can't use automatic variables which contain an int[262144].
Composition is more efficient, being faster, lower memory overhead and less memory fragmentation.
You could do something like this:
template <int SZ = 6>
class Whatever {
...
float floats[SZ];
};
Use the stack allocated memory whenever possible. It will save you from the headaches of deallocating the memory, fragmentation of your virtual address space etc. Also, it is faster compared to the dynamic memory allocation.
There are more variables at play here:
The size of the array vs. the size of the stack: stack sizes are quite small compared to the free store (e.g. 1MB upto 30MB). Large chunks on the stack will cause stack overflow
The number of arrays you need: large number of small arrays
The lifetime of the array: if it's only needed locally inside a function, the stack is very convenient. If you need it after the function has exited, you must allocate it on the heap.
Garbage collection: if you allocate it on the heap, you need to clean it up manually, or have some flavour of smart pointers do the work for you.
As mentioned in another reply, large objects can not be allocated on the stack because you are not sure what is the stack size. In interests of portability, large objects or objects with variable sizes should always be allocated on the heap.
There has been a lot of development in the malloc/new routines now provided by the operating system (for example, Solaris's libumem). Dynamic memory allocation is often not a bottleneck.
If yo allocate the arraty statically, there will only be one instance of it. The point of using a class is that you want multiple instances. There is no need to allocate the array dynamically at all:
class A {
...
private:
float x[8];
};
is what you want.
I've been using C++ for a short while, and I've been wondering about the new keyword. Simply, should I be using it, or not?
With the new keyword...
MyClass* myClass = new MyClass();
myClass->MyField = "Hello world!";
Without the new keyword...
MyClass myClass;
myClass.MyField = "Hello world!";
From an implementation perspective, they don't seem that different (but I'm sure they are)... However, my primary language is C#, and of course the 1st method is what I'm used to.
The difficulty seems to be that method 1 is harder to use with the std C++ classes.
Which method should I use?
Update 1:
I recently used the new keyword for heap memory (or free store) for a large array which was going out of scope (i.e. being returned from a function). Where before I was using the stack, which caused half of the elements to be corrupt outside of scope, switching to heap usage ensured that the elements were intact. Yay!
Update 2:
A friend of mine recently told me there's a simple rule for using the new keyword; every time you type new, type delete.
Foobar *foobar = new Foobar();
delete foobar; // TODO: Move this to the right place.
This helps to prevent memory leaks, as you always have to put the delete somewhere (i.e. when you cut and paste it to either a destructor or otherwise).
Method 1 (using new)
Allocates memory for the object on the free store (This is frequently the same thing as the heap)
Requires you to explicitly delete your object later. (If you don't delete it, you could create a memory leak)
Memory stays allocated until you delete it. (i.e. you could return an object that you created using new)
The example in the question will leak memory unless the pointer is deleted; and it should always be deleted, regardless of which control path is taken, or if exceptions are thrown.
Method 2 (not using new)
Allocates memory for the object on the stack (where all local variables go) There is generally less memory available for the stack; if you allocate too many objects, you risk stack overflow.
You won't need to delete it later.
Memory is no longer allocated when it goes out of scope. (i.e. you shouldn't return a pointer to an object on the stack)
As far as which one to use; you choose the method that works best for you, given the above constraints.
Some easy cases:
If you don't want to worry about calling delete, (and the potential to cause memory leaks) you shouldn't use new.
If you'd like to return a pointer to your object from a function, you must use new
There is an important difference between the two.
Everything not allocated with new behaves much like value types in C# (and people often say that those objects are allocated on the stack, which is probably the most common/obvious case, but not always true). More precisely, objects allocated without using new have automatic storage duration
Everything allocated with new is allocated on the heap, and a pointer to it is returned, exactly like reference types in C#.
Anything allocated on the stack has to have a constant size, determined at compile-time (the compiler has to set the stack pointer correctly, or if the object is a member of another class, it has to adjust the size of that other class). That's why arrays in C# are reference types. They have to be, because with reference types, we can decide at runtime how much memory to ask for. And the same applies here. Only arrays with constant size (a size that can be determined at compile-time) can be allocated with automatic storage duration (on the stack). Dynamically sized arrays have to be allocated on the heap, by calling new.
(And that's where any similarity to C# stops)
Now, anything allocated on the stack has "automatic" storage duration (you can actually declare a variable as auto, but this is the default if no other storage type is specified so the keyword isn't really used in practice, but this is where it comes from)
Automatic storage duration means exactly what it sounds like, the duration of the variable is handled automatically. By contrast, anything allocated on the heap has to be manually deleted by you.
Here's an example:
void foo() {
bar b;
bar* b2 = new bar();
}
This function creates three values worth considering:
On line 1, it declares a variable b of type bar on the stack (automatic duration).
On line 2, it declares a bar pointer b2 on the stack (automatic duration), and calls new, allocating a bar object on the heap. (dynamic duration)
When the function returns, the following will happen:
First, b2 goes out of scope (order of destruction is always opposite of order of construction). But b2 is just a pointer, so nothing happens, the memory it occupies is simply freed. And importantly, the memory it points to (the bar instance on the heap) is NOT touched. Only the pointer is freed, because only the pointer had automatic duration.
Second, b goes out of scope, so since it has automatic duration, its destructor is called, and the memory is freed.
And the barinstance on the heap? It's probably still there. No one bothered to delete it, so we've leaked memory.
From this example, we can see that anything with automatic duration is guaranteed to have its destructor called when it goes out of scope. That's useful. But anything allocated on the heap lasts as long as we need it to, and can be dynamically sized, as in the case of arrays. That is also useful. We can use that to manage our memory allocations. What if the Foo class allocated some memory on the heap in its constructor, and deleted that memory in its destructor. Then we could get the best of both worlds, safe memory allocations that are guaranteed to be freed again, but without the limitations of forcing everything to be on the stack.
And that is pretty much exactly how most C++ code works.
Look at the standard library's std::vector for example. That is typically allocated on the stack, but can be dynamically sized and resized. And it does this by internally allocating memory on the heap as necessary. The user of the class never sees this, so there's no chance of leaking memory, or forgetting to clean up what you allocated.
This principle is called RAII (Resource Acquisition is Initialization), and it can be extended to any resource that must be acquired and released. (network sockets, files, database connections, synchronization locks). All of them can be acquired in the constructor, and released in the destructor, so you're guaranteed that all resources you acquire will get freed again.
As a general rule, never use new/delete directly from your high level code. Always wrap it in a class that can manage the memory for you, and which will ensure it gets freed again. (Yes, there may be exceptions to this rule. In particular, smart pointers require you to call new directly, and pass the pointer to its constructor, which then takes over and ensures delete is called correctly. But this is still a very important rule of thumb)
The short answer is: if you're a beginner in C++, you should never be using new or delete yourself.
Instead, you should use smart pointers such as std::unique_ptr and std::make_unique (or less often, std::shared_ptr and std::make_shared). That way, you don't have to worry nearly as much about memory leaks. And even if you're more advanced, best practice would usually be to encapsulate the custom way you're using new and delete into a small class (such as a custom smart pointer) that is dedicated just to object lifecycle issues.
Of course, behind the scenes, these smart pointers are still performing dynamic allocation and deallocation, so code using them would still have the associated runtime overhead. Other answers here have covered these issues, and how to make design decisions on when to use smart pointers versus just creating objects on the stack or incorporating them as direct members of an object, well enough that I won't repeat them. But my executive summary would be: don't use smart pointers or dynamic allocation until something forces you to.
Which method should I use?
This is almost never determined by your typing preferences but by the context. If you need to keep the object across a few stacks or if it's too heavy for the stack you allocate it on the free store. Also, since you are allocating an object, you are also responsible for releasing the memory. Lookup the delete operator.
To ease the burden of using free-store management people have invented stuff like auto_ptr and unique_ptr. I strongly recommend you take a look at these. They might even be of help to your typing issues ;-)
If you are writing in C++ you are probably writing for performance. Using new and the free store is much slower than using the stack (especially when using threads) so only use it when you need it.
As others have said, you need new when your object needs to live outside the function or object scope, the object is really large or when you don't know the size of an array at compile time.
Also, try to avoid ever using delete. Wrap your new into a smart pointer instead. Let the smart pointer call delete for you.
There are some cases where a smart pointer isn't smart. Never store std::auto_ptr<> inside a STL container. It will delete the pointer too soon because of copy operations inside the container. Another case is when you have a really large STL container of pointers to objects. boost::shared_ptr<> will have a ton of speed overhead as it bumps the reference counts up and down. The better way to go in that case is to put the STL container into another object and give that object a destructor that will call delete on every pointer in the container.
Without the new keyword you're storing that on call stack. Storing excessively large variables on stack will lead to stack overflow.
If your variable is used only within the context of a single function, you're better off using a stack variable, i.e., Option 2. As others have said, you do not have to manage the lifetime of stack variables - they are constructed and destructed automatically. Also, allocating/deallocating a variable on the heap is slow by comparison. If your function is called often enough, you'll see a tremendous performance improvement if use stack variables versus heap variables.
That said, there are a couple of obvious instances where stack variables are insufficient.
If the stack variable has a large memory footprint, then you run the risk of overflowing the stack. By default, the stack size of each thread is 1 MB on Windows. It is unlikely that you'll create a stack variable that is 1 MB in size, but you have to keep in mind that stack utilization is cumulative. If your function calls a function which calls another function which calls another function which..., the stack variables in all of these functions take up space on the same stack. Recursive functions can run into this problem quickly, depending on how deep the recursion is. If this is a problem, you can increase the size of the stack (not recommended) or allocate the variable on the heap using the new operator (recommended).
The other, more likely condition is that your variable needs to "live" beyond the scope of your function. In this case, you'd allocate the variable on the heap so that it can be reached outside the scope of any given function.
The simple answer is yes - new() creates an object on the heap (with the unfortunate side effect that you have to manage its lifetime (by explicitly calling delete on it), whereas the second form creates an object in the stack in the current scope and that object will be destroyed when it goes out of scope.
Are you passing myClass out of a function, or expecting it to exist outside that function? As some others said, it is all about scope when you aren't allocating on the heap. When you leave the function, it goes away (eventually). One of the classic mistakes made by beginners is the attempt to create a local object of some class in a function and return it without allocating it on the heap. I can remember debugging this kind of thing back in my earlier days doing c++.
C++ Core Guidelines R.11: Avoid using new and delete explicitly.
Things have changed significantly since most answers to this question were written. Specifically, C++ has evolved as a language, and the standard library is now richer. Why does this matter? Because of a combination of two factors:
Using new and delete is potentially dangerous: Memory might leak if you don't keep a very strong discipline of delete'ing everything you've allocated when it's no longer used; and never deleteing what's not currently allocated.
The standard library now offers smart pointers which encapsulate the new and delete calls, so that you don't have to take care of managing allocations on the free store/heap yourself. So do other containers, in the standard library and elsewhere.
This has evolved into one of the C++ community's "core guidelines" for writing better C++ code, as the linked document shows. Of course, there exceptions to this rule: Somebody needs to write those encapsulating classes which do use new and delete; but that someone is rarely yourself.
Adding to #DanielSchepler's valid answer:
The second method creates the instance on the stack, along with such things as something declared int and the list of parameters that are passed into the function.
The first method makes room for a pointer on the stack, which you've set to the location in memory where a new MyClass has been allocated on the heap - or free store.
The first method also requires that you delete what you create with new, whereas in the second method, the class is automatically destructed and freed when it falls out of scope (the next closing brace, usually).
The short answer is yes the "new" keyword is incredibly important as when you use it the object data is stored on the heap as opposed to the stack, which is most important!