I read about C++ dynamic memory allocation. Here is my code:
#include <iostream>
using namespace std;
int main()
{
int t;
cin>>t;
int a[t];
return 0;
}
What is the difference between the above and the following:
int* a=new(nothrow) int[t];
Use dynamic allocation:
when you need control over when an object is created and destroyed; or
when you need to create a local object that's too big to risk putting on the stack; or
when the size of a local array isn't a constant
To answer your specific question: int a[t]; isn't valid C++, since an array size must be constant. Some compilers allow such variable-length arrays as an extension, borrowed from C; but you shouldn't use them, unless you don't mind being tied to that compiler.
So you'd want dynamic allocation there, either the easy way, managed by RAII:
std::vector<int> a(t);
// use it, let it clean itself up when it goes out of scope
or the hard way, managed by juggling pointers and hoping you don't drop them:
int* a=new int[t];
// use it, hope nothing throws an exception or otherwise leaves the scope
delete [] a; // don't forget to delete it
Your first example is C99-compatible array allocations, which occur on the stack and whose lifetimes are similar to other local variables.
The allocation example is a typical C++ dynamic memory allocation, which occurs from the heap and whose lifetime extends until delete a[] is reached--without this code the memory is "leaked". The one-of-lifetime occurs with the variable is destructed by delete and can occur after the current local scope has ended.
Related
I've always declared my arrays using this method:
bool array[256];
However, I've recently been told to declare my arrays using:
bool* array = new bool[256];
What is the difference and which is better? Honestly, I don't fully understand the second way, so an explanation on that would be helpful too.
bool array[256];
This allocates a bool array with automatic storage duration.
It will be automatically cleaned up when it goes out of scope.
In most implementations this would be allocated on the stack if it's not declared static or global.
Allocations/deallocations on the stack are computationally really cheap compared to the alternative. It also might have some advantages for data-locality but that's not something you usually have to worry about. But you might need to be careful of allocating many large arrays to avoid a stack overflow.
bool* array = new bool[256];
This allocates an array with dynamic storage duration.
You need to clean it up yourself with a call to delete[] later on. If you do not then you will leak memory.
Alternatively (as mentioned by #Fibbles) you can use smart-pointers to express the desired ownership/lifetime requirements. This will leave the responsibility of cleaning up to the smart-pointer class. Which helps a lot with guaranteeing deletion, even in cases of exceptions.
It has the advantage of being able to pass it to outer scopes and other objects without copying (RVO will avoid copying for the first case too in certain cases, but storing it as a data-member and other uses can't be optimized in the first case).
The first is allocation of memory on stack:
// inside main (or function, or non-static member of class) -> stack
int main() {
bool array[256];
}
or maybe as a static memory:
// outside main (and any function, or static member of class) -> static
bool array[256];
int main() {
}
The last is allocation of dynamic memory (in heap):
int main() {
bool* array = new bool[256];
delete[] array; // you should not forget to release memory allocated in heap
}
The advantage of dynamic memory is that it can be created with variable number of elements (not 256, but from some user input for example). But you should release it each time by yourself.
More about stack, static and heap memory and when you should use each is here: Stack, Static, and Heap in C++
The difference is static vs dynamic allocation, as previous answers have indicated. There are reasons for using one over the other. This video by Herb Sutter explains when you should use what. https://www.youtube.com/watch?v=JfmTagWcqoE It is just over 1 1/2 hours.
My preference is to use
bool array[256];
unless there's a reason to do otherwise.
Mike
I am trying to understand the difference between the stack and heap memory, and this question on SO as well as this explanation did a pretty good job explaining the basics.
In the second explanation however, I came across an example to which I have a specific question, the example is this:
It is explained that the object m is allocated on the heap, I am just wondering if this is the full story. According to my understanding, the object itself indeed is allocated on the heap as the new keyword has been used for its instantiation.
However, isn't it that the pointer to object m is on the same time allocated on the stack? Otherwise, how would the object itself, which of course is sitting in the heap be accessed. I feel like for the sake of completeness, this should have been mentioned in this tutorial, leaving it out causes a bit of confusion to me, so I hope someone can clear this up and tell me that I am right with my understanding that this example should have basically two statements that would have to say:
1. a pointer to object m has been allocated on the stack
2. the object m itself (so the data that it carries, as well as access to its methods) has been allocated on the heap
Your understanding may be correct, but the statements are wrong:
A pointer to object m has been allocated on the stack.
m is the pointer. It is on the stack. Perhaps you meant pointer to a Member object.
The object m itself (the data that it carries, as well as access to its methods) has been allocated on the heap.
Correct would be to say the object pointed by m is created on the heap
In general, any function/method local object and function parameters are created on the stack. Since m is a function local object, it is on the stack, but the object pointed to by m is on the heap.
"stack" and "heap" are general programming jargon. In particular , no storage is required to be managed internally via a stack or a heap data structure.
C++ has the following storage classes
static
automatic
dynamic
thread
Roughly, dynamic corresponds to "heap", and automatic corresponds to "stack".
Moving onto your question: a pointer can be created in any of these four storage classes; and objects being pointed to can also be in any of these storage classes. Some examples:
void func()
{
int *p = new int; // automatic pointer to dynamic object
int q; // automatic object
int *r = &q; // automatic pointer to automatic object
static int *s = p; // static pointer to dynamic object
static int *s = r; // static pointer to automatic object (bad idea)
thread_local int **t = &s; // thread pointer to static object
}
Named variables declared with no specifier are automatic if within a function, or static otherwise.
When you declare a variable in a function, it always goes on the stack. So your variable Member* m is created on the stack. Note that by itself, m is just a pointer; it doesn't point to anything. You can use it to point to an object on either the stack or heap, or to nothing at all.
Declaring a variable in a class or struct is different -- those go where ever the class or struct is instantiated.
To create something on the heap, you use new or std::malloc (or their variants). In your example, you create an object on the heap using new and assign its address to m. Objects on the heap need to be released to avoid memory leaks. If allocated using new, you need to use delete; if allocated using std::malloc, you need to use std::free. The better approach is usually to use a "smart pointer", which is an object that holds a pointer and has a destructor that releases it.
Yes, the pointer is allocated on the stack but the object that pointer points to is allocated on the heap. You're correct.
However, isn't it that the pointer to object m is on the same time
allocated on the stack?
I suppose you meant the Member object. The pointer is allocated on the stack and will last there for the entire duration of the function (or its scope). After that, the code might still work:
#include <iostream>
using namespace std;
struct Object {
int somedata;
};
Object** globalPtrToPtr; // This is into another area called
// "data segment", could be heap or stack
void function() {
Object* pointerOnTheStack = new Object;
globalPtrToPtr = &pointerOnTheStack;
cout << "*globalPtrToPtr = " << *globalPtrToPtr << endl;
} // pointerOnTheStack is NO LONGER valid after the function exits
int main() {
// This can give an access violation,
// a different value after the pointer destruction
// or even the same value as before, randomly - Undefined Behavior
cout << "*globalPtrToPtr = " << *globalPtrToPtr << endl;
return 0;
}
http://ideone.com/BwUVgm
The above code stores the address of a pointer residing on the stack (and leaks memory too because it doesn't free Object's allocated memory with delete).
Since after exiting the function the pointer is "destroyed" (i.e. its memory can be used for whatever pleases the program), you can no longer safely access it.
The above program can either: run properly, crash or give you a different result. Accessing freed or deallocated memory is called undefined behavior.
I'm a bit confused about the difference between when I just declare a variable such as:
int n;
and dynamically assigning memory to a variable using "new" such as:
int m = new int;
I noticed just from working on a simple linked list project that when I'm inserting a new value in the form of an node object, I have to dynamically create a new node object and append the desired value to it and then link it to the rest of my list. However.. in the same function, I could just define another node object, ex. NodeType *N. and traverse my list using this pointer.
My question is.. when we just declare a variable, does memory not get assigned right away.. or what's the difference?
Thank you!
Prefer automatic storage allocated variables when possible:
int n;
over
int* m = new int; // note pointer
The reason dynamic allocation is prefered in your case is the way the linked list is defined. I.e. each node contains a pointer to a next node (probably). Because the nodes must exist beyond the point where they are created, they are dynamically allocated.
NodeType *N. and traverse my list using this pointer
Yes, you could do that. But note that this is just a pointer declaration. You have to assign it to something meaningful to actually use it.
My question is.. when we just declare a variable, does memory not get assigned right away.. or what's the difference?
Actually, both cases are definitions, not just declarations.
int n;
creates an un-initialized int with automatic storage;
int* n;
creates a pointer to an int. It's dangling, it doesn't point to a valid memory location.
int* n = new int;
creates a pointer and initializes it to a valid memory location containing an uninitialized int.
int* n = new int();
creates a pointer and initializes it to a valid memory location containing a value-initialized int (i.e. 0).
The difference is that automatic storage can only be used when the compiler can determine at compile time how much memory is needed and how long it will be needed for. Typically automatic variables will be allocated on the stack.
Whereas for memory that is allocated dynamically the programmer is responsible for keeping track of this information. This is typically allocated on the heap. Using heap memory will typically have greater overhead for a variety of reasons, and there is a risk of memory leak where you allocate heap memory but never free it.
In the example you described of a linked list, it's unlikely that you know the length of the list at compile time (if you did then you could just use a static array), so that is why you will need to manage the memory explicitly rather than letting the compiler take care of memory management automatically. But the pointer that you use to traverse the list is not needed after the function returns, so that is why it can be managed automatically by the compiler.
int m = new int;
is actually incorrect. new returns a pointer to the memory its created. It should be
int *m = new int;
And even better:
int *m = new int();
which sets the initial value of the variable pointed to by m = 0.
Also, in regards to your question, big objects are typically created with pointers to eliminate large copy operations when they are passed from function to function by value. They also are used when the life of the variable is needed to be longer than the scope of the functions.
However, for variables used in the scope of a function, and are not useful anywhere else, automatic memory should be used
int m;
int* f()
{
int *p = new int[10];
return p;
}
int main()
{
int *p = f();
//using p;
return 0;
}
Is it true that during stack destruction when function return it's value some compilers (common ones like VS or gcc were implied when I was told that) could try to automatically free memory pointed by local pointers such as p in this example? Even if it's not, would I be able to normally delete[] allocated memory in main? The problem seems to be that information about exact array size is lost at that point. Also, would the answer change in case of malloc and free?
Thank you.
Only Local variables are destroyed-released.
In your case p is "destroyed" (released) , but what what p points to, is not "destroyed" (released using delete[]).
Yes you can, and should/must use a delete[] on your main. But this does not imply using raw pointers in C++. You might find this e-book interesting : Link-Alf-Book
If you want to delete what a local variable points to when the function is "over" (out of scope) use std::auto_ptr() (only works for non-array variables though, not the ones which require delete[])
Also, would the answer change in case
of malloc and free?
Nope, but you should make sure that you do not mix free()/new/delete/malloc(). The same applies for new/delete[] and new[]/delete.
No, they won't free or delete what your pointer points to. They will only release the few bytes that the pointer itself occupies. A compiler that called free or delete would, I believe, violate the language standard.
You will only be able to delete[] memory in main if you a pointer to the memory, i.e., the result from f(). You don't need keep track of the size of the allocation; new and malloc do that for you, behind the scenes.
If you want memory cleaned up at function return, use a smart pointer such as boost::scoped_ptr or boost::scoped_array (both from the Boost collection of libraries), std::auto_ptr (in the current C++ standard, but about to be deprecated) or std::unique_ptr (in the upcoming standard).
In C, it's impossible to create a smart pointer.
Is it true that during stack destruction when function return it's value some compilers (common ones like VS or gcc were implied when I was told that) could try to automatically free memory pointed by local pointers such as p in this example?
Short Answer: No
Long Answer:
If you are using smart pointers or container (like you should be) then yes.
When the smart pointer goes out of scope the memory is released.
std::auto_ptr<int> f()
{
int *p = new int;
return p; // smart pointer credated here and returned.
// p should probably have been a smart pointer to start with
// But feeling lazy this morning.
}
std::vector<int> f1()
{
// If you want to allocate an array use a std::vector (or std::array from C++0x)
return std::vector<int>(10);
}
int main()
{
std::auto_ptr<int> p = f();
std::vector<int> p1 = f1();
//using p;
return 0; // p destroyed
}
Even if it's not, would I be able to normally delete[] allocated memory in main?
It is normal to make sure all memory is correctly freed as soon as you don't need it.
Note delete [] and delete are different so be careful about using them.
Memory allocated with new must be released with delete.
Memory allocated with new [] must be released with delete [].
Memory allocated with malloc/calloc/realloc must be released with free.
The problem seems to be that information about exact array size is lost at that point.
It is the runtime systems problem to remember this information. How it is stored it is not specified by the standard but usually it is close to the object that was allocated.
Also, would the answer change in case of malloc and free?
In C++ you should probably not use malloc/free. But they can be used. When they are used you should use them together to make sure that memory is not leaked.
You were misinformed - local variables are cleaned up, but the memory allocated to local pointers is not. If you weren't returning the pointer, you would have an immediate memory leak.
Don't worry about how the compiler keeps track of how many elements were allocated, it's an implementation detail that isn't addressed by the C++ standard. Just know that it works. (As long as you use the delete[] notation, which you did)
When you use new[] the compiler adds extra bookkeeping information so that it knows how many elements to delete[]. (In a similar way, when you use malloc it knows how many bytes to free. Some compiler libraries provide extensions to find out what that size is.)
I haven't heard of a compiler doing that, but it's certainly possible for a compiler to detect (in many cases) whether the allocated memory from the function isn't referenced by a pointer anymore, and then free that memory.
In your case however, the memory is not lost because you keep a pointer to it which is the return value of the function.
A very common case for memory leaks and a perfect candidate for such a feature would be this code:
int *f()
{
int *p = new int[10];
// do something that doesn't pass p to external
// functions or assign p to global data
return p;
}
int main()
{
while (1) {
f();
}
return 0;
}
As you can notice, the pointer to the allocated memory is lost and that can be detected by the compiler with absolute certainty.
What are some of the technical differences between memory that is allocated with the new operator and memory that is allocated via a simple variable declaration, such as int var? Does c++ have any form of automatic memory management?
In particular, I have a couple questions. First, since with dynamic memory you have to declare a pointer to store the address of the actual memory you work with, doesn't dynamic memory use more memory? I don't see why the pointer is necessary at all unless you're declaring an array.
Secondly, if I were to make a simple function such as this:
int myfunc() { int x = 2; int y = 3; return x+y; }
...And call it, would the memory allocated by the function be freed as soon as it's scope of existence has ended? What about with dynamic memory?
Note: This answer is way too long. I'll pare it down sometime. Meanwhile, comment if you can think of useful edits.
To answer your questions, we first need to define two areas of memory called the stack and the heap.
The stack
Imagine the stack as a stack of boxes. Each box represents the execution of a function. At the beginning, when main is called, there is one box sitting on the floor. Any local variables you define are in that box.
A simple example
int main(int argc, char * argv[])
{
int a = 3;
int b = 4;
return a + b;
}
In this case, you have one box on the floor with the variables argc (an integer), argv (a pointer to a char array), a (an integer), and b (an integer).
More than one box
int main(int argc, char * argv[])
{
int a = 3;
int b = 4;
return do_stuff(a, b);
}
int do_stuff(int a, int b)
{
int c = a + b;
c++;
return c;
}
Now, you have a box on the floor (for main) with argc, argv, a, and b. On top of that box, you have another box (for do_stuff) with a, b, and c.
This example illustrates two interesting effects.
As you probably know, a and b were passed-by-value. That's why there is a copy of those variables in the box for do_stuff.
Notice that you don't have to free or delete or anything for these variables. When your function returns, the box for that function is destroyed.
Box overflow
int main(int argc, char * argv[])
{
int a = 3;
int b = 4;
return do_stuff(a, b);
}
int do_stuff(int a, int b)
{
return do_stuff(a, b);
}
Here, you have a box on the floor (for main, as before). Then, you have a box (for do_stuff) with a and b. Then, you have another box (for do_stuff calling itself), again with a and b. And then another. And soon, you have a stack overflow.
Summary of the stack
Think of the stack as a stack of boxes. Each box represents a function executing, and that box contains the local variables defined in that function. When the function returns, that box is destroyed.
More technical stuff
Each "box" is officially called a stack frame.
Ever notice how your variables have "random" default values? When an old stack frame is "destroyed", it just stops being relevant. It doesn't get zeroed out or anything like that. The next time a stack frame uses that section of memory, you see bits of old stack frame in your local variables.
The heap
This is where dynamic memory allocation comes into play.
Imagine the heap as an endless green meadow of memory. When you call malloc or new, a block of memory is allocated in the heap. You are given a pointer to access this block of memory.
int main(int argc, char * argv[])
{
int * a = new int;
return *a;
}
Here, a new integer's worth of memory is allocated on the heap. You get a pointer named a that points to that memory.
a is a local variable, and so it is in main's "box"
Rationale for dynamic memory allocation
Sure, using dynamically allocated memory seems to waste a few bytes here and there for pointers. However, there are things that you just can't (easily) do without dynamic memory allocation.
Returning an array
int main(int argc, char * argv[])
{
int * intarray = create_array();
return intarray[0];
}
int * create_array()
{
int intarray[5];
intarray[0] = 0;
return intarray;
}
What happens here? You "return an array" in create_array. In actuality, you return a pointer, which just points to the part of the create_array "box" that contains the array. What happens when create_array returns? Its box is destroyed, and you can expect your array to become corrupt at any moment.
Instead, use dynamically allocated memory.
int main(int argc, char * argv[])
{
int * intarray = create_array();
int return_value = intarray[0];
delete[] intarray;
return return_value;
}
int * create_array()
{
int * intarray = new int[5];
intarray[0] = 0;
return intarray;
}
Because function returning does not modify the heap, your precious intarray escapes unscathed. Remember to delete[] it after you're done though.
Dynamic memory lives on the heap as opposed to the stack. The lifetime of dynamic memory is from the time of allocation, to the time of deallocation. With local variables, their lifetime is limited to the function / block they are defined in.
Regarding your question about the memory usage in the function, in your example the memory for your local variables would be freed at the end of the function. However, if the memory was dynamically allocated with new, it would not be automatically disposed, and you would be responsible for explicitly using delete to free the memory.
Regarding automatic memory management, the C++ Standard Library provides auto_ptr for this.
Memory allocated by "new" ends up on the heap.
Memory allocated in a function resides inside the function where the function is placed on the stack.
Read about stack vs heap allocation here: http://www-ee.eng.hawaii.edu/~tep/EE160/Book/chap14/subsection2.1.1.8.html
Memory allocated with the new operator is fetched from a memory section called "heap" while static allocations for variables are use a memory section shared with procedure/function-calls (the "stack").
You only need to worry about the dynamic memory allocations you made yourself with new, variables which are known at compile-time (defined in the source) are automatically freed at the end of their scope (end of function/procedure, block, ...).
The big difference between "dynamic" and "ordinary" memory was rather good reflected in the question itself.
Dynamic memory is not too good supported by C++ at all.
When you use dynamic memory, you are totally responsible for it by yourself. You have to allocate it. When you forget to do it and try to access it threw your pointer, you will have plenty off negative surprises. Also you have to free the memory -- and when you forget it by any way, you will have even more surprises. Such errors belong to the most difficult errors to find in C/C++ programms.
You need an extra pointer, since somehow you need access to your new memory. Some memory (if dynamic or not) is first of it nothing a programming language can handle. You need to have access to it. This is done by variables. But variables in languages like C++ are stored in "ordinary" memory. So you need to have "pointers" -- pointers are a form of indirection, that says "No, I am not the value you are searching for, but I point to it". Pointers are the only possibility in C++ to access dynamic memory.
By contrast, "ordinary" memory can be accessed directly, allocation and freeing is done automatically by the language itself.
Dynamic memory and pointers is the biggest source for problems in C++ -- but it is also a very mighty concept -- when you do it right, you can do much more then with ordinary memory.
That is also the reason, plenty of libraries have functions or whole modules for dealing with dynamic memory. The auto_ptr-example was also mentioned in a parallel answer, that tries to deal with the problem, that dynamic memory should be reliably released at the end of a method.
Normally you will use dynamic memory only in cases you really need it. You will not use it, to have a single integer variable, but to have arrays or build larger data structures in memory.