c++ dynamic memory allocation using "new"

c++ dynamic memory allocation using "new" - c++

I'm new to C++, trying to learn by myself (I've got Java background).
There's this concept of dynamic memory allocation that I can assign to an array (for example) using new.
In C (and also in C++) I've got malloc and realloc that are doing that. In C++ they've added the new for some reason I can't understand.
I've read a lot about the difference between a normal array that goes to the stack while the dynamic allocated array goes to the heap.
So what I understand is that by using new I'm allocating space in the heap which will not be deleted automatically when finished a function let's say, but will remain where it is until I finally, manually free it.
I couldn't find practical examples of using the dynamic memory allocation over the normal memory.
It's said that I can't allocate memory through runtime when using normal array. Well, probably I didn't understand it right because when I tried to create a normal array (without new) with a capacity given as an input by the user (like arr[input]), it worked fine.
here is what I mean:
int whatever;
cin>>whatever;
int arr2[whatever];
for (int i = 0; i < whatever; i++) {
arr2[i]=whatever;
cout<<arr2[i];
}
I didn't really understand why it's called dynamic when the only way of extending the capacity of an array is to copy it to a new, larger array.
I understood that the Vector class (which I haven't yet learned) is much better to use. But still, I can't just leave that gap of knowledge begin and I must understand why exactly it's called dynamic and why should I use it instead of a normal array.
Why should I bother freeing memory manually when I can't really extend it but only copy it to a new array?

When you know the size of an array at compile time you can declare it like this and it will live on the stack:
int arr[42];
But if you don't know the size at compile time, only at runtime, then you cannot say:
int len = get_len();
int arr[len];
In this case you must allocate the array at runtime. In this case the array will live on the heap.
int len = get_len();
int* arr = new int[len];
When you no longer need that memory you need to do a delete [] arr.
std::vector is a variable size container that allows you to allocate and reallocate memory at runtime without having to worry about explicitly allocating and freeing it.
int len = get_len();
std::vector<int> v(len); // v has len elements
v.resize(len + 10); // add 10 more elements to the vector

For static allocation, you must specify the size as a constant:
MyObj arrObject[5];
For dynamic allocation, that can be varied at run-time:
MyObj *arrObject = new MyObj[n];
The different between new and malloc is that new will call the ctor for all those objects in the array, while malloc just gives you raw memory.

if you wanna use an array and you dont know the exact size at the compile time, thats when dynamic memory allocation steps in. See the example below,
int a[3] = {1,2,3}; //<= valid in terms of syntax;
however,
int size = 3;
int a[size] = {1,2,3} //<= compile error
in order to fix this,
int* ArrayPtr = new int[size];
also, when freeing it, call delete[] ArrayPtr; instead of delete alone, coz we are talking abt freeing a BLOCK of memory at this moment.

In C (and also in C++) I've got malloc and realloc that are doing that. In C++ they've added the "new" for some reason I can't understand.
malloc and realloc take the number of bytes to allocate instead of the type you want to allocate, and also don't call any constructors (again, they only know about the size to allocate). This works fine in C (as it really has more of a size system than a type system), but with C++'s much more involved type system, it falls short. In contrast, new is type safe (it doesn't return a void* as malloc does) and constructs the object allocated for you before returning.
It's said that I can't allocate memory through runtime when using normal array. Well, probably I didn't understand it right because when I tried to create a normal array (without "new") with a capacity given as an input by the user (like arr[input]). it worked fine.
This is a compiler extension (and part of C99), it is NOT standard C++. The standard requires that a 'normal' array have a bound which is known at compile time. However, it seems your compiler decided to support variable length 'normal' arrays anyways.
I didn't really understand why it's called dynamic when the only way of extending the capacity of an array is to copy it to a new, larger array.
Its dynamic in that you don't know the size until run time (and thus can be different across different invocations). Compile time vs. run time is a distinction you don't often run across in other languages (in my experience at least), but it is crucial to understanding C++.

Related

C++ doesn't tell you the size of a dynamic array. But why?

I know that there is no way in C++ to obtain the size of a dynamically created array, such as:
int* a;
a = new int[n];
What I would like to know is: Why? Did people just forget this in the specification of C++, or is there a technical reason for this?
Isn't the information stored somewhere? After all, the command
delete[] a;
seems to know how much memory it has to release, so it seems to me that delete[] has some way of knowing the size of a.

It's a follow on from the fundamental rule of "don't pay for what you don't need". In your example delete[] a; doesn't need to know the size of the array, because int doesn't have a destructor. If you had written:
std::string* a;
a = new std::string[n];
...
delete [] a;
Then the delete has to call destructors (and needs to know how many to call) - in which case the new has to save that count. However, given it doesn't need to be saved on all occasions, Bjarne decided not to give access to it.
(In hindsight, I think this was a mistake ...)
Even with int of course, something has to know about the size of the allocated memory, but:
Many allocators round up the size to some convenient multiple (say 64 bytes) for alignment and convenience reasons. The allocator knows that a block is 64 bytes long - but it doesn't know whether that is because n was 1 ... or 16.
The C++ run-time library may not have access to the size of the allocated block. If for example, new and delete are using malloc and free under the hood, then the C++ library has no way to know the size of a block returned by malloc. (Usually of course, new and malloc are both part of the same library - but not always.)

One fundamental reason is that there is no difference between a pointer to the first element of a dynamically allocated array of T and a pointer to any other T.
Consider a fictitious function that returns the number of elements a pointer points to.
Let's call it "size".
Sounds really nice, right?
If it weren't for the fact that all pointers are created equal:
char* p = new char[10];
size_t ps = size(p+1); // What?
char a[10] = {0};
size_t as = size(a); // Hmm...
size_t bs = size(a + 1); // Wut?
char i = 0;
size_t is = size(&i); // OK?
You could argue that the first should be 9, the second 10, the third 9, and the last 1, but to accomplish this you need to add a "size tag" on every single object.
A char will require 128 bits of storage (because of alignment) on a 64-bit machine. This is sixteen times more than what is necessary.
(Above, the ten-character array a would require at least 168 bytes.)
This may be convenient, but it's also unacceptably expensive.
You could of course envision a version that is only well-defined if the argument really is a pointer to the first element of a dynamic allocation by the default operator new, but this isn't nearly as useful as one might think.

You are right that some part of the system will have to know something about the size. But getting that information is probably not covered by the API of memory management system (think malloc/free), and the exact size that you requested may not be known, because it may have been rounded up.

You will often find that memory managers will only allocate space in a certain multiple, 64 bytes for example.
So, you may ask for new int[4], i.e. 16 bytes, but the memory manager will allocate 64 bytes for your request. To free this memory it doesn't need to know how much memory you asked for, only that it has allocated you one block of 64 bytes.
The next question may be, can it not store the requested size? This is an added overhead which not everybody is prepared to pay for. An Arduino Uno for example only has 2k of RAM, and in that context 4 bytes for each allocation suddenly becomes significant.
If you need that functionality then you have std::vector (or equivalent), or you have higher-level languages. C/C++ was designed to enable you to work with as little overhead as you choose to make use of, this being one example.

There is a curious case of overloading the operator delete that I found in the form of:
void operator delete[](void *p, size_t size);
The parameter size seems to default to the size (in bytes) of the block of memory to which void *p points. If this is true, it is reasonable to at least hope that it has a value passed by the invocation of operator new and, therefore, would merely need to be divided by sizeof(type) to deliver the number of elements stored in the array.
As for the "why" part of your question, Martin's rule of "don't pay for what you don't need" seems the most logical.

There's no way to know how you are going to use that array.
The allocation size does not necessarily match the element number so you cannot just use the allocation size (even if it was available).
This is a deep flaw in other languages not in C++.
You achieve the functionality you desire with std::vector yet still retain raw access to arrays. Retaining that raw access is critical for any code that actually has to do some work.
Many times you will perform operations on subsets of the array and when you have extra book-keeping built into the language you have to reallocate the sub-arrays and copy the data out to manipulate them with an API that expects a managed array.
Just consider the trite case of sorting the data elements.
If you have managed arrays then you can't use recursion without copying data to create new sub-arrays to pass recursively.
Another example is an FFT which recursively manipulates the data starting with 2x2 "butterflies" and works its way back to the whole array.
To fix the managed array you now need "something else" to patch over this defect and that "something else" is called 'iterators'. (You now have managed arrays but almost never pass them to any functions because you need iterators +90% of the time.)

The size of an array allocated with new[] is not visibly stored anywhere, so you can't access it. And new[] operator doesn't return an array, just a pointer to the array's first element. If you want to know the size of a dynamic array, you must store it manually or use classes from libraries such as std::vector

What's the advantage of malloc?

What is the advantage of allocating a memory for some data. Instead we could use an array of them.
Like
int *lis;
lis = (int*) malloc ( sizeof( int ) * n );
/* Initialize LIS values for all indexes */
for ( i = 0; i < n; i++ )
lis[i] = 1;
we could have used an ordinary array.
Well I don't understand exactly how malloc works, what is actually does. So explaining them would be more beneficial for me.
And suppose we replace sizeof(int) * n with just n in the above code and then try to store integer values, what problems might i be facing? And is there a way to print the values stored in the variable directly from the memory allocated space, for example here it is lis?

Your question seems to rather compare dynamically allocated C-style arrays with variable-length arrays, which means that this might be what you are looking for: Why aren't variable-length arrays part of the C++ standard?
However the c++ tag yields the ultimate answer: use std::vector object instead.
As long as it is possible, avoid dynamic allocation and responsibility for ugly memory management ~> try to take advantage of objects with automatic storage duration instead. Another interesting reading might be: Understanding the meaning of the term and the concept - RAII (Resource Acquisition is Initialization)
"And suppose we replace sizeof(int) * n with just n in the above code and then try to store integer values, what problems might i be facing?"
- If you still consider n to be the amount of integers that it is possible to store in this array, you will most likely experience undefined behavior.

More fundamentally, I think, apart from the stack vs heap and variable vs constant issues (and apart from the fact that you shouldn't be using malloc() in C++ to begin with), is that a local array ceases to exist when the function exits. If you return a pointer to it, that pointer is going to be useless as soon as the caller receives it, whereas memory dynamically allocated with malloc() or new will still be valid. You couldn't implement a function like strdup() using a local array, for instance, or sensibly implement a linked representation list or tree.

The answer is simple. Local1 arrays are allocated on your stack, which is a small pre-allocated memory for your program. Beyond a couple thousand data, you can't really do much on a stack. For higher amounts of data, you need to allocate memory out of your stack.
This is what malloc does.
malloc allocates a piece of memory as big as you ask it. It returns a pointer to the start of that memory, which could be treated similar to an array. If you write beyond the size of that memory, the result is undefined behavior. This means everything could work alright, or your computer may explode. Most likely though you'd get a segmentation fault error.
Reading values from the memory (for example for printing) is the same as reading from an array. For example printf("%d", list[5]);.
Before C99 (I know the question is tagged C++, but probably you're learning C-compiled-in-C++), there was another reason too. There was no way you could have an array of variable length on the stack. (Even now, variable length arrays on the stack are not so useful, since the stack is small). That's why for variable amount of memory, you needed the malloc function to allocate memory as large as you need, the size of which is determined at runtime.
Another important difference between local arrays, or any local variable for that matter, is the life duration of the object. Local variables are inaccessible as soon as their scope finishes. malloced objects live until they are freed. This is essential in practically all data structures that are not arrays, such as linked-lists, binary search trees (and variants), (most) heaps etc.
An example of malloced objects are FILEs. Once you call fopen, the structure that holds the data related to the opened file is dynamically allocated using malloc and returned as a pointer (FILE *).
1 Note: Non-local arrays (global or static) are allocated before execution, so they can't really have a length determined at runtime.

I assume you are asking what is the purpose of c maloc():
Say you want to take an input from user and now allocate an array of that size:
int n;
scanf("%d",&n);
int arr[n];
This will fail because n is not available at compile time. Here comes malloc()
you may write:
int n;
scanf("%d",&n);
int* arr = malloc(sizeof(int)*n);
Actually malloc() allocate memory dynamically in the heap area

Some older programming environments did not provide malloc or any equivalent functionality at all. If you needed dynamic memory allocation you had to code it yourself on top of gigantic static arrays. This had several drawbacks:
The static array size put a hard upper limit on how much data the program could process at any one time, without being recompiled. If you've ever tried to do something complicated in TeX and got a "capacity exceeded, sorry" message, this is why.
The operating system (such as it was) had to reserve space for the static array all at once, whether or not it would all be used. This phenomenon led to "overcommit", in which the OS pretends to have allocated all the memory you could possibly want, but then kills your process if you actually try to use more than is available. Why would anyone want that? And yet it was hyped as a feature in mid-90s commercial Unix, because it meant that giant FORTRAN simulations that potentially needed far more memory than your dinky little Sun workstation had, could be tested on small instance sizes with no trouble. (Presumably you would run the big instance on a Cray somewhere that actually had enough memory to cope.)
Dynamic memory allocators are hard to implement well. Have a look at the jemalloc paper to get a taste of just how hairy it can be. (If you want automatic garbage collection it gets even more complicated.) This is exactly the sort of thing you want a guru to code once for everyone's benefit.
So nowadays even quite barebones embedded environments give you some sort of dynamic allocator.
However, it is good mental discipline to try to do without. Over-use of dynamic memory leads to inefficiency, of the kind that is often very hard to eliminate after the fact, since it's baked into the architecture. If it seems like the task at hand doesn't need dynamic allocation, perhaps it doesn't.
However however, not using dynamic memory allocation when you really should have can cause its own problems, such as imposing hard upper limits on how long strings can be, or baking nonreentrancy into your API (compare gethostbyname to getaddrinfo).
So you have to think about it carefully.

we could have used an ordinary array
In C++ (this year, at least), arrays have a static size; so creating one from a run-time value:
int lis[n];
is not allowed. Some compilers allow this as a non-standard extension, and it's due to become standard next year; but, for now, if we want a dynamically sized array we have to allocate it dynamically.
In C, that would mean messing around with malloc; but you're asking about C++, so you want
std::vector<int> lis(n, 1);
to allocate an array of size n containing int values initialised to 1.
(If you like, you could allocate the array with new int[n], and remember to free it with delete [] lis when you're finished, and take extra care not to leak if an exception is thrown; but life's too short for that nonsense.)
Well I don't understand exactly how malloc works, what is actually does. So explaining them would be more beneficial for me.
malloc in C and new in C++ allocate persistent memory from the "free store". Unlike memory for local variables, which is released automatically when the variable goes out of scope, this persists until you explicitly release it (free in C, delete in C++). This is necessary if you need the array to outlive the current function call. It's also a good idea if the array is very large: local variables are (typically) stored on a stack, with a limited size. If that overflows, the program will crash or otherwise go wrong. (And, in current standard C++, it's necessary if the size isn't a compile-time constant).
And suppose we replace sizeof(int) * n with just n in the above code and then try to store integer values, what problems might i be facing?
You haven't allocated enough space for n integers; so code that assumes you have will try to access memory beyond the end of the allocated space. This will cause undefined behaviour; a crash if you're lucky, and data corruption if you're unlucky.
And is there a way to print the values stored in the variable directly from the memory allocated space, for example here it is lis?
You mean something like this?
for (i = 0; i < len; ++i) std::cout << lis[i] << '\n';

C++ creating arrays

Why can't I do something like this:
int size = menu.size;
int list[size];
Is there anyway around this instead of using a vector? (arrays are faster, so I wanted to use arrays)
thanks

The size must be known at compile-time, since the compiler needs to know how much stack space will be needed to allocate enough memory for it. (Edit: I stand corrected. In C, variable length arrays can be allocated on the stack. C++ does not allow variable length arrays, however.)
But, you can create arrays on the heap at run-time:
int* list = new int[size];
Just make sure you free the memory when you're done, or you'll get a memory leak:
delete [] list;
Note that it's very easy to accidentally create memory leaks, and a vector is almost for sure easier to use and maintain. Vectors are quite fast (especially if you reserve() them to the right size first), and I strongly recommend using a vector instead of manual memory-management.
In general, it's a good idea to profile your code to find out where the real bottlenecks are than to micro-optimize up front (because the optimizations are not always optimizations).

As other have said, the C++ language designers have chosen not to allow variable length arrays, VLAs, in spite of them being available in C99. However, if you are prepared to do a bit more work yourself, and you are simply desperate to allocate memory on the stack, you can use alloca().
That said, I personally would use std::vector. It is simpler, safer, more maintainable and likely fast enough.

C++ does not allow variable length arrays. The size must be known at compile-time. So you can't do that.
You can either use vectors or new.
vector<int> list;
or
int *list = new int[size];
If you go with the latter, you need to free it later on:
delete[] list;
arrays are faster, so I wanted to use arrays
This is not true. Where did you hear this from?

In C++, the length of an array needs to be known at compile time, in order to allocate it on the stack.
If you need to allocate an array whose size you do not know at compile time, you'll need to allocate it on the heap, using operator new[]
int size = menu.size;
int *list = new int[size];
But since you've new'd memory on the heap, you need to ensure you properly delete it when you are done with it.
delete[] list;

AFAIK In C++03, there are no variable-length-arrays (VLA):
you probably want to do this:
const int size = menu.size;
int list[size]; // size is compile-time constant
or
int *list = new int[size]; // creates an array at runtime;
delete[] list; // size can be non-const

first of all, vectors aren't significantly faster.
as for the reason why you cant do something like this:
the code will allocate the array on the stack. compilers have to know this size upfront so they can account for it. hence you can only use constants with that syntax.
an easy way around is creating the array on the heap: int list = new int[size]; Don't forget to delete[] later on.
However, if you use a vector and reserve the correct size upfront + compile with optimization, there should be little, to absolutely no overhead.

Since list is allocated on the stack, its size has to be known at compile time. But here, size is not known until runtime, so the size of list is not known at compile time.

If pointers can dynamically change the size of arrays at run time, why is it necessary to initialize the array with a size?

For instance:
int* pArray;
pArray = new array[];
instead of:
int* pArray;
pArray = new array[someNumber];
Since pointers are able to dynamically change the size of an array at run time, and the name of the pointer points to the first element of an array, shouldn't the default size be [1]? Does anyone know what's happening behind the scene?

Since pointers are able to dynamically change the size of an array at run time
This is not true. They can't change the size unless you allocate a new array with the new size.
If you want to have an array-like object that dynamically changes the size you should use the std::vector.
#include<vector>
#include<iostream>
...
std::vector<int> array;
array.push_back(1);
array.push_back(2);
array.push_back(3);
array.push_back(4);
std::cout << array.size() << std::endl; // should be 4

When you create an array with new, you are allocating a specific amount of memory for that array. You need to tell it how many items are to be stored so it can allocate enough memory.
When you "resize" the array, you are creating a new array (one with even more memory) and copying the items over before deleting the old array (or else you have a memory leak).

Quite simply, C++ arrays have no facility to change their size automatically. Therefore, when allocating an array you must specify it size.

Pointers cannot change an array. They can be made to point to different arrays at runtime, though.
However, I suggest you stay away from anything involving new until you have learned more about the language. For arrays changing their size dynamically use std::vector.

Pointers point to dynamically allocated memory. The memory is on the heap rather than the stack. It is dynamic because you can call new and delete on it, adding to it and removing from it at run time (in simple terms). The pointer has nothing to do with that - a pointer can point to anything and in this case, it just happens to point to the beginning of your dynamic memory. The resizing and management of that memory is completely your responsibility (or the responsibility of the container you may use, e.g. std::vector manages dynamic memory and acts as a dynamic array).

They cannot change the size dynamically. You can get the pointer to point to a new allocation of memory from the heap.

Behind the scenes there is memory allocated, a little chunk of silicium somewhere in your machine is now dedicated to the array you just newed.
When you want to "resize" your array, it is only possible to do so in place if the chunk of silicium has some free space around it. Most of the times, it is instead necessary to reserve another, bigger, chunk and copy the data that were in the first... and obviously relinquish the first (otherwise you have a memory leak).
This is done automatically by STL containers (like std::vector or std::deque), but manually when you yourself call new. Therefore, the best solution to avoid leaks is to use the Standard Library instead of trying to emulate it yourself.

int *pArray = new int; can be considered an array of size 1 and it kinda does what you want "by default".
But what if I need an array of 10 elements?
Pointers do not have any magical abilites, they just point to memory, therefore:
pArray[5] = 10; will just yield a run-time error (if you are lucky).
Therefore there is a possibility to allocate an array of needed size by calling new type[size].

adding element to an array

How can I add an element to an array where the size of the array is unknown and vectors are prohibited?

If the size of the array is unknown how do you know where to put the element and whether it will fit?
Anyway, if it won't fit you have to allocate a new array that is big enough.
If you allocated originally with malloc rather than new[] you can use realloc. You might be surprised to know that malloc / realloc are not "replaced" by new but are used for a different purpose, and in this case it is a useful thing to use. You can then insert objects into the allocated memory using placement new, and vector works this way. (allocator is used to actually allocate the memory but has an interface like malloc and although the implementation is up to the library author, they will almost certainly use malloc).
If you reallocated using realloc, you need to know that:
Any memory will be copied over. Beware though that if they are non-POD objects stored it is not safe to just do byte-by-byte copy
If realloc fails it returns NULL and your previous array is not freed. Be certain to keep a pointer to the old location until you know realloc worked.
If your array is not POD you cannot realloc, so malloc the new memory and use placement-new with a copy-constructor, then call the destructor on each object of the old memory before freeing it.
Placement new is used so you can allocate more memory than you need for now, i.e. more than you have objects for. This prevents you having to go through this process every single time you append.
I have explained how you might implement vector, but this is probably far too complex for you and probably for your tutor too. So just create an array with new[] and copy over the elements although it is horribly inefficient if you have to do this every time you add one.

You use lists for this purpose. An array is not supposed to be extended.

Your question does not make sense. If you can't use vector, means this is an assignment. I assume you have to use native arrays. If you don't know the size, you just cannot add an element to it. If you knew the size, you would have to allocate a bigger array, copy the contents of the old array plus the new element. Which the vector actually does for you :)

Without knowing the current size of the array and the capacity of the array (the distinction there is important), adding elements to the array is dangerous. You will never know when you have passed the bounds of the array (at least not before your program crashes from a buffer overrun error).
If your professor has banned vector for an assignment, chances are he wants to show you how to manage memory. He's looking for you to create an array using new and properly recognize when you need to allocate a new array of a larger size, copy the original elements, and deallocate the original array using delete.
One hint most professors fail to mention: when you allocate an array using new [], you must also deallocate it using delete [].

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js