C++ proper new usage? - c++

int* array = new int[ 10 ]( );
Is this the proper usage of the new operator? To my knowledge the previous code will initialize each element in the array to 0.
int* array = new int[ 10 ];
Does the second line of code just initialize the array, but not set the values to zero?

The proper way to use the new operator depends on whatever you are going to do next after allocating the memory.
int* array = new int[10](); will zero out the memory that you are allocating because it is running the int initializer for each int in the array.
int* array = new int[10]; will not initialize the memory, so the value of each int in the array will be whatever the value was at the memory address you get from new. It might be zeros if you're lucky, but more than likely it's probably garbage left there from some other memory request/release.
Generally speaking, you need to treat uninitialized variables as garbage values and not use them before assigning them a value. That is unless you are using it as entropy in a random number generator, but even then it might not be random enough if the memory happens to be too clean. Another rare use case might be snooping on what another program left in memory after closing. Both of these examples are exceptions to the rule.
Usually the best reason NOT to initialize is speed. Setting each item in the array to 0 has a speed cost to it, and although it might be small, it might be noticeable if your array is huge or you execute this code frequently. This is for when you KNOW you will be setting those values before you use them, you can save yourself the cost of initializing them needlessly.
Now having said all that, I also agree with the comments that std::vector<int> is usually the better way to go, if nothing more than for the advantage that you don't have to worry about memory leaks (which can cost a lot of debug/development time and should not be underestimated) and you also get a lot of benefits. Not to mention you can do all the same things with vectors that you could a regular array - this is because vectors allocate contiguous memory.
std::vector<int> safeArray(10);
int* array = &safeArray[0]; // array now points to the 0th element in safeArray
One thing you lose with std::vector is that you no longer have a choice on whether or not you initialize.

Related

Dynamic and static array

I am studying C++ reading Stroustrup's book that in my opinion is not very clear in this topic (arrays). From what I have understood C++ has (like Delphi) two kind of arrays:
Static arrays that are declared like
int test[3] = {10,487,-22};
Dynamic arrays that are called vectors
std::vector<int> a;
a.push_back(10);
a.push_back(487);
a.push_back(-22);
I have already seen answers about this (and there were tons of lines and concepts inside) but they didn't clarify me the concept.
From what I have understood vectors consume more memory but they can change their size (dynamically, in fact). Arrays instead have a fixed size that is given at compile time.
In the chapter Stroustrup said that vectors are safe while arrays aren't, whithout explaining the reason. I trust him indeed, but why? Is the reason safety related to the location of the memory? (heap/stack)
I would like to know why I am using vectors if they are safe.
The reason arrays are unsafe is because of memory leaks.
If you declare a dynamic array
int * arr = new int[size]
and you don't do delete [] arr, then the memory remains uncleared and this is known as a memory leak. It should be noted, ANY time you use the word new in C++, there must be a delete somewhere in there to free that memory. If you use malloc(), then free() should be used.
http://ptolemy.eecs.berkeley.edu/ptolemyclassic/almagest/docs/prog/html/ptlang.doc7.html
It is also very easy to go out of bounds in an array, for example inserting a value in an index larger than its size -1. With a vector, you can push_back() as many elements as you want and the vector will resize automatically. If you have an array of size 15 and you try to say arr[18] = x,
Then you will get a segmentation fault. The program will compile, but will crash when it reaches a statement that puts it out of the array bounds.
In general when you have large code, arrays are used infrequently. Vectors are objectively superior in almost every way, and so using arrays becomes sort of pointless.
EDIT: As Paul McKenzie pointed out in the comments, going out of array bounds does not guarantee a segmentation fault, but rather is undefined behavior and is up to the compiler to determine what happens
Let us take the case of reading numbers from a file.
We don't know how many numbers are in the file.
To declare an array to hold the numbers, we need to know the capacity or quantity, which is unknown. We could pick a number like 64. If the file has more than 64 numbers, we start overwriting the array. If the file has fewer than 64 (like 16), we are wasting memory (by not using 48 slots). What we need is to dynamically adjust the size of the container (array).
To dynamically adjust the capacity of an array, a new larger array must be created, then elements copied and the old array deleted.
The std::vector will adjust its capacity as necessary. It handles the dynamic allocation of memory for you.
Another aspect is the passing of the container to a function. With an array, you need to pass the array and the capacity. With std::vector, you only need to pass the vector. The vector object can be queried about its capacity.
One Security I can see is that you can't access something in vector which is not there.
What I meant by that is , if you push_back only 4 elements and you try to access index 7 , then it will throw back an error. But in array that doesn't happen.
In short, it stops you from accessing corrupt data.
edit :
programmer has to compare the index with vector.size() to throw an error. and it doesn't happne automatically. One has to do it by himself/herself.

C++ doesn't tell you the size of a dynamic array. But why?

I know that there is no way in C++ to obtain the size of a dynamically created array, such as:
int* a;
a = new int[n];
What I would like to know is: Why? Did people just forget this in the specification of C++, or is there a technical reason for this?
Isn't the information stored somewhere? After all, the command
delete[] a;
seems to know how much memory it has to release, so it seems to me that delete[] has some way of knowing the size of a.
It's a follow on from the fundamental rule of "don't pay for what you don't need". In your example delete[] a; doesn't need to know the size of the array, because int doesn't have a destructor. If you had written:
std::string* a;
a = new std::string[n];
...
delete [] a;
Then the delete has to call destructors (and needs to know how many to call) - in which case the new has to save that count. However, given it doesn't need to be saved on all occasions, Bjarne decided not to give access to it.
(In hindsight, I think this was a mistake ...)
Even with int of course, something has to know about the size of the allocated memory, but:
Many allocators round up the size to some convenient multiple (say 64 bytes) for alignment and convenience reasons. The allocator knows that a block is 64 bytes long - but it doesn't know whether that is because n was 1 ... or 16.
The C++ run-time library may not have access to the size of the allocated block. If for example, new and delete are using malloc and free under the hood, then the C++ library has no way to know the size of a block returned by malloc. (Usually of course, new and malloc are both part of the same library - but not always.)
One fundamental reason is that there is no difference between a pointer to the first element of a dynamically allocated array of T and a pointer to any other T.
Consider a fictitious function that returns the number of elements a pointer points to.
Let's call it "size".
Sounds really nice, right?
If it weren't for the fact that all pointers are created equal:
char* p = new char[10];
size_t ps = size(p+1); // What?
char a[10] = {0};
size_t as = size(a); // Hmm...
size_t bs = size(a + 1); // Wut?
char i = 0;
size_t is = size(&i); // OK?
You could argue that the first should be 9, the second 10, the third 9, and the last 1, but to accomplish this you need to add a "size tag" on every single object.
A char will require 128 bits of storage (because of alignment) on a 64-bit machine. This is sixteen times more than what is necessary.
(Above, the ten-character array a would require at least 168 bytes.)
This may be convenient, but it's also unacceptably expensive.
You could of course envision a version that is only well-defined if the argument really is a pointer to the first element of a dynamic allocation by the default operator new, but this isn't nearly as useful as one might think.
You are right that some part of the system will have to know something about the size. But getting that information is probably not covered by the API of memory management system (think malloc/free), and the exact size that you requested may not be known, because it may have been rounded up.
You will often find that memory managers will only allocate space in a certain multiple, 64 bytes for example.
So, you may ask for new int[4], i.e. 16 bytes, but the memory manager will allocate 64 bytes for your request. To free this memory it doesn't need to know how much memory you asked for, only that it has allocated you one block of 64 bytes.
The next question may be, can it not store the requested size? This is an added overhead which not everybody is prepared to pay for. An Arduino Uno for example only has 2k of RAM, and in that context 4 bytes for each allocation suddenly becomes significant.
If you need that functionality then you have std::vector (or equivalent), or you have higher-level languages. C/C++ was designed to enable you to work with as little overhead as you choose to make use of, this being one example.
There is a curious case of overloading the operator delete that I found in the form of:
void operator delete[](void *p, size_t size);
The parameter size seems to default to the size (in bytes) of the block of memory to which void *p points. If this is true, it is reasonable to at least hope that it has a value passed by the invocation of operator new and, therefore, would merely need to be divided by sizeof(type) to deliver the number of elements stored in the array.
As for the "why" part of your question, Martin's rule of "don't pay for what you don't need" seems the most logical.
There's no way to know how you are going to use that array.
The allocation size does not necessarily match the element number so you cannot just use the allocation size (even if it was available).
This is a deep flaw in other languages not in C++.
You achieve the functionality you desire with std::vector yet still retain raw access to arrays. Retaining that raw access is critical for any code that actually has to do some work.
Many times you will perform operations on subsets of the array and when you have extra book-keeping built into the language you have to reallocate the sub-arrays and copy the data out to manipulate them with an API that expects a managed array.
Just consider the trite case of sorting the data elements.
If you have managed arrays then you can't use recursion without copying data to create new sub-arrays to pass recursively.
Another example is an FFT which recursively manipulates the data starting with 2x2 "butterflies" and works its way back to the whole array.
To fix the managed array you now need "something else" to patch over this defect and that "something else" is called 'iterators'. (You now have managed arrays but almost never pass them to any functions because you need iterators +90% of the time.)
The size of an array allocated with new[] is not visibly stored anywhere, so you can't access it. And new[] operator doesn't return an array, just a pointer to the array's first element. If you want to know the size of a dynamic array, you must store it manually or use classes from libraries such as std::vector

Declare large array on Stack

I am using Dev C++ to write a simulation program. For it, I need to declare a single dimensional array with the data type double. It contains 4200000 elements - like double n[4200000].
The compiler shows no error, but the program exits on execution. I have checked, and the program executes just fine for an array having 5000 elements.
Now, I know that declaring such a large array on the stack is not recommended. However, the thing is that the simulation requires me to call specific elements from the array multiple times - for example, I might need the value of n[234] or n[46664] for a given calculation. Therefore, I need an array in which it is easier to sift through elements.
Is there a way I can declare this array on the stack?
No there is no(we'll say "reasonable") way to declare this array on the stack. You can however declare the pointer on the stack, and set aside a bit of memory on the heap.
double *n = new double[4200000];
accessing n[234] of this, should be no quicker than accessing n[234] of an array that you declared like this:
double n[500];
Or even better, you could use vectors
std::vector<int> someElements(4200000);
someElements[234];//Is equally fast as our n[234] from other examples, if you optimize (-O3) and the difference on small programs is negligible if you don't(+5%)
Which if you optimize with -O3, is just as fast as an array, and much safer. As with the
double *n = new double[4200000];
solution you will leak memory unless you do this:
delete[] n;
And with exceptions and various things, this is a very unsafe way of doing things.
You can increase your stack size. Try adding these options to your link flags:
-Wl,--stack,36000000
It might be too large though (I'm not sure if Windows places an upper limit on stack size.) In reality though, you shouldn't do that even if it works. Use dynamic memory allocation, as pointed out in the other answers.
(Weird, writing an answer and hoping it won't get accepted... :-P)
Yes, you can declare this array on the stack (with a little extra work), but it is not wise.
There is no justifiable reason why the array has to live on the stack.
The overhead of dynamically allocating a single array once is neglegible (you could say "zero"), and a smart pointer will safely take care of not leaking memory, if that is your concern.
Stack allocated memory is not in any way different from heap allocated memory (apart from some caching effects for small objects, but these do not apply here).
Insofar, just don't do it.
If you insist that you must allocate the array on the stack, you will need to reserve 32 megabytes of stack space first (preferrably a bit more). For that, using Dev-C++ (which presumes Windows+MingW) you will either need to set the reserved stack size for your executable using compiler flags such as -Wl,--stack,34000000 (this reserves somewhat more than 32MiB), or create a thread (which lets you specify a reserved stack size for that thread).
But really, again, just don't do that. There's nothing wrong with allocating a huge array dynamically.
Are there any reasons you want this on the stack specifically?
I'm asking because the following will give you a construct that can be used in a similar way (especially accessing values using array[index]), but it is a lot less limited in size (total max size depending on 32bit/64bit memory model and available memory (RAM and swap memory)) because it is allocated from the heap.
int arraysize= 4200000;
int *heaparray= new int[arraysize];
...
k= heaparray[456];
...
delete [] heaparray;
return;

C++ Deleting part of dynamic array

Say I have a dynamic array like:
int* integers = new int[100];
Is there a way to delete only part of the array such as:
int* integers2 = integers + 50;
delete[] integers2;
I want to not only delete everything 50 and beyond, but if I call another delete[] on the original integers array, then it would only delete the correct amount of memory and not try to delete the originally allocated amount and seg fault.
Why I want to do this: I have a data structure that is structured in levels of arrays, and I want to be able to create this data structure from a full array. So I want to be able to say
int* level1 = integers;
int* level2 = integers + 50;
int* level3 = integers + 100;
But when level 3 is no longer needed, the data structure will automatically delete[] level3. I need to know that this will behave correctly and not just destroy everything in the array. If it will then I need to just create new arrays and copy the contents over, but it would be nice to avoid doing that for performance reasons.
Edit: Everyone seems to be jumping to the conclusion that I just should use a dynamic resizing container (ie vector, deque) in the first place for my data structure. I am using levels of arrays for a good reason (and they aren't equally sized like I make it look like in my example). I was merely looking for a good way to have a constructor to my data structure that takes in an array or vector and not need to copy the original contents over into the new data structure.
No, this will not behave correctly. You can only delete[] pointers that you got from new[], else the results are undefined and bad things might happen.
If you really need the array to get smaller you have to allocate a new one and copy the contents manually.
Typically when memory gets allocated, there is some housekeeping stuff before the pointer.
i.e. houskeeping (pointer) data
You will mess that up.
int* integers2 = integers + 50;
delete[] integers2;
Will not work because new is created on int*, so space of 100 int has been assigned to integers, now integers2 is only a pointer to 50th location of integers, it has no space assigned to it of its own, so using delete will not delete rest of integers2, it'll only give erratic results.
What you can do is copy the first 50 in another array, and delete the previous array completely.
delete will only delete the pointer which has space assigned to it, using delete to another pointer pointing to the space assigned to first pointer will not delete any space assigned to the first pointer.
delete[] integers2 will not delete any space assigned to integers1 or any other pointer.
Dynamic allocators (like new) generally don't like you releasing part of the memory they gave you. If you use the <malloc.h> defined library functions malloc() and free() instead of new and delete, then you can use realloc(), though in most cases that you would care about the size difference it's just going to copy for you anyway.
Dynamically sizing containers generally use an exponential rule for resizing: if you run out of space, they (for example) double the allocation (and copy the old data over), if you remove data until you are using (for example) less than half the allocation they copy into a smaller allocation. This means you never waste more than half the memory, and the cost of copying per element added or removed is effectively constant. Implementing all of this is a pain in the ass, though, so just use std::vector and let it do it for you :).
No you can't do this with a fixed sized array allocated with new[]. If you want to have a dynamic array use one of the STL containers, such as std::vector.

If pointers can dynamically change the size of arrays at run time, why is it necessary to initialize the array with a size?

For instance:
int* pArray;
pArray = new array[];
instead of:
int* pArray;
pArray = new array[someNumber];
Since pointers are able to dynamically change the size of an array at run time, and the name of the pointer points to the first element of an array, shouldn't the default size be [1]? Does anyone know what's happening behind the scene?
Since pointers are able to dynamically change the size of an array at run time
This is not true. They can't change the size unless you allocate a new array with the new size.
If you want to have an array-like object that dynamically changes the size you should use the std::vector.
#include<vector>
#include<iostream>
...
std::vector<int> array;
array.push_back(1);
array.push_back(2);
array.push_back(3);
array.push_back(4);
std::cout << array.size() << std::endl; // should be 4
When you create an array with new, you are allocating a specific amount of memory for that array. You need to tell it how many items are to be stored so it can allocate enough memory.
When you "resize" the array, you are creating a new array (one with even more memory) and copying the items over before deleting the old array (or else you have a memory leak).
Quite simply, C++ arrays have no facility to change their size automatically. Therefore, when allocating an array you must specify it size.
Pointers cannot change an array. They can be made to point to different arrays at runtime, though.
However, I suggest you stay away from anything involving new until you have learned more about the language. For arrays changing their size dynamically use std::vector.
Pointers point to dynamically allocated memory. The memory is on the heap rather than the stack. It is dynamic because you can call new and delete on it, adding to it and removing from it at run time (in simple terms). The pointer has nothing to do with that - a pointer can point to anything and in this case, it just happens to point to the beginning of your dynamic memory. The resizing and management of that memory is completely your responsibility (or the responsibility of the container you may use, e.g. std::vector manages dynamic memory and acts as a dynamic array).
They cannot change the size dynamically. You can get the pointer to point to a new allocation of memory from the heap.
Behind the scenes there is memory allocated, a little chunk of silicium somewhere in your machine is now dedicated to the array you just newed.
When you want to "resize" your array, it is only possible to do so in place if the chunk of silicium has some free space around it. Most of the times, it is instead necessary to reserve another, bigger, chunk and copy the data that were in the first... and obviously relinquish the first (otherwise you have a memory leak).
This is done automatically by STL containers (like std::vector or std::deque), but manually when you yourself call new. Therefore, the best solution to avoid leaks is to use the Standard Library instead of trying to emulate it yourself.
int *pArray = new int; can be considered an array of size 1 and it kinda does what you want "by default".
But what if I need an array of 10 elements?
Pointers do not have any magical abilites, they just point to memory, therefore:
pArray[5] = 10; will just yield a run-time error (if you are lucky).
Therefore there is a possibility to allocate an array of needed size by calling new type[size].