I've searched far and wide and I cannot work out what this produces. I've seen no other examples where "new int [] is used twice within one array. Can anyone help?
int *t [2] = { new int [2], new int [2] };
t is an array of two int*, which are, generically, pointers to int.
The new operator is allocating an array of 2 consecutive int on the heap, by returning a memory address to that allocated memory (int*). This is done twice, thus allocating two arrays and storing them in the outer array.
Since new int [2] gives you a heap-allocated array of two integers (each time you call it), you'll end up with an array t of integer pointers pointing to distinct arrays of integers, something like this:
(array) (points to) (arrays-on-heap)
t: [0] -> [int0, int1]
[1] -> [int2, int3]
Were you to print out &t, &(t[0]) and &(t[1]), you would find that was an array with the first two items the same and the third slightly higher (the size of an int *). This is because array elements are consecutively placed.
Printing out t[0] and t[1] may have wildly disparate values since they can come from anywhere on the heap. They probably won't be that different simply because consecutive heap allocations tend to be done from consecutive memory(a), but they're likely to be separated by some memory - this is because many common allocation strategies involve allocating blocks of a minimum size/resolution, and with inline control information between blocks.
Printing out &(t[0][0]) and &(t[0][1]) will again give you close values since they form part of an array.
Note that those paragraphs above are not all mandated to be true by the standard, they're just the most common scenarios. It's possible that allocation strategies may involve exact sizes and out-of-line control information, but it would be unlikely.
(a) There may be exceptions to this in optimised allocators if, for example, different-sized requests come from different pools, or there's a separate preferred pool per thread. But, in the general case here, that's unlikely.
Related
You can't have:
int array[1000000];
but you can make a vector and store those 1000000 elements.
Is this because the array is stored on the stack and it will not have enough space to grow?
What happens when you use the vector instead?
How does it prevent the issue of storing too many elements?
As defining those as global or in other places might not go in the stack, I assume we are defining int array[1000000] or std::vector<int> array(1000000) in a function definition, i.e. local variables.
For the former, yes you're right. It is stored in the stack and due to stack space limitation, in most environment it is dangerous.
On the other hand, in most of standard library implementations, the latter only contains a size, capacity and pointer where the data is actually stored. So it would take up just a couple dozen bytes in the stack, no matter how many elements are in the vector. And the pointer is generated from heap memory allocation(new or malloc), not the stack.
Here is an example of how many bytes it takes up in the stack for each.
And here is a rough visualization.
I am studying C++ reading Stroustrup's book that in my opinion is not very clear in this topic (arrays). From what I have understood C++ has (like Delphi) two kind of arrays:
Static arrays that are declared like
int test[3] = {10,487,-22};
Dynamic arrays that are called vectors
std::vector<int> a;
a.push_back(10);
a.push_back(487);
a.push_back(-22);
I have already seen answers about this (and there were tons of lines and concepts inside) but they didn't clarify me the concept.
From what I have understood vectors consume more memory but they can change their size (dynamically, in fact). Arrays instead have a fixed size that is given at compile time.
In the chapter Stroustrup said that vectors are safe while arrays aren't, whithout explaining the reason. I trust him indeed, but why? Is the reason safety related to the location of the memory? (heap/stack)
I would like to know why I am using vectors if they are safe.
The reason arrays are unsafe is because of memory leaks.
If you declare a dynamic array
int * arr = new int[size]
and you don't do delete [] arr, then the memory remains uncleared and this is known as a memory leak. It should be noted, ANY time you use the word new in C++, there must be a delete somewhere in there to free that memory. If you use malloc(), then free() should be used.
http://ptolemy.eecs.berkeley.edu/ptolemyclassic/almagest/docs/prog/html/ptlang.doc7.html
It is also very easy to go out of bounds in an array, for example inserting a value in an index larger than its size -1. With a vector, you can push_back() as many elements as you want and the vector will resize automatically. If you have an array of size 15 and you try to say arr[18] = x,
Then you will get a segmentation fault. The program will compile, but will crash when it reaches a statement that puts it out of the array bounds.
In general when you have large code, arrays are used infrequently. Vectors are objectively superior in almost every way, and so using arrays becomes sort of pointless.
EDIT: As Paul McKenzie pointed out in the comments, going out of array bounds does not guarantee a segmentation fault, but rather is undefined behavior and is up to the compiler to determine what happens
Let us take the case of reading numbers from a file.
We don't know how many numbers are in the file.
To declare an array to hold the numbers, we need to know the capacity or quantity, which is unknown. We could pick a number like 64. If the file has more than 64 numbers, we start overwriting the array. If the file has fewer than 64 (like 16), we are wasting memory (by not using 48 slots). What we need is to dynamically adjust the size of the container (array).
To dynamically adjust the capacity of an array, a new larger array must be created, then elements copied and the old array deleted.
The std::vector will adjust its capacity as necessary. It handles the dynamic allocation of memory for you.
Another aspect is the passing of the container to a function. With an array, you need to pass the array and the capacity. With std::vector, you only need to pass the vector. The vector object can be queried about its capacity.
One Security I can see is that you can't access something in vector which is not there.
What I meant by that is , if you push_back only 4 elements and you try to access index 7 , then it will throw back an error. But in array that doesn't happen.
In short, it stops you from accessing corrupt data.
edit :
programmer has to compare the index with vector.size() to throw an error. and it doesn't happne automatically. One has to do it by himself/herself.
I know that there is no way in C++ to obtain the size of a dynamically created array, such as:
int* a;
a = new int[n];
What I would like to know is: Why? Did people just forget this in the specification of C++, or is there a technical reason for this?
Isn't the information stored somewhere? After all, the command
delete[] a;
seems to know how much memory it has to release, so it seems to me that delete[] has some way of knowing the size of a.
It's a follow on from the fundamental rule of "don't pay for what you don't need". In your example delete[] a; doesn't need to know the size of the array, because int doesn't have a destructor. If you had written:
std::string* a;
a = new std::string[n];
...
delete [] a;
Then the delete has to call destructors (and needs to know how many to call) - in which case the new has to save that count. However, given it doesn't need to be saved on all occasions, Bjarne decided not to give access to it.
(In hindsight, I think this was a mistake ...)
Even with int of course, something has to know about the size of the allocated memory, but:
Many allocators round up the size to some convenient multiple (say 64 bytes) for alignment and convenience reasons. The allocator knows that a block is 64 bytes long - but it doesn't know whether that is because n was 1 ... or 16.
The C++ run-time library may not have access to the size of the allocated block. If for example, new and delete are using malloc and free under the hood, then the C++ library has no way to know the size of a block returned by malloc. (Usually of course, new and malloc are both part of the same library - but not always.)
One fundamental reason is that there is no difference between a pointer to the first element of a dynamically allocated array of T and a pointer to any other T.
Consider a fictitious function that returns the number of elements a pointer points to.
Let's call it "size".
Sounds really nice, right?
If it weren't for the fact that all pointers are created equal:
char* p = new char[10];
size_t ps = size(p+1); // What?
char a[10] = {0};
size_t as = size(a); // Hmm...
size_t bs = size(a + 1); // Wut?
char i = 0;
size_t is = size(&i); // OK?
You could argue that the first should be 9, the second 10, the third 9, and the last 1, but to accomplish this you need to add a "size tag" on every single object.
A char will require 128 bits of storage (because of alignment) on a 64-bit machine. This is sixteen times more than what is necessary.
(Above, the ten-character array a would require at least 168 bytes.)
This may be convenient, but it's also unacceptably expensive.
You could of course envision a version that is only well-defined if the argument really is a pointer to the first element of a dynamic allocation by the default operator new, but this isn't nearly as useful as one might think.
You are right that some part of the system will have to know something about the size. But getting that information is probably not covered by the API of memory management system (think malloc/free), and the exact size that you requested may not be known, because it may have been rounded up.
You will often find that memory managers will only allocate space in a certain multiple, 64 bytes for example.
So, you may ask for new int[4], i.e. 16 bytes, but the memory manager will allocate 64 bytes for your request. To free this memory it doesn't need to know how much memory you asked for, only that it has allocated you one block of 64 bytes.
The next question may be, can it not store the requested size? This is an added overhead which not everybody is prepared to pay for. An Arduino Uno for example only has 2k of RAM, and in that context 4 bytes for each allocation suddenly becomes significant.
If you need that functionality then you have std::vector (or equivalent), or you have higher-level languages. C/C++ was designed to enable you to work with as little overhead as you choose to make use of, this being one example.
There is a curious case of overloading the operator delete that I found in the form of:
void operator delete[](void *p, size_t size);
The parameter size seems to default to the size (in bytes) of the block of memory to which void *p points. If this is true, it is reasonable to at least hope that it has a value passed by the invocation of operator new and, therefore, would merely need to be divided by sizeof(type) to deliver the number of elements stored in the array.
As for the "why" part of your question, Martin's rule of "don't pay for what you don't need" seems the most logical.
There's no way to know how you are going to use that array.
The allocation size does not necessarily match the element number so you cannot just use the allocation size (even if it was available).
This is a deep flaw in other languages not in C++.
You achieve the functionality you desire with std::vector yet still retain raw access to arrays. Retaining that raw access is critical for any code that actually has to do some work.
Many times you will perform operations on subsets of the array and when you have extra book-keeping built into the language you have to reallocate the sub-arrays and copy the data out to manipulate them with an API that expects a managed array.
Just consider the trite case of sorting the data elements.
If you have managed arrays then you can't use recursion without copying data to create new sub-arrays to pass recursively.
Another example is an FFT which recursively manipulates the data starting with 2x2 "butterflies" and works its way back to the whole array.
To fix the managed array you now need "something else" to patch over this defect and that "something else" is called 'iterators'. (You now have managed arrays but almost never pass them to any functions because you need iterators +90% of the time.)
The size of an array allocated with new[] is not visibly stored anywhere, so you can't access it. And new[] operator doesn't return an array, just a pointer to the array's first element. If you want to know the size of a dynamic array, you must store it manually or use classes from libraries such as std::vector
In my project, there are one million inputs and I am supposed to take different numbers of inputs in order to compare sort/search algorithms. Everything was allright till I tried to take five hundread thousand inputs. Therefore, I have realized that I can't create five hundred thousand pointers to my class or even an integer type by using array. However, I can create five pointers with size of one hundred thousand.
If I didn't explain very well, just look these two codes;
int *ptr[500000]; // it crashes
int *ptr1[100000]; // it runs well
int *ptr2[100000];
int *ptr3[100000];
int *ptr4[100000];
int *ptr5[100000];
What is the reason of crashing? Is there a limiting or is it about memory? And of course, how can I fix it?
You are trying to allocate a 500,000-entry array on the stack. The stack is not really designed for holding large amounts of data like this. In your case, the stack just happens to be big enough to hold 100,000 entries (or even several different lots of 100,000 entries) but not 500,000 in a single block. If you overflow the stack, behaviour is undefined but a crash is likely.
You will get much better results by allocating your array on the heap instead.
int **ptr = malloc(500000*sizeof(int*));
Remember to check for a NULL return value from malloc, and free the memory when you're finished with it.
My background is C++ and I'm currently about to start developing in C# so am doing some research. However, in the process I came across something that raised a question about C++.
This C# for C++ developers guide says that
In C++ an array is merely a pointer.
But this StackOverflow question has a highly-upvoted comment that says
Arrays are not pointers. Stop telling people that.
The cplusplus.com page on pointers says that arrays and pointers are related (and mentions implicit conversion, so they're obviously not the same).
The concept of arrays is related to that of pointers. In fact, arrays work very much like pointers to their first elements, and, actually, an array can always be implicitly converted to the pointer of the proper type.
I'm getting the impression that the Microsoft page wanted to simplify things in order to summarise the differences between C++ and C#, and in the process wrote something that was simpler but not 100% accurate.
But what have arrays got to do with pointers in the first place? Why is the relationship close enough for them to be summarised as the "same" even if they're not?
The cplusplus.com page says that arrays "work like" pointers to their first element. What does that mean, if they're not actually pointers to their first element?
There is a lot of bad writing out there. For example the statement:
In C++ an array is merely a pointer.
is simply false. How can such bad writing come about? We can only speculate, but one possible theory is that the author learned C++ by trial and error using a compiler, and formed a faulty mental model of C++ based on the results of his experiments. This is possibly because the syntax used by C++ for arrays is unconventional.
The next question is, how can a learner know if he/she is reading good material or bad material? Other than by reading my posts of course ;-) , participating in communities like Stack Overflow helps to bring you into contact with a lot of different presentations and descriptions, and then after a while you have enough information and experience to make your own decisions about which writing is good and which is bad.
Moving back to the array/pointer topic: my advice would be to first build up a correct mental model of how object storage works when we are working in C++. It's probably too much to write about just for this post, but here is how I would build up to it from scratch:
C and C++ are designed in terms of an abstract memory model, however in most cases this translates directly to the memory model provided by your system's OS or an even lower layer
The memory is divided up into basic units called bytes (usually 8 bits)
Memory can be allocated as storage for an object; e.g. when you write int x; it is decided that a particular block of adjacent bytes is set aside to store an integer value. An object is any region of allocated storage. (Yes this is a slightly circular definition!)
Each byte of allocated storage has an address which is a token (usually representible as a simple number) that can be used to find that byte in memory. The addresses of any bytes within an object must be sequential.
The name x only exists during the compilation stage of a program. At runtime there can be int objects allocated that never had a name; and there can be other int objects with one or more names during compilation.
All of this applies to objects of any other type, not just int
An array is an object which consists of many adjacent sub-objects of the same type
A pointer is an object which serves as a token identifying where another object can be found.
From hereon in, C++ syntax comes into it. C++'s type system uses strong typing which means that each object has a type. The type system extends to pointers. In almost all situations, the storage used to store a pointer only saves the address of the first byte of the object being pointed to; and the type system is used at compilation time to keep track of what is being pointed to. This is why we have different types of pointer (e.g. int *, float *) despite the fact that the storage may consist of the same sort of address in both cases.
Finally: the so-called "array-pointer equivalence" is not an equivalence of storage, if you understood my last two bullet points. It's an equivalence of syntax for looking up members of an array.
Since we know that a pointer can be used to find another object; and an array is a series of many adjacent objects; then we can work with the array by working with a pointer to that array's first element. The equivalence is that the same processing can be used for both of the following:
Find Nth element of an array
Find Nth object in memory after the one we're looking at
and furthermore, those concepts can be both expressed using the same syntax.
They are most definitely not the same thing at all, but in this case, confusion can be forgiven because the language semantics are ... flexible and intended for the maximum confusion.
Let's start by simply defining a pointer and an array.
A pointer (to a type T) points to a memory space which holds at least one T (assuming non-null).
An array is a memory space that holds multiple Ts.
A pointer points to memory, and an array is memory, so you can point inside or to an array. Since you can do this, pointers offer many array-like operations. Essentially, you can index any pointer on the presumption that it actually points to memory for more than one T.
Therefore, there's some semantic overlap between (pointer to) "Memory space for some Ts" and "Points to a memory space for some Ts". This is true in any language- including C#. The main difference is that they don't allow you to simply assume that your T reference actually refers to a space where more than one T lives, whereas C++ will allow you to do that.
Since all pointers to a T can be pointers to an array of T of arbitrary size, you can treat pointers to an array and pointers to a T interchangably. The special case of a pointer to the first element is that the "some Ts" for the pointer and "some Ts" for the array are equal. That is, a pointer to the first element yields a pointer to N Ts (for an array of size N) and a pointer to the array yields ... a pointer to N Ts, where N is equal.
Normally, this is just interesting memory crapping-around that nobody sane would try to do. But the language actively encourages it by converting the array to the pointer to the first element at every opportunity, and in some cases where you ask for an array, it actually gives you a pointer instead. This is most confusing when you want to actually use the array like a value, for example, to assign to it or pass it around by value, when the language insists that you treat it as a pointer value.
Ultimately, all you really need to know about C++ (and C) native arrays is, don't use them, pointers to arrays have some symmetries with pointers to values at the most fundamental "memory as an array of bytes" kind of level, and the language exposes this in the most confusing, unintuitive and inconsistent way imaginable. So unless you're hot on learning implementation details nobody should have to know, then use std::array, which behaves in a totally consistent, very sane way and just like every other type in C++. C# gets this right by simply not exposing this symmetry to you (because nobody needs to use it, give or take).
Arrays and pointers in C and C++ can be used with the exact same semantics and syntax in the vast majority of cases.
That is achieved by one feature:
Arrays decay to pointers to their first element in nearly all contexts.
Exceptions in C: sizeof, _Alignas, _Alignas, address-of &
In C++, the difference can also be important for overload-resolution.
In addition, array notation for function arguments is deceptive, these function-declarations are equivalent:
int f(int* a);
int f(int a[]);
int f(int a[3]);
But not to this one:
int f(int (&a)[3]);
Besides what has already been told, there is one big difference:
pointers are variables to store memory addresses, and they can be incremented or decremented and the values they store can change (they can point to any other memory location). That's not the same for arrays; once they are allocated, you can't change the memory region they reference, e.g. you cannot assign other values to them:
int my_array[10];
int x = 2;
my_array = &x;
my_array++;
Whereas you can do the same with a pointer:
int *p = array;
p++;
p = &x;
The meaning in this guide was simply that in C# an array is an object (perhaps like in STL that we can use in C++), while in C++ an array is basically a sequence of variables located & allocated one after the other, and that's why we can refer to them using a pointer (pointer++ will give us the next one etc.).
it's as simple as:
int arr[10];
int* arr_pointer1 = arr;
int* arr_pointer2 = &arr[0];
so, since arrays are contiguous in memory, writing
arr[1];
is the same as writing:
*(arr_pointer+1)
pushing things a bit further, writing:
arr[2];
//resolves to
*(arr+2);
//note also that this is perfectly valid
2[arr];
//resolves to
*(2+arr);