According to my class notes, you can allocate an array in C++ like
int *A = new int[5]
where A is a pointer to the array.
But then you can access the array as A[3]. Why can you do that? Isn't A a pointer and not the actual array?
The indexing operator[] actually defined to work on pointers, not on arrays. A[3] is actually a synonym for *(A+3). It works on arrays as a consequence of the fact that arrays can be implicitly converted to pointers to their first element.
A[3] is an alias for *(A+3) which dereferences the pointer.
You are right! Basically, this is syntactic sugar, meaning that the language designers put something in place to make your life a bit easier, but behind the scenes it is doing something quite different. (This point is arguable)
Effectively, what this is doing is taking the pointer location at A, moving the pointer 5 int-sizes up, then giving you the value by dereferencing.
This is the starting point of pointer arithmetic: you can add or subtract values with pointers, as long you don't move off the array (well, you can move up to right after the array). Basically, the following hold (for A being a pointer to an array with at leat i + 1 elements:
A[i]
*(A + i) // this what A[i] means
*(i + A) // addition is commutative
i[A] // this sould work, too - and it does!
Related
I wanted some further understanding, and possibly clarification, on a few things that confuse me about arrays and pointers in c++. One of the main things that confuse me is how when you refer to the name of an array, it could be referring to the array, or as a pointer to the first element, as well as a few other things. in order to better display where my trouble understanding is coming from, I'll display a few lines of code, as well as the conclusion that I've made from each one.
let's say that I have
int vals[] = {1,50,3,28,32,500};
int* valptr = vals;
Because vals is a pointer to the first value of the array, then that means that valptr, should equal vals as a whole, since I'm just setting a pointer equal to a pointer, like 1=1.
cout<<vals<<endl;
cout<<&vals<<endl;
cout<<*(&vals)<<endl;
cout<<valptr<<endl;
the code above all prints out the same value, leading me to conclude two things, one that valptr and vals are equal, and can be treated in the same way, as well as the fact that, for some reason, adding & and * to vals doesn't seem to refer to anything different, meaning that using them in this way, is useless, since they all refer to the same value.
cout<<valptr[1]<<endl;//outputs 50
cout<<vals[1]<<endl;//outputs 50
cout<<*vals<<endl;//outputs 1
cout<<*valptr<<endl;//outputs 1
the code above furthers the mindset to me that valptr and vals are the same, seeing as whenever I do something to vals, and do the same thing to valptr, they both yield the same result
cout<<*(&valptr +1)-valptr<<endl;
cout<<endl;
cout<< *(&vals + 1) -vals<<endl;
Now that we have established what I know, or the misconceptions that I may have, now we move on to the two main problems that I have, which we'll go over now.
The first confusion I have is with cout<< *(&vals + 1) -vals<<endl; I know that this outputs the size of the array, and the general concept on how it works, but I'm confused on several parts.
As shown earlier, if
cout<<vals<<endl;
cout<<&vals<<endl;
cout<<*(&vals)<<endl;
all print out the same value, the why do I need the * and & in cout<< *(&vals + 1) -vals<<endl; I know that if I do vals+1 it just refers to the address of the next element on the array, meaning that *(vals+1)returns 50.
This brings me to my first question: Why does the & in *(&vals+1) refer to the next address out the array? why is it not the same output as (vals+1)in the way that *(&vals)and (vals)have the same output?
Now to my second question. We know thatcout<< *(&vals + 1) -vals<<endl; is a valid statement, successfully printing the size of the array. However, as I stated earlier, in every other instance, valptr could be substituted for vals interchangeably. However, in the instance of writingcout<<*(&valptr +1)-valptr<<endl; I get 0 returned, instead of the expected value of 6. How can something that was proven to be interchangeable before, no longer be interchangeable in this instance?
I appreciate any help that anyone reading this can give me
Arrays and pointers are two different types in C++ that have enough similarities between them to create confusion if they are not understood properly. The fact that pointers are one of the most difficult concepts for beginners to grasp doesn't help either.
So I fell a quick crash course is needed.
Crash course
Arrays, easy
int a[3] = {2, 3, 4};
This creates an array named a that contains 3 elements.
Arrays have defined the array subscript operator:
a[i]
evaluates to the i'th element of the array.
Pointers, easy
int val = 24;
int* p = &val;
p is a pointer pointing to the address of the object val.
Pointers have the indirection (dereference) operator defined:
*p
evaluates to the value of the object pointed by p.
Pointers, acting like arrays
Pointers give you the address of an object in memory. It can be a "standalone object" like in the example above, or it can be an object that is part of an array. Neither the pointer type nor the pointer value can tell you which one it is. Just the programmer. That's why
Pointers also have the array subscript operator defined:
p[i]
evaluates to the ith element to the "right" of the object pointed by p. This assumes that the object pointer by p is part of an array (except in p[0] where it doesn't need to be part of an array).
Note that p[0] is equivalent to (the exact same as) *p.
Arrays, acting like pointers
In most contexts array names decay to a pointer to the first element in the array. That is why many beginners think that arrays and pointers are the same thing. In fact they are not. They are different types.
Back to your questions
Because vals is a pointer to the first value of the array
No, it's not.
... then that means that valptr, should equal vals as a whole, since I'm just setting a pointer equal to a pointer, like 1=1.
Because the premise is false the rest of the sentence is also false.
valptr and vals are equal
No they are not.
can be treated in the same way
Most of the times yes, because of the array to pointer decay. However that is not always the case. E.g. as an expression for sizeof and operand of & (address of) operator.
Why does the & in *(&vals+1) refer to the next address out the array?
&vals this is one of the few situation when vals doesn't decay to a pointer. &vals is the address of the array and is of type int (*)[6] (pointer to array of 6 ints). That's why &vals + 1 is the address of an hypothetical another array just to the right of the array vals.
How can something that was proven to be interchangeable before, no longer be interchangeable in this instance?
Simply that's the language. In most cases the name of the array decays to a pointer to the first element of the array. Except in a few situations that I've mentioned.
More crash course
A pointer to the array is not the same thing as a pointer to the first element of the array. Taking the address of the array is one of the few instances where the array name doesn't decay to a pointer to its first element.
So &a is a pointer to the array, not a pointer to the first element of the array. Both the array and the first element of the array start at the same address so the values of the two (&a and &a[0]) are the same, but their types are different and that matters when you apply the dereference or array subscript operator to them:
Expression
Expression type
Dereference / array subscript expresison
Dereference / array subscript type
a
int[3]
*a / a[i]
int
&a
int (*) [3] (pointer to array)
*&a / &a[i]
int[3]
&a[0]
int *
*&a[0] / (&a[0])[i]
int
I always thought that x[i] is equivalent to *(x+i).
So that would x[i][j] is equivalent to *(*(x+i)+j), which is in this case implies that a 2D array must be implemented as an array of pointers, each of these pointers has to be dereferenced.
But I learned that you can create a 2D array on the heap this way :
char (*arr)[256]=malloc(512*256);
If the former hypothesis is right, then arr[i][j] would access an unauthorized location (since we are dereferencing two times).
Is my former hypothesis wrong in case of 2d arrays ?
x[i] is indeed equivalent to *(x+i). Pointer arithmetic is used in the latter form.
In case of a 2D array type array[x][y];, then you have to apply the above rule in several steps.
array when used in an expression, decays to a pointer to the first element, in this case of type type(*)[y] - an array pointer pointing at the first array in the array of arrays. Do not confuse an array pointer with "an array of pointers", which means something different entirely.
Therefore array + i performs pointer arithmetic on such an array pointer. *(array + i) gives the pointed-at item, a 1D array.
If you for some reason unknown want to access an individual item in the 2D array by using pointer arithmetic only, you would therefore have to write something obscure like this:
*( *(array + i) + j)
If the former hypothesis is right, then arr[i][j] would access an
unauthorized location (since we are dereferencing two times).
Well, it wasn't right. When you have an array pointer such as char (*arr)[256], then you can access the items as arr[i][j]. Note that arr[i] performs the same kind of pointer arithmetic, so you get "the char[256] array number i". And in that array you then access item j.
This is actually the reason why this syntax is used. Had you written the malloc more type correct like this:
char (*arr)[512][256]=malloc( sizeof(char[512][256]) );
then you would have to de-reference the array pointer before using it, (*arr)[i][j] or otherwise you would get pointer arithmetic on whole 2D arrays.
The question Correctly allocating multi-dimensional arrays explains and illustrates the differences between look-up tables based on arrays of pointers and true multi-dimensional arrays.
Once I encountered following code:
int s =10;
int *p=&s;
cout << p[3] << endl;
And I can't understand why am I able to access p[3] that doesn't exist (only p exists that is single pointer but I still get access to p[3] that is array that I have never created).
Is it some compiler bug or it is a feature or I don't know some basics of C++ that covers this?
Thank you
Why does C++ consider pointer and array of pointers as same thing?
It doesn't. You're asking why it treats pointers and arrays as the same.
The [] operator is just an abbreviated form of pointer arithmetic. a[b] is equivalent to *(a + b). Array names can decay into pointers, and then pointer arithmetic is applied. It's the programmers job to make sure they don't go out of bounds. The compiler can't possibly stop you from shooting your foot off.
Also, claiming to be able to "access" it is a strong assertion. That is UB, and is most likely going to either read the wrong memory or get a segfault.
No, it's not a compiler bug, its a very useful feature... but lets not get ahead of ourselves here, the consequence of your code is called Undefined Behaviour
So, what's the feature? All naked arrays are actually pointer to the first element. Except un-decayed arrays (See What is array decaying?).
Consider this code:
int s =10;
int* array = new int[12];
int *p;
p = array; // p refers to the first element
int* x = p + 7; //advances to the 7th element, compiler never checks bounds
int* y = p + 700; //ditto ...this is obviously undefined
p = &s; //p is now pointing to where s
int* xx = p + 3; //But s is a single element, so Undefined Behaviour
Once an array is decayed, it's simply a pointer... And a pointer can be incremented, decremented, dereferenced, advanced, assigned or reassigned.
So,
cout << p[7] << endl;
is a valid C++ program. but not necessarily correct.
It's the responsibility of the programmer to know whether a pointer points to a single element or an array. but thanks to static analyzers and https://github.com/isocpp/CppCoreGuidelines, things are changing for good.
Also see What are all the common undefined behaviours that a C++ programmer should know about?
From here, section array-to-pointer decay:
There is anĀ implicit conversionĀ from lvalues and rvalues of array type to rvalues of pointer type: it constructs a pointer to the first element of an array. This conversion is used whenever arrays appear in context where arrays are not expected, but pointers are
Inherited from C, C++ allows you to treat any pointer like the first element of an array starting at that address.
That's in part because it passes arrays by reference as pointers and so for that to make sense you need to be able to treat a pointer as an array.
It also enables some quite neat and very efficient code in various circumstances.
The upshot is that p[3] is a valid construct in this context.
Obviously however it has undefined behaviour because p isn't pointing to an array! Unfortunately the language rules (and compiler) aren't smart enough to work that out.
C is a very low level language and doesn't enforce nice things like range checking either during compilation or execution.
for example:
#include <iostream>
using namespace std;
int main(){
int *a;
a = new int[2];
a[1] = 1;
}
From what I understand, a 2 sized array of int is allocated in "the heap memory" and pointer a takes the memory address of that newly created array. However, when trying to access (for example) the second index of the array (a[1]), it simply does so without the asterisk operator and I don't understand why, I'm used to seeing that the value stored in a memory address pointed to by a pointer is accessed as *pointername and not like pointername[value].
So my question is, why do we use the subscript operator to access a pointer which points to an array without the asterisk operator?
In C++, applying operator[] to a pointer p with index i is the semantic equivalent of
*(p + i)
and
*(i + p)
You can think of it as syntactic sugar. Also note that this implies that p[N] is equivalent to N[p].
a[N] is equal to *(a+N) if a is a pointer. Thus, a[1] dereferences the pointer a+1.
An array is a block of memory containing multiple instances of same size elements,
the subscript operator just puts an offset for the right element from the original pointer. so a[0] would be equal to *a. and a[1] = *a + 1 * sizeof(element) so in a sense you are right we do use the asterisk operator in the end it's just hidden behind syntactic sugar.
Ihe index operator has a peculiar meaning for an array of primitives: array[index] is equivalent to *(array+index). One side effect is that if you want to obfuscate your code, index[array] means exactly the same thing (for primitive types, of course). For primitive arrays, the index operator is semantic sugar for a dereference.
For non-primitive types, classes can override operator[]. This is the opposite of semantic sugar.
This is my first question :)
double MyArray[][6];
double* Myptr = MyArray[0];
So i've been wondering why, in pointer Arithmetic, I can notate a pointer to move in a single dimension like this,
*(Myptr + i);
but if i try to move through the dimensions of the array using a for loop it won't let me
*(*(Myptr + i) + j);
However it does let me use this notation with the array itself.
*(*(MyArray + i) + j);
I wanted to know why is this a restriction? or maybe i am writing it down incorrectly.
Myptr is a pointer to double, so *(Myptr + i) is a double, and while you can add j to that, you cannot dereference the result.
Had you declared Myptr like this:
double (*Myptr)[6] = MyArray;
then Myptr would be a pointer to an array of 6 doubles. Consequently *(Myptr + i) would be a double[6], and the expression would work.
That you can index arrays using this syntax is unfortunately surprising to people who see it the first time and have never been taught about pointer decay. A funny thing about arrays in C and C++ is that they decay into pointers in almost all circumstances. What this means is that very nearly always when you use an array (exceptions exist), the array is implicitly converted to a pointer to its first element and the rest is done with pointer arithmetic.
For example, this is the case when you write MyArray[i]. The standard defines MyArray[i] to mean *(MyArray + i) (the standard uses more parentheses, but this is what it boils down to), which makes sense when you understand that MyArray decays to a pointer, i is added to this pointer, and the resulting pointer is dereferenced. It also explains why i[MyArray] is equally valid (if in bad style).
In the context of multidimensional arrays, it is important to understand that a multidimensional array is merely an array of arrays. MyArray is, in your case, an array of arrays of doubles. MyArray[0] is, then, an array of doubles. In pointer decay, MyArray decays to a pointer to array of doubles, and when you dereference that pointer and work with the array it points to, then that array also decays to a pointer (to double) when you work with it. It's decay all the way down.
MyArray has type double[][6] and MyPointer has type double*.
Now arrays and pointers are different, but this is not important in this context. Both variables have array/pointer thingies in their types, but My Array has two (count them: [] and [6]) and MyPointer has only one (count it: *). Which is why you can apply dereferencing (that's the * operator) to MyArray twice, and to MyPointer only once.
Now if you want to have a pointer that can be used in much the same way you use MyArray, this is possible, but its type will not be double** (because arrays and pointers are different). You write it this way:
double (*MyPointer2)[6] = MyArray;
This one has two array/pointer thingies in its type so you can apply dereferencing twice to it.
In this case, MyPtr is only one dimensional, because it was assigned to a specific "array" in MyArray.
Because Myptr is a double*. A pointer to a double. It has no knowledge of the array dimensions.