Pointer arithmetic with multidimensional Array Notation - c++

This is my first question :)
double MyArray[][6];
double* Myptr = MyArray[0];
So i've been wondering why, in pointer Arithmetic, I can notate a pointer to move in a single dimension like this,
*(Myptr + i);
but if i try to move through the dimensions of the array using a for loop it won't let me
*(*(Myptr + i) + j);
However it does let me use this notation with the array itself.
*(*(MyArray + i) + j);
I wanted to know why is this a restriction? or maybe i am writing it down incorrectly.

Myptr is a pointer to double, so *(Myptr + i) is a double, and while you can add j to that, you cannot dereference the result.
Had you declared Myptr like this:
double (*Myptr)[6] = MyArray;
then Myptr would be a pointer to an array of 6 doubles. Consequently *(Myptr + i) would be a double[6], and the expression would work.
That you can index arrays using this syntax is unfortunately surprising to people who see it the first time and have never been taught about pointer decay. A funny thing about arrays in C and C++ is that they decay into pointers in almost all circumstances. What this means is that very nearly always when you use an array (exceptions exist), the array is implicitly converted to a pointer to its first element and the rest is done with pointer arithmetic.
For example, this is the case when you write MyArray[i]. The standard defines MyArray[i] to mean *(MyArray + i) (the standard uses more parentheses, but this is what it boils down to), which makes sense when you understand that MyArray decays to a pointer, i is added to this pointer, and the resulting pointer is dereferenced. It also explains why i[MyArray] is equally valid (if in bad style).
In the context of multidimensional arrays, it is important to understand that a multidimensional array is merely an array of arrays. MyArray is, in your case, an array of arrays of doubles. MyArray[0] is, then, an array of doubles. In pointer decay, MyArray decays to a pointer to array of doubles, and when you dereference that pointer and work with the array it points to, then that array also decays to a pointer (to double) when you work with it. It's decay all the way down.

MyArray has type double[][6] and MyPointer has type double*.
Now arrays and pointers are different, but this is not important in this context. Both variables have array/pointer thingies in their types, but My Array has two (count them: [] and [6]) and MyPointer has only one (count it: *). Which is why you can apply dereferencing (that's the * operator) to MyArray twice, and to MyPointer only once.
Now if you want to have a pointer that can be used in much the same way you use MyArray, this is possible, but its type will not be double** (because arrays and pointers are different). You write it this way:
double (*MyPointer2)[6] = MyArray;
This one has two array/pointer thingies in its type so you can apply dereferencing twice to it.

In this case, MyPtr is only one dimensional, because it was assigned to a specific "array" in MyArray.

Because Myptr is a double*. A pointer to a double. It has no knowledge of the array dimensions.

Related

Clarification on the relation of arrays and pointers

I wanted some further understanding, and possibly clarification, on a few things that confuse me about arrays and pointers in c++. One of the main things that confuse me is how when you refer to the name of an array, it could be referring to the array, or as a pointer to the first element, as well as a few other things. in order to better display where my trouble understanding is coming from, I'll display a few lines of code, as well as the conclusion that I've made from each one.
let's say that I have
int vals[] = {1,50,3,28,32,500};
int* valptr = vals;
Because vals is a pointer to the first value of the array, then that means that valptr, should equal vals as a whole, since I'm just setting a pointer equal to a pointer, like 1=1.
cout<<vals<<endl;
cout<<&vals<<endl;
cout<<*(&vals)<<endl;
cout<<valptr<<endl;
the code above all prints out the same value, leading me to conclude two things, one that valptr and vals are equal, and can be treated in the same way, as well as the fact that, for some reason, adding & and * to vals doesn't seem to refer to anything different, meaning that using them in this way, is useless, since they all refer to the same value.
cout<<valptr[1]<<endl;//outputs 50
cout<<vals[1]<<endl;//outputs 50
cout<<*vals<<endl;//outputs 1
cout<<*valptr<<endl;//outputs 1
the code above furthers the mindset to me that valptr and vals are the same, seeing as whenever I do something to vals, and do the same thing to valptr, they both yield the same result
cout<<*(&valptr +1)-valptr<<endl;
cout<<endl;
cout<< *(&vals + 1) -vals<<endl;
Now that we have established what I know, or the misconceptions that I may have, now we move on to the two main problems that I have, which we'll go over now.
The first confusion I have is with cout<< *(&vals + 1) -vals<<endl; I know that this outputs the size of the array, and the general concept on how it works, but I'm confused on several parts.
As shown earlier, if
cout<<vals<<endl;
cout<<&vals<<endl;
cout<<*(&vals)<<endl;
all print out the same value, the why do I need the * and & in cout<< *(&vals + 1) -vals<<endl; I know that if I do vals+1 it just refers to the address of the next element on the array, meaning that *(vals+1)returns 50.
This brings me to my first question: Why does the & in *(&vals+1) refer to the next address out the array? why is it not the same output as (vals+1)in the way that *(&vals)and (vals)have the same output?
Now to my second question. We know thatcout<< *(&vals + 1) -vals<<endl; is a valid statement, successfully printing the size of the array. However, as I stated earlier, in every other instance, valptr could be substituted for vals interchangeably. However, in the instance of writingcout<<*(&valptr +1)-valptr<<endl; I get 0 returned, instead of the expected value of 6. How can something that was proven to be interchangeable before, no longer be interchangeable in this instance?
I appreciate any help that anyone reading this can give me
Arrays and pointers are two different types in C++ that have enough similarities between them to create confusion if they are not understood properly. The fact that pointers are one of the most difficult concepts for beginners to grasp doesn't help either.
So I fell a quick crash course is needed.
Crash course
Arrays, easy
int a[3] = {2, 3, 4};
This creates an array named a that contains 3 elements.
Arrays have defined the array subscript operator:
a[i]
evaluates to the i'th element of the array.
Pointers, easy
int val = 24;
int* p = &val;
p is a pointer pointing to the address of the object val.
Pointers have the indirection (dereference) operator defined:
*p
evaluates to the value of the object pointed by p.
Pointers, acting like arrays
Pointers give you the address of an object in memory. It can be a "standalone object" like in the example above, or it can be an object that is part of an array. Neither the pointer type nor the pointer value can tell you which one it is. Just the programmer. That's why
Pointers also have the array subscript operator defined:
p[i]
evaluates to the ith element to the "right" of the object pointed by p. This assumes that the object pointer by p is part of an array (except in p[0] where it doesn't need to be part of an array).
Note that p[0] is equivalent to (the exact same as) *p.
Arrays, acting like pointers
In most contexts array names decay to a pointer to the first element in the array. That is why many beginners think that arrays and pointers are the same thing. In fact they are not. They are different types.
Back to your questions
Because vals is a pointer to the first value of the array
No, it's not.
... then that means that valptr, should equal vals as a whole, since I'm just setting a pointer equal to a pointer, like 1=1.
Because the premise is false the rest of the sentence is also false.
valptr and vals are equal
No they are not.
can be treated in the same way
Most of the times yes, because of the array to pointer decay. However that is not always the case. E.g. as an expression for sizeof and operand of & (address of) operator.
Why does the & in *(&vals+1) refer to the next address out the array?
&vals this is one of the few situation when vals doesn't decay to a pointer. &vals is the address of the array and is of type int (*)[6] (pointer to array of 6 ints). That's why &vals + 1 is the address of an hypothetical another array just to the right of the array vals.
How can something that was proven to be interchangeable before, no longer be interchangeable in this instance?
Simply that's the language. In most cases the name of the array decays to a pointer to the first element of the array. Except in a few situations that I've mentioned.
More crash course
A pointer to the array is not the same thing as a pointer to the first element of the array. Taking the address of the array is one of the few instances where the array name doesn't decay to a pointer to its first element.
So &a is a pointer to the array, not a pointer to the first element of the array. Both the array and the first element of the array start at the same address so the values of the two (&a and &a[0]) are the same, but their types are different and that matters when you apply the dereference or array subscript operator to them:
Expression
Expression type
Dereference / array subscript expresison
Dereference / array subscript type
a
int[3]
*a / a[i]
int
&a
int (*) [3] (pointer to array)
*&a / &a[i]
int[3]
&a[0]
int *
*&a[0] / (&a[0])[i]
int

Is the meaning of [][] context-specific?

I always thought that x[i] is equivalent to *(x+i).
So that would x[i][j] is equivalent to *(*(x+i)+j), which is in this case implies that a 2D array must be implemented as an array of pointers, each of these pointers has to be dereferenced.
But I learned that you can create a 2D array on the heap this way :
char (*arr)[256]=malloc(512*256);
If the former hypothesis is right, then arr[i][j] would access an unauthorized location (since we are dereferencing two times).
Is my former hypothesis wrong in case of 2d arrays ?
x[i] is indeed equivalent to *(x+i). Pointer arithmetic is used in the latter form.
In case of a 2D array type array[x][y];, then you have to apply the above rule in several steps.
array when used in an expression, decays to a pointer to the first element, in this case of type type(*)[y] - an array pointer pointing at the first array in the array of arrays. Do not confuse an array pointer with "an array of pointers", which means something different entirely.
Therefore array + i performs pointer arithmetic on such an array pointer. *(array + i) gives the pointed-at item, a 1D array.
If you for some reason unknown want to access an individual item in the 2D array by using pointer arithmetic only, you would therefore have to write something obscure like this:
*( *(array + i) + j)
If the former hypothesis is right, then arr[i][j] would access an
unauthorized location (since we are dereferencing two times).
Well, it wasn't right. When you have an array pointer such as char (*arr)[256], then you can access the items as arr[i][j]. Note that arr[i] performs the same kind of pointer arithmetic, so you get "the char[256] array number i". And in that array you then access item j.
This is actually the reason why this syntax is used. Had you written the malloc more type correct like this:
char (*arr)[512][256]=malloc( sizeof(char[512][256]) );
then you would have to de-reference the array pointer before using it, (*arr)[i][j] or otherwise you would get pointer arithmetic on whole 2D arrays.
The question Correctly allocating multi-dimensional arrays explains and illustrates the differences between look-up tables based on arrays of pointers and true multi-dimensional arrays.

What is a pointer to an array of x amount of ints?

I've been working through C++ Primer, and I feel as if something hasn't been explained clearly...
I understand what it means to have a array of pointers to ints, for example:
int *foo [4];
or
int *foo1 [2][4];
However, I don't understand what it means to have a pointer to an array of some number of ints. Here's an example from the book:
int ia[3][4];
int (*p) [4] = ia;
What is p here, and what does this mean, both in the 1D and the 2D case?
When you declare a variable like
int array[137];
the type of the variable array is int[137]; that is, an array of 137 ints. In C++, you can create pointers or references to variables of any type that you'd like, so you can create a pointer to the variable array. Since array has type "array of 137 ints," a pointer to array would have type "pointer to an array of 137 ints." The syntax for writing out a pointer like this would be
int (*arrayPtr)[137] = &array;
This is read as "arrayPtr is a pointer, and what it points at is an array of 137 ints." The syntax here is unusual, but that's what you would do if you wanted a pointer to an array of 137 integers.
Similarly, you could make a reference to array like this:
int (&arrayRef)[137] = array;
It's extremely uncommon to see pointers to arrays actually used anywhere - I've never seen it done except in very specialized template circumstances. References to arrays are sometimes used in template programming, but otherwise I've never seen them used anywhere before.
In other words, it's good to know these exist and know how to read them, but realistically you're unlikely to need them unless you start doing some pretty advanced library development or like to play around with template wizardry.
This gets weirder when you factor array-to-pointer decay into the mix. The example code you had was essentially
int myArray[137][42];
int (*arrayPtr)[42] = myArray;
The type of arrayPtr is "pointer to an array of 42 ints", which is of type int (*) [42]. So why can we assign myArray, which has type int[137][42], to it? This is where array-to-pointer decay kicks in. Array-to-pointer decay is an implicit conversion that converts an array to a pointer to its first element. Let's see how that applies here.
myArray is an array of 137 arrays of 42 integers. This can be though of as "an array of 137 things, each of which is an int[42]." This means that when array-to-pointer decay applies to myArray, it converts to a pointer to its first element. That element is itself an array of 42 integers, so the effect of applying the array-to-pointer decay is that the expression myArray implicitly converts to an int (*) [42] (a pointer to an array of 42 integers), specifically, one pointing at the first row in the array.
The net effect of this assignment is that arrayPtr now points to the first of the 137 arrays in myArray. This is just plain weird, if you ask me, and I would not advise writing code like this. I've been writing C++ for years and never had the misfortune of seeing this used anywhere, so I think it's safe to chalk this one up to "bizarre edge cases that only library implementers need to worry about." :-)
int (*p) [4];
means
p is a pointer to an array of 4 int elements.
By bracket overriding rule, pointer * has higher precedence over array [] in this case. So, p is at first a pointer.
int ia[3][4];
int (*p) [4] = ia;
Basically here p will hold the starting address of each row of the ia matrix.
p[0] = &ia[0];
...

Why does int*[] decay into int** but not int[][]?

I'm trying to understand the nature of type-decay. For example, we all know arrays decay into pointers in a certain context. My attempt is to understand how int[] equates to int* but how two-dimensional arrays don't correspond to the expected pointer type. Here is a test case:
std::is_same<int*, std::decay<int[]>::type>::value; // true
This returns true as expected, but this doesn't:
std::is_same<int**, std::decay<int[][1]>::type>::value; // false
Why is this not true? I finally found a way to make it return true, and that was by making the first dimension a pointer:
std::is_same<int**, std::decay<int*[]>::type>::value; // true
And the assertion holds true for any type with pointers but with the last being the array. For example (int***[] == int****; // true).
Can I have an explanation as to why this is happening? Why doesn't the array types correspond to the pointer types as would be expected?
Why does int*[] decay into int** but not int[][]?
Because it would be impossible to do pointer arithmetic with it.
For example, int p[5][4] means an array of (length-4 array of int). There are no pointers involved, it's simply a contiguous block of memory of size 5*4*sizeof(int). When you ask for a particular element, e.g. int a = p[i][j], the compiler is really doing this:
char *tmp = (char *)p // Work in units of bytes (char)
+ i * sizeof(int[4]) // Offset for outer dimension (int[4] is a type)
+ j * sizeof(int); // Offset for inner dimension
int a = *(int *)tmp; // Back to the contained type, and dereference
Obviously, it can only do this because it knows the size of the "inner" dimension(s). Casting to an int (*)[4] retains this information; it's a pointer to (length-4 array of int). However, an int ** doesn't; it's merely a pointer to (pointer to int).
For another take on this, see the following sections of the C FAQ:
6.18: My compiler complained when I passed a two-dimensional array to a function expecting a pointer to a pointer.
6.19: How do I write functions which accept two-dimensional arrays when the width is not known at compile time?
6.20: How can I use statically- and dynamically-allocated multidimensional arrays interchangeably when passing them to functions?
(This is all for C, but this behaviour is essentially unchanged in C++.)
C was not really "designed" as a language; instead, features were added as needs arose, with an effort not to break earlier code. Such an evolutionary approach was a good thing in the days when C was being developed, since it meant that for the most part developers could reap the benefits of the earlier improvements in the language before everything the language might need to do was worked out. Unfortunately, the way in which array- and pointer handling have evolved has led to a variety of rules which are, in retrospect, unfortunate.
In the C language of today, there is a fairly substantial type system, and variables have clearly defined types, but things were not always thus. A declaration char arr[8]; would allocate 8 bytes in the present scope, and make arr point to the first of them. The compiler wouldn't know that arr represented an array--it would represent a char pointer just like any other char*. From what I understand, if one had declared char arr1[8], arr2[8];, the statement arr1 = arr2; would have been perfectly legal, being somewhat equivalent conceptually to char *st1 = "foo, *st2 = "bar"; st1 = st2;, but would have almost always represented a bug.
The rule that arrays decompose into pointers stemmed from a time when arrays and pointers really were the same thing. Since then, arrays have come to be recognized as a distinct type, but the language needed to remain essentially compatible with the days when they weren't. When the rules were being formulated, the question of how two-dimensional arrays should be handled wasn't an issue because there was no such thing. One could do something like char foo[20]; char *bar[4]; int i; for (i=0; i<4; i++) bar[i] = foo + (i*5); and then use bar[x][y] in the same way as one would now use a two-dimensional array, but a compiler wouldn't view things that way--it just saw bar as a pointer to a pointer. If one wanted to make foo[1] point somewhere completely different from foo[2], one could perfectly legally do so.
When two two-dimensional arrays were added to C, it was not necessary to maintain compatibility with earlier code that declared two-dimensional arrays, because there wasn't any. While it would have been possible to specify that char bar[4][5]; would generate code equivalent to what was shown using the foo[20], in which case a char[][] would have been usable as a char**, it was thought that just as assigning array variables would have been a mistake 99% of the time, so too would have been re-assignment of array rows, had that been legal. Thus, arrays in C are recognized as distinct types, with their own rules which are a bit odd, but which are what they are.
Because int[M][N] and int** are incompatible types.
However, int[M][N] can decay into int (*)[N] type. So the following :
std::is_same<int(*)[1], std::decay<int[1][1]>::type>::value;
should give you true.
Two dimensional arrays are not stored as pointer to pointers, but as a contiguous block of memory.
An object declared as type int[y][x] is a block of size sizeof(int) * x * y whereas, an object of type int ** is a pointer to an int*

Pointers and dynamically allocated arrays

According to my class notes, you can allocate an array in C++ like
int *A = new int[5]
where A is a pointer to the array.
But then you can access the array as A[3]. Why can you do that? Isn't A a pointer and not the actual array?
The indexing operator[] actually defined to work on pointers, not on arrays. A[3] is actually a synonym for *(A+3). It works on arrays as a consequence of the fact that arrays can be implicitly converted to pointers to their first element.
A[3] is an alias for *(A+3) which dereferences the pointer.
You are right! Basically, this is syntactic sugar, meaning that the language designers put something in place to make your life a bit easier, but behind the scenes it is doing something quite different. (This point is arguable)
Effectively, what this is doing is taking the pointer location at A, moving the pointer 5 int-sizes up, then giving you the value by dereferencing.
This is the starting point of pointer arithmetic: you can add or subtract values with pointers, as long you don't move off the array (well, you can move up to right after the array). Basically, the following hold (for A being a pointer to an array with at leat i + 1 elements:
A[i]
*(A + i) // this what A[i] means
*(i + A) // addition is commutative
i[A] // this sould work, too - and it does!