C++ why accessing dynamic array is used without the asterisk operator? - c++

for example:
#include <iostream>
using namespace std;
int main(){
int *a;
a = new int[2];
a[1] = 1;
}
From what I understand, a 2 sized array of int is allocated in "the heap memory" and pointer a takes the memory address of that newly created array. However, when trying to access (for example) the second index of the array (a[1]), it simply does so without the asterisk operator and I don't understand why, I'm used to seeing that the value stored in a memory address pointed to by a pointer is accessed as *pointername and not like pointername[value].
So my question is, why do we use the subscript operator to access a pointer which points to an array without the asterisk operator?

In C++, applying operator[] to a pointer p with index i is the semantic equivalent of
*(p + i)
and
*(i + p)
You can think of it as syntactic sugar. Also note that this implies that p[N] is equivalent to N[p].

a[N] is equal to *(a+N) if a is a pointer. Thus, a[1] dereferences the pointer a+1.

An array is a block of memory containing multiple instances of same size elements,
the subscript operator just puts an offset for the right element from the original pointer. so a[0] would be equal to *a. and a[1] = *a + 1 * sizeof(element) so in a sense you are right we do use the asterisk operator in the end it's just hidden behind syntactic sugar.

Ihe index operator has a peculiar meaning for an array of primitives: array[index] is equivalent to *(array+index). One side effect is that if you want to obfuscate your code, index[array] means exactly the same thing (for primitive types, of course). For primitive arrays, the index operator is semantic sugar for a dereference.
For non-primitive types, classes can override operator[]. This is the opposite of semantic sugar.

Related

Clarification on the relation of arrays and pointers

I wanted some further understanding, and possibly clarification, on a few things that confuse me about arrays and pointers in c++. One of the main things that confuse me is how when you refer to the name of an array, it could be referring to the array, or as a pointer to the first element, as well as a few other things. in order to better display where my trouble understanding is coming from, I'll display a few lines of code, as well as the conclusion that I've made from each one.
let's say that I have
int vals[] = {1,50,3,28,32,500};
int* valptr = vals;
Because vals is a pointer to the first value of the array, then that means that valptr, should equal vals as a whole, since I'm just setting a pointer equal to a pointer, like 1=1.
cout<<vals<<endl;
cout<<&vals<<endl;
cout<<*(&vals)<<endl;
cout<<valptr<<endl;
the code above all prints out the same value, leading me to conclude two things, one that valptr and vals are equal, and can be treated in the same way, as well as the fact that, for some reason, adding & and * to vals doesn't seem to refer to anything different, meaning that using them in this way, is useless, since they all refer to the same value.
cout<<valptr[1]<<endl;//outputs 50
cout<<vals[1]<<endl;//outputs 50
cout<<*vals<<endl;//outputs 1
cout<<*valptr<<endl;//outputs 1
the code above furthers the mindset to me that valptr and vals are the same, seeing as whenever I do something to vals, and do the same thing to valptr, they both yield the same result
cout<<*(&valptr +1)-valptr<<endl;
cout<<endl;
cout<< *(&vals + 1) -vals<<endl;
Now that we have established what I know, or the misconceptions that I may have, now we move on to the two main problems that I have, which we'll go over now.
The first confusion I have is with cout<< *(&vals + 1) -vals<<endl; I know that this outputs the size of the array, and the general concept on how it works, but I'm confused on several parts.
As shown earlier, if
cout<<vals<<endl;
cout<<&vals<<endl;
cout<<*(&vals)<<endl;
all print out the same value, the why do I need the * and & in cout<< *(&vals + 1) -vals<<endl; I know that if I do vals+1 it just refers to the address of the next element on the array, meaning that *(vals+1)returns 50.
This brings me to my first question: Why does the & in *(&vals+1) refer to the next address out the array? why is it not the same output as (vals+1)in the way that *(&vals)and (vals)have the same output?
Now to my second question. We know thatcout<< *(&vals + 1) -vals<<endl; is a valid statement, successfully printing the size of the array. However, as I stated earlier, in every other instance, valptr could be substituted for vals interchangeably. However, in the instance of writingcout<<*(&valptr +1)-valptr<<endl; I get 0 returned, instead of the expected value of 6. How can something that was proven to be interchangeable before, no longer be interchangeable in this instance?
I appreciate any help that anyone reading this can give me
Arrays and pointers are two different types in C++ that have enough similarities between them to create confusion if they are not understood properly. The fact that pointers are one of the most difficult concepts for beginners to grasp doesn't help either.
So I fell a quick crash course is needed.
Crash course
Arrays, easy
int a[3] = {2, 3, 4};
This creates an array named a that contains 3 elements.
Arrays have defined the array subscript operator:
a[i]
evaluates to the i'th element of the array.
Pointers, easy
int val = 24;
int* p = &val;
p is a pointer pointing to the address of the object val.
Pointers have the indirection (dereference) operator defined:
*p
evaluates to the value of the object pointed by p.
Pointers, acting like arrays
Pointers give you the address of an object in memory. It can be a "standalone object" like in the example above, or it can be an object that is part of an array. Neither the pointer type nor the pointer value can tell you which one it is. Just the programmer. That's why
Pointers also have the array subscript operator defined:
p[i]
evaluates to the ith element to the "right" of the object pointed by p. This assumes that the object pointer by p is part of an array (except in p[0] where it doesn't need to be part of an array).
Note that p[0] is equivalent to (the exact same as) *p.
Arrays, acting like pointers
In most contexts array names decay to a pointer to the first element in the array. That is why many beginners think that arrays and pointers are the same thing. In fact they are not. They are different types.
Back to your questions
Because vals is a pointer to the first value of the array
No, it's not.
... then that means that valptr, should equal vals as a whole, since I'm just setting a pointer equal to a pointer, like 1=1.
Because the premise is false the rest of the sentence is also false.
valptr and vals are equal
No they are not.
can be treated in the same way
Most of the times yes, because of the array to pointer decay. However that is not always the case. E.g. as an expression for sizeof and operand of & (address of) operator.
Why does the & in *(&vals+1) refer to the next address out the array?
&vals this is one of the few situation when vals doesn't decay to a pointer. &vals is the address of the array and is of type int (*)[6] (pointer to array of 6 ints). That's why &vals + 1 is the address of an hypothetical another array just to the right of the array vals.
How can something that was proven to be interchangeable before, no longer be interchangeable in this instance?
Simply that's the language. In most cases the name of the array decays to a pointer to the first element of the array. Except in a few situations that I've mentioned.
More crash course
A pointer to the array is not the same thing as a pointer to the first element of the array. Taking the address of the array is one of the few instances where the array name doesn't decay to a pointer to its first element.
So &a is a pointer to the array, not a pointer to the first element of the array. Both the array and the first element of the array start at the same address so the values of the two (&a and &a[0]) are the same, but their types are different and that matters when you apply the dereference or array subscript operator to them:
Expression
Expression type
Dereference / array subscript expresison
Dereference / array subscript type
a
int[3]
*a / a[i]
int
&a
int (*) [3] (pointer to array)
*&a / &a[i]
int[3]
&a[0]
int *
*&a[0] / (&a[0])[i]
int

Understanding difference/similarities with array and pointers in c++ [duplicate]

This question already has answers here:
Why am I being told that an array is a pointer? What is the relationship between arrays and pointers in C++?
(6 answers)
Closed 5 years ago.
I am trying to understand arrays and pointers in c++.
In a past project, I did. I created a pointer and then assigned that pointer to an array of the same type. I've read previous posts, though, that says you can't assign a pointer to an array. Also from the pointer, I was able to use the index to get a certain element (square brackets []).
So I was wondering if anyone could explain to me arrays and pointers in C++ and why something like this is possible, I've tried looking for it online, but haven't found anything.
Example,
#include <iostream>
using namespace std;
int main()
{
int *pointer = new int[10];
for (int i = 0; i < 10; i++)
{
pointer[i] = i;
}
for (int i = 0; i < 10; i++)
{
cout << pointer[i];
}
return 1;
}
The main difference between arrays and pointers is that they are completely different things.
As array is a collection of objects, which is laid out contiguously in memory. For example, int x[5] defines an array named x, which is a collection of 5 integers, laid out side by side in memory. Individual elements in the array may be accessed using "array syntax" of the form x[i] where i is an integral value with values between 0 and 4. (Other values of i will result in undefined behaviour).
A pointer is a variable which holds a value that is an address in memory. For example, int *p defines p as a pointer to an int, and it can be initialised with the address of a variable of type int. For example, p = &some_int causes p to contain the address of some_int. When that is done, the notation *p (called dereferencing) provides access to the pointed-to variable. For example, *p = 42 will set some_int to have the value 42.
You'll notice, in the description above, I have not used the word "pointer" in describing an array, nor have I used the word "array" to describe a pointer. They are completely different things.
However, they can be used in ways that makes them seem the same, because of a few rules in the language. Firstly, there is a conversion called the "array-to-pointer" conversion. Because of this, it is possible to do
int x[5];
int *p = x;
The initialisation of p actually works by using the array-to-pointer conversion. Because it is being used to initialise a pointer, the compiler implicitly converts x to a pointer, equal to the address of x[0]. To do this explicitly (without the compiler silently and sneakily doing a conversion) you could have written
int *p = &x[0];
and got exactly the same effect. Either way, the assignment *p = 42 will subsequently have the effect of assigning x[0] to 42.
That suggests there is a relationship between expressions involving pointers and expressions involving (the name of) arrays. If p is equal to &x[0], then
p + i is equivalent to &x[i]; AND
*(p + i) is equivalent to x[i].
The language rules of C and C++ make these relationships symmetric, so (look carefully here)
x + i is equivalent to &x[i]; AND
*(x + i) is equivalent to x[i]
and, with pointers
p + i is equivalent to &p[i]; AND
*(p + i) is equivalent to p[i]
Which basically means that pointer syntax can be used to work with arrays (thanks to the pointer-to-array conversion) AND array syntax can be used to work with pointers.
Really bad textbooks then go on from this and conclude that pointers are arrays and that arrays are pointers. But they are not. If you find textbooks which say such things, burn them. Arrays and pointers are different things entirely. What we have here is a syntactic equivalence - even though arrays and pointers are different things entirely, they can be worked on using the same syntax.
One of the differences - where the syntactic equivalence does not apply - is that arrays cannot be reassigned. For example;
int x[5];
int y[5];
int *p = y; // OK - pointer to array conversion
x = y; // error since x is an array
x = p; // error since x is an array
The last two statements will be diagnosed by a C or C++ compiler as an error, because x is an array.
Your example
int *pointer = new int[10];
is a little different again. pointer is still not an array. It is a pointer, initialised with a "new expression", which dynamically allocates an array of 10 integers. But because of the syntactic equivalence of pointers and arrays, pointer can be treated syntactically AS IF it is an array of 10 elements.
Note: the above is concerned with raw arrays. The C++ standard library also has a type named std::array which is a data structure which contains an array, but behaves somewhat differently than described here.
a pointer in reality is a variable which keeps an address of a memory it points to. The memory itself can be allocated by one of the heap management functions like 'malloc' or 'new' or some others. So, in general you ask the function to allocate a certain amount of memory and return its address. You keep the latter as the pointer to it in a variable.
Arrays in 'c++/c' are contiguous chunks of memory. So, there is no difference between allocating 40 bytes or 10 integers in the array (assuming that an int is 4 byte long).
'c/c++' also understand the type of the data which the allocated memory contains. The languages provide a feature named 'pointer arithmetic'. This actually means that you can add to a pointer or subtract from it. The language will factor the values by the size of the type. I.e.
int *a = new ...
int b = *(a+4);
In the above case value of 'b' will be the same as the value saved in memory 'a' with offset of 16 bytes.
The above is very similar to the array indexing arithmetic and the language allow you to use the array indexing instead of the above:
b = a[4];
So, in this sense both worlds intersect and you can interchangeably use either pointer arithmetic or array arithmetic in the language.
Arrays do not have to be allocated on the heap as well as the pointers do hot have to address the heap only. You can have an array allocated on stack or in the global scope and have a pointer to it:
int myarray[10];
int *pointer = myarray;
How you can apply either the pointer arithmetic or array arithmetic to the pointer. the following are equivalent (if you did not advance the pointer)
myarray[3]
pointer[3]
*(pointer + 3)
Hope it clarifies the issue for you.
From wikipedia:
A pointer references a location in memory, and obtaining the value
stored at that location is known as dereferencing the pointer
Arrays are the contiguous memory locations and their location in memory is referenced by a pointer.
int *pointer = new int[10];
for (int i = 0; i < 10; i++)
{
pointer[i] = i;
}
the code above is actually accesses pointer + i * sizeof(int) thanks to operator[] (assume variable pointer is an integer or smth like that, no pointer arithmetic here.)

Is the meaning of [][] context-specific?

I always thought that x[i] is equivalent to *(x+i).
So that would x[i][j] is equivalent to *(*(x+i)+j), which is in this case implies that a 2D array must be implemented as an array of pointers, each of these pointers has to be dereferenced.
But I learned that you can create a 2D array on the heap this way :
char (*arr)[256]=malloc(512*256);
If the former hypothesis is right, then arr[i][j] would access an unauthorized location (since we are dereferencing two times).
Is my former hypothesis wrong in case of 2d arrays ?
x[i] is indeed equivalent to *(x+i). Pointer arithmetic is used in the latter form.
In case of a 2D array type array[x][y];, then you have to apply the above rule in several steps.
array when used in an expression, decays to a pointer to the first element, in this case of type type(*)[y] - an array pointer pointing at the first array in the array of arrays. Do not confuse an array pointer with "an array of pointers", which means something different entirely.
Therefore array + i performs pointer arithmetic on such an array pointer. *(array + i) gives the pointed-at item, a 1D array.
If you for some reason unknown want to access an individual item in the 2D array by using pointer arithmetic only, you would therefore have to write something obscure like this:
*( *(array + i) + j)
If the former hypothesis is right, then arr[i][j] would access an
unauthorized location (since we are dereferencing two times).
Well, it wasn't right. When you have an array pointer such as char (*arr)[256], then you can access the items as arr[i][j]. Note that arr[i] performs the same kind of pointer arithmetic, so you get "the char[256] array number i". And in that array you then access item j.
This is actually the reason why this syntax is used. Had you written the malloc more type correct like this:
char (*arr)[512][256]=malloc( sizeof(char[512][256]) );
then you would have to de-reference the array pointer before using it, (*arr)[i][j] or otherwise you would get pointer arithmetic on whole 2D arrays.
The question Correctly allocating multi-dimensional arrays explains and illustrates the differences between look-up tables based on arrays of pointers and true multi-dimensional arrays.

Pointer arithmetic with multidimensional Array Notation

This is my first question :)
double MyArray[][6];
double* Myptr = MyArray[0];
So i've been wondering why, in pointer Arithmetic, I can notate a pointer to move in a single dimension like this,
*(Myptr + i);
but if i try to move through the dimensions of the array using a for loop it won't let me
*(*(Myptr + i) + j);
However it does let me use this notation with the array itself.
*(*(MyArray + i) + j);
I wanted to know why is this a restriction? or maybe i am writing it down incorrectly.
Myptr is a pointer to double, so *(Myptr + i) is a double, and while you can add j to that, you cannot dereference the result.
Had you declared Myptr like this:
double (*Myptr)[6] = MyArray;
then Myptr would be a pointer to an array of 6 doubles. Consequently *(Myptr + i) would be a double[6], and the expression would work.
That you can index arrays using this syntax is unfortunately surprising to people who see it the first time and have never been taught about pointer decay. A funny thing about arrays in C and C++ is that they decay into pointers in almost all circumstances. What this means is that very nearly always when you use an array (exceptions exist), the array is implicitly converted to a pointer to its first element and the rest is done with pointer arithmetic.
For example, this is the case when you write MyArray[i]. The standard defines MyArray[i] to mean *(MyArray + i) (the standard uses more parentheses, but this is what it boils down to), which makes sense when you understand that MyArray decays to a pointer, i is added to this pointer, and the resulting pointer is dereferenced. It also explains why i[MyArray] is equally valid (if in bad style).
In the context of multidimensional arrays, it is important to understand that a multidimensional array is merely an array of arrays. MyArray is, in your case, an array of arrays of doubles. MyArray[0] is, then, an array of doubles. In pointer decay, MyArray decays to a pointer to array of doubles, and when you dereference that pointer and work with the array it points to, then that array also decays to a pointer (to double) when you work with it. It's decay all the way down.
MyArray has type double[][6] and MyPointer has type double*.
Now arrays and pointers are different, but this is not important in this context. Both variables have array/pointer thingies in their types, but My Array has two (count them: [] and [6]) and MyPointer has only one (count it: *). Which is why you can apply dereferencing (that's the * operator) to MyArray twice, and to MyPointer only once.
Now if you want to have a pointer that can be used in much the same way you use MyArray, this is possible, but its type will not be double** (because arrays and pointers are different). You write it this way:
double (*MyPointer2)[6] = MyArray;
This one has two array/pointer thingies in its type so you can apply dereferencing twice to it.
In this case, MyPtr is only one dimensional, because it was assigned to a specific "array" in MyArray.
Because Myptr is a double*. A pointer to a double. It has no knowledge of the array dimensions.

Pointers and dynamically allocated arrays

According to my class notes, you can allocate an array in C++ like
int *A = new int[5]
where A is a pointer to the array.
But then you can access the array as A[3]. Why can you do that? Isn't A a pointer and not the actual array?
The indexing operator[] actually defined to work on pointers, not on arrays. A[3] is actually a synonym for *(A+3). It works on arrays as a consequence of the fact that arrays can be implicitly converted to pointers to their first element.
A[3] is an alias for *(A+3) which dereferences the pointer.
You are right! Basically, this is syntactic sugar, meaning that the language designers put something in place to make your life a bit easier, but behind the scenes it is doing something quite different. (This point is arguable)
Effectively, what this is doing is taking the pointer location at A, moving the pointer 5 int-sizes up, then giving you the value by dereferencing.
This is the starting point of pointer arithmetic: you can add or subtract values with pointers, as long you don't move off the array (well, you can move up to right after the array). Basically, the following hold (for A being a pointer to an array with at leat i + 1 elements:
A[i]
*(A + i) // this what A[i] means
*(i + A) // addition is commutative
i[A] // this sould work, too - and it does!