Multi-dimensional arrays and pointers in C++ - c++

Let's say I declare and initialize the following multi-dimensional array in C++:
unsigned a[3][4] = {
{12, 6, 3, 2},
{9, 13, 7, 0},
{7, 4, 8, 5}
};
After which I execute this code:
cout << a << endl; // output: 0x7fff5afc5bc0
cout << a + 1 << endl; // output: 0x7fff5f0afbd0
cout << *a << endl; // output: 0x7fff5afc5bc0
cout << *a + 1 << endl; // output: 0x7fff5f0afbc4
I just don't understand what is happening here.
a is the address of the first element, right? In single-dimensional arrays, *a should be the value of the first element, but instead it's the same as a?! What does *a even mean in this context?
Why is a + 1 different from *a + 1?

You should try and find some good documentation about pointers, arrays, and array-to-pointer decay. In your case, unsigned a[3][4] is a bi-dimensional array of type unsigned [3][4]. Whenever you refer to it as a, it decays to a pointer to unsigned[4], so the decayed type of a is unsigned (*)[4]. Therefore dereferencing it gives you an array, so *a is the array [12, 6, 3, 2] (technically it's the pointer to the first element into the array).
Now, a+1 means "increment the pointer to unsigned[4] by 1", in your case it "jumps" 4 unsigneds in the memory, so now a+1 points to the "row" indexed by 1 in your example. Dereferencing it *(a+1) yields the array itself ([9,13,7,0]) (i.e. the pointer to the first element of it), and dereferencing again gives you the first element, i.e. **(a+1) equals 9.
On the other hand, *a+1 first dereferences a, so you get the first row, i.e. [12,6,3,2] (again, technically the pointer to the first element of this row). You then increment it by one, so you end up pointing at the element 6. Dereferencing it again, *(*a+1), yields 6.
It may be helpful to define a equivalently as
typedef unsigned T[4]; // or (C++11) using T = unsigned[4];
T a[3]; // Now it's a bit more clear how dereferencing works

Two dimensional array is array of array.
You can visualize a as
a = { a[0], a[1], a[2]}
and a[0], a[1], a[2] as
a[0] = { a[0][0], a[0][1], a[0][2], a[0][3]};
a[1] = { a[1][0], a[1][1], a[1][2], a[1][3]};
a[1] = { a[2][0], a[2][1], a[2][2], a[2][3]};
Analysis of your first question
a is the address of the first element, right?
Yes a is the address of the first element, and first element of a is a[0] which is the address of the first element of a[0][0].
*a should be the value of the first element, but instead it's the same as a?
Yes *a should be the value of the first element that refer a[0]. And we see a[0] is the address of a[0][0] so as a . Thus *a have same value as a which is the address of a[0][0].
What does *a even mean in this context?
Previously answered, *a is the address of first arrays first element a[0][0], and *(a+1) is the address of second arrays first element a[1][0].
And analysis of your second question
Why is a + 1 different from *a + 1?
At this time perhaps you can answer your owns question.
a is the address of a[0] then
a+1 is the address of a[1] which hold address of a[1][0].
You can print the value by
cout<<**(a+1)<<endl; // 9
Other way *a is the value of a[0] which is the address of a[0][0].
So *a+1 is the address of a[0][1]. You can print the value as
cout<<*(*a+1)<<endl; // 6

Related

Question about the array's address in C++

I have a question about the array's address.
Is *(a + *a) the same as a[3]? If it is, please give me some explanation.
int main() {
int a[] = { 1, 2, 3, 4, 5, 6};
*(a + *a);
}
Is (a + *a) same as a[3]??
No, it is not.
No. *a is equal to 1, so cout << a + 1 will output address of a incremented by 4, which is the size of an int type. This is because a is an array of int, so every increment will add it's address by sizeof(int), which is 4.
a is the address of the first element of the array. *a is the value located at the first element of the array. So a+*a is the sum of those two values, which is not the same as a[3].
If the first element of the array happened to be 3, then *(a+*a) would be the same as a[3]. But that is not the situation you have. i.e. *a=3, (a+3) is the location of the fourth element of a, *(a+3) is the element at the fourth location of the array, which is also accessible using a[3].
To generalize for a given X* x or X x[] that can decay into a pointer:
*(x + y) is the same as x[y].
*x is the same as x[0]
In effect *(a + *a) is just a really confusing way of writing a[a[0]], so the way it evaluates depends entirely on what a[0] is.
In this case it's 1, so you get a[a[0]] or a[1] as the result.

Why do you have to specify an index number when assigning a value to the first element in an array?

Why do you have to specify an index number when assigning a value to the first element in an array?
Let's say we had this array:
int childer[] = {10, 30, 50};
then both childer and childer[0] would be accessing the first element of the array, so why can't you dochilder = 15;?
Arrays are similar to pointers. For example an array decays to a pointer when passed to a function, or even when the operators [] and the unary * are applied to them (Thanks StoryTeller). So
childer[0]
is equivalent to
*childer
and you can do
*childer = 15;
to set the first element to 15.
childer
on the other hand is the same as
&childer[0]
childer[0] and childer have two different meanings. childer[0] is the first element of the array while childer is a pointer to the first element. In order to get the first element using the pointer, you have to use *childer.
so,
*childer = 15;
can be used for assignment.
you could dereference the name of the array, but just trying to do
childer = -999;
is completely wrong...
see the example below:
int main() {
int childer[] = { 10, 30, 50 };
cout << childer[0] << endl;
cout << *childer << endl;
//now write
childer[0] = 200;
cout << childer[0] << endl;
cout << *childer << endl;
//now write again
*childer = -999;
cout << childer[0] << endl;
cout << *childer << endl;
return 0;
}
First of all
int childer[] = { 10, 20, 30 };
...
childer = ....; <--- THIS WILL NOT COMPILE, as childer is not an lvalue.
you cannot say childer = 20;, you can say childer[0] = 20; or *childer = 20;, or even more *(childer + 1) = 20; (to access other elements than the first), but you cannot use childer as the whole array in an assignment. Simply, neither C nor C++ allow this.
Arrays and pointers are very different things. What confuses you, is that neither C, nor C++ have a way to reference the array as a whole, and the array identifier, by itself, means the address of the first element (and not a pointer but a reference). What I mean by a reference, and I don't use the word pointer for just now, is that a pointer normally refers to a variable of pointer type, and as a variable, can be modified (mainly because the array position cannot be changed at runtime). You cannot write something like array = something_else;, when array is an array. Array names cannot be used by themselves as lvalues, only as rvalues. This can be done with pointers, like
int childer[] = { 10, 20, 30, };
int *pointer = childer;
and then
pointer = some_other_place_to_point_to;
because pointer is a pointer and childer isn't.
The meaning that both languages attribute to the array name is just a pointer rvalue pointing to the address of the first element (and typed as the cell type, continue reading for an explanation on this). And this is not the same as the address of the array (both are different things, demonstrated below) as the first (the array name is a pointer to cell type, while the array address &childer is a pointer to the whole array type). This can be illustrated with the next program, which tries to show you the difference in sizes for the array element, the whole array, the array pointer and the cell pointer:
#include <stdio.h>
#include <sys/types.h>
#define D(x) __FILE__":%d:%s: " x, __LINE__, __func__
#define P(fmt, exp) do { \
printf(D(#exp " == " fmt "\n"), (exp)); \
} while(0)
#define TR(sent) \
sent; \
printf(D(#sent ";\n"))
int main()
{
int i = 1;
TR(double a[10]);
P("%p", a);
P("%p", &a);
P("%d", sizeof a);
P("%d", sizeof &a);
P("%d", sizeof *a);
P("%d", sizeof *&a);
return 0;
}
which executes into (I selected double as the type of cell to make evident the differences while trying to be more architecture independent):
$ pru
pru.c:16:main: double a[10]; <-- array of ten double's (a double is 8 bytes in many architectures)
pru.c:18:main: a == 0xbfbfe798 <-- address values are
pru.c:19:main: &a == 0xbfbfe798 <-- equal for both.
pru.c:20:main: sizeof a == 80 <-- size of whole array.
pru.c:21:main: sizeof &a == 4 <-- size of array address (that's a pointer to array)
pru.c:22:main: sizeof *a == 8 <-- size of pointed cell (first array element)
pru.c:23:main: sizeof *&a == 80 <-- size of pointed array.
But you are wrong when saying that you cannot use an expression to access the other elements, as you can, use array + 1 to mean the address of the second element, and array + n to access the address of the n + 1 element. Evidently array + 0 is the same as array (the 0 number is also neutral in pointer addition), so you can get on it easily. Some people like to write it for fun also as 0+array (to abuse of the commutativity of the + operator) and go further, to write something like 0[array], to refer to the first element of the array (this time the access being to the element, and not to its address) I don't know if this continues to be valid with the new standards, but I'm afraid it continues, perhaps, to allow for legacy code to be compilable (adding to the controversy of what should be conserved and what can be eliminated on standard revisions)

How are 2-Dimensional Arrays stored in memory?

#include<bits/stdc++.h>
using namespace std;
int main()
{
int a[101][101];
a[2][0]=10;
cout<<a+2<<endl;
cout<<*(a+2)<<endl;
cout<<*(*(a+2));
return 0;
}
Why are the values of a+2 and *(a+2) same? Thanks in advance!
a is a 2D array, that means an array of arrays. But it decays to a pointer to an array when used in appropriate context. So:
in a+2, a decays to a pointer to int arrays of size 101. When you pass is to an ostream, you get the address of the first element of this array, that is &(a[2][0])
in *(a+2) is by definition a[2]: it is an array of size 101 that starts at a[2][0]. It decays to a pointer to int, and when you pass it to an ostream you get the address of its first element, that is still &(a[2][0])
**(a+2) is by definition a[2][0]. When you pass it to an ostream you get its int value, here 10.
But beware: a + 2 and a[2] are both pointers to the same address (static_cast<void *>(a+2) is the same as static_cast<void *>(a[2])), but they are pointers to different types: first points to int array of size 101, latter to int.
I'll try to explain you how the memory is mapped by the compiler:
Let's consider a more pratical example multi-dimentional array:
int a[3][3] = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
You can execute the command
x/10w a
In GDB and look at the memory:
0x7fffffffe750: 1 2 3 4
0x7fffffffe760: 5 6 7 8
0x7fffffffe770: 9 0
Each element is stored in a int type (32 bit / 4 bytes).
So the first element of the matrix has been stored in:
1) a[0][0] -> 0x7fffffffe750
2) a[0][1] -> 0x7fffffffe754
3) a[0][2] -> 0x7fffffffe758
4) a[1][0] -> 0x7fffffffe75c
5) a[1][1] -> 0x7fffffffe760
6) a[1][2] -> 0x7fffffffe764
7) a[2][0] -> 0x7fffffffe768
...
The command:
std::cout << a + 2 << '\n'
It will print the address 0x7fffffffe768 because of the
pointer aritmetic:
Type of a is int** so it's a pointer to pointers.
a+2 is the a[0] (the first row) + 2. The result is a pointer
to the third row.
*(a+2) deferences the third row, that's {7,8,9}
The third row is an array of int, that's a pointer to int.
Then the operator<< will print the value of that pointer.
A 2-dimensional array is an array of arrays, so it's stored like this in memory:
char v[2][3] = {{1,3,5},{5,10,2}};
Content: | 1 | 3 | 5 | 5 | 10 | 2
Address: v v+1 v+2 v+3 v+4 v+5
To access v[x][y], the compiler rewrites it as: *(v + y * M + x) (where M is the second dimension specified)
For example, to access v[1][1], the compiler rewrites it as *(v + 1*3 + 1) => *(v + 4)
Be aware that this is not the same as a pointer to a pointer (char**).
A pointer to a pointer is not an array: it contains and address to a memory cell, which contains another address.
To access a member of a 2-dimensional array using a pointer to a pointer, this is what is done:
char **p;
/* Initialize it */
char c = p[3][5];
Go to the address specified by the content of p;
Add the offset to that address (3 in our case);
Go to that address and get its content (our new address).
Add the second offset to that new address (5 in our case).
Get the content of that address.
While to access the member via a traditional 2-dimensional array, these are the steps:
char p[10][10];
char c = p[3][5];
Get the address of pand sum the first offset (3), multiplied by the dimension of a row (10).
Add the the second offset (5) to the result.
Get the content of that address.
If you have an array like this
T a[N];
then the name of the array is implicitly converted to pointer to its first element with rare exceptions (as for example using an array name in the sizeof operator).
So for example in expression ( a + 2 ) a is converted type T * with value &a[0].
Relative to your example wuth array
int a[101][101];
in expression
a + 2
a is converted to rvalue of type int ( * )[101] and points to the first "row" of the array. a + 2 points to the third "row" of the array. The type of the row is int[101]
Expression *(a+2) gives this third row that has type int[101] that is an array. And this array as it is used in an expression in turn is converted to pointer to its first element of type int *.
It is the same starting address of the memory area occupied by the third row.
Only expression ( a + 2 ) has type int ( * )[101] while expression *( a + 2 ) has type int *. But the both yield the same value - starting address of the memory area occupied by the third row of the array a.
The first element of an array is at the same location as the array itself - there is no "empty space" in an array.
In cout << a + 2, a is implicitly converted into a pointer to its first element, &a[0], and a + 2 is the location of a's third element, &a[2].
In cout << *(a + 2), the array *(a + 2) - that is, a[2] - is converted into a pointer to its first element, &a[2][0].
Since the location of the third element of a and the location of the first element of the third element of a are the same, the output is the same.

Pointer syntax usage in Array

I have some problem understanding the pointers syntax usage in context with two dimensional arrays, though I am comfortable with 1-D array notation and pointers, below is one of the syntax and I am not able to understand how the following expression is evaluated.
To access the element stored at third row second column of array a we will use the subscripted notation as a[2][1] other way to access the same element is
*(a[2]+1)
and if we want to use it as pointers we will do like this
*(*(a+2)+1)
Though I am able to understand the replacement of *(a[2]+1)
as *(*(a+2)+1) but I don't know how this is getting evaluated.
Please explain with example if possible.
Assume array is stored in row wise order and contain the following elements
int a[5][2]={
21,22,
31,32
41,42,
51,52,
61,62
};
and base address of array is 100(just assume) so the address of a[2] is 108 (size of int =2(another assumption)) So the expression *(*(a+2)+1). How does it gets evaluated does it start from the inside bracket and if it does then after the first bracket we have the value to which 1 is being added rather than the address... :/
To start of with
a[i] = *(a+i);
So
a[i][j] = *(a[i] +j)
and
a[i][j] = *(*(a+i) + j)
How a[i] = *(a+i):
If a is a array then the starting address of the array is given by &a[0] or just a
So when you specify
a[i] this will decay to a pointer operation *(a+i) which is, start at the location a and dereference the pointer to get the value stored in the location.
If the memory location is a then the value stored in it is given by *a.
Similarly the address of the next element in the array is given by
&a[1] = (a+1); /* Array decays to a pointer */
Now the location where the element is stored in given by &a[1] or (a+1) so the value stored in that location is given by *(&a[1]) or *(a+1)
For Eg:
int a[3];
They are stored in the memory as shown below:
a a+1 a+2
------------------
| 100 | 102 | 104|
------------------
&a[0] &a[1] &a[2]
Now a is pointing to the first element of the array. If you know the pointer operations a+1 will give you the next location and so on.
In 2D arrays the below is what the access is like :
int arr[m][n];
arr:
will be pointer to first sub array, not the first element of first sub
array, according to relationship of array & pointer, it also represent
the array itself,
arr+1:
will be pointer to second sub array, not the second element of first sub
array,
*(arr+1):
will be pointer to first element of second sub array,
according to relationship of array & pointer, it also represent second
sub array, same as arr[1],
*(arr+1)+2:
will be pointer to third element of second sub array,
*(*(arr+1)+2):
will get value of third element of second sub array,
same as arr[1][2],
A 2D array is actually a consecutive piece of memory. Let me take a example : int a[3][4] is representend in memory by a unique sequence of 12 integer :
a00 a01 a02 a03 a04 a10 a11 a12 a13 a14 a20 a21 a22 a23 a24
| | |
first row second row third row
(of course it can be extended to any muti-dimensional array)
a is an array of int[4] : it decays to a pointer to int[4] (in fact, it decays to &(a[0]))
a[1] is second row. It decays to a int * pointing to beginning of first row.
Even if arrays are not pointer, the fact that they decay to pointers allows to use them in pointer arithmetic : a + 1 is a pointer to second element of array a.
All that explains why a[1] == *(a + 1)
The same reasoning can be applied to a[i] == *(a+i), and from there to all the expressions in your question.
Let's look specifically to *(*(a+2)+1). As said above, a + 2 is a pointer to the third element of an array of int 4. So *(a + 2) is third row, is an array of int[4] and decays to a int * : &(a[2][0]).
As *(a + 2) decays to an int *, we can still use it as a base for pointer arithmetic and *(a + 2) + 1 is a pointer to the second element of third row : *(a + 2) + 1 == &(a[2][1]). Just dereference all that and we get
*(*(a + 2) + 1) == a[2][1]

Help me understand this Strange C++ code

This was a question in our old C++ exam. This code is driving me crazy, could anyone explain what it does and - especially - why?
int arr[3]={10,20,30};
int *arrp = new int;
(*(arr+1)+=3)+=5;
(arrp=&arr[0])++;
std::cout<<*arrp;
This statement writes to the object *(arr+1) twice without an intervening sequence point so has undefined behavior.
(*(arr+1)+=3)+=5;
This statement writes to the object arrp twice without an intervening sequence point so has undefined behavior.
(arrp=&arr[0])++;
The code could result in anything happening.
Reference: ISO/IEC 14882:2003 5 [expr]/4: "Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression."
(*(arr+1)+=3)+=5;
arr + 1 - element with index 1
*(arr + 1) - value of this element
(arr + 1) += 3 - increase by 3
((arr+1)+=3)+=5 - increase by 5;
so arr[1] == 28
(arrp=&arr[0])++;
arr[0] - value of element 0
&arr[0] - address of element 0
arrp=&arr[0] - setting arrp to point to elem 0
(arrp=&arr[0])++ - set arr to point to elem 1
result: 28
This line:
(*(arr+1)+=3)+=5;
produces the same result as this (see footnote):
arr[1] += 3;
arr[1] += 5;
This line:
(arrp=&arr[0])++;
produces the same result as this (see footnote):
int* arrp = arr+1;
So this line:
std::cout<<*arrp
prints out 28.
But this code leaks memory because int *arrp = new int; allocates a new int on the heap which will be lost on assignment by (arrp=&arr[0])++;
Footnote: Of course I'm assuming an absence of weirdness.
Edit: Apparently some of the lines in fact lead to undefined behavior, due to C++ Standard 5/4. So this really is a crappy exam question.
int arr[3]={10,20,30}; // obvious?
int *arrp = new int; // allocated memory for an int
(*(arr+1)+=3)+=5; // (1)
(arrp=&arr[0])++; // (2)
std::cout<<*arrp; // (3)
(1)
*(arr+1) is the same as arr[1], which means that *(arr+1)+=3 will increase arr[1] by 3, so arr[1] == 23 now.
(*(arr+1)+=3)+=5 means arr[1] is increased by another 5, so it will be 28 now.
(2)
arrp will pont to the address of the first element of arr (arr[0]). The pointer arrp will then be incremented, thus it will point to the second element after the entire statement is executed.
(3)
Prints what arrp points to: the second element of arr, meaning 28.
Well, remember that arrays can be interpreted as pointers
int arr[3]={10,20,30};
int *arrp = new int;
creates an array arr of three integers and an int pointer that gets assigned with a freshly allocated value.
Since assignment operators return a reference to the value that has been assigned in order to allow multi-assignment,
(*(arr+1)+=3)+=5;
is equivalent to
*(arr+1)+=3;
*(arr+1)+=5;
*(arr + 1) refers to the first element of the array arr, therefore arr[1] is effectively increased by eight.
(arrp=&arr[0])++; assigns the address of the first array element to arrp and afterward increments this pointer which now points to the second element (arr[1] again).
By dereferencing it in std::cout<<*arrp, you output arr[1] which now holds the value 20 + 3 + 5 = 28.
So the code prints 28 (and furthermore creates a memory-leak since the new int initially assigned to arrp never gets deleted)
I'll try to answer you by rewriting the code in a simpler way.
int arr[3]={10,20,30};
int *arrp = new int;
(*(arr+1)+=3)+=5;
(arrp=&arr[0])++;
std::cout<<*arrp;
=== equals ===
int arr[3]={10,20,30};//create array of 3 elements and assign them
int *arrp = new int;//create an integer pointer and allocate an int to it(useless)
//(*(arr+1)+=3)+=5;
arr[1] = arr[1] + 3;//arr[1] == arr+1 because it is incrementing the arr[0] pointer
arr[1] = arr[1] + 5;
//(arrp=&arr[0])++;
arrp = &arr[0];//point the integer pointer to the first element in arr[]
arrp++;//increment the array pointer, so this really is now pointing to arr[1]
std::cout<<*arrp;//just print out the value, which is arr[1]
I am assuming you understand pointers and basic c.