What Happens in Scanf When We Using & Operator? [duplicate] - c++

This question already has answers here:
What's the type difference between a, &a, and &a[0]?
(3 answers)
Difference between `a` and `&a` in C++ where `a` is an array
(4 answers)
How come an array's address is equal to its value in C?
(6 answers)
Closed 1 year ago.
When we using '&' operator in scanf, do we scan address or exact value in that adress? For instance i don't understand how this 2 code give to us same result.
CODE 1
#include <stdio.h>
int main(){
int arr[6], i, sum=0;
for(i=0;i<6;i++){
scanf("%d", &arr[i]);
sum+=arr[i];
}
printf("%d", sum);
}
CODE 2
#include <stdio.h>
int main(){
int arr[6], i, sum=0;
for(i=0;i<6;i++){
scanf("%d", (arr+i));
sum+=*(arr+i);
}
printf("%d", sum);
}

scanf() requires you to pass it the address of each variable that you want it to write parsed values to. The & operator returns those addresses.
In your examples, &arr[i] and (arr+i) both represent the same address of the i'th array element, and also arr[i] and *(arr+i) both represent accesses to that same element's stored value.

& is the address-of operator, i.e. given an object, it gives you a pointer to that object.
Along with ordinary arithemetic, operator+ applied to a (pointer, integer) pair gives you a pointer to another element of the same array (so long as there are such elements) or a pointer one past the end of the array (if the value is exactly distance to the end) or undefined behaviour (if you would be out of range).
a[i] is defined as *(a + i). It dereferences the calculated pointer.
Putting that together, arr + i is an int* expression, whereas arr[i] is an int expression. As we need an int* for scanf, & is used to get one.

From the C Standard (6.5.2.1 Array subscripting)
2 A postfix expression followed by an expression in square brackets []
is a subscripted designation of an element of an array object. The
definition of the subscript operator [] is that E1[E2] is identical to
(*((E1)+(E2))). Because of the conversion rules that apply to the
binary + operator, if E1 is an array object (equivalently, a pointer
to the initial element of an array object) and E2 is an integer,
E1[E2] designates the E2-th element of E1 (counting from zero).
So the expression &arr[i] is equivalent to &( *( arr + i ) ). The operators * and & applied to the expression arr + i sequentially may be omitted and you will get that &arr[i] is equivalent tp arr + i. The both expressions are pointers to the i-th element of the array arr.

Related

How C/C++ compiler distinguish regular two dimensional array and array of pointers to arrays?

Regular static allocated array looks like this, and may be accessed using the following formulas:
const int N = 3;
const int M = 3;
int a1[N][M] = { {0,1,2}, {3,4,5}, {6,7,8} };
int x = a1[1][2]; // x = 5
int y = *(a1+2+N*1); // y = 5, this is what [] operator is doing in the background
Array is continuous region of memory. It looks different in case of dynamic array allocation, there is array of pointer to arrays instead:
int** a2 = new int*[N];
for (int i = 0; i < N; i++)
a2[i] = new int[M];
//Assignment of values as in previous example
int x = a2[1][2];
int y = *(*(a2+1))+2); // This is what [] operator is doing in the background, it needs to dereference pointers twice
As we can see, operations done by [] operator are completely different in case of typical continuous array and dynamically allocated array.
My questions are now following:
Is my understanding of [] operations correct?
How C/C++ compiler can distinguish which [] operation it should perform, and where it's implemented? I can image implementing it myself in C++ by overloading [] operator, but how C/C++ treat this?
Will it work correctly in C language using malloc instead of new? I don't see any reasons why not actually.
For this declaration of an array
int a1[N][M] = { {0,1,2}, {3,4,5}, {6,7,8} };
these records
int x = a1[1][2];
int y = *(a1+2+N*1);
are not equivalent.
The second one is incorrect. The expression *(a1+2+N*1) has the type int[3] that is implicitly converted to an object of the type int * used as an initializer. So the integer variable y is initialized by a pointer.
The operator a1[1] is evaluated like *( a1 + 1 ) . The result is a one-dimensional array of the type int[3].
So applying the second subscript operator you will get *( *( a1 + 1 ) + 2 ).
The difference between the expressions when used the two-dimensional array and the dynamically allocated array is that the designator of the two-dimensional array in this expression (a1 + 1) is implicitly converted to a pointer to its first element of the type int ( * )[3] while the pointer to the dynamically allocated array of pointers still have the same type int **.
In the first case dereferencing the expression *(a1 + 1 ) you will get lvalue of the type int[3] that in turn used in the expression *( a1 + 1) + 2 is again implicitly converted to a pointer of the type int *.
In the second case the expression *(a1 + 1) yields an object of the type int *.
In the both cases there is used the pointer arithmetic. The difference is that when you are using arrays in the subscript operator then they are implicitly converted to pointers to their first elements.
When you are allocating dynamically arrays when you are already deals with pointers to their first elements.
For example instead of these allocations
int** a2 = new int*[N];
for (int i = 0; i < N; i++)
a2[i] = new int[M];
you could just write
int ( *a2 )[M] = new int[N][M];
Is my understanding of [] operations correct?
int y = *(a1+2+N*1); // y = 5, this is what [] operator is doing in the background
By definition, the way to translate the subscript operators to the corresponding indirection and pointer arithmetic is:
int y = *(*(a1+1)+2)
Which is exactly the same as in the case of int**.
How C/C++ compiler can distinguish which [] operation it should perform
The compiler uses the type system. It knows the types of the expressions and it knows what subscript operation means for each type.
Will it work correctly in C language using malloc instead of new? I don't see any reasons why not actually.
It doesn't matter how an array is created. Subscript operator works the same way with all pointers.
a1 and a2 are different types, and as such, the behavior of operator [] will depend on how that type defines the operator. In this case you're dealing with intrinsic compiler behaviors that conform to the C++ spec, but it could just as well be a std::unique_ptr<>, or MyClass with overloaded operator[]
Each operation leads to result of some specific type. Each type defines what kind of operation is available for it.
Note that array has ability to decay to pointer to element of array. So some_array + int_value leads to pointer to element.
Here is code which exposes types of each step: https://godbolt.org/z/jeKWh5WWW
#include <type_traits>
const int N = 3;
const int M = 4;
int a1[N][M] = { {0,1,2,0}, {3,4,5,0}, {6,7,8,0} };
int** a2 = new int*[N];
static_assert(
std::is_same_v<decltype(a1[0][0]), int&>,
"value type is reference to int");
static_assert(
std::is_same_v<decltype(a1[0]), int(&)[M]>,
"row type is reference to int aray");
static_assert(
std::is_same_v<decltype(a1 + 1), int(*)[M]>,
"advanced pointer is pointer to array of ints");
static_assert(
!std::is_same_v<decltype(a1[0]), int*&>,
"row type is reference to int pointer");
static_assert(
std::is_same_v<decltype(a2[0][0]), int&>,
"value type is reference to int");
static_assert(
!std::is_same_v<decltype(a2[0]), int(&)[M]>,
"row type is not reference to int aray");
static_assert(
std::is_same_v<decltype(a2 + 1), int**>,
"advanced pointer is pointer to pointer to int");
static_assert(
std::is_same_v<decltype(a2[0]), int*&>,
"row type is reference to int pointer");
I think this is good appendix to other answers.
How C/C++ compiler can distinguish which [] operation it should perform, and where it's implemented?
The built-in [] operator (that is, not a user-defined overload) always does one thing: It adds its two operands and dereference the results. E1[E2] is defined to be (*((E1)+(E2))). Here is how this works:
If E1 or E2 is an array, it is automatically converted to a pointer to its first element. This not a part of the [] operator per se; it is a built-in part of the C and C++ languages. In C, the specific rule is that, whenever an array is used in an expression other than as the operand of sizeof, the operand of unary &, or as a string literal used to initialize an array, it is converted to a pointer to its first element.
Thus, whether the code is written with a pointer or an array, [] always has a pointer operand. You may write an array, but [] always receives a pointer.
The + operator adds an integer to a pointer by adjusting the pointer by the given number of elements: Given a pointer to element j of an array and an integer k to add to it, it produces a pointer to element j+k of the array.
From the pointer to an element, the * operator produces an lvalue for the referenced element.
The combination of automatic array conversion, +, and *, means that A[i] produces an lvalue for element i of the array A.
Here is how this works for the expression A[i][j] where A is an array declared as SomeType A[m][n]:
In A[i][j], A is an array of m arrays of n elements. It is automatically converted to a pointer to its first element (the one with index 0).
Then A[i] produces an lvalue for element i of this array. In other words, the result of A[i] is an array; it is an array of n SomeType objects.
Since the result of A[i] is an array, it is automatically converted to a pointer to its first element.
Then A[i][j] produces an lvalue for element j of that array.
Since the pointer arithmetic operates in units of the pointed-to-type, it includes the scaling for the sizes of the elements. This is what makes the calculation of A[i] scaled by the size of the subarray of n elements.
Will it work correctly in C language using malloc instead of new? I don't see any reasons why not actually.
Sure, if done correctly.

How *(&arr + 1) - arr is working to give the array size [duplicate]

This question already has answers here:
How does *(&arr + 1) - arr give the length in elements of array arr?
(5 answers)
Why are the values different? C++ pointer
(2 answers)
Closed 1 year ago.
The community reviewed whether to reopen this question 9 months ago and left it closed:
Duplicate This question has been answered, is not unique, and doesn’t differentiate itself from another question.
int arr[] = { 3, 5, 9, 2, 8, 10, 11 };
int arrSize = *(&arr + 1) - arr;
std::cout << arrSize;
I am not able to get how this is working. So anyone can help me with this.
If we "draw" the array together with the pointers, it will look something like this:
+--------+--------+-----+--------+-----+
| arr[0] | arr[1] | ... | arr[6] | ... |
+--------+--------+-----+--------+-----+
^ ^ ^
| | |
&arr[0] &arr[1] |
| |
&arr &arr + 1
The type of the expressions &arr and &arr + 1 is int (*)[7]. If we dereference either of those pointers, we get a value of type int[7], and as with all arrays, it will decay to a pointer to its first element.
So what's happening is that we take the difference between a pointer to the first element of &arr + 1 (the dereference really makes this UB, but will still work with any sane compiler) and a pointer to the first element of &arr.
All pointer arithmetic is done in the base-unit of the pointed-to type, which in this case is int, so the result is the number of int elements between the two addresses being pointed at.
It might be useful to know that an array will naturally decay to a pointer to its first element, ie the expression arr will decay to &arr[0], which will have the type int *.
Also, for any pointer (or array) p and index i, the expression *(p + i) is exactly equal to p[i]. So *(&arr + 1) is really the same as (&arr)[1] (which makes the UB much more visible).
That program has undefined behaviour. (&arr + 1) is a valid pointer that points "one beyond" arr, and has type int(*)[7], however it doesn't point to an int [7], so dereferencing it is invalid.
It so happens that your implementation assumes there is a second int [7] after the one you declare, and subtracts the location of the first element of that array that exists from the location of the first element of the fictitious array that the pointer arithmetic invented.
You need to explore what the type of the &arr expression is, and how that affects the + 1 operation on it.
Pointer arithmetic works in 'raw units' of the pointed-to type; &arr is the address of your array, so it points to an object of type, "array of 7 int". Adding 1 to that pointer actually adds the size of the type to the address – so 7 * sizeof(int) is added to the address.
However, in the outer expression (subtraction of arr), the operands are pointers to int objects1 (not arrays), so the 'units' are just sizeof(int) – which is 7 times smaller than in the inner expression. Thus, the subtraction results in the size of the array.
1 This is because, in such expressions, an array variable (such as the second operand, arr) decays to a pointer to its first element; further, your first operand is also an array, as the * operator dereferences the modified value of the array pointer.
Note on Possible UB: Other answers (and comments thereto) have suggested that the dereferencing operation, *(&arr + 1), invokes undefined behaviour. However, looking through this Draft C++17 Standard, there is the vaguest of suggestions that it may not:
6.7.2 Compound Types
...
3    … For purposes of pointer arithmetic (8.5.6) and comparison
(8.5.9, 8.5.10), a pointer past the end of the last element of an
array x of n elements is considered to be equivalent to a pointer to a
hypothetical element x[n].
But I won't claim "Language-Lawyer" status here, as there is no explicit mention in that section about dereferencing such a pointer.
If you have a declaration like this
int arr[] = { 3, 5, 9, 2, 8, 10, 11 };
the the expression &arr + 1 will point to the memory after the last element of the array. The value of the expression is equal to the value of the expression arr + 7 where 7 is the number of elements in the array declared above. The only difference is that the expression &arr + 1 has the type int ( * )[7] while the expression arr + 7 has the type int *.
So due to the integer arithmetic the difference ( arr + 7 ) - arr will yield 7: the number of elements in the array.
On the other hand, dereferencing the expression &att + 1 having the type int ( * )[7] we will get lvalue of the type int[7] that in turn used in the expression *(&arr + 1) - arr is converted to a pointer of the type int * and has the same value as arr + 7 as it was pointed out above. So the expression will yield the number of elements in the array.
The only difference between these two expressions
( arr + 7 ) - arr
and
*( &arr + 1 ) - arr
is that in the first case we will need explicitly to specify the number of elements in the array to get the address of the memory after the last element of the array while in the second case the compiler itself will calculate the address of the memory after the last element of the array knowing the array declaration.
As others have mentioned, *(&arr + 1) triggers undefined behavior because &arr + 1 is a pointer to one-past-the end of an array of type int [7] and that pointer is subsequently dereferenced.
An alternate way of doing this would be to convert the relevant pointers to uintptr_t, subtracting, and dividing the element size.
int arrSize = reinterpret_cast<int>((reinterpret_cast<uintptr_t>(&arr + 1) -
reinterpret_cast<uintptr_t>(arr)) / sizeof *arr);
Or using C-style casts:
int arrSize = (int)(((uintptr_t)(&arr + 1) - (uintptr_t)arr) / sizeof *arr);
This one is simple:
arr is just a pointer to the 0'th element of the array (&arr[0]);
&arr gives a pointer to the previous pointer;
&arr+1 gives a pointer to a pointer to arr[0]+sizeof(arr)*1;
*(&arr + 1) turns the previous value into just &arr[0]+sizeof(arr)*1;
*(&arr + 1) - arr also subtracts the pointer to arr[0] leaving just sizeof(arr)*1.
So the only tricks here are that static arrays in C internally preserve all their static type information including their total sizes and that when you increment a pointer by some integer value, C compilers don't just add the value to it, but for whatever reason standards require to increase the pointers by the value of sizeof() of whatever type the pointer is assigned to times the specified value so *(&p+idx) gives the same result as p[idx].
C language is designed to allow for very simplistic compilers so inside it is full of little tricks like this. I would not recommend using them in production code though. Remember about other developers who may need to read and maintain your code later and use the most simple and obvious stuff available instead (for the example it is obviously just using sizeof() directly).

C++: Why do i not need to dereference for initialising dynamic arrays? [duplicate]

This question already has answers here:
What does "dereferencing" a pointer mean?
(6 answers)
Closed 4 years ago.
Given the code below:
double** vr(new double*[nx]);
vr[i]=new double[ny];
for(int i=0;i<nx;++i) { //Loop 200times
for(int j=0;j<ny;++j) { //Loop 200times
vr[i][j]=double(i*i*i*j*c1);
}
}
For vr[i][j] = double(i * i * i * j * c1);
why is it not required to address the value using * for example *(vr[i][j]).
It is just an address isnt it?
vr[i][j] is a double not an address.
Basically, if p is of type T* then p[i] means *(p+i) and is of type T, where T can be any type.
In your case vr is double**, so vr[i] is double* and vr[i][j] is double.
When you do vr[i] you're already dereferencing because vr[i] is equivalent to *(vr + i). This means that vr[i][j] is double dereferencing, and is equivalent to *(*(vr + i) + j).
See pointer arithmetic for further information.
From dcl.array/6
the subscript operator [] is interpreted in such a way that E1[E2]
is identical to *((E1)+(E2)) ([expr.sub]). Because of the conversion
rules that apply to +, if E1 is an array and E2 an integer, then
E1[E2] refers to the E2-th member of E1. Therefore, despite its
asymmetric appearance, subscripting is a commutative operation.
 — end note ]
Thus, operator[], the subscript operator, works the same as using operator* (unary) for indirection.
int a[5] = {1, 2, 3, 4, 5}; // array
int* pa = a; // pointer to a
std::cout << pa[2]; // (ans. 3) using subscript operator
std::cout << *(pa + 2); // (ans. 3) using indirection, accessing A's 3rd element
Subscript operator ([i]) does the dereferencing. It says to shift on i "cells" and get the object located there.

passing pointers (the name of array) into function in C/C++

Lets say we create an array like:
int a[4]={1,2,3,4};
Now a is the name of this array and also the pointer points to the first element a[0]. So when I want to call the elements in the array, I can use a[ i ] or *(a+i).
Now I have a function:
void print_array(int* array, int arraySize){
for(int i=0; i<arraySize; i++){
cout<<*(array+i)<<endl;
cout<<array[i]<<endl;
}
}
When I pass a[4]={1,2,3,4} into this function using print_array(a,4), for the first line of cout, I fully understand because I use *(a+i) method to access data and a is the pointer I passed.
What I can't understand is: since I pass a pointer a into function, why can I use a in the format of a[i] like the second line of cout? Isn't a a pointer? If a is a pointer why does a[i] work?
This has confused me for a whole day. Any help will be much appreciated!
a is an array, not a pointer. They are not the same things. However, the name a can be implicitly converted to a pointer (with the value &a[0]).
For example;
int main()
{
int a[] = {1,2,3,4};
int *p = a; // p now has the value &a[0]
Now, after this partial code snippet, assuming i is an integral value, rules of the language amount to;
a[i] is equivalent to *(a + i) which is equivalent to *(&a[0] + i)
p[i] is equivalent to *(p + i)
Now, since p is equal to &a[0] this means that a[i], *(a + i), p[i], and *(p + i) are all equivalent.
When calling print_arrat(a, 4) where a is the name of an array, then a is ALWAYS converted to a pointer. This means print_arrat() is always passed a pointer. And this means *(array + i) inside print_arrat() is the same as a[i] in the caller.
This quote from the C++ Standard will make the point clear (5.2.1 Subscripting)
1 A postfix expression followed by an expression in square brackets is
a postfix expression. One of the expressions shall have the type
“array of T” or “pointer to T” and the other shall have unscoped
enumeration or integral type. The result is of type “T.” The type “T”
shall be a completely-defined object type.64 The expression E1[E2] is
identical (by definition) to *((E1)+(E2)) [Note: see 5.3 and 5.7 for
details of * and + and 8.3.4 for details of arrays. —end note], except
that in the case of an array operand, the result is an lvalue if that
operand is an lvalue and an xvalue otherwise.
Because in effect, while the subscript operator is defined on arrays, what happens is that they decay into pointers for the arithmetic to occur.
Meaning if a is an array, semantically what happens is:
int b = a[i]; => int *__tmp = a; int b = *(__tmp + i);
However, once operator overloading comes into play, then it is no longer true that a[i] == *(a + i). The right hand side may not even be defined.
What I can't understand is: since I pass a pointer "a" into function, why can I use "a" in the format of a[i] like the second line of "cout"?
Because subscript operator a[i] is defined for arrays and it is equivalent to *(a+i) by definition.
In the line with cout, you use array[i] however, where array is a pointer. This is also allowed, because the subscript operator is also defined for pointers.
Isn't "a" a pointer?
No. a is an array. array is a pointer.

Pointer in 2D Array [duplicate]

This question already has answers here:
How come an array's address is equal to its value in C?
(6 answers)
Closed 7 years ago.
As a beginner programmer I am dealing with some simple problems related to Pointers. In the following code I found the value of *a and a are same in hexadecimal. But I can't understand the reason.
#include <stdio.h>
#include <stdlib.h>
main(){
int a[5][5];
a[0][0] = 1;
printf("*a=%p a=%p \n", *a, a);
return 0;
}
Here is the output:
*a=0x7ffddb8919f0 a=0x7ffddb8919f0
An array and its first element have the same address.:)
For this declaration
int a[5][5];
expression a used in the printf call is implicitly converted to the pointer to its first element. Expression *a yields the first element of the array that is in turn a one-dimensional array that also is converted to a pointer to its first element.
Thus expressions a and *a have the same value as expression &a[0][0]
In C and C++ languages values of array type T [N] are implicitly converted to values of pointer type T * in most contexts (with few exceptions). The resultant pointer points to the first element of the original array (index 0). This phenomenon is informally known as array type decay.
printf argument is one of those contexts when array type decay happens.
A 2D array of type int [5][5] is nothing else than an "1D array of 1D arrays", i.e. it is an array of 5 elements, with each element itself being an array of 5 ints.
The above array type decay rule naturally applies to this situation.
The expression a, which originally has array type int [5][5], decays to a pointer of type int (*)[5]. The pointer points to element a[0], which is the beginning of sub-array a[0] in memory. This is the first pointer you print.
The expression *a is a dereference operator applied to sub-expression a. Sub-expression a in this context behaves in exactly the same way as before: it decays to pointer of type int (*)[5] that points to a[0]. Thus the result of *a is a[0] itself. But a[0] is also an array. It is an array of int[5] type. It is also subject to array type decay. It decays to pointer of type int *, which points to the first element of a[0], i.e. to a[0][0]. This is the second pointer you print.
The reason both pointer values are the same numerically is that the beginning of sub-array a[0] corresponds to the same memory location as element a[0][0].
a can be considered a pointer to a pointer to an int (in reality, it's an array of array of int, but close enough).
So a and *a both point to the same address (which happens to be a[0][0]).
*a is still a pointer, and a[0] is the same address as a[0][0].