Difference between `a` and `&a` in C++ where `a` is an array - c++

I am confused about the output of the following code.
#include<iostream>
#include<cstdlib>
using namespace std;
int main()
{
int a[] = {1,2,3};
cout << a << " " << &a << endl;
cout << sizeof(a) << " " << sizeof(&a) << endl;
return 0;
}
The output is
0xbfcd3ae4 0xbfcd3ae4
12 4
How can a and &a print the same expression but have different sizes?
I always thought that for any array, its name always has the value = address of the first byte.
Also &a should not make sense, since one cannot have an address (obtained with the & operator) to an address(the value of a). Yet the code gives an output and infact 'a == &a'
according to the output.
Similarly why is the output of sizeof(a) = 12 (which is the total memory occupied)
by the array? a being a "pointer" itself sizeof(a) = 4 bytes (on my 32 bit Ubuntu 11.04)
Obviously there is some misconception I am having. Could some one sort this out for me ?

An array is not a pointer, but an array decays to a pointer when you try to use it like one. In your case printing the address of the array automatically converts it into a pointer.
There's little difference between the automatically converted pointer and the one created explicitly with &, except that one is a pointer to a single element while the other is a pointer to the entire array. If you had used &a[0] then they would be identical.

First of all, realize that there is a difference between an object1 and the expression that we use to refer to that object. In your code, a is an expression that refers to an object large enough to store 3 int values.
Except when it is the operand of the sizeof or unary & operators, or is a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element of the array.
Given a statement like
cout << a;
the expression a has type "3-element array of int"; since a is not an operand of the sizeof or unary & operators, it will be converted to an expression of type "pointer to int", and the value of the expression will be the address of the first element in the array.
OTOH, given a statement like
cout << &a;
the expression a is the operand of the unary & operator, so the rule doesn't apply; instead, the type of the expression is "pointer to 3-element array of int", and the value of the expression is the address of the array.
In both C and C++, the address of the array and the address of the first element of the array are the same, so both expressions yield the same value, but the types of the two expressions are different (int * vs. int (*)[3]).
In the statement
cout << sizeof a; // or sizeof (a)
the expression a is the operand of the sizeof operator, so again, the conversion rule doesn't apply; instead, the expression sizeof a evaluates to the number of bytes used by the array (in this case, 12).
In the statement
cout << sizeof &a; // or sizeof (&a)
the expression &a evaluates to the address of the array and has type int (*)[3], so sizeof &a evaluates to the number of bytes used by the pointer type (in your case, 4 bytes).
In both C and C++, when you declare an array like
int a[3];
the only storage set aside is for the 3 array elements; there's no separate storage for an a variable that points to the first element of the array (which is why a and &a yield the same value).
1. In the sense of something that occupies memory, not the sense of an instance of a class

In C, the expression a is a primary expression which designates an array object. In a context where this expression is evaluated, it produces a value which is of pointer type, and points to the first element of the array.
This is a special evaluation rule for arrays. It does not mean that a is a pointer. It isn't; a is an array object.
When the sizeof of & operators are applied to a, it is not evaluated. sizeof produces the size of the array, and & takes its address: it produces a pointer to the array as a whole.
The values of the expressions a and &a point to the same location, but have a different type: a pointer to type of the array element (such as pointer to int), versus the type of the array (such pointer to array of 10 int) respectively.
C++ works very similarly in this area for compatibility with C.
There are other situations in C where the value of an expression which denotes a value or object of one type produces a value of another type. If c has type char, then the value of c is a value of type int. Yet &c has type char *, and sizeof c produces 1. Just like an array is not a pointer, c is not an int; it just produces one.
C++ isn't compatible in this area; character expressions like names of char variables or character constants like 'x' have type char. This allows void foo(int) and void foo(char) to be different overloads of a function foo. foo(3) will call the first one, and foo('x') the second.

Look at this following code for better understanding:
#include<iostream>
#include<cstdlib>
using namespace std;
int main()
{
int a[] = {1,2,3};
cout << a << " " << &a << endl;
cout << a+1 << " " << &a+1 << endl;
cout << sizeof(a) << " " << sizeof(&a) << endl;
return 0;
}
Output:
0x7fffb231ced0 0x7fffb231ced0
0x7fffb231ced4 0x7fffb231cedc
12 8
We got same address for a and &a. So we you may think they are same. But that not true, because a variable (a) and its address (&a) cannot be same.
Both printed address, but both are printing different addresses, "a” is a “pointer to the first element of array” but “&array” is a “pointer to whole array”.
That's why when we printed a+1 we got 0x7fffb231ced4which is address to second element in array (and since it array of integers, its increased by 4) while &a+1 got us 0x7fffb231cedc which is increment of 4*3= 12 (c means 12 in hex) because we had 3 elements in array. &arr+1 points to next memory location after array.
Also this fundamental difference is what results in different in sizes of a and &a. size of a is actually the size of array while size of &a is size of pointer.

Related

Why does the sizeof operator produce different results for an array

Why does the sizeof operator produces 12 bytes when it should only be 4? When I reference the variable array, that is only referring to the memory address of the first index of the array. In fact, I printed the memory address of the first index &array[0] and compared it to array, they produced the same memory address result which confirms that they are both referring to the first index of the array, but 'array' produces 12 byte while array[0] produces 4 byte.
int main() {
int array[] = {1,2,3};
int a = 1;
int b = sizeof(array); //this is referring to the first index of the array
int c = sizeof(array[0]); //this is also referring to the first index of the array
std::cout << b << std::endl;
std::cout << array << std::endl; //they have the same memory address
std::cout << &array[0] << std::endl; /* they have the same memory address, which confirms that array
and &array[0] is the same */
return 0;
}
Arrays and pointers are not the same, and this is a prime example of this.
In most contexts, an array decays to a pointer to its first member. One of the few times this decay does not happen is when the array is the subject of the sizeof operator. In that case it refers to the entire array and the expression evaluates to the size of the entire array in bytes.
This is described in section 6.3.2.1p3 of the C standard:
Except when it is the operand of the sizeof operator, the _Alignof operator, or theunary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
As well as the C++11 standard in sections 7.2:
An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to a prvalue of type “pointer to T”. The temporary materialization conversion (7.4) is applied. The result is a pointer to the first element of the array.
And 8.3.3p4:
The lvalue-to-rvalue (7.1), array-to-pointer (7.2), and function-to-pointer (7.3) standard conversions are not applied to the operand of sizeof. If the operand is a prvalue, the temporary materialization conversion (7.4)is applied.
So what we actually have is:
int b = sizeof(array); // size of the entire array
int c = sizeof(array[0]); // size of the first element of the array
int d = sizeof(&array[0]); // size of a pointer to an array element
The size of the array is 12 bytes. The output is correct. The size of the element is 4 bytes. There are 3 elements. 4 * 3 = 12.
In fact, i printed the memory address of the first index (&array[0]) and compared it to (array), they produced the same memory address result which confirms that they are both referring to the first index of the array
Just because the array has the same memory address as the first element of the array, doesn't mean that the entire array is contained within the first element. It isn't.
then why does array+1 refers to the second index of the array?
Because in such a sub expression, the array is implicitly converted to a pointer to first element, and adding 1 to a pointer to first element results in a pointer to second element. Such implicit conversion is called decaying.
If I understand your question well, on 64 bit machine int will be 4 bytes so the sizeof operator will return 3 x 4 = 12 bytes. Did not understand your assumption why it should return 4 in first place when you allocated only three items in that array.
If you want to know the number of items in the array you may do something as follows:
// Finds size of array[] and stores in 'size'
int size = sizeof(array)/sizeof(array[0]);

Why can't pointer fit variable of different type, even though sizeof is same?

Why the sizeof any pointer is 4 or 8 bytes, but it cannot fit any different variable? I get error while trying to assign a double pointer an int pointer value.
int *int_ptr{nullptr};
float *float_ptr{nullptr};
double *double_ptr{nullptr};
long double *long_double_ptr{nullptr};
string *string_ptr{nullptr};
vector<int> *vector_ptr{nullptr};
cout << "sizeof int pointer is " << sizeof int_ptr; //8 or 4
cout << "sizeof float pointer is " << sizeof float_ptr; //8 or 4
cout << "sizeof double pointer is " << sizeof double_ptr; //8 or 4
cout << "sizeof double double pointer is " << sizeof long_double_ptr; //8 or 4
cout << "sizeof string pointer is " << sizeof string_ptr; //8 or 4
cout << "sizeof vector int pointer is " << sizeof vector_ptr; //8 or 4
double double_num{1020.7};
double_ptr = &int_ptr; //cannot convert ‘int**’ to ‘double*’ in assignment
C++ is a statically typed language. The language enforces type safety and protects you from mistakes by rejecting arbitrary conversions between unrelated types. The meaning of a type is not wholly described by the size alone.
If an address contains an object of type int*, then a int** can point to that object. Given that the address contains an object of type int*, it cannot possibly also contain an object of type double, so there is no meaningful way to convert one of those pointers to another.
Pointers are addresses.
Let's say, you have two addresses:
53 Main avenue
85 WrongTurn Road
The first one is the address to a small 2 bedrooms house.
The second is the address is to a mansion, 23 bedrooms, 10 bathrooms, etc...
You cannot expect all people living in the mansion to move into the small house rigth?
but guess what? at the post office, their box sizes is just the same!
That's how they work. They just tell you where to find your variable. Thay are not containers
This code is illegal because there is no "implicit conversion" to map from &int_ptr to double_ptr, where an "implicit conversion" is defined as something:
Performed whenever an expression of some type T1 is used in context that does not accept that type, but accepts some other type T2; in particular:
when the expression is used as the argument when calling a function that is declared with T2 as parameter;
when the expression is used as an operand with an operator that expects T2;
when initializing a new object of type T2, including return statement in a function returning T2;
when the expression is used in a switch statement (T2 is integral type);
when the expression is used in an if statement or a loop (T2 is bool).
The program is well-formed (compiles) only if there exists one unambiguous implicit conversion sequence from T1 to T2.
I initially suggested you use a reinterpret_cast but this won't work either as using the result of a reinterpret_cast between types is only legal if the types differ only by whether their signed, the type being cast to is a byte*, char*, or unsigned char*, or the types are "similar" which is defined as:
they are the same type; or
they are both pointers, and the pointed-to types are similar; or
they are both pointers to member of the same class, and the types of the pointed-to members are similar; or
they are both arrays of the same size or both arrays of unknown bound, and the array element types are similar.
As you can see none of these situations apply to wanting to cast from the address of int* int_ptr to double* double_ptr. It's difficult for me to predict your use case for this type of cast within a strongly typed language, but perhaps a void* is what you're looking for? It could point to either a valid int**, in which case you could initialize it like this: const void* ptr = reinterpret_cast<void*>(&int_ptr) or a valid double* in which case you'd initialize it like this: const void* ptr = reinterpret_cast<void*>(double_ptr). Of course in order to use ptr you'd need a variable telling you which type it contained, for example:
if(is_int) {
// recover int** (reinterpret_cast<const int* const*>(ptr)) and operate on it
} else {
// recover double* (reinterpret_cast<const double*>(ptr)) and operate on it
}
Live Example
I should admit at this point that this answer is somewhat contrived. A better solution anywhere this code is used would very likely be a template.

Does the address of the array equal to that of its first element in C++?

This can be guaranteed in C because of the following sentence in WG14/N1570:
6.2.5/20 ... An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type.
But in WG21/N4527, i.e. in C++, the corresponding sentence becomes
8.3.4/1 ...An object of array type contains a contiguously allocated non-empty set of N subobjects of type T.
while the word "describes" is changed to "contains", which cannot guarantee that the address of the array equals to that of its first element. Is this change intentional or unintentional? If it is intentional, does the address of the array equal to that of its first element in C++? If it does, which paragraph in the C++ standard can guarantee this?
I don't think it's stated explicitly anywhere, but I believe it follows from 5.3.3 Sizeof:
the size of an array of n elements is n times the size of an element
that the only thing that can be stored at the array's starting address is the array's first element.
In C++ it is guaranteed by 4.2/1 Array-to-pointer conversion [conv.array], (bold by me)
An lvalue or rvalue of type “array of N T” or “array of unknown bound
of T” can be converted to a prvalue of type “pointer to T”. The result
is a pointer to the first element of the array.
That means if you want to take the address of an array in C++, you would get a pointer which points to the first element of the array. i.e.
int a[10];
int* p1 = a; // take the address of array
int* p2 = &a[0]; // take the address of the first element of the array
The standard guarantees that p1 and p2 will point to same address.
It depends what you mean by "address of array".
If you are asking if an array, when converted to a pointer, gives a result equal to the address of its first element, then the answer is yes. For example;
#include <iostream>
void func(int *x, int *y)
{
std::cout << "x and y are ";
if (p != q) std::cout << "NOT ";
std::cout << " equal\n";
}
int main()
{
int x[2];
func(x, &x[0]);
}
This will always report that the two pointers are equal. songyuanyao has explained why.
However, x is not actually a pointer to an array (and nor is it converted to one in this code). If you change the call of func() in main() to
func(&x, &x[0]);
then that statement will not even compile. The reason is that &x (the address of the array x) is not a pointer to an int - it is a pointer to an array of two int, and that cannot be implicitly converted into a pointer to int.
The value, however, will be the same, as may be demonstrated by running this code.
#include <iostream>
void func2(void *x, void *y)
{
std::cout << "x and y are ";
if (p != q) std::cout << "NOT ";
std::cout << " equal\n";
}
int main()
{
int x[2];
func2(&x, &x[0]); // both pointers implicitly converted to void * when calling func2()
}

C++ static_cast<void *>

Could someone explain this little code snippet for me?
Given:
int a[3] = {2,3,4};
Why does the following evaluate to true?
static_cast<void *>(a) == static_cast<void *>(&a); // Why is this true?
Is this saying that the address of a is the same as a? If so, why is this true?
It is because address of the variable a concides with the address of the first element of array a. You can also think of a is &a[0] which is clearer when we say "the address of the first element of the array").
Another example,
struct X
{
int i;
};
X x;
Here also the address of variable x concides with the address of x.i (which is the first element of the aggregate), so this would print 1:
std::cout << (&x == &(x.i)) << std::endl; //1
So in your case, &a is like &x, and a (or &a[0]) is like &(x.i).
Note that in C++ a and x are both called aggregate (see my answer here: What is an aggregate?)
In almost all contexts, the name of an array decays into a pointer to the first element of the array. So in static_cast<void*>(a), the a decays into &a[0]; it's type is "pointer to int". The expression evaluates to the address of the first element of the array. In static_cast<void*>(&a), however, &a is the address of the array itself; its type is "pointer to array of 3 int". That's why the casts are needed here: the two expressions without the casts would have different types, and could not be compared. Both can be converted to void* and compared. So what this code is illustrating is that the address of the first element of an array is the same as the address of the array, i.e., there's no padding at the front.
The name of an array usually evaluates to the address of the first element of the array, so array and &array have exact the same value.
However, they are different types. For the following array:
int a[8];
a+1 is the address of the array a + sizeof(int)
but &a+1 will be the address of the array a + 8 * sizeof(int).

Why is arr and &arr the same?

I have been programming c/c++ for many years, but todays accidental discovery made me somewhat curious... Why does both outputs produce the same result in the code below? (arr is of course the address of arr[0], i.e. a pointer to arr[0]. I would have expected &arr to be the adress of that pointer, but it has the same value as arr)
int arr[3];
cout << arr << endl;
cout << &arr << endl;
Remark: This question was closed, but now it is opened again. (Thanks ?)
I know that &arr[0] and arr evaluates to the same number, but that is not my question! The question is why &arr and arr evaluates to the same number. If arr is a literal (not stored anyware), then the compiler should complain and say that arr is not an lvalue. If the address of the arr is stored somewhere then &arr should give me the address of that location. (but this is not the case)
if I write
const int* arr2 = arr;
then arr2[i]==arr[i] for any integer i, but &arr2 != arr.
#include <cassert>
struct foo {
int x;
int y;
};
int main() {
foo f;
void* a = &f.x;
void* b = &f;
assert(a == b);
}
For the same reason the two addresses a and b above are the same. The address of an object is the same as the address of its first member (Their types however, are different).
arr
_______^_______
/ \
| [0] [1] [2] |
--------------------+-----+-----+-----+--------------------------
some memory | | | | more memory
--------------------+-----+-----+-----+--------------------------
^
|
the pointers point here
As you can see in this diagram, the first element of the array is at the same address as the array itself.
They're not the same. They just are at the same memory location. For example, you can write arr+2 to get the address of arr[2], but not (&arr)+2 to do the same.
Also, sizeof arr and sizeof &arr are different.
The two have the same value but different types.
When it's used by itself (not the operand of & or sizeof), arr evaluates to a pointer to int holding the address of the first int in the array.
&arr evaluates to a pointer to array of three ints, holding the address of the array. Since the first int in the array has to be at the very beginning of the array, those addresses must be equal.
The difference between the two becomes apparent if you do some math on the results:
arr+1 will be equal to arr + sizeof(int).
((&arr) + 1) will be equal to arr + sizeof(arr) == arr + sizeof(int) * 3
Edit: As to how/why this happens, the answer is fairly simple: because the standard says so. In particular, it says (§6.3.2.1/3):
Except when it is the operand of the sizeof operator or the unary & operator, or is a
string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue.
[note: this particular quote is from the C99 standard, but I believe there's equivalent language in all versions of both the C and C++ standards].
In the first case (arr by itself), arr is not being used as the operand of sizeof, unary &, etc., so it is converted (not promoted) to the type "pointer to type" (in this case, "pointer to int").
In the second case (&arr), the name obviously is being used as the operand of the unary & operator -- so that conversion does not take place.
The address is the same but both expressions are different. They just start at the same memory location. The types of both expressions are different.
The value of arr is of type int * and the value of &arr is of type int (*)[3].
& is the address operator and the address of an object is a pointer to that object. The pointer to an object of type int [3] is of type int (*)[3]
They are not the same.
A bit more strict explanation:
arr is an lvalue of type int [3]. An attempt to use
arr in some expressions like cout << arr will result in lvalue-to-rvalue conversion which, as there are no rvalues of array type, will convert it to an rvalue of type int * and with the value equal to &arr[0]. This is what you can display.
&arr is an rvalue of type int (*)[3], pointing to the array object itself. No magic here :-) This pointer points to the same address as &arr[0] because the array object and its first member start in the exact same place in the memory. That's why you have the same result when printing them.
An easy way to confirm that they are different is comparing *(arr) and *(&arr): the first is an lvalue of type int and the second is an lvalue of type int[3].
Pointers and arrays can often be treated identically, but there are differences. A pointer does have a memory location, so you can take the address of a pointer. But an array has nothing pointing to it, at runtime. So taking the address of an array is, to the compiler, syntactically defined to be the same as the address of the first element. Which makes sense, reading that sentence aloud.
I found Graham Perks' answer to be very insightful, I even went ahead and tested this in an online compiler:
int main()
{
int arr[3] = {1,2,3};
int *arrPointer = arr; // this is equivalent to: int *arrPointer = &arr;
printf("address of arr: %p\n", &arr);
printf("address of arrPointer: %p\n", &arrPointer);
printf("arr: %p\n", arr);
printf("arrPointer: %p\n", arrPointer);
printf("*arr: %d\n", *arr);
printf("*arrPointer: %d\n", *arrPointer);
return 0;
}
Outputs:
address of arr: 0x7ffed83efbac
address of arrPointer: 0x7ffed83efba0
arr: 0x7ffed83efbac
arrPointer: 0x7ffed83efbac
*arr: 1
*arrPointer: 1
It seems the confusion was that arr and arrPointer are equivalent. However, as Graham Parks detailed in his answer, they are not.
Visually, the memory looks something like this:
[Memory View]
[memory address: value stored]
arrPointer:
0x7ffed83efba0: 0x7ffed83efbac
arr:
0x7ffed83efbac: 1
0x7ffed83efbb0: 2
0x7ffed83efbb4: 3
As you can see, arrPointer is a label for memory address 0x7ffed83efba0 which has 4 bytes of allocated memory which hold the memory address of arr[0].
On the other hand, arr is a label for memory address 0x7ffed83efbac, and as per Jerry Coffin's answer, since the type of variable arr is "array of type", it gets converted to a "pointer of type" (which points to the array's starting address), and thus printing arr yields 0x7ffed83efbac.
The key difference is arrPointer is an actual pointer and has its own memory slot allocated to hold the value of the memory it's pointing to, so &arrPointer != arrPointer. Since arr is not technically a pointer but an array, the memory address we see when printing arr is not stored elsewhere, but rather determined by the conversion mentioned above. So, the values (not types) of &arrPointer and arrPointer are equal.