How does sizeof know the size of array? [duplicate] - c++

This question already has answers here:
How does sizeof know the size of the operand array?
(12 answers)
Closed 8 years ago.
I have codes as following:
main() {
int array[5] = {3,6,9,-8,1};
printf("the size of the array is %d\n", sizeof(array));
printf("the address of array is %p\n", array);
printf("the address of array is %p\n", &array);
int * x = array;
printf("the address of x is %p\n", x);
printf("the size of x is %d\n", sizeof(x));
}
The output is
the size of the array is 20
the address of array is 0x7fff02309560
the address of array is 0x7fff02309560
the address of x is 0x7fff02309560
the size of x is 8
I know the variable array will be seen as a pointer to the first element of the array, so I understand the the size of x is 8. But I don't know why the size of the array is 20. Isn't it should be 8 (in a 64-bits machine)?
Besides how does the program know that it is 20? As far as I know in C it doesn't store the number of elements. How come the sizeof(array) and sizeof(x) is different? I tracked several posts pertaining to array decaying but no idea on this problem.

The name of an array decays to a pointer to the first element of the array in most situations. There are a couple of exceptions to that rule though. The two most important are when the array name is used as the operand of either the sizeof operator or the address-of operator (&). In these cases, the name of the array remains an identifier for the array as a whole.
For a non-VLA array, this means that the size of the array can be determined statically (at compile time) and the result of the expression will be the size of the array (in bytes), not the size of a pointer.
When you take the address of the array, you get the same value (i.e., the same address) as if you'd just used the name of the array without taking the address. The type is different though--when you explicitly take the address, what you get is a pointer of type "pointer to array of N items of type T". That means (for one example) that while array+1 points to the second element of the array, &array+1 points to another array just past the end of the entire array.
Assuming an array of at least two items, *(array+1) will refer to the second element of the array. Regardless of the array size, &array+1 will yield an address past the end of the array, so attempting to dereference that address gives undefined behavior.
In your case, given that the size of the array is 20, and the size of one element of the array is 4, if array was, say, 0x1000, then array+1 would be 0x1004 and &array+1 would be 0x1014 (0x14 = 20).

Your array has a static length so it can be determined at compile time. Your compiler knows the sizeof(int) = 4 and your static array length [5]. 4 * 5 = 20
Edit: Your compilers int is probably 32-bit, but addressing 64-bit. That is why sizeof(pointer) returns 8.

Note that sizeof is not a library function. sizeof is
a compile-time unary operator [...] that can be used to compute the
size of any object K&R
So sizeof doesn't know how big the array is, the compiler knows how big the array is, and by definition
when applied to an array, the result is the total number of bytes
in the array.K&R

A pointer and an array are 2 different data types.
Array can hold elements of similar data type. The memory for array is contiguous.
Pointer is used to point to some valid memory location.
sizeof(type) gives you the number of bytes of the type you pass.
Now if you pass array then the compiler knows that this is an array and number of elements in it and it just multiplies that many elements with the respective data-type size value.
In this case:
5*4 = 20
Again the sizeof(int) or sizeof(pointer) is platform dependent. In this case you are seeing sizeof(pointer) as 8.

No, arrays do not decay as operands of the sizeof operator. This is one of the few places where arrays don't decay. If an int is 4 bytes on your machine, then the total number of bytes of the array should be 20 (4 * 5). We don't even need an object to test this.
sizeof(int[5]) // 20
sizeof(int*) // 8 on a 64-bit machine

C11: 6.5.3.4 (p2)
The sizeof operator yields the size (in bytes) of its operand, which may be an
expression or the parenthesized name of a type. The size is determined from the type of
the operand. [...]
In the declaration
int array[5]
the type of array is an array of 5 ints. The compiler will determine the size of array from this type.

Try this
int x = sizeof(array)/sizeof(int);
printf("the size of the array is %d\n", x);

Related

Calculate array length via pointer arithmetic

I was wondering how *(&array + 1) actually works. I saw this as an easy way to calculate the array length and want to understand it properly before using it. I'm not very experienced with pointer arithmetic, but with my understanding &array gives the address of the first element of the array. (&array + 1) would go to end of the array in terms of address. But shouldn't *(&array + 1) give the value, which is at this address. Instead it prints out the address. I would really appreciate your help to get the pointer stuff clear in my head.
Here is the simple example I'm working on:
int numbers[] = {5,8,9,3,4,6,1};
int length = *(&numbers + 1) - numbers;
(This answer is for C++.)
&numbers is a pointer to the array itself. It has type int (*)[7].
&numbers + 1 is a pointer to the byte right after the array, where another array of 7 ints would be located. It still has type int (*)[7].
*(&numbers + 1) dereferences this pointer, yielding an lvalue of type int[7] referring to the byte right after the array.
*(&numbers + 1) - numbers: Using the - operator forces both operands to undergo the array-to-pointer conversion, so pointers can be subtracted. *(&numbers + 1) is converted to an int* pointing at the byte after the array. numbers is converted to an int* pointing at the first byte of the array. Their difference is the number of ints between the two pointers---which is the number of ints in the array.
Edit: Although there's no valid object pointed to by &numbers + 1, this is what's called a "past the end" pointer. If p is a pointer to T, pointing to a valid object of type T, then it's always valid to compute p + 1, even though *p may be a single object, or the object at the end of an array. In that case, you get a "past the end" pointer, which does not point to a valid object, but is still a valid pointer. You can use this pointer for pointer arithmetic, and even dereference it to yield an lvalue, as long as you do not try to read or write through that lvalue. Note that you can only go one byte past-the-end of an object; attempting to go any further leads to undefined behaviour.
The expression &numbers gives you the address of the array, not the first member (although numerically they are the same). The type of this expression is int (*)[7], i.e. a pointer to an array of size 7.
The expression &numbers + 1 adds sizeof(int[7]) bytes to the address of array. The resulting pointer points right after the array.
The problem however is when you then dereference this pointer with *(&numbers + 1). Dereferencing a pointer that points one element past the end of an array invokes undefined behavior.
The proper way to get the number of elements of an array is sizeof(numbers)/sizeof(numbers[0]). This assumes that the array was defined in the current scope and is not a parameter to a function.
but with my understanding &array gives the address of the first element of the array.
This understanding is misleading. &array gives the address of the array. Sure, the value of that address is the same same as the first element, but the type of the expression is different. The type of the expression &array is "pointer to array of N elements of type T" (where N is the length that you're looking for and T is int).
But shouldn't *(&array + 1) give the value, which is at this address.
Well yes... but it's here that the type of the expression becomes important. Indirecting a pointer to an array (rather than pointer to an element of the array) will result in the array itself.
In the subtraction expression, both array operands decay into pointer to first element. Since the subtraction uses decayed pointers, the unit of the pointer arithmetic is in terms of the element size.
I saw this as an easy way to calculate the array length
There are easier ways:
std::size(numbers)
And in C:
sizeof(numbers)/sizeof(numbers[0])

Pointer in 2D Array [duplicate]

This question already has answers here:
How come an array's address is equal to its value in C?
(6 answers)
Closed 7 years ago.
As a beginner programmer I am dealing with some simple problems related to Pointers. In the following code I found the value of *a and a are same in hexadecimal. But I can't understand the reason.
#include <stdio.h>
#include <stdlib.h>
main(){
int a[5][5];
a[0][0] = 1;
printf("*a=%p a=%p \n", *a, a);
return 0;
}
Here is the output:
*a=0x7ffddb8919f0 a=0x7ffddb8919f0
An array and its first element have the same address.:)
For this declaration
int a[5][5];
expression a used in the printf call is implicitly converted to the pointer to its first element. Expression *a yields the first element of the array that is in turn a one-dimensional array that also is converted to a pointer to its first element.
Thus expressions a and *a have the same value as expression &a[0][0]
In C and C++ languages values of array type T [N] are implicitly converted to values of pointer type T * in most contexts (with few exceptions). The resultant pointer points to the first element of the original array (index 0). This phenomenon is informally known as array type decay.
printf argument is one of those contexts when array type decay happens.
A 2D array of type int [5][5] is nothing else than an "1D array of 1D arrays", i.e. it is an array of 5 elements, with each element itself being an array of 5 ints.
The above array type decay rule naturally applies to this situation.
The expression a, which originally has array type int [5][5], decays to a pointer of type int (*)[5]. The pointer points to element a[0], which is the beginning of sub-array a[0] in memory. This is the first pointer you print.
The expression *a is a dereference operator applied to sub-expression a. Sub-expression a in this context behaves in exactly the same way as before: it decays to pointer of type int (*)[5] that points to a[0]. Thus the result of *a is a[0] itself. But a[0] is also an array. It is an array of int[5] type. It is also subject to array type decay. It decays to pointer of type int *, which points to the first element of a[0], i.e. to a[0][0]. This is the second pointer you print.
The reason both pointer values are the same numerically is that the beginning of sub-array a[0] corresponds to the same memory location as element a[0][0].
a can be considered a pointer to a pointer to an int (in reality, it's an array of array of int, but close enough).
So a and *a both point to the same address (which happens to be a[0][0]).
*a is still a pointer, and a[0] is the same address as a[0][0].

Argument of sizeof()

The output of size of for
#include<iostream>
using namespace std;
struct node
{
int k;
struct node *next;
};
int main()
{
int arr[3];
cout<<sizeof(struct node)<<endl;
cout<<sizeof(struct node *)<<endl;
cout<<sizeof(arr)<<endl;
cout<<sizeof(arr[0])<<endl;
cout<sizeof(int *)<<endl;
return 0;
}
is
8
4
12
4
4
I understand that struct node * is a pointer so its output should be 4. So similarly arr is also a pointer, so its output should also be 4 but why is it showing the size of arr array as 12?
So similarly arr is also a pointer
No, arr is not a pointer, it is an array. Although it can be freely converted to a pointer, it is a different kind of object. The results of sizeof indicate the amount of memory necessary to store the array of the specified size, i.e. three times the size of an int.
So similarly arr is also a pointer, so its output should also be 4 but why is it showing the size of arr array as 12?
When array name is an operand of the sizeof or the unary & then it doesn't convert to pointer to its first element. Conversion rule is not applied in these cases.
In case of
cout<<sizeof(arr)<<endl;
The sizeof returns the size of the type of arr, which is of type int[3] (array of 3 ints).
C11: 6.5.3.4:
The sizeof operator yields the size (in bytes) of its operand, which may be an
expression or the parenthesized name of a type. The size is determined from the type of
the operand.
This is because the type of arr is int[3], not int* as you've assumed. So sizeof returns the size of the entire array.

Sizeof Pointer to Array

If I have an array declared like this:
int a[3][2];
then why is:
sizeof(a+0) == 8
whereas:
sizeof(a) == 24
I don't understand how adding 0 to the pointer changes the sizeof output. Is there maybe some implicit type cast?
If you add 0 to a, then a is first converted to a pointer value of type int(*)[2] (pointing to the first element of an array of type int[3][2]). Then 0 is added to that, which adds 0 * sizeof(int[2]) bytes to the address represented by that pointer value. Since that multiplication yields 0, it will yield the same pointer value. Since it is a pointer, sizeof(a+0) yields the size of a pointer, which is 8 bytes on your box.
If you do sizeof(a), there is no reason for the compiler to convert a to a pointer value (that makes only sense if you want to index elements or to do pointer arithmetic involving the address of the elements). So expression a stays being of an array type, and you get the size of int[3][2] instead the size of int(*)[2]. So, 3 * 2 * sizeof(int) which on your box is 24 bytes.
Hope this clarifies things.
sizeof tells you the size of the type of the expression. When you add 0 to a, the type becomes a pointer (8 bytes on 64-bit systems).

Pointer incrementing in C++

What does this mean: that a pointer increment points to the address of the next base type of the pointer?
For example:
p1++; // p1 is a pointer to an int
Does this statement mean that the address pointed to by p1 should change to the address of the next int or it should just be incremented by 2 (assuming an int is 2 bytes), in which case the particular address may not contain an int?
I mean, if p1 is, say, 0x442012, will p1++ be 0x442014 (which may be part of the address of a double) or will it point to the next int which is in an address like 0x44201F?
Thanks
Pointer arithmetic doesn’t care about the content – or validity – of the pointee. It will simply increment the pointer address using the following formula:
new_value = reinterpret_cast<char*>(p) + sizeof(*p);
(Assuming a pointer to non-const – otherwise the cast wouldn’t work.)
That is, it will increment the pointer by an amount of sizeof(*p) bytes, regardless of things like pointee value and memory alignment.
The compiler will add sizeof(int) (usually 4) to the numeric value of the pointer. If p1 is 0x442012 before the increment, then after the increment it will be 0x442012 + 4 = 0x442016.
Mind you, 0x442012 is not a multiple of 4, so it is unlikely to be the address of a valid four-byte int, though it would be fine for your two-byte ints.
It certainly won't go looking for the next integer. That would require magic.
p1++ gives rise to assembly language instructions which increment p1 by the size of what it points to. So you get
(char *)p1 = (char *)p1 + sizeof (object pointed to by p1)
(When this question was answered) Typically an int is 4 bytes, so it would increment by 4, but it depends on the sizeof() on your machine.
It does not go to "the next int".
An example: assume a 4 byte address and p1 = 0x20424 (where p1 is an int*). Then
p1++
would set the new value of p1 to 0x20428. NOT 0x20425.
If p1 is pointing into the element of index n of an array of objects of type int (a non-array object counts as an array of length 1 for this purpose), then after p1++, p1 is either:
Pointing to the element of index n+1 if the array is of length greater than n+1.
The 'past-the-end' address of the array, if the array is of length exactly n+1.
p1++ causes undefined behavior if p1 is not pointing to an element of an array of objects of type int.
The only meaning that the C and C++ languages give to the notion of "address" is the value of a pointer object.
Any relationship that C/C++'s notion of address has to the notion of a numeric addresses you'd consider in assembly language is purely an implementation detail (albeit, an extremely common implementation detail).
Pointer arithmetic are done in sizoeof(*pointer) multiples - that is, for a pointer to int, increment will advance to the next integer (or 4 bytes for 32 bit integers).