Why is this pointer 8 bytes? - c++

I am learning C++, and read that when an array is passed into a function it decays into a pointer. I wanted to play around with this and wrote the following function:
void size_print(int a[]){
cout << sizeof(a)/sizeof(a[0]) << endl;
cout << "a ->: " << sizeof(a) << endl;
cout << "a[0] ->" << sizeof(a[0]) << endl;
}
I tried inputting an array with three elements, let's say
int test_array[3] = {1, 2, 3};
With this input, I was expecting this function to print 1, as I thought a would be an integer pointer (4 bytes) and a[0] would also be 4 bytes. However, to my surprise the result is 2 and sizeof(a) = 8.
I cannot figure out why a takes up 8 bytes, but a[0] takes up 4. Shouldn't they be the same?

Shouldn't they be the same?
No. a is (meant to be) an array (but because it's a function argument, has been adjusted to a pointer to the 1st element), and as such, has the size of a pointer. Your machine seems to have 64 bit addresses, and thus, each address (and hence, each pointer) is 64 bits (8 bytes) long.
a[0], on the other hand, is of the type that an element of that array has (an int), and that type has 32 bits (4 bytes) on your machine.

A pointer is just an address of memory where the start of the variable is located. That address is 8 bytes.
a[0] is a variable in the first place of the array. It technically could be anything of whatever size. When you take a pointer to it, the pointer just contains an address of memory (integer) without knowing or caring what this address contains. (This is just to illustrate the concept, in the example in the question, a[] is an integer array but the same logic works with anything).
Note, the size of the pointer is actually different on different architectures. This is where the 32-bit, 64-bit, etc. comes in. It can also depend on the compiler but this is beyond the question.

The size of the pointer depends on the system and implementation. Your uses 64 bits (8 bytes).
a[0] is an integer and the standard only gives an indication of the minimum max value it has to store. It can be anything from 2 bytes up. Most modern implementations use 32 bits (4 bytes) integers.
sizeof(a)/sizeof(a[0]) will not work on the function parameters. Arrays are passed by the reference and this division will only give you information how many times size of the pointer is larger than the size of an integer, but not the size of the object referenced by the pointer.

Related

Casting a void pointer to check memory alignment

One of the solutions which often turns up in several posts on how to determine if a void * points to aligned memory involves casting the void pointer. That is say I get a void *ptr = CreateMemory() and now I want to check if memory pointed to by ptr is a multiple of some value, say 16.
See How to determine if memory is aligned? which makes a specific claim.
A pointer p is aligned on a 16-byte boundary iff ((unsigned long)p & 15) == 0.
A solution in a similar vein appears in this post.
How to check if a pointer points to a properly aligned memory location?
Can someone clarify, how does this casting work? I mean, if I had a void *ptr = CreateMem();, it seems to me that (unsigned long)(ptr) would give me the pointer itself reinterpreted as an unsigned long. Why would the value of this reinterpreted void pointer have any bearing on the alignment of the memory pointed to?
EDIT: Thanks for all the useful comments. Please bear with me a bit more. Perhaps a simplified example will help me understand this better.
#include <iostream>
using namespace std;
struct __attribute__((aligned(16))) Data0 {
uint8_t a;
};
struct Data1 {
uint8_t a;
};
int main() {
std::cout<<sizeof(Data0) << "\n"; //< --(1)
std::cout<<sizeof(Data1) << "\n";
Data0 ptr0[10];
Data1 ptr1[10];
std::cout<<(unsigned long) (ptr0 + 1) - (unsigned long) ptr0 << "\n"; //< --- (2)
std::cout<<(unsigned long) (ptr1 + 1) - (unsigned long) ptr1 << "\n";
return 0;
}
To date, I always interpreted aligned memory to have the following two requirements. sizeof() should return a multiple of the the specified size (See condition (1)). And the while incrementing a pointer to aligned struct array, the stride would end up also being a multiple of the specified size (See condition (2)).
So I am somewhat surprised to see there is a third requirement on the actual value of ptr0 as well. Or I might have misunderstood this entirely. Would the memory pointed to by ptr0 in the above example be considered aligned and not so for ptr1?
As I am typing this, I realize I do not actually understand what aligned memory itself means. Since I have mostly dealt with this while allocating cuda buffers, I tend to relate it to some sort of padding required for my data structs.
Consider a second example. That of aligned_alloc. https://en.cppreference.com/w/c/memory/aligned_alloc
Allocate size bytes of uninitialized storage whose alignment is specified by alignment.
I am not sure how to interpret this. Say, I do a void *p0 = aligned_alloc(16, 16 * 2), what is different about the memory pointed to by p0 as compared to say p1 where std::vector<char> buffer(32); char *p1 = buffer.data().
(unsigned long)(ptr) casting a pointer to an unsigned integer returns the pointer's memory address.
p & 15 is equivalent of p % 16, which is the rest of the division by 16. If it's 0, then it means the memory is aligned to multiple of 16.
((unsigned long)(ptr) & 15) == 0 returns true if the memory is aligned to multiple of 16.
uintptr_t is a better type for such casting, as it should be more platform-agnostic.
Why would the value of this reinterpreted void pointer have any bearing on the alignment of the memory pointed to?
Because in common architectures the conversion of a void pointer to an unsigned long is the address of the memory pointed to by the pointer. Or because of the way conversion to an unsigned type works, the low bits of the address if it is too big to fit in an unsigned long. So is is enough to test for alignment because in that case you only need the lowest bits of the address.
I am not sure whether it is really mandated per standard, because the actual representation of a memory address if left to the implementation, and I can remember the segment+address representation used by 16 bits intel processors. In those architectures the addressable memory used 20 bits, and a far pointer (able to represent any address) was represented as a 32 bits unsigned value where the high 16 bits were the segment and the low 16 bits were the offset. Those were then combined as:
Segment S S S S _ (4 * 4 = 16 bits)
Offset _ O O O O (4 * 4 = 16 bits)
Address A A A A A (5 * 4 = 20 bits)
You can see that this architecture did not allow to easily test for alignment higher than 16. For example this pointer value 0x00010040 is divisible by 64 but the actual address is 0x00050 (80) which is only divisible by 16.
But only dinosaurs can remember the Intel 8086 and its segmented address representation, and even with it, testing alignment up to 16 bytes was straightforward...

What does *(int*) mean in C++?

I encountered the following line in a OpenGL tutorial and I wanna know what does the *(int*) mean and what is its value
if ( *(int*)&(header[0x1E])!=0 )
Let's take this a step at a time:
header[0x1E]
header must be an array of some kind, and here we are getting a reference to the 0x1Eth element in the array.
&(header[0x1E])
We take the address of that element.
(int*)&(header[0x1E])
We cast that address to a pointer-to-int.
*(int*)&(header[0x1E])
We dereference that pointer-to-int, yielding an int by interpreting the first sizeof(int) bytes of header, starting at offset 0x1E, as an int and gets the value it finds there.
if ( *(int*)&(header[0x1E])!=0 )
It compares that resulting value to 0 and if it isn't 0, executes whatever is in the body of the if statement.
Note that this is potentially very dangerous. Consider what would happen if header were declared as:
double header [0xFF];
...or as:
int header [5];
It's truly a terrible piece of code, but what it's doing is:
&(header[0x1E])
takes the address of the (0x1E + 1)th element of array header, let's call it addr:
(int *)addr
C-style cast this address into a pointer to an int, let's call this pointer p:
*p
dereferences this memory location as an int.
Assuming header is an array of bytes, and the original code has been tested only on intel, it's equivalent with:
header[0x1E] + header[0x1F] << 8 + header[0x20] << 16 + header[0x21] << 24;
However, besides the potential alignment issues the other posters mentioned, it has at least two more portability problems:
on a platform with 64 bit ints, it will make an int out of bytes 0x1E to 0x25 instead of the above; it will be also wrong on a platform with 16 bit ints, but I suppose those are too old to matter
on a big endian platform the number will be wrong, because the bytes will get reversed and it will end up as:
header[0x1E] << 24 + header[0x1F] << 16 + header[0x20] << 8 + header[0x21];
Also, if it's a bmp file header as rici assumed, the field is probably unsigned and the cast is done to a signed int. In this case it doesn't matter as it's being compared to zero, but in some other case it may.

What is the size of a pointer? [duplicate]

This question already has answers here:
Do all pointers have the same size in C++?
(10 answers)
Closed 5 months ago.
Is the size of a pointer the same as the size as the type it's pointing to, or do pointers always have a fixed size? For example...
int x = 10;
int * xPtr = &x;
char y = 'a';
char * yPtr = &y;
std::cout << sizeof(x) << "\n";
std::cout << sizeof(xPtr) << "\n";
std::cout << sizeof(y) << "\n";
std::cout << sizeof(yPtr) << "\n";
What would the output of this be? Would sizeof(xPtr) return 4 and sizeof(yPtr) return 1, or would the 2 pointers actually return the same size?
The reason I ask this is because the pointers are storing a memory address and not the values of their respective stored addresses.
Function Pointers can have very different sizes, from 4 to 20 bytes on an x86 machine, depending on the compiler. So the answer is no - sizes can vary.
Another example: take an 8051 program. It has three memory ranges and thus has three different pointer sizes, from 8 bit, 16 bit, 24 bit, depending on where the target is located, even though the target's size is always the same (e.g., char).
Pointers generally have a fixed size, for ex. on a 32-bit executable they're usually 32-bit. There are some exceptions, like on old 16-bit windows when you had to distinguish between 32-bit pointers and 16-bit... It's usually pretty safe to assume they're going to be uniform within a given executable on modern desktop OS's.
Edit: Even so, I would strongly caution against making this assumption in your code. If you're going to write something that absolutely has to have a pointers of a certain size, you'd better check it!
Function pointers are a different story -- see Jens' answer for more info.
On 32-bit machine sizeof pointer is 32 bits ( 4 bytes), while on 64 bit machine it's 8 byte. Regardless of what data type they are pointing to, they have fixed size.
To answer your other question. The size of a pointer and the size of what it points to are not related. A good analogy is to consider them like postal addresses. The size of the address of a house has no relationship to the size of the house.
Pointers are not always the same size on the same architecture.
You can read more on the concept of "near", "far" and "huge" pointers, just as an example of a case where pointer sizes differ...
http://en.wikipedia.org/wiki/Intel_Memory_Model#Pointer_sizes
They can be different on word-addressable machines (e.g., Cray PVP systems).
Most computers today are byte-addressable machines, where each address refers to a byte of memory. There, all data pointers are usually the same size, namely the size of a machine address.
On word-adressable machines, each machine address refers instead to a word larger than a byte. On these, a (char *) or (void *) pointer to a byte of memory has to contain both a word address plus a byte offset within the addresed word.
http://docs.cray.com/books/004-2179-001/html-004-2179-001/rvc5mrwh.html
Recently came upon a case where this was not true, TI C28x boards can have a sizeof pointer == 1, since a byte for those boards is 16-bits, and pointer size is 16 bits. To make matters more confusing, they also have far pointers which are 22-bits. I'm not really sure what sizeof far pointer would be.
In general, DSP boards can have weird integer sizes.
So pointer sizes can still be weird in 2020 if you are looking in weird places
The size of a pointer is the size required by your system to hold a unique memory address (since a pointer just holds the address it points to)

Creating integer variable of a defined size

I want to define an integer variable in C/C++ such that my integer can store 10 bytes of data or may be a x bytes of data as defined by me in the program.
for now..!
I tried the
int *ptr;
ptr = (int *)malloc(10);
code. Now if I'm finding the sizeof ptr, it is showing as 4 and not 10. Why?
C and C++ compilers implement several sizes of integer (typically 1, 2, 4, and 8 bytes {8, 16, 32, and 64 bits}), but without some helper code to preform arithmetic operations you can't really make arbitrary sized integers.
The declarations you did:
int *ptr;
ptr = (int *)malloc(10);
Made what is probably a broken array of integers. Broken because unless you are on a system where (10 % sizeof(int) ) == 0) then you have extra bytes at the end which can't be used to store an entire integer.
There are several big number Class libraries you should be able to locate for C++ which do implement many of the operations you may want preform on your 10 byte (80 bit) integers. With C you would have to do operation as function calls because it lacks operator overloading.
Your sizeof(ptr) evaluated to 4 because you are using a machine that uses 4 byte pointers (a 32 bit system). sizeof tells you nothing about the size of the data that a pointer points to. The only place where this should get tricky is when you use sizeof on an array's name which is different from using it on a pointer. I mention this because arrays names and pointers share so many similarities.
Because on you machine, size of a pointer is 4 byte. Please note that type of the variable ptr is int *. You cannot get complete allocated size by sizeof operator if you malloc or new the memory, because sizeof is a compile time operator, meaning that at compile time the value is evaluated.
It is showing 4 bytes because a pointer on your platform is 4 bytes. The block of memory the pointer addresses may be of any arbitrary size, in your case it is 10 bytes. You need to create a data structure if you need to track that:
struct VariableInteger
{
int *ptr;
size_t size;
};
Also, using an int type for your ptr variable doesn't mean the language will allow you to do arithmetic operations on anything of a size different than the size of int on your platform.
Because the size of the pointer is 4. Try something like:
typedef struct
{
int a[10];
} big_int_t;
big_int_t x;
printf("%d\n", sizeof(x));
Note also that an int is typically not 1 byte in size, so this will probably print 20 or 40, depending on your platform.
Integers in C++ are of a fixed size. Do you mean an array of integers? As for sizeof, the way you are using it, it tells you that your pointer is four bytes in size. It doesn't tell you the size of a dynamically allocated block.
Few or no compilers support 10-byte integer arithmetic. If you want to use integers bigger than the values specified in <limits.h>, you'll need to either find a library with support for big integers or make your own class which defines the mathematical operators.
I believe what you're looking for is known as "Arbitrary-precision arithmetic". It allows you to have numbers of any size and any number of decimals. Instead of using fixed-size assembly level math functions, these libraries are coded to do math how one would do them on paper.
Here's a link to a list of arbitrary-precision arithmetic libraries in a few different languages, compliments of Wikipedia: link.

What does sizeof do?

What is the main function of sizeof (I am new to C++). For instance
int k=7;
char t='Z';
What do sizeof (k) or sizeof (int) and sizeof (char) mean?
sizeof(x) returns the amount of memory (in bytes) that the variable or type x occupies. It has nothing to do with the value of the variable.
For example, if you have an array of some arbitrary type T then the distance between elements of that array is exactly sizeof(T).
int a[10];
assert(&(a[0]) + sizeof(int) == &(a[1]));
When used on a variable, it is equivalent to using it on the type of that variable:
T x;
assert(sizeof(T) == sizeof(x));
As a rule-of-thumb, it is best to use the variable name where possible, just in case the type changes:
int x;
std::cout << "x uses " << sizeof(x) << " bytes." << std::endl
// If x is changed to a char, then the statement doesn't need to be changed.
// If we used sizeof(int) instead, we would need to change 2 lines of code
// instead of one.
When used on user-defined types, sizeof still returns the amount of memory used by instances of that type, but it's worth pointing out that this does not necessary equal the sum of its members.
struct Foo { int a; char b; };
While sizeof(int) + sizeof(char) is typically 5, on many machines, sizeof(Foo) may be 8 because the compiler needs to pad out the structure so that it lies on 4 byte boundaries. This is not always the case, and it's quite possible that on your machine sizeof(Foo) will be 5, but you can't depend on it.
To add to Peter Alexander's answer: sizeof yields the size of a value or type in multiples of the size of a char---char being defined as the smallest unit of memory addressable (by C or C++) for a given architecture (and, in C++ at least, at least 8 bits in size according to the standard). This is what's generally meant by "bytes" (smallest addressable unit for a given architecture) but it never hurts to clarify, and there are occasionally questions about the variability of sizeof (char), which is of course always 1.
sizeof() returns the size of the argument passed to it.
sizeof() cpp reference
sizeof is a compile time unary operator that returns size of data type.
For example:
sizeof(int)
will return the size of int in byte.
Also remember that type sizes are platform dependent.
Check this page for more details: sizeof in C/C++