This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Question about pointer increment
When i increment a int pointer then its address have a gap of 4 bytes. why it is so ? why a int pointer takes 4 bytes to store whereas a char takes 2 bytes ?
When you increment a pointer of a type A, you move that pointer forward in the memory by the size of the type it points to. On your machine, int takes 4 bytes, so the pointer moves by 4 bytes.
As for "why does int take 4 bytes on my machine?":
The C++ standard says (4.9.1. paragraph 2):
There are five standard signed integer types : “signed char”, “short
int”, “int”, “long int”, and “long long int”. In this list, each type
provides at least as much storage as those preceding it in the list.
<...> Plain ints have the natural size suggested by the architecture
of the execution environment[44]; the other signed integer types are
provided to meet special needs.
[44]: that is, large enough to contain any value in the
range of INT_MIN and INT_MAX, as defined in the header .
Basically, the sizes of fundamental types are not set in stone, and are implementation-defined. The accepted answer to this SO question has some information about it.
Here is the general rule:
If the type is T, its size N is calculated as sizeof(T) bytes. So pointer of type T* is increased by N bytes if you increment the pointer by 1.
Mathematically,
T *p = getT();
size_t diff = static_cast<size_t>(p+1) - static_cast<size_t>(p);
bool alwaysTrue = (diff == sizeof(T)); //alwaysTrue is always true!
the size of the pointer to any data types always be the same as supported by your system
If system is 32 -bit the size would be 4 bytes for all the pointers.
In pointer arithmetic when you do ptr++ or ptr-- the increments and decrements takes place according to the size of the data type this ptrpointer points to .
char *cptr;
int *iptr;
char c[5];
int a[5];
cptr=c;
iptr=a;
By doing cptr++ you will get c[1] and pointer will increments by only one byte
You can check the address of each char.
Similarly iptr++ will give you a[1] here pointer increased by 4 bytes.
int main()
{
int i;
for(i=0;i<5;i++)
{
printf("%p\t",&c[i]); //internally pointer arithmeitc: (c+sizeof(char)*i) ,
printf("%p\n",&a[i]); //intenally pointer arithmetic : (a+sizeof(int)*i)
}
}
Size of int or other data types are implementation defined
Pointers increment by the size in bytes of the things they point to. ints take 4 bytes on a 32-bit machine.
Because, on your computer, sizeof (int) == 4, so stepping from one int to the next requires an increment of four bytes.
Most integer types have different sizes on different computers. int must have at least 16 bits, and is supposed to be a "natural" size for the computer. Most 32 or 64-bit platforms choose 32 bits as a "natural" size, and most computers have 8-bit bytes, so 4 bytes is a very common size for int.
However, sizeof (char) == 1 on all computers, so I'm rather surprised that you say "a char takes 2 bytes". It should only take one.
because the size of data (int) which the pointer is pointing has 4 byte size so the pointer increments 4 bytes (size of data (int))
another example: if you have structure with size 8 byte and you have pointer pointing to this structure the increment of this pointer will be 8 byte:
struct test {
int x;
int y;
}
struct test ARRAY[50];
struct test *p=ARRAY; // p pointer is pointing here to the first element ARRAY[0]. ARRAY[0] is with size 8 bytes
p++; // this will increment p with 8 byte (size of struct test). So p now is pointing to the second element ARRAY[1]
Related
Value of a pointer is address of a variable. Why value of an int pointer increased by 4-bytes after the int pointer increased by 1.
In my opinion, I think value of pointer(address of variable) only increase by 1-byte after pointer increment.
Test code:
int a = 1, *ptr;
ptr = &a;
printf("%p\n", ptr);
ptr++;
printf("%p\n", ptr);
Expected output:
0xBF8D63B8
0xBF8D63B9
Actually output:
0xBF8D63B8
0xBF8D63BC
EDIT:
Another question - How to visit the 4 bytes an int occupies one by one?
When you increment a T*, it moves sizeof(T) bytes.† This is because it doesn't make sense to move any other value: if I'm pointing at an int that's 4 bytes in size, for example, what would incrementing less than 4 leave me with? A partial int mixed with some other data: nonsensical.
Consider this in memory:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
Which makes more sense when I increment that pointer? This:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
Or this:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
The last doesn't actually point an any sort of int. (Technically, then, using that pointer is UB.)
If you really want to move one byte, increment a char*: the size of of char is always one:
int i = 0;
int* p = &i;
char* c = (char*)p;
char x = c[1]; // one byte into an int
†A corollary of this is that you cannot increment void*, because void is an incomplete type.
Pointers are increased by the size of the type they point to, if the pointer points to char, pointer++ will increment pointer by 1, if it points to a 1234 bytes struct, pointer++ will increment the pointer by 1234.
This may be confusing first time you meet it, but actually it make a lot of sense, this is not a special processor feature, but the compiler calculates it during compilation, so when you write pointer+1 the compiler compiles it as pointer + sizeof(*pointer)
As you said, an int pointer points to an int. An int usually takes up 4 bytes and therefore, when you increment the pointer, it points to the "next" int in the memory - i.e., increased by 4 bytes. It acts this way for any size of type. If you have a pointer to type A, then incrementing a A* it will increment by sizeof(A).
Think about it - if you only increment the pointer by 1 byte, than it will point to a middle of an int and I can't think of an opportunity where this is desired.
This behavior is very comfortable when iterating over an array, for example.
The idea is that after incrementing, the pointer points to the next int in memory. Since ints are 4 bytes wide, it is incremented by 4 bytes. In general, a pointer to type T will increment by sizeof(T)
A pointer points at the BEGINNING of something in memory. An INT occupies 4 bytes (32bit) and a DOUBLE occupies 8 bytes (64bit) in memory. So if you have a DOUBLE number stored, and you wish at a very low level pointing to the next available memory location, the pointer wooud be increased by 8 bytes. If for some reason you pointed at +4bytes from the start of a DOUBLE value, you would corrupt it's value. Memory is a very large flat field that has no conscience of itself, so it's up to the software to divides it properly and to "respect the borders" of items located in that field.
One of the solutions which often turns up in several posts on how to determine if a void * points to aligned memory involves casting the void pointer. That is say I get a void *ptr = CreateMemory() and now I want to check if memory pointed to by ptr is a multiple of some value, say 16.
See How to determine if memory is aligned? which makes a specific claim.
A pointer p is aligned on a 16-byte boundary iff ((unsigned long)p & 15) == 0.
A solution in a similar vein appears in this post.
How to check if a pointer points to a properly aligned memory location?
Can someone clarify, how does this casting work? I mean, if I had a void *ptr = CreateMem();, it seems to me that (unsigned long)(ptr) would give me the pointer itself reinterpreted as an unsigned long. Why would the value of this reinterpreted void pointer have any bearing on the alignment of the memory pointed to?
EDIT: Thanks for all the useful comments. Please bear with me a bit more. Perhaps a simplified example will help me understand this better.
#include <iostream>
using namespace std;
struct __attribute__((aligned(16))) Data0 {
uint8_t a;
};
struct Data1 {
uint8_t a;
};
int main() {
std::cout<<sizeof(Data0) << "\n"; //< --(1)
std::cout<<sizeof(Data1) << "\n";
Data0 ptr0[10];
Data1 ptr1[10];
std::cout<<(unsigned long) (ptr0 + 1) - (unsigned long) ptr0 << "\n"; //< --- (2)
std::cout<<(unsigned long) (ptr1 + 1) - (unsigned long) ptr1 << "\n";
return 0;
}
To date, I always interpreted aligned memory to have the following two requirements. sizeof() should return a multiple of the the specified size (See condition (1)). And the while incrementing a pointer to aligned struct array, the stride would end up also being a multiple of the specified size (See condition (2)).
So I am somewhat surprised to see there is a third requirement on the actual value of ptr0 as well. Or I might have misunderstood this entirely. Would the memory pointed to by ptr0 in the above example be considered aligned and not so for ptr1?
As I am typing this, I realize I do not actually understand what aligned memory itself means. Since I have mostly dealt with this while allocating cuda buffers, I tend to relate it to some sort of padding required for my data structs.
Consider a second example. That of aligned_alloc. https://en.cppreference.com/w/c/memory/aligned_alloc
Allocate size bytes of uninitialized storage whose alignment is specified by alignment.
I am not sure how to interpret this. Say, I do a void *p0 = aligned_alloc(16, 16 * 2), what is different about the memory pointed to by p0 as compared to say p1 where std::vector<char> buffer(32); char *p1 = buffer.data().
(unsigned long)(ptr) casting a pointer to an unsigned integer returns the pointer's memory address.
p & 15 is equivalent of p % 16, which is the rest of the division by 16. If it's 0, then it means the memory is aligned to multiple of 16.
((unsigned long)(ptr) & 15) == 0 returns true if the memory is aligned to multiple of 16.
uintptr_t is a better type for such casting, as it should be more platform-agnostic.
Why would the value of this reinterpreted void pointer have any bearing on the alignment of the memory pointed to?
Because in common architectures the conversion of a void pointer to an unsigned long is the address of the memory pointed to by the pointer. Or because of the way conversion to an unsigned type works, the low bits of the address if it is too big to fit in an unsigned long. So is is enough to test for alignment because in that case you only need the lowest bits of the address.
I am not sure whether it is really mandated per standard, because the actual representation of a memory address if left to the implementation, and I can remember the segment+address representation used by 16 bits intel processors. In those architectures the addressable memory used 20 bits, and a far pointer (able to represent any address) was represented as a 32 bits unsigned value where the high 16 bits were the segment and the low 16 bits were the offset. Those were then combined as:
Segment S S S S _ (4 * 4 = 16 bits)
Offset _ O O O O (4 * 4 = 16 bits)
Address A A A A A (5 * 4 = 20 bits)
You can see that this architecture did not allow to easily test for alignment higher than 16. For example this pointer value 0x00010040 is divisible by 64 but the actual address is 0x00050 (80) which is only divisible by 16.
But only dinosaurs can remember the Intel 8086 and its segmented address representation, and even with it, testing alignment up to 16 bytes was straightforward...
Here is the problem program:
#include <stdio.h>
int main()
{
int apricot[2][3][5];
int (*r)[5]=apricot[0];
int *t=apricot[0][0];
printf("%p\n%p\n%p\n%p\n",r,r+1,t,t+1);
}
The output of it is:
# ./a.out
0xbfa44000
0xbfa44014
0xbfa44000
0xbfa44004
I think t's dimension's value should be 5 because t is the last dimension,and the fact is matched(0xbfa44004-0xbfa44000+1=5)
But the r's dimension's value is 0xbfa44014-0xbfa44000+1=21,I think it should be 3*5=15,because 3 and 5 are the last two dimensions,then why the difference is 21?
r is a pointer to an array of 5 ints.
Assuming 1 int is 4 bytes on your system (from t and t+1), then "stepping" that pointer by 1 (r+1) means an increase in 5*4 = 20 bytes. Which is what you get here.
You get tricked by the C syntax. r is an array pointer to an array of int, t is a plain int pointer. When doing any kind of pointer arithmetic, you do it in the unit pointed at.
Thus t+1 means the address of t + the size of one pointed-at object. Since t points at int and int is 4 bytes on your system, you get an address 4 bytes from t.
The same rule applies to r. It is a pointer to an array of 5 int. When you do pointer arithmetic on it by r+1, you get the size of the pointed-at object, which has size 5*sizeof(int), which happens to be 20 bytes on your computer. So therefore r+1 gives you an address 20 bytes (==14 hex) from r.
When I run sizeof(r) on my Mac. It says sizeof(r) = 1. My understanding is that the size of a union is the size of its largest element. In this case shouldn't the largest element be the struct s?
union
{
struct
{
char i:1;
char j:2;
char m:3;
}s;
char ch;
}r;
Your union composes of two parts, a struct, and a character. The size of the union, since it shares the memory, is the size of the largest element, plus the size of any padding it sticks on (which in your case is 0 bytes).
First, let's see the size ideone reports for each:
http://ideone.com/LAhop
Okay, both are 1. Therefore, the union's size must be 1 as well.
The struct is composed of bitfields. One is 1 bit, one is 2, and one is 3. This gives a total of 6 out of the 8 bits in one byte. Since it has to be at least one byte anyway (bitfields aren't really sized in bits), the size is 1.
As for char, here's what the C++11 standard says in § 3.9.1/1 [basic.fundamental]:
Objects declared as characters (char) shall be large enough to store any member
of the implementation’s basic character set.
For pretty much every platform, this is one byte.
This is one byte.
The struct s is taking up 1 + 2 + 3 = 6 bits which fit into 1 byte and its unioning with a char which is 1 byte. Hence the answer 1 byte.
Value of a pointer is address of a variable. Why value of an int pointer increased by 4-bytes after the int pointer increased by 1.
In my opinion, I think value of pointer(address of variable) only increase by 1-byte after pointer increment.
Test code:
int a = 1, *ptr;
ptr = &a;
printf("%p\n", ptr);
ptr++;
printf("%p\n", ptr);
Expected output:
0xBF8D63B8
0xBF8D63B9
Actually output:
0xBF8D63B8
0xBF8D63BC
EDIT:
Another question - How to visit the 4 bytes an int occupies one by one?
When you increment a T*, it moves sizeof(T) bytes.† This is because it doesn't make sense to move any other value: if I'm pointing at an int that's 4 bytes in size, for example, what would incrementing less than 4 leave me with? A partial int mixed with some other data: nonsensical.
Consider this in memory:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
Which makes more sense when I increment that pointer? This:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
Or this:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
The last doesn't actually point an any sort of int. (Technically, then, using that pointer is UB.)
If you really want to move one byte, increment a char*: the size of of char is always one:
int i = 0;
int* p = &i;
char* c = (char*)p;
char x = c[1]; // one byte into an int
†A corollary of this is that you cannot increment void*, because void is an incomplete type.
Pointers are increased by the size of the type they point to, if the pointer points to char, pointer++ will increment pointer by 1, if it points to a 1234 bytes struct, pointer++ will increment the pointer by 1234.
This may be confusing first time you meet it, but actually it make a lot of sense, this is not a special processor feature, but the compiler calculates it during compilation, so when you write pointer+1 the compiler compiles it as pointer + sizeof(*pointer)
As you said, an int pointer points to an int. An int usually takes up 4 bytes and therefore, when you increment the pointer, it points to the "next" int in the memory - i.e., increased by 4 bytes. It acts this way for any size of type. If you have a pointer to type A, then incrementing a A* it will increment by sizeof(A).
Think about it - if you only increment the pointer by 1 byte, than it will point to a middle of an int and I can't think of an opportunity where this is desired.
This behavior is very comfortable when iterating over an array, for example.
The idea is that after incrementing, the pointer points to the next int in memory. Since ints are 4 bytes wide, it is incremented by 4 bytes. In general, a pointer to type T will increment by sizeof(T)
A pointer points at the BEGINNING of something in memory. An INT occupies 4 bytes (32bit) and a DOUBLE occupies 8 bytes (64bit) in memory. So if you have a DOUBLE number stored, and you wish at a very low level pointing to the next available memory location, the pointer wooud be increased by 8 bytes. If for some reason you pointed at +4bytes from the start of a DOUBLE value, you would corrupt it's value. Memory is a very large flat field that has no conscience of itself, so it's up to the software to divides it properly and to "respect the borders" of items located in that field.