I'm trying to understand C pointers, I set up the following example:
int main()
{
int temp[] = {45,67,99};
int* address;
address = &temp[0]; //equivalent to address = temp;
std::cout << "element 0: " << *address << std::endl;
address = (address + sizeof(int));
std::cout << "element 1: " << *address << std::endl;
std::getchar();
return 0;
}
Now, the first element is printed correctly while the second one is garbage. I know that I can just use:
address = (address + 1);
in order to move the pointer to the second element of the array, but I don't understand why the first method is wrong.
The operator sizeof returns the size of the object in bytes. So (depending on your compiler) you may have just done
address + sizeof(int) == address + 4
This will obviously access out of bounds when you dereference the pointer. The problem is that the way you are trying to use sizeof is already accounted for in the pointer arithmetic. When you add an integral (e.g. 1) to an int* it already knows to move 1 int over to the next address.
Incrementing a pointer by N will not increment it by N bytes, but by N * sizeof(pointed_type) bytes.
So when you do:
address = (address + sizeof(int));
You are incrementing the address by (probably) 4 times the size of int, which is past your original array.
What you intended to do is:
address = reinterpret_cast<int*>(reinterpret_cast<char*>(address) + sizeof(int));
Which is pretty horrendous.
An array is a group of elements of the Same Type mapped consecutively in memory one-next to the other.
The size of the array is:
sizeof(any element) * the number of elements
So the vice-vers to get the number of elements of an array you can:
Number of elements = sizeof(array) / sizeof(any element).
These elements are addressable and each one is addressed with the first byte's address.
int a[] = {1, 10, 100, 1000, 10000};
std::cout << sizeof(int) << std::endl; // 4
std::cout << sizeof(a) << std::endl; // 20: 5 * 4
The addresses:
cout << &a[0] << " : " << &a[1] << " : " <<
&a[2] << " : " << &a[3] << endl;
The outPut:
0018FF38 : 0018FF3C : 0018FF40 : 0018FF44
As you can see the address is incrementing by sizeof(int) // 4
To do it on your way:
for(int* tmp = array; tmp <= a + nElement; tmp++)
cout << tmp << " : ";
The result:
0018FF38 : 0018FF3C : 0018FF40 : 0018FF44
As you can see the result is identical.
To move from one element to another just increment / decrement the address as I did above:
tmp++; // incrementing the address by 1 means moving to the next element, element means an integer which means 4 bytes. 1 here is not a single byte but a unit or element which is 4 byte here;
To move to the n-element just:
tmp += n;
In your example:
address = (address + sizeof(int)); :
address = address + 4; // moving to the 0 + 4 element which means fifth element (here outbound).
Related
Say I declared an arrayint a[]={1,2,3,4,5};, when I do (*(&a+1)-a), it prints 5.
I came to know that *(&a+1) takes me to the end of an array, and as sizeof(a)=20.
So does pointer arithmetic takes me ahead of size of allocated container?
Also I am little bit confused on pointer arithmetic, why it prints 5 rather than 20?
The result of pointer arithmetic is is in units of the dereferenced type.
If you have a pointer to an int then the units will be in int elements.
If you have a pointer to an int[5] then the units will be in int[5] elements, which are exactly 5 times as big.
For your program above, a and &a will have the same numerical value,and I believe that's where your whole confusion lies.You may wonder that if they are the same,the following should give the next address after a in both cases,going by pointer arithmetic:
(&a+1) and (a+1)
But it's not so!!Base address of an array (a here) and Address of an array are not same! a and &a might be same numerically ,but they are not the same type. a is of type * while &a is of type (*)[5],ie , &a is a pointer to (address of ) and array of size 5.But a as you know is the address of the first element of the array.Numerically they are the same as you can see from the illustration using ^ below.
1 2 3 4 5
^ // ^ stands at &a
1 2 3 4 5
^ // ^ stands at (&a+1)
1 2 3 4 5
^ //^ stands at a
1 2 3 4 5
^ // ^ stands at (a+1)
Hope this will clear the doubts.
Pointer arithmetic is not same as arithmetic operation (addition/subtraction) between hexadecimal values. Following example demonstrates both.
int main()
{
int a[] = { 1, 2, 3, 4, 5 };
int * pintx = *(&a + 1);
int * pinty = a;
cout << "pintx = " << pintx << endl;
cout << "pinty = " << pinty << endl;
cout << "Pointer Arithmetic : Ans = " << (*(&a + 1) - a) << endl;
// Prints 5
cout << "Pointer Arithmetic : Ans = " << (pintx - pinty) << endl;
// Save as above, Print 5
cout << "Hexadecimal Subtraction: Ans = " << ((int)pintx - (int)pinty) << endl;
// Prints 20, as you expect
return 0;
}
Hope this helps.
So does pointer arithmetic takes me ahead of size of allocated container?
No. Not the "size of allocated container", but the size of the dereferenced type. sizeof(a) is 20 because as an object the type of a is int[5] instead of int*, and the type of &a is int[5]*. It will be clearer if I rewrite your example as below:
typedef int Int5[5];
Int5 a = { 1, 2, 3, 4, 5 };
Also I am little bit confused on pointer arithmetic, why it prints 5 rather than 20?
(*(&a+1)-a) is 5 because in this case a is interpreted as int*.
I have a data type called 'tod', with which I create arrays. To iterate between elements of this type inside function (with pointer), I want to use an offset. However, this offset gets squared during operation:
tod theTime[] = { {12,0,0, "noon"}, {0,0,0, "midnight"}, {11,30,0, "lunch time"}, {18,45,0, "supper time"}, {23,59,59, "bed time"} };
auto tsize = sizeof(tod);
auto p1 = &theTime[0];
auto p2 = &theTime[0] + tsize;
cout << "size of tod = " << tsize << endl;
cout << "p1 = " << p1 << endl;
cout << "p2 = " << p2 << endl;
That gives me:
size of tod = 44
p1 = 0x7ffd3e3b0bf0
p2 = 0x7ffd3e3b1380
The difference between the two hex values comes down to 0x790, which is 1936 (decimal) and 44^2. How is this happening? Someone please help.
You're misunderstanding about pointer arithmetic:
If the pointer P points to the ith element of an array, then the expressions P+n, n+P, and P-n are pointers of the same type that point to the i+nth, i+nth, and i-nth element of the same array, respectively.
e.g. &theTime[0] + 1 will return the pointer pointing to the 2nd element of the array theTime; then &theTime[0] + tsize will try to return the pointer pointing to the tsize + 1th element (note it's getting out of the bound and leads to UB). That's why you got 1936, i.e. 44 * 44.
I want to compare the memory address and pointer value of p, p + 1, q , and q + 1.
I want to understand, what the following values actually mean. I can't quite wrap my head around whats going on.
When I run the code:
I get an answer of 00EFF680 for everytime I compare the adresss p with another pointer.
I get an answer of 00EFF670 for everytime I compare the address of q with another pointer.
I get an answer of 15726208 when I look at the pointer value of p.
And I get an answer of 15726212 When I look at the pointer value of p + 1.
I get an answer of 15726192 when I look at the pointer value of q
And I get an answer of 15726200 Wehn I look at the pointer value of q + 1.
Code
#include <iostream>
#include <string>
using namespace std;
int main()
{
int val = 20;
double valD = 20;
int *p = &val;
double *q;
q = &valD;
cout << "Memory Address" << endl;
cout << p == p + 1;
cout << endl;
cout << q == q + 1;
cout << endl;
cout << p == q;
cout << endl;
cout << q == p;
cout << endl;
cout << p == q + 1;
cout << endl;
cout << q == p + 1;
cout << endl;
cout << "Now Compare Pointer Value" << endl;
cout << (unsigned long)(p) << endl;
cout << (unsigned long) (p + 1) << endl;
cout << (unsigned long)(q) << endl;
cout << (unsigned long) (q + 1) << endl;
cout <<"--------" << endl;
return 0;
}
There are a few warnings and/or errors.
The first is that overloaded operator << has higher precedence than the comparison operator (on clang++ -Woverloaded-shift-op-parentheses is the flag).
The second is that there is a comparison of distinct pointer types ('int *' and 'double *').
For the former, parentheses must be placed around the comparison to allow for the comparison to take precedence. For the latter, the pointers should be cast to a type that allows for safe comparison (e.g., size_t).
For instance on line 20, the following would work nicely.
cout << ((size_t) p == (size_t) (q + 1));
As for lines 25-28, this is standard pointer arithmetic. See the explanation here.
As to your question:
I want to compare p, p +1 , q , and q + 1. And Understand what the results mean.
If p is at address 0x80000000 then p+1 is at address 0x80000000 + sizeof(*p). If *p is int then this is 0x80000000 + 0x8 = 0x80000008. And the same reasoning applies for q.
So if you do p == p + 1 then compiler will first do the additon: p+1 then comparison, so you will have 0x80000000 == 0x80000008 which results in false.
Now to your code:
cout << p == p + 1;
is actually equivalent to:
(cout << p) == p + 1;
and that is because << has higher precedence than ==. Actually you should get a compilation error for this.
Another thing is comparision of pointers of non related types like double* with int*, without cast it should not compile.
In C and C++ pointer arithmetic is very closely tied with array manipulation. The goal is that
int array[3] = { 1, 10, 100 };
int *ptr = { 1, 10, 100 };
std::cout << array[2] << '\n';
std::cout << *(ptr + 2) << '\n';
outputs two 100s. This allows the language to treat arrays and pointers as equivalent - that's not the same thing as "the same" or "equal", see the C FAQ for clarification.
This means that the language allows:
int array[3] = { 1, 10, 100 };
int *ptr = { 1, 10, 100 };
And then
std::cout << (void*)array << ", " << (void*)&array[0] << '\n';
outputs the address of the first element twice, the first array behaves like a pointer.
std::cout << (void*)(array + 1) << ", " << (void*)&array[1] << '\n';
prints the address of the second element of array, again array behaving like a pointer in the first case.
std::cout << ptr[2] << ", " << *(ptr + 2) << '\n';
prints element #3 of ptr (100) twice, here ptr is behaving like an array in the first use,
std::cout << (void*)ptr << ", " << (void*)&ptr[0] << '\n';
prints the value of ptr twice, again ptr behaving like an array in the second use,
But this can catch people unaware.
const char* h = "hello"; // h points to the character 'h'.
std::cout << (void*)h << ", " << (void*)(h+1);
This prints the value of h and then a value one higher. But this is purely because the type of h is a pointer to a one-byte-sized data type.
h + 1;
is
h + (sizeof(*h)*1);
If we write:
const char* hp = "hello";
short int* sip = { 1 };
int* ip = { 1 };
std::cout << (void*)hp << ", " << (void*)(hp + 1) << "\n";
std::cout << (void*)sip << ", " << (void*)(sip + 1) << "\n";
std::cout << (void*)ip << ", " << (void*)(ip + 1) << "\n";
The first line of output will show two values 1 byte (sizeof char) apart, the second two values will be 2 bytes (sizeof short int) apart and the last will be four bytes (sizeof int) apart.
The << operator invokes
template<typename T>
std::ostream& operator << (std::ostream& stream, const T& instance);
The operator itself has very high precedence, higher than == so what you are actually writing is:
(std::cout << p) == p + 1
what you need to write is
std::cout << (p == p + 1)
this is going to print 0 (the result of int(false)) if the values are different and 1 (the result of int(true)) if the values are the same.
Perhaps a picture will help (For a 64bit machine)
p is a 64bit pointer to a 32bit (4byte) int. The green pointer p takes up 8 bytes. The data pointed to by p, the yellow int val takes up 4 bytes. Adding 1 to p goes to the address just after the 4th byte of val.
Similar for pointer q, which points to a 64bit (8byte) double. Adding 1 to q goes to the address just after the 8th byte of valD.
If you want to print the value of a pointer, you can cast it to void *, for example:
cout << static_cast<void*>(p) << endl;
A void* is a pointer of indefinite type. C code uses it often to point to arbitrary data whose type isn’t known at compile time; C++ normally uses a class hierarchy for that. Here, though, it means: treat this pointer as nothing but a memory location.
Adding an integer to a pointer gets you another pointer, so you want to use the same technique there:
cout << static_cast<void*>(p+1) << endl;
However, the difference between two pointers is a signed whole number (the precise type, if you ever need it, is defined as ptrdiff_t in <cstddef>, but fortunately you don’t need to worry about that with cout), so you just want to use that directly:
cout << (p+1) - p << endl;
cout << reinterpret_cast<char*>(p+1) - reinterpret_cast<char*>(p) << endl;
cout << (q - p) << endl;
That second line casts to char* because the size of a char is always 1. That’s a big hint what’s going on.
As for what’s going on under the hood: compare the numbers you get to sizeof(*p) and sizeof(*q), which are the sizes of the objects p and q point to.
The pointer values that are printed are likely to change on every execution (see why the addresses of local variables can be different every time and Address Space Layout Randomization)
I get an answer of 00EFF680 for everytime I compare the adresss p with another pointer.
int val = 20;
double valD = 20;
int *p = &val;
cout << p == p + 1;
It is translated into (cout << p) == p + 1; due to the higher precedence of operator << on operator ==.
It print the hexadecimal value of &val, first address on the stack frame of the main function.
Note that in the stack, address are decreasing (see why does the stack address grow towards decreasing memory addresses).
I get an answer of 00EFF670 for everytime I compare the address of q with another pointer.
double *q = &valD;
cout << q == q + 1;
It is translated into (cout << q) == q + 1; due to the precedence of operator << on operator ==.
It prints the hexadecimal value of &valD, second address on the stack frame of the main function.
Note that &valD <= &val - sizeof(decltype(valD) = double) == &val - 8 since val is just after valD on the stack. It is a compiler choice that respects some alignment constraints.
I get an answer of 15726208 when I look at the pointer value of p.
cout << (unsigned long)(p) << endl;
It just prints the decimal value of &val
And I get an answer of 15726212 When I look at the pointer value of p + 1.
int *p = &val;
cout << (unsigned long) (p + 1) << endl;
It prints the decimal value of &val + sizeof(*decltype(p)) = &val + sizeof(int) = &val + 4 since on your machine int = 32 bits
Note that if p is a pointer to type t, p+1 is p + sizeof(t) to avoid memory overlapping in array indexing.
Note that if p is a pointer to void, p+1 should be undefined (see void pointer arithmetic)
I get an answer of 15726192 when I look at the pointer value of q
cout << (unsigned long)(q) << endl;
It prints the decimal value of &valD
And I get an answer of 15726200 Wehn I look at the pointer value of q + 1.
cout << (unsigned long) (q + 1) << endl;
It prints the decimal value of &val + sizeof(*decltype(p)) = &valD + sizeof(double) = &valD + 8
When print a pointer without * or & shows an address, I don't know what is this address.
For example:
int *n;
int num = 10;
n = #
cout << n << endl; // Prints 0020F81C
cout << &n << endl; // Prints 0020f828
The result:
I know cout << &n << endl; to print the address of place in the memory.
But what about cout << n << endl; ?
n is referring to "&num", therefore it's the address to the place in memory where num is pointing to.
You've answered your own question. It is an address. The address in memory of whatever the pointer points to, or zero: in this case, the address of 'num'.
Your second 'cout' prints the address of the pointer itself.
#include <iostream>
using namespace std;
struct test
{
int i;
double h;
int j;
};
int main()
{
test te;
te.i = 5;
te.h = 6.5;
te.j = 10;
cout << "size of an int: " << sizeof(int) << endl; // Should be 4
cout << "size of a double: " << sizeof(double) << endl; //Should be 8
cout << "size of test: " << sizeof(test) << endl; // Should be 24 (word size of 8 for double)
//These two should be the same
cout << "start address of the object: " << &te << endl;
cout << "address of i member: " << &te.i << endl;
//These two should be the same
cout << "start address of the double field: " << &te.h << endl;
cout << "calculate the offset of the double field: " << (&te + sizeof(double)) << endl; //NOT THE SAME
return 0;
}
Output:
size of an int: 4
size of a double: 8
size of test: 24
start address of the object: 0x7fffb9fd44e0
address of i member: 0x7fffb9fd44e0
start address of the double field: 0x7fffb9fd44e8
calculate the offset of the double field: 0x7fffb9fd45a0
Why do the last two lines produce different values? Something I am doing wrong with pointer arithmetic?
(&te + sizeof(double))
This is the same as:
&((&te)[sizeof(double)])
You should do:
(char*)(&te) + sizeof(int)
You are correct -- the problem is with pointer arithmetic.
When you add to a pointer, you increment the pointer by a multiple of that pointer's type
Therefore, &te + 1 will be 24 bytes after &te.
Your code &te + sizeof(double) will add 24 * sizeof(double) or 192 bytes.
Firstly, your code is wrong, you'd want to add the size of the fields before h (i.e. an int), there's no reason to assume double. Second, you need to normalise everything to char * first (pointer arithmetic is done in units of the thing being pointed to).
More generally, you can't rely on code like this to work. The compiler is free to insert padding between fields to align things to word boundaries and so on. If you really want to know the offset of a particular field, there's an offsetof macro that you can use. It's defined in <stddef.h> in C, <cstddef> in C++.
Most compilers offer an option to remove all padding (e.g. GCC's __attribute__ ((packed))).
I believe it's only well-defined to use offsetof on POD types.
struct test
{
int i;
int j;
double h;
};
Since your largest data type is 8 bytes, the struct adds padding around your ints, either put the largest data type first, or think about the padding on your end! Hope this helps!
&te + sizeof(double) is equivalent to &te + 8, which is equivalent to &((&te)[8]). That is — since &te has type test *, &te + 8 adds eight times the size of a test.
You can see what's going on more clearly using the offsetof() macro:
#include <iostream>
#include <cstddef>
using namespace std;
struct test
{
int i;
double h;
int j;
};
int main()
{
test te;
te.i = 5;
te.h = 6.5;
te.j = 10;
cout << "size of an int: " << sizeof(int) << endl; // Should be 4
cout << "size of a double: " << sizeof(double) << endl; // Should be 8
cout << "size of test: " << sizeof(test) << endl; // Should be 24 (word size of 8 for double)
cout << "i: size = " << sizeof te.i << ", offset = " << offsetof(test, i) << endl;
cout << "h: size = " << sizeof te.h << ", offset = " << offsetof(test, h) << endl;
cout << "j: size = " << sizeof te.j << ", offset = " << offsetof(test, j) << endl;
return 0;
}
On my system (x86), I get the following output:
size of an int: 4
size of a double: 8
size of test: 16
i: size = 4, offset = 0
h: size = 8, offset = 4
j: size = 4, offset = 12
On another system (SPARC), I get:
size of an int: 4
size of a double: 8
size of test: 24
i: size = 4, offset = 0
h: size = 8, offset = 8
j: size = 4, offset = 16
The compiler will insert padding bytes between struct members to ensure that each member is aligned properly. As you can see, alignment requirements vary from system to system; on one system (x86), double is 8 bytes but only requires 4-byte alignment, and on another system (SPARC), double is 8 bytes and requires 8-byte alignment.
Padding can also be added at the end of a struct to ensure that everything is aligned properly when you have an array of the struct type. On SPARC, for example, the compile adds 4 bytes pf padding at the end of the struct.
The language guarantees that the first declared member will be at an offset of 0, and that members are laid out in the order in which they're declared. (At least that's true for simple structs; C++ metadata might complicate things.)
Compilers are free to space out structs however they want past the first member, and usually use padding to align to word boundaries for speed.
See these:
C struct sizes inconsistence
Struct varies in memory size?
et. al.