Self-referential pointer arithmetic - c++

So given the following code:
#include <iostream>
#include <vector>
int main(int argc, char* argv[]) {
int i = 42;
int* p = &i;
std::cout << "*p: " << *p << std::endl;
std::cout << "&p: " << &p << std::endl;
std::cout << "p: " << p << std::endl;
std::cout << "p + 1: " << (p + 1) << std::endl;
std::cout << "p + 1: " << ((p + 1) == (int*)(&p)) << std::endl;
std::cout << "*(p + 1): " << *(p + 1) << std::endl;
return 0;
}
It might produce the following output:
*p: 42
&p: 0x7fff38d8a888
p: 0x7fff38d8a884
p + 1: 0x7fff38d8a888
p + 1: 1
*(p + 1): 953723012
Is (p + 1) a pointer to the memory location p is stored in? Is it possible to get the value pointed by p by this way?

p is the pointer to an int object.
&p is the address of p.
The stack from your example looks like:
Address Type Name Value
0x7fff38d8a884 int i 42
0x7fff38d8a888 int* p 0x7fff38d8a884
The way that the stack has been setup, the address of p is right after the address of i. In this particular case, when you added 1 to p, it moved 4 bytes down and found the value there, which happens to be the address to i.
What is happening in the line
std::cout << "p + 1: " << ((p + 1) == (int*)(&p)) << std::endl;
is p+1 --> compiler gets address for the "second element" of array p
(int*)(&p) --> &p is an int**, but is being cast to an int*, int this particular instance, that happens to be the same as the value stored in p + 4 bytes
What is happening in the line
std::cout << "*(p + 1): " << *(p + 1) << std::endl;
is *(p+1) --> compiler accesses the "second element" of array p, because you are likely using an x86_64 system, which is little endian, the hex value stored there is 0x38D8A884, the lower half of the pointer stored in p (which converts to 953723012 in decimal),.

In your example (p + 1) does not point to any storage you have allocated, so dereferencing it produces undefined behavior and should be avoided.
EDIT: Also, your second output for (p + 1) itself is unreliable, since pointer arithmetic should be used only if the pointer is a pointer to an array. Consequently, the expression evaluates to false on my machine.

If you remember that pointers and arrays can be used interchangeably, you might figure out that e.g.
p[1]
is the same as
*(p + 1)
That means that the expression (p + 1) is a pointer to the int value after p. As p doesn't point to an array, it means that (p + n) for a positive n is a pointer to something you haven't allocated (it's out of bounds), and reading that value leads to undefined behavior. Assigning to it is also undefined behavior, and can even overwrite other variables data.
To get the address of where p is stored, you use the address-of operator: &p. That returns a pointer to the pointer (i.e. of type int **).

While the standard gives you no guarantee that ((p + 1) == (int*)(&p)) you seem to be lucky here.
Yet since you are on a 64-bitmachine when dereferencing (p+1) you get only the lower 32 bits of p.
0x38D8A884 == 953723012
The right hand side of the equation is the output that you received. The left hand side is the lower 32 bits of p as witnessed by the output of your program.

No.
Pointer arithmetic, although unchecked, is very limited by the Standard. In general, it should only be used within an array, and you may use it to point to either an array element or one past the end of the array. Furthermore, although pointing one past the end of an array is allowed, the so-obtained pointer is a sentinel value which should not be dereferenced.
So, what is it that you observe ? Simply put, &p, p + 1, etc... are temporary expressions whose result have to be materialized somewhere. With optimizations on, said results would probably be materialized in CPU registers, but without they are materialized on the stack within the function frame (in general).
Of course, this location is not prescribed by the Standard, so trying to obtain it produces undefined behavior; and even though it appears to work on your compiler with this set of compiling options means nothing for any other compiler or even this very same compiler with any other set of options.
That is the true meaning of undefined behavior: it does not mean the program crashes, it just means anything may happen and this encompasses the seems to work situations.

It is a random case that p + 1 is equal to &p. It takes place only in such code as yours where pointer p follows the object it points to. That is the address of p itself is sizeof( int ) greater than the address of the object it points to. If you for example will insert one more definition between i and p then the equation p + 1 == &p will not be valid. For example
int i = 42;
int j = 62;
int* p = &i;

p just so happened to get allocated on the stack at the address right after (well 4 bytes after) the address of the integer i. some_ptr+1 (which is really some_ptr+1*sizeof(int)) is not a consistent way to get the address of some_ptr, it is just a coincidence in this case.
so to answer your question some_ptr+1 != &some_ptr

Related

char pointer doesn't increment?

after ptr++ pointer does not increment
1 #include<iostream>
2
3 int main() {
4
5 char *ptr;
6 char ch = 'A';
7 ptr = &ch;
8
9 std::cout << "pointer :" << &ptr << "\n";
10 ptr++;
11 std::cout << "pointer after ++ :" << &ptr << "\n";
12 return 0;
13 }
ikar$ g++ pointer_arth.cpp
ikar$ ./a.out
pointer :0x7ffeed9f19a0
pointer after ++ :0x7ffeed9f19a0
ikar$
You're incrementing the pointer, but outputting the address of the variable that holds the pointer itself (&ptr). You should output just ptr (and format it accordingly - see edit below).
Example:
#include <iostream>
int main() {
char data;
char *ptr = &data;
std::cout << "pointer:" << (unsigned long long)ptr << std::endl;
ptr++;
std::cout << "pointer incremented: " << (unsigned long long)ptr << std::endl;
}
Output:
pointer:140732831185615
pointer incremented: 140732831185616
Yes, printing just ptr will output garbage, so I converted the pointer to an integer (since pointers are memory addresses anyway).
As suggested in the comments, you can cast the pointer to void * when printing, which gives nicer formatting:
pointer:0x7ffee5467acf
pointer incremented: 0x7ffee5467ad0
Note how 0x7ffee5467acf == 140732745022159 != 140732831185615 - you'll get different outputs on each run because the kernel will load the executable into different places in memory.
EDIT: yes, the first version of this answer, about simply outputting ptr with std::cout << ptr, was incorrect, because the << operator is overloaded in such a way that it treats pointers to char as C-strings. Thus, that version would access potentially invalid memory and output garbage.
But the concept remains the same. Pointers to int, for example, don't have this "problem" and are printed as hexadecimal numbers, even without casting them to void *: Try it online!. The output shows that pointers are still incremented correctly by sizeof(int), which equals 4 on that machine.
Pointer is incremented successfully in your code.
You print the address of location which hold the pointer variable.
Actually, it is garbage after character -'A' if print 'ptr', you can understand and pointing to such un-handled memory location is not good.

How come incrementing a pointer gives a random number as oppose to a memory address when?

How come when I increment a pointer and then dereference it I get a random number?
Here is my code:
#include <iostream>
using namespace std;
int main(){
int reference = 10;
int *health = &reference;
int *health1 = health;
cout << "Health Address: " << health <<
"\nHealth1 Address: " << health1 <<
"\nReference Address: " << &reference << endl;
health1++;
cout << "Health1 value after being incremented then dereferenced: " << *health1 << endl;
}
My output is:
Health Address: 0x7fff5e930a9c
Health1 Address: 0x7fff5e930a9c
Reference Address: 0x7fff5e930a9c.
Health1 value after being incremented then dereferenced: 197262882
I was expecting to get a 0 since the next value of the next memory address would be null, but that is not the case in this situation.
I was expecting to get a 0 since the next spot in memory would be null, but that is not the case in this situation.
Your expectation is wrong. After you have incremented your pointer, it is no longer pointing to the valid memory buffer. Dereferencing it invokes an undefined behavior.
After you increase the pointer, it points to the memory not allocated and initialized by your program, so it's not null as you expected.
Each time you run your program, you may get a random integer.
I think your misunderstanding comes from the fact that you expect the pointer to point at the address of the next element of an array:
int myarray[] = { 1, 2, 3, 4, 5 };
int* parray = myarray;
std::cout << *parray << std::endl;
parray++; // increment the pointer, now points at the address of number 2
std::cout << *parray << std::endl;
But since there is no array in your example the incremented pointer points at the next memory block. The size in bytes of the type it points to is added to the pointer. What value lies there is anyone's guess:
int myvalue = 10;
int* mypointer = &myvalue;
mypointer++;
Dereferencing such pointer is undefined behavior. When you exceed the memory allocated to your process you will not get a pointer null value or a dereferenced value of 0.
Suppose X is the value of a pointer T *p;.
If p is incremented then, p will point to address X + sizeof(T)
*p will then give you the value stored at address X + sizeof(T). Now depending upon the validity of address X + sizeof(T), you will get the result which can be Undefined Behavior in case X + sizeof(T) is invalid.
(*health1)++;
you need to dereference the pointer as this would increment the address itself and not the value it is pointing at.

Comparison Of Pointers

I want to compare the memory address and pointer value of p, p + 1, q , and q + 1.
I want to understand, what the following values actually mean. I can't quite wrap my head around whats going on.
When I run the code:
I get an answer of 00EFF680 for everytime I compare the adresss p with another pointer.
I get an answer of 00EFF670 for everytime I compare the address of q with another pointer.
I get an answer of 15726208 when I look at the pointer value of p.
And I get an answer of 15726212 When I look at the pointer value of p + 1.
I get an answer of 15726192 when I look at the pointer value of q
And I get an answer of 15726200 Wehn I look at the pointer value of q + 1.
Code
#include <iostream>
#include <string>
using namespace std;
int main()
{
int val = 20;
double valD = 20;
int *p = &val;
double *q;
q = &valD;
cout << "Memory Address" << endl;
cout << p == p + 1;
cout << endl;
cout << q == q + 1;
cout << endl;
cout << p == q;
cout << endl;
cout << q == p;
cout << endl;
cout << p == q + 1;
cout << endl;
cout << q == p + 1;
cout << endl;
cout << "Now Compare Pointer Value" << endl;
cout << (unsigned long)(p) << endl;
cout << (unsigned long) (p + 1) << endl;
cout << (unsigned long)(q) << endl;
cout << (unsigned long) (q + 1) << endl;
cout <<"--------" << endl;
return 0;
}
There are a few warnings and/or errors.
The first is that overloaded operator << has higher precedence than the comparison operator (on clang++ -Woverloaded-shift-op-parentheses is the flag).
The second is that there is a comparison of distinct pointer types ('int *' and 'double *').
For the former, parentheses must be placed around the comparison to allow for the comparison to take precedence. For the latter, the pointers should be cast to a type that allows for safe comparison (e.g., size_t).
For instance on line 20, the following would work nicely.
cout << ((size_t) p == (size_t) (q + 1));
As for lines 25-28, this is standard pointer arithmetic. See the explanation here.
As to your question:
I want to compare p, p +1 , q , and q + 1. And Understand what the results mean.
If p is at address 0x80000000 then p+1 is at address 0x80000000 + sizeof(*p). If *p is int then this is 0x80000000 + 0x8 = 0x80000008. And the same reasoning applies for q.
So if you do p == p + 1 then compiler will first do the additon: p+1 then comparison, so you will have 0x80000000 == 0x80000008 which results in false.
Now to your code:
cout << p == p + 1;
is actually equivalent to:
(cout << p) == p + 1;
and that is because << has higher precedence than ==. Actually you should get a compilation error for this.
Another thing is comparision of pointers of non related types like double* with int*, without cast it should not compile.
In C and C++ pointer arithmetic is very closely tied with array manipulation. The goal is that
int array[3] = { 1, 10, 100 };
int *ptr = { 1, 10, 100 };
std::cout << array[2] << '\n';
std::cout << *(ptr + 2) << '\n';
outputs two 100s. This allows the language to treat arrays and pointers as equivalent - that's not the same thing as "the same" or "equal", see the C FAQ for clarification.
This means that the language allows:
int array[3] = { 1, 10, 100 };
int *ptr = { 1, 10, 100 };
And then
std::cout << (void*)array << ", " << (void*)&array[0] << '\n';
outputs the address of the first element twice, the first array behaves like a pointer.
std::cout << (void*)(array + 1) << ", " << (void*)&array[1] << '\n';
prints the address of the second element of array, again array behaving like a pointer in the first case.
std::cout << ptr[2] << ", " << *(ptr + 2) << '\n';
prints element #3 of ptr (100) twice, here ptr is behaving like an array in the first use,
std::cout << (void*)ptr << ", " << (void*)&ptr[0] << '\n';
prints the value of ptr twice, again ptr behaving like an array in the second use,
But this can catch people unaware.
const char* h = "hello"; // h points to the character 'h'.
std::cout << (void*)h << ", " << (void*)(h+1);
This prints the value of h and then a value one higher. But this is purely because the type of h is a pointer to a one-byte-sized data type.
h + 1;
is
h + (sizeof(*h)*1);
If we write:
const char* hp = "hello";
short int* sip = { 1 };
int* ip = { 1 };
std::cout << (void*)hp << ", " << (void*)(hp + 1) << "\n";
std::cout << (void*)sip << ", " << (void*)(sip + 1) << "\n";
std::cout << (void*)ip << ", " << (void*)(ip + 1) << "\n";
The first line of output will show two values 1 byte (sizeof char) apart, the second two values will be 2 bytes (sizeof short int) apart and the last will be four bytes (sizeof int) apart.
The << operator invokes
template<typename T>
std::ostream& operator << (std::ostream& stream, const T& instance);
The operator itself has very high precedence, higher than == so what you are actually writing is:
(std::cout << p) == p + 1
what you need to write is
std::cout << (p == p + 1)
this is going to print 0 (the result of int(false)) if the values are different and 1 (the result of int(true)) if the values are the same.
Perhaps a picture will help (For a 64bit machine)
p is a 64bit pointer to a 32bit (4byte) int. The green pointer p takes up 8 bytes. The data pointed to by p, the yellow int val takes up 4 bytes. Adding 1 to p goes to the address just after the 4th byte of val.
Similar for pointer q, which points to a 64bit (8byte) double. Adding 1 to q goes to the address just after the 8th byte of valD.
If you want to print the value of a pointer, you can cast it to void *, for example:
cout << static_cast<void*>(p) << endl;
A void* is a pointer of indefinite type. C code uses it often to point to arbitrary data whose type isn’t known at compile time; C++ normally uses a class hierarchy for that. Here, though, it means: treat this pointer as nothing but a memory location.
Adding an integer to a pointer gets you another pointer, so you want to use the same technique there:
cout << static_cast<void*>(p+1) << endl;
However, the difference between two pointers is a signed whole number (the precise type, if you ever need it, is defined as ptrdiff_t in <cstddef>, but fortunately you don’t need to worry about that with cout), so you just want to use that directly:
cout << (p+1) - p << endl;
cout << reinterpret_cast<char*>(p+1) - reinterpret_cast<char*>(p) << endl;
cout << (q - p) << endl;
That second line casts to char* because the size of a char is always 1. That’s a big hint what’s going on.
As for what’s going on under the hood: compare the numbers you get to sizeof(*p) and sizeof(*q), which are the sizes of the objects p and q point to.
The pointer values that are printed are likely to change on every execution (see why the addresses of local variables can be different every time and Address Space Layout Randomization)
I get an answer of 00EFF680 for everytime I compare the adresss p with another pointer.
int val = 20;
double valD = 20;
int *p = &val;
cout << p == p + 1;
It is translated into (cout << p) == p + 1; due to the higher precedence of operator << on operator ==.
It print the hexadecimal value of &val, first address on the stack frame of the main function.
Note that in the stack, address are decreasing (see why does the stack address grow towards decreasing memory addresses).
I get an answer of 00EFF670 for everytime I compare the address of q with another pointer.
double *q = &valD;
cout << q == q + 1;
It is translated into (cout << q) == q + 1; due to the precedence of operator << on operator ==.
It prints the hexadecimal value of &valD, second address on the stack frame of the main function.
Note that &valD <= &val - sizeof(decltype(valD) = double) == &val - 8 since val is just after valD on the stack. It is a compiler choice that respects some alignment constraints.
I get an answer of 15726208 when I look at the pointer value of p.
cout << (unsigned long)(p) << endl;
It just prints the decimal value of &val
And I get an answer of 15726212 When I look at the pointer value of p + 1.
int *p = &val;
cout << (unsigned long) (p + 1) << endl;
It prints the decimal value of &val + sizeof(*decltype(p)) = &val + sizeof(int) = &val + 4 since on your machine int = 32 bits
Note that if p is a pointer to type t, p+1 is p + sizeof(t) to avoid memory overlapping in array indexing.
Note that if p is a pointer to void, p+1 should be undefined (see void pointer arithmetic)
I get an answer of 15726192 when I look at the pointer value of q
cout << (unsigned long)(q) << endl;
It prints the decimal value of &valD
And I get an answer of 15726200 Wehn I look at the pointer value of q + 1.
cout << (unsigned long) (q + 1) << endl;
It prints the decimal value of &val + sizeof(*decltype(p)) = &valD + sizeof(double) = &valD + 8

Unable to increment char* pointer

I'm a little rusty on pointers so bear with me. I have declared an array of chars and a char* pointer which is set in the following manner:
char* memPointer = &mem_arr[0];
When I run the program, it looks like my starting address is 6E2D200C (this address is always the same for some reason. I'm not sure if this is right or not. Please let me know if I'm doing something wrong)
Then, at some point when the user requests some memory, I try to increment the pointer and print out the new address. But the address remains the same as mentioned above. Here is what i have tried with no success:
//*(memPointer += actualSize); //no change
//memPointer += actualSize; //no change
//*memPointer += actualSize;// no change
&memPointer += actualSize; //no change
actualSize is an int by the way. If there is any other information that I need to provide, please let me know. Thank you in advance!
In your comments you indicate that your code is along the liens of:
char mem_arr[] = "hello world";
char* memPointer = &mem_arr[0];
memPointer += actualSize;
printf("memPointer = %p\n", &memPointer);
This line
char* memPointer = &mem_arr[0];
says: Let there be a variable of type char* called memPointer and let it contain the address of the 1st element of mem_arr. After this line, the variable memPointer contains the address of mem_arr[0].
&memPointer expands to the address-of memPointer rather than the address that memPointer contains. memPointer is a named variable, and saying &memPointer evaluates to the address - location in memory - of that variable, not its value.
memPointer += actualSize;
printf("&mem_arr[0] = %p, memPointer = %p\n", &mem_arr[0], memPointer);
will set memPointer to point to the actualSizeth element of mem_arr, as your code appears to be trying to do.
The address of the pointer will not change, but the pointer may point to something else if you start to increment it:
char* x = "test";
// *x is now 't'
x++;
// *x is now 'e'
There's no meaningful way to manipulate the address of the pointer itself unless you're using a pointer to a pointer, like char** x.
Your remark about "when the user requests some memory" and implying you must manipulate the pointer seems to be based on flawed assumptions. If you're doing this C-style you'd use realloc() where you get a pointer to the new allocation back. If you're doing C++ properly you'd new some new memory, copy the contents over, and delete the old allocation.
Maybe you're confused getting the address of mempointer instead of the address that mempointer is pointing to
This code shows how the address that my_pointer is pointing to changes whenever i increase the pointer or i give him a new memory address, take a look:
int mem_arr[256];
int * my_pointer = &mem_arr[0];
// let's give some values to the array just
// for fun
for(int i = 0; i < 256; i++)
{
mem_arr[i] = 10 + (i * 10);
}
// print original value & address
std::cout << "Value 1: " << *my_pointer << " Address 1: " << my_pointer << std::endl;
// increase pointer (+4 bytes)
my_pointer++;
// print current value & address
std::cout << "Value 2: " << *my_pointer << " Address 2: " << my_pointer << std::endl;
// let's seek into the 8th element
// of the array
my_pointer = &mem_arr[8];
// and print current value & address once again
std::cout << "Value 3: " << *my_pointer << " Address 3: " << my_pointer << std::endl;
getchar();
my_pointer
by itself returns the address that my_pointer is pointing to, while
&my_pointer
returns the address where the pointer my_pointer is stored

Narrowing type conversion in C++ using pointers

I have been having some problems with downward type conversion in C++ using pointers, and before I came up with the idea of doing it this way Google basically told me this is impossible and it wasn't covered in any books I learned C++ from. I figured this would work...
long int TheLong=723330;
int TheInt1=0;
int TheInt2=0;
long int * pTheLong1 = &TheLong;
long int * pTheLong2 = &TheLong + 0x4;
TheInt1 = *pTheLong1;
TheInt2 = *pTheLong2;
cout << "The double is " << TheLong << " which is "
<< TheInt1 << " * " << TheInt2 << "\n";
The increment on line five might not be correct but the output has me worried that my C compiler I am using gcc 3.4.2 is automatically turning TheInt1 into a long int or something. The output looks like this...
The double is 723330 which is 723330 * 4067360
The output from TheInt1 is impossibly high, and the output from TheInt2 is absent.
I have three questions...
Am I even on the right track?
What is the proper increment for line five?
Why the hell is TheInt1/TheInt2 allowing such a large value?
int is probably 32 bit, which gives it a range of -2*10^9 to 2*10^9.
In the line long int * pTheLong2 = &TheLong + 0x4; you are doing pointer arithmetic to a long int*, which means the address will increase by the size of 0x4 long ints. I guess you are assuming that long int is twice the size of int. This is absolutely not guaranteed, but probably true if you are compiling in 64 bit mode. So you want to add half the size of a long int -- exactly the size of an int under your assumption -- to your pointer. int * pTheLong2 = (int*)(&TheLong) + 1; achieves this.
You are on the right track, but please keep in mind, as others have pointed out, that you are now exploring undefined behaviour. This means that portability is broken and optimization flags may very well change the behaviour.
By the way, a more correct thing to output (assuming that the machine is little-endian) would be:
cout << "The long is " << TheLong << " which is "
<< TheInt1 << " + " << TheInt2 << " * 2^32" << endl;
For completeness' sake, a well-defined conversion of a 32 bit integer to two 16 bit ones:
#include <cstdint>
#include <iostream>
int main() {
uint32_t fullInt = 723330;
uint16_t lowBits = (fullInt >> 0) & 0x0000FFFF;
uint16_t highBits = (fullInt >> 16) & 0x0000FFFF;
std::cout << fullInt << " = "
<< lowBits << " + " << highBits << " * 2^16"
<< std::endl;
return 0;
}
Output: 723330 = 2434 + 11 * 2^16
Am I even on the right track?
Probably not. You seem confused.
What is the proper increment for line five?
There are none. Pointer arithmetic is possible only inside arrays, you have no arrays here. So
long int * pTheLong2 = &TheLong + 0x4;
is undefined behavior and any value other than 0 (and possibly 1) by which you'd replace 0x4 would also be UB.
Why the hell is TheInt1/TheInt2 allowing such a large value?
int and long int often have the same range of possible values.
TheInt2 = *pTheLong2;
This invokes undefined behavior, because the C++ Standard does not give any guarantee as to which memory location pTheLong2 is pointing to, as it's initlialized as:
long int * pTheLong2 = &TheLong + 0x4;
&TheLong is a memory location of the variable TheLong and pTheLong2 is initialized to a memory location which is either not a part of the program hence illegal, or its pointing to a memory location within the program itself, though you don't know where exactly, neither the C++ Standard gives any guarantee where it's pointing to.
Hence, dereferencing such a pointer invokes undefined behavior.