I have a question about pointers, and memory addresses:
Supposing I have the following code:
int * array = (int *) malloc(sizeof(int) * 4);
Now in array im storing a memory address, I know that c++ takes already care when adding +1 to this pointer it will add 4 bytes, but what If I want to add manually 4 bytes?
array + 0x004
If im correct this will lead to add 4*4 (16) bytes, but my Idea is to add manually those 4 bytes.
Why? Just playing around, i've tried this and I got a totally different result from what I expected, then i've researched and i've seen that c++ takes already care when you add +1 to a pointer (it sums 4 bytes in this case).
Any idea?
For a pointer p to a type T with value v, the expression p+n will (on most systems anyway) result in a pointer to the address v+n*sizeof(T). To get a fixed-byte offset to the pointer, you can first cast it to a character pointer, like this:
reinterpret_cast<T*>(reinterpret_cast<char*>(p) + n)
In c++, sizeof(char) is defined to be equal to 1.
Do note that accessing improperly aligned values can have large performance penalties.
Another thing to note is that, in general, casting pointers to different types is not allowed (called the strict aliasing rule), but an exception is explicitly made for casting any pointer type to char* and back.
The trick is convert the type of array into any pointer-type with a size of 1 Byte, or store the pointer value in an integer.
#include <stdint.h>
int* increment_1(int* ptr) {
//C-Style
return (int*)(((char*)ptr) + 4);
}
int* increment_2(int* ptr) {
//C++-Style
char* result = reinterpret_cast<char*>(ptr);
result += 4;
return reinterpret_cast<int*>(result);
}
int* increment_3(int* ptr) {
//Store in integer
intptr_t result = reinterpret_cast<intptr_t>(ptr);
result += 4;
return reinterpret_cast<int*>(result);
}
Consider that if you add an arbitrary number of bytes to an address of an object of type T, it no longer makes sense to use a pointer of type T, since there might not be an object of type T at the incremented memory address.
If you want to access a particular byte of an object, you can do so using a pointer to a char, unsigned char or std::byte. Such objects are the size of a byte, so incrementing behaves just as you would like. Furthermore, while rules of C++ disallow accessing objects using incompatible pointers, these three types are excempt of that rule and are allowed to access objects of any type.
So, given
int * array = ....
You can access the byte at index 4 like this:
auto ptr = reinterpret_cast<unsigned char*>(array);
auto byte_at_index_4 = ptr + 4;
array + 0x004
If im correct this will lead to add 4*4 (16) bytes
Assuming sizeof(int) happens to be 4, then yes. But size of int is not guaranteed to be 4.
Related
Value of a pointer is address of a variable. Why value of an int pointer increased by 4-bytes after the int pointer increased by 1.
In my opinion, I think value of pointer(address of variable) only increase by 1-byte after pointer increment.
Test code:
int a = 1, *ptr;
ptr = &a;
printf("%p\n", ptr);
ptr++;
printf("%p\n", ptr);
Expected output:
0xBF8D63B8
0xBF8D63B9
Actually output:
0xBF8D63B8
0xBF8D63BC
EDIT:
Another question - How to visit the 4 bytes an int occupies one by one?
When you increment a T*, it moves sizeof(T) bytes.† This is because it doesn't make sense to move any other value: if I'm pointing at an int that's 4 bytes in size, for example, what would incrementing less than 4 leave me with? A partial int mixed with some other data: nonsensical.
Consider this in memory:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
Which makes more sense when I increment that pointer? This:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
Or this:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
The last doesn't actually point an any sort of int. (Technically, then, using that pointer is UB.)
If you really want to move one byte, increment a char*: the size of of char is always one:
int i = 0;
int* p = &i;
char* c = (char*)p;
char x = c[1]; // one byte into an int
†A corollary of this is that you cannot increment void*, because void is an incomplete type.
Pointers are increased by the size of the type they point to, if the pointer points to char, pointer++ will increment pointer by 1, if it points to a 1234 bytes struct, pointer++ will increment the pointer by 1234.
This may be confusing first time you meet it, but actually it make a lot of sense, this is not a special processor feature, but the compiler calculates it during compilation, so when you write pointer+1 the compiler compiles it as pointer + sizeof(*pointer)
As you said, an int pointer points to an int. An int usually takes up 4 bytes and therefore, when you increment the pointer, it points to the "next" int in the memory - i.e., increased by 4 bytes. It acts this way for any size of type. If you have a pointer to type A, then incrementing a A* it will increment by sizeof(A).
Think about it - if you only increment the pointer by 1 byte, than it will point to a middle of an int and I can't think of an opportunity where this is desired.
This behavior is very comfortable when iterating over an array, for example.
The idea is that after incrementing, the pointer points to the next int in memory. Since ints are 4 bytes wide, it is incremented by 4 bytes. In general, a pointer to type T will increment by sizeof(T)
A pointer points at the BEGINNING of something in memory. An INT occupies 4 bytes (32bit) and a DOUBLE occupies 8 bytes (64bit) in memory. So if you have a DOUBLE number stored, and you wish at a very low level pointing to the next available memory location, the pointer wooud be increased by 8 bytes. If for some reason you pointed at +4bytes from the start of a DOUBLE value, you would corrupt it's value. Memory is a very large flat field that has no conscience of itself, so it's up to the software to divides it properly and to "respect the borders" of items located in that field.
I have a return value from a library which is a void pointer. I know that it points to a short int; I try to obtain the int value in the following way (replacing the function call with a simple assignment to a void *):
short n = 1;
void* s = &n;
int k = *(int*)s;
I try to cast a void pointer that points to an address in which there is a short and I try to cast the pointer to point to an int and when I do so the output becomes a rubbish value. While I understand why it's behaving like that I don't know if there's a solution to this.
If the problem you are dealing with truly deals with short and int, you can simply avoid the pointer and use:
short n = 1;
int k = n;
If the object types you are dealing with are different, then the solution will depend on what those types are.
Update, in response to OP's comment
In a comment, you said,
I have a function that returns a void pointer and I would need to cast the value accordingly.
If you know that the function returns a void* that truly points to a short object, then, your best bet is:
void* ptr = function_returning_ptr();
short* sptr = reinterpret_cast<short*>(ptr);
int k = *sptr;
The last line work since *sptr evaluates to a short and the conversion of a short to an int is a valid operation. On the other hand,
int k = *(int*)sptr;
does not work since conversion of short* to an int* is not a valid operation.
Your code is subject to undefined behavior, as it violates the so-called strict aliasing rules. Without going into too much detail and simplifying a bit, the rule states that you can not access an object of type X though a pointer to type Z unless types X and Z are related. There is a special exception for char pointer, but it doesn't apply here.
In your example, short and int are not related types, and as such, accessing one through pointer to another is not allowed.
The size of a short is only 16 bits the size of a int is 32 bits ( in most cases not always) this means that you are tricking the computer into thinking that your pointer to a short is actually pointing to an integer. This causes it to read more memory that it should and is reading garbage memory. If you cast s to a pointer to a short then deference it it will work.
short n = 1;
void* s = &n;
int k = *(short*)s;
Assuming you have 2 byte shorts and 4 byte ints, There's 3 problems with casting pointers in your method.
First off, the 4 byte int will necessarily pick up some garbage memory when using the short's pointer. If you're lucky the 2 bytes after short n will be 0.
Second, the 4 byte int may not be properly aligned. Basically, the memory address of a 4 byte int has to be a multiple of 4, or else you risk bus errors. Your 2 byte short is not guaranteed to be properly aligned.
Finally, you have a big-endian/little-endian dependency. You can't turn a big-endian short into a little-endian int by just tacking on some 0's at the end.
In the very fortunate circumstance that the bytes following the short are 0, AND the short is integer aligned, AND the system uses little-endian representation, then such a cast will probably work. It would be terrible, but it would (probably) work.
The proper solution is to use the original type and let the compiler cast. Instead of int k = *(int*)s;, you need to use int k = *(short *)s;
I'm trying to store a couple of ints in memory using void* & then retrieve them but it keeps throwing "pointer of type ‘void *’ used in arithmetic" warning.
void *a = new char[4];
memset(a, 0 , 4);
unsigned short d = 7;
memcpy(a, (void *)&d, 2);
d=8;
memcpy(a+2, (void *)&d, 2); //pointer of type ‘void *’ used in arithmetic
/*Retrieving*/
unsigned int *data = new unsigned int();
memcpy(data, a, 2);
cout << (unsigned int)(*data);
memcpy(data, a+2, 2); //pointer of type ‘void *’ used in arithmetic
cout << (unsigned int)(*data);
The results are as per expectation but I fear that these warnings might turn into errors on some compiler. Is there another way to do this that I'm not aware of?
I know this is perhaps a bad practice in normal scenario but the problem statement requires that unsigned integers be stored and sent in 2 byte packets. Please correct me if I'm wrong but as per my understanding, using a char* instead of a void* would have taken up 3 bytes for 3-digit numbers.
a+2, with a being a pointer, means that the pointer is increased to allow space for two items of the pointer type. V.g., if a was int32 *, a + 2 would mean "a position plus 8 bytes".
Since void * has no type, it can only try to guess what do you mean by a + 2, since it does not know the size of the type being referred.
The problem is that the compiler doesn't know what to do with
a+2
This instruction means "Move pointer 'a' forward by 2 * (sizeof-what-is-pointed-to-by-'a')".
If a is void *, the compiler doesn't know the size of the target object (there isn't one!), so it gives an error.
You need to do:
memcpy(data, ((char *)a)+2, 2);
This way, the compiler knows how to add 2 - it knows the sizeof(char).
Please correct me if I'm wrong but as per my understanding, using a char* instead of a void* would have taken up 3 bytes for 3-digit numbers.
Yes, you are wrong, that would be the case if you were transmitting the numbers as chars. 'char*' is just a convenient way of referring to 8-bit values - and since you are receiving pairs of bytes, you could treat the destination memory are char's to do simple math. But it is fairly common for people to use 'char' arrays for network data streams.
I prefer to use something like BYTE or uint8_t to indicate clearly 'I'm working with bytes' as opposed to char or other values.
void* is a pointer to an unknown, more importantly, 0 sized type (void). Because the size is zero, offset math is going to result in zeros, so the compiler tells you it's invalid.
It is possible that your solution could be as simple as to receive the bytes from the network into a byte-based array. An int is 32 bits, 4 bytes. They're not "char" values, but quads of a 4-byte integer.
#include <cstdint>
uint8_t buffer[4];
Once you know you've filled the buffer, you can simply say
uint32_t integer = *(static_cast<uint32*>(buffer));
Whether this is correct will depend on whether the bytes are in network or host order. I'm guessing you'll probably need:
uint32_t integer = ntohl(*(static_cast<uint32*>(buffer)));
If I declare
int x = 5 ;
int* p = &x;
unsigned int y = 10 ;
cout << p+y ;
Is this a valid thing to do in C++, and if not, why?
It has no practical use, but is it possible?
The math is valid; the resulting pointer isn't.
When you say ptr + i (where ptr is an int*), that evaluates to the address of an int that's i * sizeof(int) bytes past ptr. In this case, since your pointer points to a single int rather than an array of them, you have no idea (and C++ doesn't say) what's at p+10.
If, however, you had something like
int ii[20] = { 0 };
int *p = ii;
unsigned int y = 10;
cout << p + y;
Then you'd have a pointer you could actually use, because it still points to some location within the array it originally pointed into.
What you are doing in your code snippet is not converting unsigned int to pointer. Instead you are incrementing a pointer by an integer offset, which is a perfectly valid thing to do. When you access the index of an array, you basically take the pointer to the first element and increase it by the integer index value. The result of this operation is another pointer.
If p is a pointer/array, the following two lines are equivalent and valid (supposing the pointed-to-array is large enough)
p[5] = 1;
*(p + 5) = 1;
To convert unsigned int to pointer, you must use a cast
unsigned int i = 5;
char *p = reinterpret_cast<char *>(i);
However this is dangerous. How do you know 5 is a valid address?
A pointer is represented in memory as an unsigned integer type, the address. You CAN store a pointer in an integer. However you must be careful that the integer data type is large enough to hold all the bits in a pointer. If unsigned int is 32-bits and pointers are 64-bits, some of the address information will be lost.
C++11 introduces a new type uintptr_t which is guaranteed to be big enough to hold a pointer. Thus it is safe to cast a pointer to uintptr_t and back again.
It is very rare (should be never in run-of-the-mill programming) that you need to store pointers in integers.
However, modifying pointers by integer offsets is totally valid and common.
Is this a valid thing to do in c++, and if not why?
Yes. cout << p+y; is valid as you can see trying to compile it. Actually p+y is so valid that *(p+y) can be translated to p[y] which is used in C-style arrays (not that I'm suggesting its use in C++).
Valid doesn't mean it actually make sense or that the resulting pointer is valid. Since p points to an int the resulting pointer will be an offset of sizeof(int) * 10 from the location of x. And you are not certain about what's in there.
A variable of type int is a variable capable of containing an integer value. A variable of type int* is a pointer to a variable copable of containing an integer value.
Every pointer type has the same size and contains the same stuff: A memory address, which the size is 4 bytes for 32-bit arquitectures and 8 bytes for 64-bit arquitectures. What distinguish them is the type of the variable they are poiting to.
Pointers are useful to address buffers and structures allocated dynamically at run time or any sort of variable that is to be used but is stored somewhere else and you have to tell where.
Arithmetic operations with pointers are possible, but they won't do what you think. For instance, summing + 1 to a pointer of type int will increase its value by sizeof(int), not by literally 1, because its a pointer, and the logic here is that you want the next object of this array.
For instance:
int a[] = { 10, 20, 30, 40 };
int *b = a;
printf("%d\n", *b);
b = b + 1;
printf("%d\n", *b);
It will output:
10
20
Because b is pointing to the integer value 10, and when you sum 1 to it, or any variable containing an integer, its then poiting to the next value, 20.
If you want to perform operations with the variable stored at b, you can use:
*b = *b + 3;
Now b is the same pointer, the address has not changed. But the array 10, 20, 30, 40 now contains the values 13, 20, 30, 40, because you increased the element b was poiting to by 3.
Value of a pointer is address of a variable. Why value of an int pointer increased by 4-bytes after the int pointer increased by 1.
In my opinion, I think value of pointer(address of variable) only increase by 1-byte after pointer increment.
Test code:
int a = 1, *ptr;
ptr = &a;
printf("%p\n", ptr);
ptr++;
printf("%p\n", ptr);
Expected output:
0xBF8D63B8
0xBF8D63B9
Actually output:
0xBF8D63B8
0xBF8D63BC
EDIT:
Another question - How to visit the 4 bytes an int occupies one by one?
When you increment a T*, it moves sizeof(T) bytes.† This is because it doesn't make sense to move any other value: if I'm pointing at an int that's 4 bytes in size, for example, what would incrementing less than 4 leave me with? A partial int mixed with some other data: nonsensical.
Consider this in memory:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
Which makes more sense when I increment that pointer? This:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
Or this:
[↓ ]
[...|0 1 2 3|0 1 2 3|...]
[...|int |int |...]
The last doesn't actually point an any sort of int. (Technically, then, using that pointer is UB.)
If you really want to move one byte, increment a char*: the size of of char is always one:
int i = 0;
int* p = &i;
char* c = (char*)p;
char x = c[1]; // one byte into an int
†A corollary of this is that you cannot increment void*, because void is an incomplete type.
Pointers are increased by the size of the type they point to, if the pointer points to char, pointer++ will increment pointer by 1, if it points to a 1234 bytes struct, pointer++ will increment the pointer by 1234.
This may be confusing first time you meet it, but actually it make a lot of sense, this is not a special processor feature, but the compiler calculates it during compilation, so when you write pointer+1 the compiler compiles it as pointer + sizeof(*pointer)
As you said, an int pointer points to an int. An int usually takes up 4 bytes and therefore, when you increment the pointer, it points to the "next" int in the memory - i.e., increased by 4 bytes. It acts this way for any size of type. If you have a pointer to type A, then incrementing a A* it will increment by sizeof(A).
Think about it - if you only increment the pointer by 1 byte, than it will point to a middle of an int and I can't think of an opportunity where this is desired.
This behavior is very comfortable when iterating over an array, for example.
The idea is that after incrementing, the pointer points to the next int in memory. Since ints are 4 bytes wide, it is incremented by 4 bytes. In general, a pointer to type T will increment by sizeof(T)
A pointer points at the BEGINNING of something in memory. An INT occupies 4 bytes (32bit) and a DOUBLE occupies 8 bytes (64bit) in memory. So if you have a DOUBLE number stored, and you wish at a very low level pointing to the next available memory location, the pointer wooud be increased by 8 bytes. If for some reason you pointed at +4bytes from the start of a DOUBLE value, you would corrupt it's value. Memory is a very large flat field that has no conscience of itself, so it's up to the software to divides it properly and to "respect the borders" of items located in that field.