Assuming the value of uint32_t x is an address. How can I get the value behind this address?
My try was to simply assign the address to a pointer.
int*y = x;
But x is not an int pointer, it's just int with an address as value.
An integer type which is large enough to represent all data pointers can be converted into a pointer using reinterpret_cast or an explicit conversion. The pointer can be indirected to get the pointed value using the indirection operator.
Note that uint32_t is not guaranteed to be large enough to be able to represent all pointer values (and in fact will not be enough on modern 64 bit cpus). uintptr_t is meant precisely for this purpose.
Note that if the pointed address does not contain an object (of compatible type), then behaviour will be undefined.
In C++ this can be done using reinterpret_cast
8.5.1.10 Reinterpret cast [expr.reinterpret.cast]
...
5. A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined. [Note: Except as described in 6.6.4.4.3, the result of such a conversion will not be a safely-derived pointer value. —end note]
And the rules in 6.6.4.4.3 state:
An integer value is an integer representation of a safely-derived pointer only if its type is at least as large as std::intptr_t and it is one of the following:
—(3.1) the result of a reinterpret_cast of a safely-derived pointer value;
—(3.2) the result of a valid conversion of an integer representation of a safely-derived pointer value;
—(3.3) the value of an object whose value was copied from a traceable pointer object, where at the time of the copy the source object contained an integer representation of a safely-derived pointer value;
—(3.4) the result of an additive or bitwise operation, one of whose operands is an integer representation of a safely-derived pointer value P, if that result converted by reinterpret_cast<void*> would compare equal to a safely-derived pointer computable from reinterpret_cast<void*>(P).
So if x (in the question) has a type at least as large as std::intptr_t and is already an integral representation of a safely derived pointer as per the rules above, you will be able to get the value behind the address stored in x.
Related
Consider this example
int main(){
std::intptr_t value = /* a special integer value */;
int* ptr = reinterpret_cast<int*>(value ); // #1
int v = *ptr; // #2
}
[expr.reinterpret.cast] p5 says
A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined.
At least, step #1 is implementation-defined. For step #2, in my opinion, I think it has four possibilities, which are
The implementation does not support such a conversion, and implementation does anything for it.
The pointer value is exactly the address of an object of type int, the result is well-formed.
The pointer value is the address of an object other than the int type, the result is UB.
The pointer value is an invalid pointer value, the indirection is UB.
It means what the behavior of the indirection through the pointer ptr will be depends on the implementation. It is not definitely UB if the implementation takes option 2. So, I wonder whether this case is not definitely UB or is definitely UB? If it is latter, which provisions strongly state the behavior?
The standard has nothing more to say on it than what you quoted. The standard only guarantees the meaning of a integer-to-pointer cast if that integer value was taken from a pointer-to-integer cast. The meaning of all other integer-to-pointer conversions are implementation defined.
And that means everything about them is "implementation defined": what they result in and what using those results will do. After all, the pointer-to-integer-to-pointer round trip spells out that you get the "original value" back. The fact that the resulting pointer has the "original value" means that it will behave exactly like you had copied the original pointer itself. So the standard needs say nothing more on the matter.
The behavior of the pointer taken from an implementation-defined integer-to-pointer cast is... implementation-defined. If an implementation says that supports such conversions, it must spell out what the result of supported conversions are. That is, if there is some "a special integer value" for which a cast to an int* is supported by the implementation, it must say what the result of that cast is. This includes things like whether it pointer to an actual int or whatever.
Is performing indirection from a pointer acquired from converting an integer value definitely UB?
Not always. Here is an example that is definitely not UB:
int i = 42;
std::intptr_t value = reinterpret_cast<std::intptr_t>(&i);
int* ptr = reinterpret_cast<int*>(value);
int v = *ptr;
This is because converting a pointer to an integer of sufficient size and back to the same pointer type is guaranteed to yield the same pointer value as stated in the rule you quoted. Since the original pointer value was valid for indirection, so is the converted one.
can the Pointer have value??
so In which case is it used
int num=100;
int* iptr=NULL;
iptr=reinterpret_cast<int*>(num);
printf("%d \n",num);
printf("%d \n",num);
result
100
100
Mappings between pointers and integers are implementation-defined.
Conversion of an integer to a pointer using reinterpret_cast will not be a safely-derived pointer value except under certain conditions. Those conditions are not met in your example.
Citation from CPP draft (N4713):
8.5.1.10 Reinterpret cast
...
6. A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined. [ Note:
Except as described in 6.6.4.4.3, the result of such a conversion will not be a safely-derived pointer value.
—end note ]
The conditions for Safely-derived pointers.
6.6.4.4.3 Safely-derived pointers
...
2 A pointer value is a safely-derived pointer to a dynamic object only if it has an object pointer type and it is one of the following:
(2.1) — the value returned by a call to the C++ standard library implementation of ::operator new(std::size_t) or ::operator new(std::size_t, std::align_val_t);
(2.2) — the result of taking the address of an object (or one of its subobjects) designated by an lvalue resulting from indirection through a safely-derived pointer value;
(2.3) — the result of well-defined pointer arithmetic using a safely-derived pointer value;
(2.4) — the result of a well-defined pointer conversion of a safely-derived pointer value;
(2.5) — the result of a reinterpret_cast of a safely-derived pointer value;
(2.6) — the result of a reinterpret_cast of an integer representation of a safely-derived pointer value;
(2.7) — the value of an object whose value was copied from a traceable pointer object, where at the time of the copy the source object contained a copy of a safely-derived pointer value.
I'm fairly new to C++ and I'm having difficulty wrapping my head around what is going on in the final line of the below:
int numToSend = bs->GetSize();
char * tBuf = new char[NUM_LENGTH_BYTES + numToSend];
*(WORD*)tBuf = htons((WORD)numToSend);
So htons is returning a u_short or WORD, but the cast on tBuf is somewhat confusing to me. Is it something along the lines of "the value pointed to by tBuf is cast as a WORD pointer and assigned the return from htons"?
I believe this is a fairly unsafe operation in most cases, what would be the best practice here?
It may not be a recommended practice, but AFAIK, it is safe. It is true that in general, taking a pointer to P, casting it to a pointer to Q and using it as a pointer to Q leads to undefined behaviour. Here it looks even worse, because the alignment requirement of char are known to be the weakest possible.
But the char * tBuf pointer has been obtained through a new expression. Such a new expression internally rely on a allocation function to obtain storage, and draft n4296 for c++14 says in 3.7.4.1 Allocation functions [basic.stc.dynamic.allocation] §2:
The allocation function attempts to allocate the requested amount of storage. If it is successful, it shall
return the address of the start of a block of storage whose length in bytes shall be at least as large as
the requested size... The pointer returned shall be suitably aligned so that it can be converted
to a pointer of any complete object type with a fundamental alignment requirement (3.11) and then used
to access the object or array in the storage allocated (until the storage is explicitly deallocated by a call
to a corresponding deallocation function).
So this line *(WORD*)tBuf = htons((WORD)numToSend); only does perfectly defined operations:
convert numToSend from an integer type to an unsigned type, and 4.7 Integral conversions [conv.integral] says:
A prvalue of an integer type can be converted to a prvalue of another integer type...
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source
integer (modulo 2n where n is the number of bits used to represent the unsigned type)
call htons with a WORD or uint16_t as parameter to return a uint16_t or WORD
converts a pointer obtained by new to a WORD * and uses that pointer to access the object in the storage allocated
Simply, the value of the first two bytes of the allocated array is now unspecified. More exactly it is the byte representation of the WORD in the particular implementation.
But it is still allowed to access the allocated array as a character array, even if the first bytes now contain a WORD, because it is explicitely allowed per the so called strict aliasing rule 3.10 Lvalues and rvalues [basic.lval] §10 :
If a program attempts to access the stored value of an object through a glvalue of other than one of the
following types the behavior is undefined:...
(10.8) — a char or unsigned char type.
If the tBuf pointer had not been obtained through a new expression, the only correct way would have been to do a memcpy:
WORD n_numToSend = htons(numToSend);
memcpy(tBuf, &n_numToSend, sizeof(WORD));
As this one is allowed for any pointer provided the storage is big enough, it is what I would call the recommended practice.
In case of pointers, we know that their size is always same irrespective of data type of the variable it is pointing.
Data type is needed when dereferencing the pointer so it knows how much data it should read. So why cant i assign address of variable of double type to a pointer of int type?
why cant it happen like dereferencing a int pointer reads next 4 bytes from variable of double type and print its value?
Many computers have alignment requirements, so (for example) to read a 2-byte value, the address at which it's located must be a multiple of 2 (and likewise, a 4-byte value must be located at an address that's a multiple of 4, and so on). In fact, this alignment requirement is common enough that it's frequently referred to as "natural alignment".
Likewise, some types (e.g., floating point types) impose requirements on the bit sequence that can be read as that type, so if you try to take some arbitrary data and treat it as a double, you might trigger something like a floating point exception.
If you want to do this badly enough, you can use a cast to turn the pointer into the target type (but the results, if any, aren't usually portable).
You are guaranteed that you can convert a pointer to any other type of object to a pointer to unsigned char, and use that to read the bytes that represent the pointee object.
Also, if you primarily want an opaque pointer, without type information attached, you can assign a pointer to some other type to a void *.
Finally: no, not all pointers are actually the same. Pointers to different types can be different sizes (e.g., on the early Cray compilers, a char * was substantially different from an int *).
In case of pointers, we know that their size is always same irrespective of data type of the variable it is pointing.
No, we do not know that.
Chapter and verse for C
6.2.5 Types
...
28 A pointer to void shall have the same representation and alignment requirements as a
pointer to a character type.48) Similarly, pointers to qualified or unqualified versions of
compatible types shall have the same representation and alignment requirements. All
pointers to structure types shall have the same representation and alignment requirements
as each other. All pointers to union types shall have the same representation and
alignment requirements as each other. Pointers to other types need not have the same
representation or alignment requirements.
48) The same representation and alignment requirements are meant to imply interchangeability as
arguments to functions, return values from functions, and members of unions.
Emphasis added.
Chapter and verse for C++
3.9.2 Compound types
...
3 The type of a pointer to void or a pointer to an object type is called an object pointer type. [ Note: A pointer
to void does not have a pointer-to-object type, however, because void is not an object type. — end note ]
The type of a pointer that can designate a function is called a function pointer type. A pointer to objects
of type T is referred to as a “pointer to T.” [Example: a pointer to an object of type int is referred to as
“pointer to int ” and a pointer to an object of class X is called a “pointer to X.” — end example ] Except
for pointers to static members, text referring to “pointers” does not apply to pointers to members. Pointers
to incomplete types are allowed although there are restrictions on what can be done with them (3.11).
A valid value of an object pointer type represents either the address of a byte in memory (1.7) or a null
pointer (4.10). If an object of type T is located at an address A, a pointer of type cv T* whose value is the
address A is said to point to that object, regardless of how the value was obtained. [ Note: For instance,
the address one past the end of an array (5.7) would be considered to point to an unrelated object of the
array’s element type that might be located at that address. There are further restrictions on pointers to
objects with dynamic storage duration; see 3.7.4.3. — end note ] The value representation of pointer types
is implementation-defined. Pointers to layout-compatible types shall have the same value representation and
alignment requirements (3.11). [ Note: Pointers to over-aligned types (3.11) have no special representation,
but their range of valid values is restricted by the extended alignment requirement. This International
Standard specifies only two ways of obtaining such a pointer: taking the address of a valid object with
an over-aligned type, and using one of the runtime pointer alignment functions. An implementation may
provide other means of obtaining a valid pointer value for an over-aligned type. — end note ]
4 A pointer to cv-qualified (3.9.3) or cv-unqualified void can be used to point to objects of unknown type.
Such a pointer shall be able to hold any object pointer. An object of type cv void* shall have the same
representation and alignment requirements as cv char*.
Emphasis added. It is entirely possible to have different sizes and representations for different pointer types. There is no reason to expect a pointer to int to have the same size and representation as a pointer to double, or a pointer to a struct type, or a pointer to a function type. It's true for commodity platforms like x86, but not all the world runs on x86.
This is why you can't assign pointer values of one type to pointer values of another type without an explicit cast (except for converting between void * and other pointer types in C), since a representation change may be required.
Secondly, pointer arithmetic depends on the size of the pointed-to type. Assume you have pointers to a 32-bit int and a 64-bit double:
int *ip;
double *dp;
The expression ip + 1 will return the address of the next integer object (current address plus 4), while the expression dp + 1 will return the address of the next double object (current address plus 8).
If I assign the address of a double to a pointer to int, incrementing that int pointer won't take me to the next double object.
What is the integer representation of pointer?
A pointer value is a safely-derived pointer to a dynamic object only
if it has an object pointer type and it is one of the following:
[...]
— the result of a reinterpret_cast of an integer representation of a safely-derived pointer value;
[...]
My doubt is about the following:
Type int is less than any pointer to type. In particular, pointer to cannot be casted to int using reinterpret_cast.
The term is defined in the very next paragraph of the standard.
An integer value is an integer representation of a safely-derived pointer only if its type is at least as large as
std::intptr_t and it is one of the following:
— the result of a reinterpret_cast of a safely-derived pointer value;
— the result of a valid conversion of an integer representation of a safely-derived pointer value;
— the value of an object whose value was copied from a traceable pointer object, where at the time of
the copy the source object contained an integer representation of a safely-derived pointer value;
— the result of an additive or bitwise operation, one of whose operands is an integer representation of a
safely-derived pointer value P, if that result converted by reinterpret_cast<void*> would compare
equal to a safely-derived pointer computable from reinterpret_cast<void*>(P).