How can I put the address of the array into a variable?
char * str1 = "Hello";
int add = 0;
Now I want to put the address of the array into add.
I know I can print out the address of the array by the following way:
printf("Address = %p", str1);
But, I want to store the address in the variable.
If you want to store a memory address in a variable, the correct way is to type the variable as std::intptr_t or std::uintptr_t. That is because these types are guaranteed large enough to hold any memory address:
char * str1 = "Hello";
uintptr_t p = (uintptr_t)str1;
Apart from that, note that the value of str1 is already a memory address (it points to H) albeit a different one from the value of &str1 (which points to str1).
Turning a pointer into a number requires reinterpretation:
add = reinterpret_cast<int>(str1);
But there are all kinds of problems associated with this approach:
If sizeof(int) < sizeof(char*) then part of the pointer is lost, you won't be able to restore it.
Some optimizations might turn your code invalid due to unexpected aliasing.
If you need a variable which can hold pointers or integers, it would be better to use a union instead.
Use reinterpret_cast (see 5.2.10/p4):
4 A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is
implementation-defined. [ Note: It is intended to be unsurprising to those who know the addressing structure
of the underlying machine. —end note ]
static_assert( sizeof( unsigned int ) >= sizeof( &str1[ 0 ] ), "warning: use a wider type!" );
unsigned int add = reinterpret_cast< unsigned int >( &str1[ 0 ] );
From your comment on Jon's accepted answer:
char str1[] = "Hello";
char* str2 = &str1[0];
uintptr_t p = (uintptr_t)str2;
std::cout << std::hex << p << std::endl;
p = (uintptr_t)&str1[1];
std::cout << std::hex << p << std::endl;
p = (uintptr_t)&str1[0];
std::cout << std::hex << p << std::endl;
This implies your objective is to be able to stream the pointer value in a readable format. The Standard provides for this in an implementation-defined fashion as follows:
basic_ostream<charT,traits>& operator<<(const void* p);
So, perhaps what you really want is satisfied by the C++ style or (concise but less self-doncumenting and compiler-checked) C-style code below:
std::cout << static_cast<void*>(str1) << '\n';
std::cout << (void*)str1 << '\n';
(But, if you specifically want a numeric versions, or to ensure it's displayed in hex with no leading 0x or whatever else an implementation may decide upon, then you're back to Jon's suggestion or your own (possibly compile-time checked) logic to find a big enough integral type.
Related
I am little confused with the applicability of reinterpret_cast vs static_cast. From what I have read the general rules are to use static cast when the types can be interpreted at compile time hence the word static. This is the cast the C++ compiler uses internally for implicit casts also.
reinterpret_casts are applicable in two scenarios:
convert integer types to pointer types and vice versa
convert one pointer type to another. The general idea I get is this is unportable and should be avoided.
Where I am a little confused is one usage which I need, I am calling C++ from C and the C code needs to hold on to the C++ object so basically it holds a void*. What cast should be used to convert between the void * and the Class type?
I have seen usage of both static_cast and reinterpret_cast? Though from what I have been reading it appears static is better as the cast can happen at compile time? Though it says to use reinterpret_cast to convert from one pointer type to another?
The C++ standard guarantees the following:
static_casting a pointer to and from void* preserves the address. That is, in the following, a, b and c all point to the same address:
int* a = new int();
void* b = static_cast<void*>(a);
int* c = static_cast<int*>(b);
reinterpret_cast only guarantees that if you cast a pointer to a different type, and then reinterpret_cast it back to the original type, you get the original value. So in the following:
int* a = new int();
void* b = reinterpret_cast<void*>(a);
int* c = reinterpret_cast<int*>(b);
a and c contain the same value, but the value of b is unspecified. (in practice it will typically contain the same address as a and c, but that's not specified in the standard, and it may not be true on machines with more complex memory systems.)
For casting to and from void*, static_cast should be preferred.
One case when reinterpret_cast is necessary is when interfacing with opaque data types. This occurs frequently in vendor APIs over which the programmer has no control. Here's a contrived example where a vendor provides an API for storing and retrieving arbitrary global data:
// vendor.hpp
typedef struct _Opaque * VendorGlobalUserData;
void VendorSetUserData(VendorGlobalUserData p);
VendorGlobalUserData VendorGetUserData();
To use this API, the programmer must cast their data to VendorGlobalUserData and back again. static_cast won't work, one must use reinterpret_cast:
// main.cpp
#include "vendor.hpp"
#include <iostream>
using namespace std;
struct MyUserData {
MyUserData() : m(42) {}
int m;
};
int main() {
MyUserData u;
// store global data
VendorGlobalUserData d1;
// d1 = &u; // compile error
// d1 = static_cast<VendorGlobalUserData>(&u); // compile error
d1 = reinterpret_cast<VendorGlobalUserData>(&u); // ok
VendorSetUserData(d1);
// do other stuff...
// retrieve global data
VendorGlobalUserData d2 = VendorGetUserData();
MyUserData * p = 0;
// p = d2; // compile error
// p = static_cast<MyUserData *>(d2); // compile error
p = reinterpret_cast<MyUserData *>(d2); // ok
if (p) { cout << p->m << endl; }
return 0;
}
Below is a contrived implementation of the sample API:
// vendor.cpp
static VendorGlobalUserData g = 0;
void VendorSetUserData(VendorGlobalUserData p) { g = p; }
VendorGlobalUserData VendorGetUserData() { return g; }
The short answer:
If you don't know what reinterpret_cast stands for, don't use it. If you will need it in the future, you will know.
Full answer:
Let's consider basic number types.
When you convert for example int(12) to unsigned float (12.0f) your processor needs to invoke some calculations as both numbers has different bit representation. This is what static_cast stands for.
On the other hand, when you call reinterpret_cast the CPU does not invoke any calculations. It just treats a set of bits in the memory like if it had another type. So when you convert int* to float* with this keyword, the new value (after pointer dereferecing) has nothing to do with the old value in mathematical meaning (ignoring the fact that it is undefined behavior to read this value).
Be aware that reading or modifying values after reinterprt_cast'ing are very often Undefined Behavior. In most cases, you should use pointer or reference to std::byte (starting from C++17) if you want to achieve the bit representation of some data, it is almost always a legal operation. Other "safe" types are char and unsigned char, but I would say it shouldn't be used for that purpose in modern C++ as std::byte has better semantics.
Example: It is true that reinterpret_cast is not portable because of one reason - byte order (endianness). But this is often surprisingly the best reason to use it. Let's imagine the example: you have to read binary 32bit number from file, and you know it is big endian. Your code has to be generic and works properly on big endian (e.g. some ARM) and little endian (e.g. x86) systems. So you have to check the byte order. It is well-known on compile time so you can write constexpr function: You can write a function to achieve this:
/*constexpr*/ bool is_little_endian() {
std::uint16_t x=0x0001;
auto p = reinterpret_cast<std::uint8_t*>(&x);
return *p != 0;
}
Explanation: the binary representation of x in memory could be 0000'0000'0000'0001 (big) or 0000'0001'0000'0000 (little endian). After reinterpret-casting the byte under p pointer could be respectively 0000'0000 or 0000'0001. If you use static-casting, it will always be 0000'0001, no matter what endianness is being used.
EDIT:
In the first version I made example function is_little_endian to be constexpr. It compiles fine on the newest gcc (8.3.0) but the standard says it is illegal. The clang compiler refuses to compile it (which is correct).
The meaning of reinterpret_cast is not defined by the C++ standard. Hence, in theory a reinterpret_cast could crash your program. In practice compilers try to do what you expect, which is to interpret the bits of what you are passing in as if they were the type you are casting to. If you know what the compilers you are going to use do with reinterpret_cast you can use it, but to say that it is portable would be lying.
For the case you describe, and pretty much any case where you might consider reinterpret_cast, you can use static_cast or some other alternative instead. Among other things the standard has this to say about what you can expect of static_cast (§5.2.9):
An rvalue of type “pointer to cv void” can be explicitly converted to a pointer to object type. A value of type pointer to object converted to “pointer to cv void” and back to the original pointer type will have its original value.
So for your use case, it seems fairly clear that the standardization committee intended for you to use static_cast.
One use of reinterpret_cast is if you want to apply bitwise operations to (IEEE 754) floats. One example of this was the Fast Inverse Square-Root trick:
https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code
It treats the binary representation of the float as an integer, shifts it right and subtracts it from a constant, thereby halving and negating the exponent. After converting back to a float, it's subjected to a Newton-Raphson iteration to make this approximation more exact:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the deuce?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
This was originally written in C, so uses C casts, but the analogous C++ cast is the reinterpret_cast.
Here is a variant of Avi Ginsburg's program which clearly illustrates the property of reinterpret_cast mentioned by Chris Luengo, flodin, and cmdLP: that the compiler treats the pointed-to memory location as if it were an object of the new type:
#include <iostream>
#include <string>
#include <iomanip>
using namespace std;
class A
{
public:
int i;
};
class B : public A
{
public:
virtual void f() {}
};
int main()
{
string s;
B b;
b.i = 0;
A* as = static_cast<A*>(&b);
A* ar = reinterpret_cast<A*>(&b);
B* c = reinterpret_cast<B*>(ar);
cout << "as->i = " << hex << setfill('0') << as->i << "\n";
cout << "ar->i = " << ar->i << "\n";
cout << "b.i = " << b.i << "\n";
cout << "c->i = " << c->i << "\n";
cout << "\n";
cout << "&(as->i) = " << &(as->i) << "\n";
cout << "&(ar->i) = " << &(ar->i) << "\n";
cout << "&(b.i) = " << &(b.i) << "\n";
cout << "&(c->i) = " << &(c->i) << "\n";
cout << "\n";
cout << "&b = " << &b << "\n";
cout << "as = " << as << "\n";
cout << "ar = " << ar << "\n";
cout << "c = " << c << "\n";
cout << "Press ENTER to exit.\n";
getline(cin,s);
}
Which results in output like this:
as->i = 0
ar->i = 50ee64
b.i = 0
c->i = 0
&(as->i) = 00EFF978
&(ar->i) = 00EFF974
&(b.i) = 00EFF978
&(c->i) = 00EFF978
&b = 00EFF974
as = 00EFF978
ar = 00EFF974
c = 00EFF974
Press ENTER to exit.
It can be seen that the B object is built in memory as B-specific data first, followed by the embedded A object. The static_cast correctly returns the address of the embedded A object, and the pointer created by static_cast correctly gives the value of the data field. The pointer generated by reinterpret_cast treats b's memory location as if it were a plain A object, and so when the pointer tries to get the data field it returns some B-specific data as if it were the contents of this field.
One use of reinterpret_cast is to convert a pointer to an unsigned integer (when pointers and unsigned integers are the same size):
int i;
unsigned int u = reinterpret_cast<unsigned int>(&i);
You could use reinterprete_cast to check inheritance at compile time.
Look here:
Using reinterpret_cast to check inheritance at compile time
template <class outType, class inType>
outType safe_cast(inType pointer)
{
void* temp = static_cast<void*>(pointer);
return static_cast<outType>(temp);
}
I tried to conclude and wrote a simple safe cast using templates.
Note that this solution doesn't guarantee to cast pointers on a functions.
First you have some data in a specific type like int here:
int x = 0x7fffffff://==nan in binary representation
Then you want to access the same variable as an other type like float:
You can decide between
float y = reinterpret_cast<float&>(x);
//this could only be used in cpp, looks like a function with template-parameters
or
float y = *(float*)&(x);
//this could be used in c and cpp
BRIEF: it means that the same memory is used as a different type. So you could convert binary representations of floats as int type like above to floats. 0x80000000 is -0 for example (the mantissa and exponent are null but the sign, the msb, is one. This also works for doubles and long doubles.
OPTIMIZE: I think reinterpret_cast would be optimized in many compilers, while the c-casting is made by pointerarithmetic (the value must be copied to the memory, cause pointers couldn't point to cpu- registers).
NOTE: In both cases you should save the casted value in a variable before cast! This macro could help:
#define asvar(x) ({decltype(x) __tmp__ = (x); __tmp__; })
Quick answer: use static_cast if it compiles, otherwise resort to reinterpret_cast.
Read the FAQ! Holding C++ data in C can be risky.
In C++, a pointer to an object can be converted to void * without any casts. But it's not true the other way round. You'd need a static_cast to get the original pointer back.
Is it possible to cast pointers to arrays in C++?
Is casting pointers to arrays in C++, problem free? and / or, is it considered to be a good practice?
If it is possible, then what is the method of casting pointers to arrays in C++? and if there are problems with doing so, then how to avoid them?
Basically I am trying to cast a part of a const char[x] to const char[y], where y <= x and the Original const char[x] can not be accessed (in order to change it's forms to std::array or anything else) and (it's content) should not be copied to another array or pointer.
But to simplify the problem, lets consider the simple case of casting a pointerconst char *pointer; to an arrayconst char array[n];. I (guess) this can be accomplished by casting the address of the pointer to address of the array by (possibly) one of the following loosely described methods:
void *voidPointer = &pointer;
static_cast<const char (*)[n]>(voidPointer);
(const char (*)[n]) &pointer;
reinterpret_cast<const char (*)[n]>(&pointer);
And then reading the content of that address.
In order to test these cases of casting a pointer to an array , I ran the following code:
const char string[] = "123";
const char *pointer = string;
const char **pointerAddress = &pointer;
typedef const char array_type[sizeof(string) / sizeof(char)];
typedef array_type *arrayPointer_type;
void *voidPointer = pointerAddress;
arrayPointer_type arrayPointer[3];
arrayPointer[0] = static_cast<arrayPointer_type>(voidPointer);
arrayPointer[1] = (arrayPointer_type) pointerAddress;
arrayPointer[2] = reinterpret_cast<arrayPointer_type>(pointerAddress);
for(const arrayPointer_type& element : arrayPointer)
{
const char *array = *element;
bool sameValue = true;
bool sameAddress = true;
for(size_t idx = (sizeof(string) / sizeof(char)) - 1; idx--;) {
std::cout << "idx[" << idx << "]" << "pointer[" << pointer[idx]<< "]" << "array[" << array[idx]<< "], ";
if(pointer[idx] != array[idx]) {
sameValue = false;
}
if(&pointer[idx] != &array[idx]) {
sameAddress = false;
}
}
std::cout << std::endl << "sameValue[" << sameValue << "]" << "sameAddress[" << sameAddress << "]" <<std::endl <<std::endl;
}
And I got the following result:
idx[2]pointer[3]array[V], idx[1]pointer[2]array[], idx[0]pointer[1]array[�],
sameValue[0]sameAddress[0]
idx[2]pointer[3]array[V], idx[1]pointer[2]array[], idx[0]pointer[1]array[�],
sameValue[0]sameAddress[0]
idx[2]pointer[3]array[V], idx[1]pointer[2]array[], idx[0]pointer[1]array[�],
sameValue[0]sameAddress[0]
Which shows that non of the casts where correct in term of keeping(not changing) the content of string!
I studied Casting pointer to Array (int* to int[2]) and Passing a char pointer to a function accepting a reference to a char array , but I was not able to find the correct way!
So is it OK to cast pointers to arrays in C++? if so What is the correct way to cast pointers to arrays without changing their contents, in C++?
Update:
The library that I am working on is going to be used only on a very low resourced embedded platform and is going to be compiled only with GCC.
The reason for casting a portion of the const char [x] to const char [y] and then passing it to another functions or methods or ... is simply conformance with my other template functions or methods or ..., which can use the extra information in order to boost the speed of a process which it's current (non-optimized) version has already failed due to lack of speed. Please also note that the types of the strings that I want to deploy are not necessarily Null-terminated.
The following method is one example of such methods:
template<size_t STRING_SIZE>
requires conceptLowSize<STRING_SIZE>
self &operator<<(const char (&string)[STRING_SIZE])
{
.
.
.
return *this;
}
I am aware of the fact that deploying template functions/methods in this way might has the overhead of program size, but in this specific case, higher speed and lower memory consumption is more preferable than lower binary size of the compiled program.
Also I tried many other speed optimization solutions including compiling with -o3 option and the speed did not observably improve.
Your variable pointer contains the address of the first character in string; while, pointerAddress is a pointer to a variable containing the address of the first character in string. Basically pointerAddress has nothing to do with string so
void *voidPointer = pointerAddress;
arrayPointer_type arrayPointer[3];
arrayPointer[0] = static_cast<arrayPointer_type>(voidPointer);
arrayPointer[1] = (arrayPointer_type) pointerAddress;
arrayPointer[2] = reinterpret_cast<arrayPointer_type>(pointerAddress);
is all wrong in that it is all casting pointerAddress which contains the address of some other variable. The address of string is the address of its first character i.e. try this
void* voidPointer = const_cast<char*>(pointer);
arrayPointer_type arrayPointer[3];
arrayPointer[0] = static_cast<arrayPointer_type>(voidPointer);
arrayPointer[1] = (arrayPointer_type)pointer;
arrayPointer[2] = reinterpret_cast<arrayPointer_type>(pointer);
Not really sure what's going on here, I'm using Clion as my IDE which I don't believe has anything to do with this but I figured I'd add that information. My confusion comes from a function that I wrote
int Arry()
{
int Mynumbers [5] = {10};
std::cout << Mynumbers;
}
something simple. It should be assigning 5 integers the value of 10. But when I print out Mynumbers I am shown the memory address. Why is this happening, I thought that was what calling pointers was for. Thank you for your time.
Sincerely,
Nicholas
It is a bit complicated, and there are a few issues at play:
std::cout (actually, std::ostream, of which std::cout is an instance, does not have an overload of operator<< that understands plain arrays. It does have overloads that understand pointers.
In C++ (and C) an array name can be used as an expression in a place where a pointer is needed. When there is no better option, the array name will decay to a pointer. That is what makes the following legal: int a[10] = {}; int* p = a;.
The overload that takes a pointer prints it as a hexadecimal address, unless the pointer is of type char* or const char* (or wchar versions), in which case it treats it as a null terminated string.
This is what is happening here: because there isn't an operator<< overload that matches the array, it decays to the overload taking a pointer. And as it isn't a character type pointer, you see the hexadecimal address. You are seeing the equivalent of cout << &MyNumbers[0];.
Some notes:
void Arry() // use void if nothing is being returned
{
int Mynumbers[5] = {10}; // first element is 10, the rest are 0
//std::cout << Mynumbers; // will print the first address because the array decays to a pointer which is then printed
for (auto i : Mynumbers) // range-for takes note of the size of the array (no decay)
std::cout << i << '\t';
}
In C++, you can think of an array as a pointer to a memory address (this isn't strictly true, and others can explain the subtle differences). When you are calling cout on your array name, you are asking for it's contents: the memory address.
If you wish to see what's in the array, you can use a simple for loop:
for (int i = 0; i < 5; i++)
std::cout << Mynumbers[i] << " ";
The value of Mynumbers is in fact the adress of the first element in the array.
try the following:
for(int i=0; i<5;i++) {
cout << Mynumbers[i];
}
In "void pointers" example in the tutorial on cplusplus.com, I try to compare like following. Why do we still need * in parenthesis? What is happening when no *?
void increase(void* data, int psize) {
if (psize == sizeof(char)) {
char* pchar;
pchar = (char*) data;
cout << "pchar=" << pchar << endl;
cout << "*pchar=" << *pchar << endl;
//++(*pchar); // increases the value pointed to, as expected
++(pchar); // the value pointed to doesn't change
} else if (psize == sizeof(int)) {
int* pint;
pint = (int*) data;
//++(*pint); // increases the value pointed to, as expected
++(pint); // the value pointed to doesn't change
}
}
int main() {
char a = 'x';
int b = 1602;
increase(&a, sizeof(a));
increase(&b, sizeof(b));
cout << a << ", " << b << endl;
return 0;
}
Update after accepting solution) I try to make clear what I didn't get, based on #Cody Gray's answer. The address of pchar is incremented to point to nonsense location. But because variable a in main is coutted instead of pchar, this cout still prints a value that somewhat makes sense (in this example it would be 'x').
The * operator dereferences the pointer.
Therefore, this code:
++(*pint)
increments the value pointed to by pint.
By contrast, this code:
++(pint)
increments the pointer itself, pint. This is why the value pointed to by the pointer doesn't change. You're not changing the value pointed to by the pointer, but rather the value of the pointer itself.
Incrementing a pointer will cause it to point to an entirely different value in memory. In this case, since you only allocated space for a single integer, it will point to a nonsense value.
When you make a cast from pointer to other pointer
char* pchar;
pchar = (char*) data;
you telling to compiler to process this memory as char pointer, in which you could perform pointer arithmetic to this variable. Therefore, you are not working with the value of the pointer just with pointer, that's is a memory address in which you are making the arithmetic.
To work with the value you need to use "*" or access to memory pointer by the char.
I am little confused with the applicability of reinterpret_cast vs static_cast. From what I have read the general rules are to use static cast when the types can be interpreted at compile time hence the word static. This is the cast the C++ compiler uses internally for implicit casts also.
reinterpret_casts are applicable in two scenarios:
convert integer types to pointer types and vice versa
convert one pointer type to another. The general idea I get is this is unportable and should be avoided.
Where I am a little confused is one usage which I need, I am calling C++ from C and the C code needs to hold on to the C++ object so basically it holds a void*. What cast should be used to convert between the void * and the Class type?
I have seen usage of both static_cast and reinterpret_cast? Though from what I have been reading it appears static is better as the cast can happen at compile time? Though it says to use reinterpret_cast to convert from one pointer type to another?
The C++ standard guarantees the following:
static_casting a pointer to and from void* preserves the address. That is, in the following, a, b and c all point to the same address:
int* a = new int();
void* b = static_cast<void*>(a);
int* c = static_cast<int*>(b);
reinterpret_cast only guarantees that if you cast a pointer to a different type, and then reinterpret_cast it back to the original type, you get the original value. So in the following:
int* a = new int();
void* b = reinterpret_cast<void*>(a);
int* c = reinterpret_cast<int*>(b);
a and c contain the same value, but the value of b is unspecified. (in practice it will typically contain the same address as a and c, but that's not specified in the standard, and it may not be true on machines with more complex memory systems.)
For casting to and from void*, static_cast should be preferred.
One case when reinterpret_cast is necessary is when interfacing with opaque data types. This occurs frequently in vendor APIs over which the programmer has no control. Here's a contrived example where a vendor provides an API for storing and retrieving arbitrary global data:
// vendor.hpp
typedef struct _Opaque * VendorGlobalUserData;
void VendorSetUserData(VendorGlobalUserData p);
VendorGlobalUserData VendorGetUserData();
To use this API, the programmer must cast their data to VendorGlobalUserData and back again. static_cast won't work, one must use reinterpret_cast:
// main.cpp
#include "vendor.hpp"
#include <iostream>
using namespace std;
struct MyUserData {
MyUserData() : m(42) {}
int m;
};
int main() {
MyUserData u;
// store global data
VendorGlobalUserData d1;
// d1 = &u; // compile error
// d1 = static_cast<VendorGlobalUserData>(&u); // compile error
d1 = reinterpret_cast<VendorGlobalUserData>(&u); // ok
VendorSetUserData(d1);
// do other stuff...
// retrieve global data
VendorGlobalUserData d2 = VendorGetUserData();
MyUserData * p = 0;
// p = d2; // compile error
// p = static_cast<MyUserData *>(d2); // compile error
p = reinterpret_cast<MyUserData *>(d2); // ok
if (p) { cout << p->m << endl; }
return 0;
}
Below is a contrived implementation of the sample API:
// vendor.cpp
static VendorGlobalUserData g = 0;
void VendorSetUserData(VendorGlobalUserData p) { g = p; }
VendorGlobalUserData VendorGetUserData() { return g; }
The short answer:
If you don't know what reinterpret_cast stands for, don't use it. If you will need it in the future, you will know.
Full answer:
Let's consider basic number types.
When you convert for example int(12) to unsigned float (12.0f) your processor needs to invoke some calculations as both numbers has different bit representation. This is what static_cast stands for.
On the other hand, when you call reinterpret_cast the CPU does not invoke any calculations. It just treats a set of bits in the memory like if it had another type. So when you convert int* to float* with this keyword, the new value (after pointer dereferecing) has nothing to do with the old value in mathematical meaning (ignoring the fact that it is undefined behavior to read this value).
Be aware that reading or modifying values after reinterprt_cast'ing are very often Undefined Behavior. In most cases, you should use pointer or reference to std::byte (starting from C++17) if you want to achieve the bit representation of some data, it is almost always a legal operation. Other "safe" types are char and unsigned char, but I would say it shouldn't be used for that purpose in modern C++ as std::byte has better semantics.
Example: It is true that reinterpret_cast is not portable because of one reason - byte order (endianness). But this is often surprisingly the best reason to use it. Let's imagine the example: you have to read binary 32bit number from file, and you know it is big endian. Your code has to be generic and works properly on big endian (e.g. some ARM) and little endian (e.g. x86) systems. So you have to check the byte order. It is well-known on compile time so you can write constexpr function: You can write a function to achieve this:
/*constexpr*/ bool is_little_endian() {
std::uint16_t x=0x0001;
auto p = reinterpret_cast<std::uint8_t*>(&x);
return *p != 0;
}
Explanation: the binary representation of x in memory could be 0000'0000'0000'0001 (big) or 0000'0001'0000'0000 (little endian). After reinterpret-casting the byte under p pointer could be respectively 0000'0000 or 0000'0001. If you use static-casting, it will always be 0000'0001, no matter what endianness is being used.
EDIT:
In the first version I made example function is_little_endian to be constexpr. It compiles fine on the newest gcc (8.3.0) but the standard says it is illegal. The clang compiler refuses to compile it (which is correct).
The meaning of reinterpret_cast is not defined by the C++ standard. Hence, in theory a reinterpret_cast could crash your program. In practice compilers try to do what you expect, which is to interpret the bits of what you are passing in as if they were the type you are casting to. If you know what the compilers you are going to use do with reinterpret_cast you can use it, but to say that it is portable would be lying.
For the case you describe, and pretty much any case where you might consider reinterpret_cast, you can use static_cast or some other alternative instead. Among other things the standard has this to say about what you can expect of static_cast (§5.2.9):
An rvalue of type “pointer to cv void” can be explicitly converted to a pointer to object type. A value of type pointer to object converted to “pointer to cv void” and back to the original pointer type will have its original value.
So for your use case, it seems fairly clear that the standardization committee intended for you to use static_cast.
One use of reinterpret_cast is if you want to apply bitwise operations to (IEEE 754) floats. One example of this was the Fast Inverse Square-Root trick:
https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code
It treats the binary representation of the float as an integer, shifts it right and subtracts it from a constant, thereby halving and negating the exponent. After converting back to a float, it's subjected to a Newton-Raphson iteration to make this approximation more exact:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the deuce?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
This was originally written in C, so uses C casts, but the analogous C++ cast is the reinterpret_cast.
Here is a variant of Avi Ginsburg's program which clearly illustrates the property of reinterpret_cast mentioned by Chris Luengo, flodin, and cmdLP: that the compiler treats the pointed-to memory location as if it were an object of the new type:
#include <iostream>
#include <string>
#include <iomanip>
using namespace std;
class A
{
public:
int i;
};
class B : public A
{
public:
virtual void f() {}
};
int main()
{
string s;
B b;
b.i = 0;
A* as = static_cast<A*>(&b);
A* ar = reinterpret_cast<A*>(&b);
B* c = reinterpret_cast<B*>(ar);
cout << "as->i = " << hex << setfill('0') << as->i << "\n";
cout << "ar->i = " << ar->i << "\n";
cout << "b.i = " << b.i << "\n";
cout << "c->i = " << c->i << "\n";
cout << "\n";
cout << "&(as->i) = " << &(as->i) << "\n";
cout << "&(ar->i) = " << &(ar->i) << "\n";
cout << "&(b.i) = " << &(b.i) << "\n";
cout << "&(c->i) = " << &(c->i) << "\n";
cout << "\n";
cout << "&b = " << &b << "\n";
cout << "as = " << as << "\n";
cout << "ar = " << ar << "\n";
cout << "c = " << c << "\n";
cout << "Press ENTER to exit.\n";
getline(cin,s);
}
Which results in output like this:
as->i = 0
ar->i = 50ee64
b.i = 0
c->i = 0
&(as->i) = 00EFF978
&(ar->i) = 00EFF974
&(b.i) = 00EFF978
&(c->i) = 00EFF978
&b = 00EFF974
as = 00EFF978
ar = 00EFF974
c = 00EFF974
Press ENTER to exit.
It can be seen that the B object is built in memory as B-specific data first, followed by the embedded A object. The static_cast correctly returns the address of the embedded A object, and the pointer created by static_cast correctly gives the value of the data field. The pointer generated by reinterpret_cast treats b's memory location as if it were a plain A object, and so when the pointer tries to get the data field it returns some B-specific data as if it were the contents of this field.
One use of reinterpret_cast is to convert a pointer to an unsigned integer (when pointers and unsigned integers are the same size):
int i;
unsigned int u = reinterpret_cast<unsigned int>(&i);
You could use reinterprete_cast to check inheritance at compile time.
Look here:
Using reinterpret_cast to check inheritance at compile time
template <class outType, class inType>
outType safe_cast(inType pointer)
{
void* temp = static_cast<void*>(pointer);
return static_cast<outType>(temp);
}
I tried to conclude and wrote a simple safe cast using templates.
Note that this solution doesn't guarantee to cast pointers on a functions.
First you have some data in a specific type like int here:
int x = 0x7fffffff://==nan in binary representation
Then you want to access the same variable as an other type like float:
You can decide between
float y = reinterpret_cast<float&>(x);
//this could only be used in cpp, looks like a function with template-parameters
or
float y = *(float*)&(x);
//this could be used in c and cpp
BRIEF: it means that the same memory is used as a different type. So you could convert binary representations of floats as int type like above to floats. 0x80000000 is -0 for example (the mantissa and exponent are null but the sign, the msb, is one. This also works for doubles and long doubles.
OPTIMIZE: I think reinterpret_cast would be optimized in many compilers, while the c-casting is made by pointerarithmetic (the value must be copied to the memory, cause pointers couldn't point to cpu- registers).
NOTE: In both cases you should save the casted value in a variable before cast! This macro could help:
#define asvar(x) ({decltype(x) __tmp__ = (x); __tmp__; })
Quick answer: use static_cast if it compiles, otherwise resort to reinterpret_cast.
Read the FAQ! Holding C++ data in C can be risky.
In C++, a pointer to an object can be converted to void * without any casts. But it's not true the other way round. You'd need a static_cast to get the original pointer back.