C++ Read access violation for vector - c++

Im getting an exception when I`m trying to use vector[int_number] and my program stop working.
uint64_t data = 0xffeeddccbbaa5577;
uint16_t *vector = (uint16_t*) data;
int currentPosition = 0;
while (currentPosition <= 3) {
uint16_t header = vector[currentPosition]; // problem here
Visual Studio 2017 returns me: Unhandled exception thrown: read access violation.
vector was 0x6111F12.
Im stuck here. If you have any idea what I should do I`ll be grateful. Thanks in advance!

Setting aside all the undefined behaviour you get due to strict aliasing violations, in the current crop of Intel chips and MSVC runtime, all pointers are 48 bits.
So 0xffeeddccbbaa5577 is never a valid pointer value.
So the behaviour on dereferencing that value will be undefined.
If you wanted to break up data, into four elements of an appropriate type, then one method is to create a uint16_t foo[4] say and memcpy the data starting at &data to foo.

By accessing the data through a pointer of different type you obtained by casting you wander off into undefined-behavior-land. Instead of this, try the following (note I also replaced your while loop with a ranged for loop avoiding to have to keep a counter)
#include <iostream>
#include <cstring>
int main() {
uint64_t data = 0xffeeddccbbaa5577;
uint16_t vector[4];
memcpy(vector, &data, sizeof(uint64_t));
for (uint16_t header : vector)
{
std::cout << std::hex << header << std::endl;
}
}
yielding
5577
bbaa
ddcc
ffee
If you use reinterpret_cast you hold two pointers of different type pointing to same address which may easily lead to undefined behavior. memcpy avoids that by creating a copy of the memory location and you may safly access it with a pointer of a different type. Also take a look into type-punning (as pointed out by #DanielLangr)

It's really very easy, but you were so far off with your original attempt that you've confused everyone.
uint16_t vector[] = { 0x5577, 0xbbaa, 0xddcc, 0xffee };
Ask the right question, if you'd asked the question you have in the comments we'd have got there a lot quicker.

Here's a concrete example that should avoid any undefined behavior due to strict aliasing / "illegal" casts / etc., since this seems to be what you're actually interested in.
This code takes a std::uint64_t, copies it into an array of four std::uint16_ts, modifies the values in the array, and then copies them back into the original std::uint64_t.
#include <cstdint>
#include <cstring>
#include <iostream>
int main() {
std::uint64_t data = 0xffeeddccbbaa5577;
std::uint16_t data_spliced[4];
std::memcpy(&data_spliced, &data, sizeof(data));
std::cout << "Original data:\n" << data << "\nOriginal, spliced data:\n";
for (const auto spliced_value : data_spliced) {
std::cout << spliced_value << " ";
}
std::cout << "\n\n";
data_spliced[2] = 0xd00d;
memcpy(&data, &data_spliced, sizeof(data));
std::cout << "Modified data:\n" << data << "\nModified, spliced data:\n";
for (const auto spliced_value : data_spliced) {
std::cout << spliced_value << " ";
}
std::cout << '\n';
}
With output (on my machine):
Original data:
18441921395520329079
Original, spliced data:
21879 48042 56780 65518
Modified data:
18441906281530414455
Modified, spliced data:
21879 48042 53261 65518

You need to take the address of that variable if you want to assign it to a pointer
const uint16_t* vector = reinterpret_cast<const uint16_t*>( &data ) ;
NOTE:
This works on MSVC 2017, but...
This is a truck load of undefined behaviour! – Bathsheba
As cpprefrence for reinterpret_cast says:
5) Any pointer to object of type T1 can be converted to pointer to object of another type cv T2. This is exactly equivalent to static_cast<cv T2*>(static_cast<cv void*>(expression)) (which implies that if T2's alignment requirement is not stricter than T1's, the value of the pointer does not change and conversion of the resulting pointer back to its original type yields the original value). In any case, the resulting pointer may only be dereferenced safely if allowed by the type aliasing rules (see below)
...
Type aliasing.
Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true:
AliasedType and DynamicType are similar.
AliasedType is the
(possibly cv-qualified) signed or unsigned variant of DynamicType.
AliasedType is std::byte (since C++17), char, or unsigned char: this
permits examination of the object representation of any object as an
array of bytes.
Note that many C++ compilers relax this rule, as a non-standard language extension, to allow wrong-type access through the inactive member of a union (such access is not undefined in C)
The above code does not fulfill any of the Aliasing rules.

Related

Is reinterpret_cast only made for type punning? [duplicate]

I am little confused with the applicability of reinterpret_cast vs static_cast. From what I have read the general rules are to use static cast when the types can be interpreted at compile time hence the word static. This is the cast the C++ compiler uses internally for implicit casts also.
reinterpret_casts are applicable in two scenarios:
convert integer types to pointer types and vice versa
convert one pointer type to another. The general idea I get is this is unportable and should be avoided.
Where I am a little confused is one usage which I need, I am calling C++ from C and the C code needs to hold on to the C++ object so basically it holds a void*. What cast should be used to convert between the void * and the Class type?
I have seen usage of both static_cast and reinterpret_cast? Though from what I have been reading it appears static is better as the cast can happen at compile time? Though it says to use reinterpret_cast to convert from one pointer type to another?
The C++ standard guarantees the following:
static_casting a pointer to and from void* preserves the address. That is, in the following, a, b and c all point to the same address:
int* a = new int();
void* b = static_cast<void*>(a);
int* c = static_cast<int*>(b);
reinterpret_cast only guarantees that if you cast a pointer to a different type, and then reinterpret_cast it back to the original type, you get the original value. So in the following:
int* a = new int();
void* b = reinterpret_cast<void*>(a);
int* c = reinterpret_cast<int*>(b);
a and c contain the same value, but the value of b is unspecified. (in practice it will typically contain the same address as a and c, but that's not specified in the standard, and it may not be true on machines with more complex memory systems.)
For casting to and from void*, static_cast should be preferred.
One case when reinterpret_cast is necessary is when interfacing with opaque data types. This occurs frequently in vendor APIs over which the programmer has no control. Here's a contrived example where a vendor provides an API for storing and retrieving arbitrary global data:
// vendor.hpp
typedef struct _Opaque * VendorGlobalUserData;
void VendorSetUserData(VendorGlobalUserData p);
VendorGlobalUserData VendorGetUserData();
To use this API, the programmer must cast their data to VendorGlobalUserData and back again. static_cast won't work, one must use reinterpret_cast:
// main.cpp
#include "vendor.hpp"
#include <iostream>
using namespace std;
struct MyUserData {
MyUserData() : m(42) {}
int m;
};
int main() {
MyUserData u;
// store global data
VendorGlobalUserData d1;
// d1 = &u; // compile error
// d1 = static_cast<VendorGlobalUserData>(&u); // compile error
d1 = reinterpret_cast<VendorGlobalUserData>(&u); // ok
VendorSetUserData(d1);
// do other stuff...
// retrieve global data
VendorGlobalUserData d2 = VendorGetUserData();
MyUserData * p = 0;
// p = d2; // compile error
// p = static_cast<MyUserData *>(d2); // compile error
p = reinterpret_cast<MyUserData *>(d2); // ok
if (p) { cout << p->m << endl; }
return 0;
}
Below is a contrived implementation of the sample API:
// vendor.cpp
static VendorGlobalUserData g = 0;
void VendorSetUserData(VendorGlobalUserData p) { g = p; }
VendorGlobalUserData VendorGetUserData() { return g; }
The short answer:
If you don't know what reinterpret_cast stands for, don't use it. If you will need it in the future, you will know.
Full answer:
Let's consider basic number types.
When you convert for example int(12) to unsigned float (12.0f) your processor needs to invoke some calculations as both numbers has different bit representation. This is what static_cast stands for.
On the other hand, when you call reinterpret_cast the CPU does not invoke any calculations. It just treats a set of bits in the memory like if it had another type. So when you convert int* to float* with this keyword, the new value (after pointer dereferecing) has nothing to do with the old value in mathematical meaning (ignoring the fact that it is undefined behavior to read this value).
Be aware that reading or modifying values after reinterprt_cast'ing are very often Undefined Behavior. In most cases, you should use pointer or reference to std::byte (starting from C++17) if you want to achieve the bit representation of some data, it is almost always a legal operation. Other "safe" types are char and unsigned char, but I would say it shouldn't be used for that purpose in modern C++ as std::byte has better semantics.
Example: It is true that reinterpret_cast is not portable because of one reason - byte order (endianness). But this is often surprisingly the best reason to use it. Let's imagine the example: you have to read binary 32bit number from file, and you know it is big endian. Your code has to be generic and works properly on big endian (e.g. some ARM) and little endian (e.g. x86) systems. So you have to check the byte order. It is well-known on compile time so you can write constexpr function: You can write a function to achieve this:
/*constexpr*/ bool is_little_endian() {
std::uint16_t x=0x0001;
auto p = reinterpret_cast<std::uint8_t*>(&x);
return *p != 0;
}
Explanation: the binary representation of x in memory could be 0000'0000'0000'0001 (big) or 0000'0001'0000'0000 (little endian). After reinterpret-casting the byte under p pointer could be respectively 0000'0000 or 0000'0001. If you use static-casting, it will always be 0000'0001, no matter what endianness is being used.
EDIT:
In the first version I made example function is_little_endian to be constexpr. It compiles fine on the newest gcc (8.3.0) but the standard says it is illegal. The clang compiler refuses to compile it (which is correct).
The meaning of reinterpret_cast is not defined by the C++ standard. Hence, in theory a reinterpret_cast could crash your program. In practice compilers try to do what you expect, which is to interpret the bits of what you are passing in as if they were the type you are casting to. If you know what the compilers you are going to use do with reinterpret_cast you can use it, but to say that it is portable would be lying.
For the case you describe, and pretty much any case where you might consider reinterpret_cast, you can use static_cast or some other alternative instead. Among other things the standard has this to say about what you can expect of static_cast (§5.2.9):
An rvalue of type “pointer to cv void” can be explicitly converted to a pointer to object type. A value of type pointer to object converted to “pointer to cv void” and back to the original pointer type will have its original value.
So for your use case, it seems fairly clear that the standardization committee intended for you to use static_cast.
One use of reinterpret_cast is if you want to apply bitwise operations to (IEEE 754) floats. One example of this was the Fast Inverse Square-Root trick:
https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code
It treats the binary representation of the float as an integer, shifts it right and subtracts it from a constant, thereby halving and negating the exponent. After converting back to a float, it's subjected to a Newton-Raphson iteration to make this approximation more exact:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the deuce?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
This was originally written in C, so uses C casts, but the analogous C++ cast is the reinterpret_cast.
Here is a variant of Avi Ginsburg's program which clearly illustrates the property of reinterpret_cast mentioned by Chris Luengo, flodin, and cmdLP: that the compiler treats the pointed-to memory location as if it were an object of the new type:
#include <iostream>
#include <string>
#include <iomanip>
using namespace std;
class A
{
public:
int i;
};
class B : public A
{
public:
virtual void f() {}
};
int main()
{
string s;
B b;
b.i = 0;
A* as = static_cast<A*>(&b);
A* ar = reinterpret_cast<A*>(&b);
B* c = reinterpret_cast<B*>(ar);
cout << "as->i = " << hex << setfill('0') << as->i << "\n";
cout << "ar->i = " << ar->i << "\n";
cout << "b.i = " << b.i << "\n";
cout << "c->i = " << c->i << "\n";
cout << "\n";
cout << "&(as->i) = " << &(as->i) << "\n";
cout << "&(ar->i) = " << &(ar->i) << "\n";
cout << "&(b.i) = " << &(b.i) << "\n";
cout << "&(c->i) = " << &(c->i) << "\n";
cout << "\n";
cout << "&b = " << &b << "\n";
cout << "as = " << as << "\n";
cout << "ar = " << ar << "\n";
cout << "c = " << c << "\n";
cout << "Press ENTER to exit.\n";
getline(cin,s);
}
Which results in output like this:
as->i = 0
ar->i = 50ee64
b.i = 0
c->i = 0
&(as->i) = 00EFF978
&(ar->i) = 00EFF974
&(b.i) = 00EFF978
&(c->i) = 00EFF978
&b = 00EFF974
as = 00EFF978
ar = 00EFF974
c = 00EFF974
Press ENTER to exit.
It can be seen that the B object is built in memory as B-specific data first, followed by the embedded A object. The static_cast correctly returns the address of the embedded A object, and the pointer created by static_cast correctly gives the value of the data field. The pointer generated by reinterpret_cast treats b's memory location as if it were a plain A object, and so when the pointer tries to get the data field it returns some B-specific data as if it were the contents of this field.
One use of reinterpret_cast is to convert a pointer to an unsigned integer (when pointers and unsigned integers are the same size):
int i;
unsigned int u = reinterpret_cast<unsigned int>(&i);
You could use reinterprete_cast to check inheritance at compile time.
Look here:
Using reinterpret_cast to check inheritance at compile time
template <class outType, class inType>
outType safe_cast(inType pointer)
{
void* temp = static_cast<void*>(pointer);
return static_cast<outType>(temp);
}
I tried to conclude and wrote a simple safe cast using templates.
Note that this solution doesn't guarantee to cast pointers on a functions.
First you have some data in a specific type like int here:
int x = 0x7fffffff://==nan in binary representation
Then you want to access the same variable as an other type like float:
You can decide between
float y = reinterpret_cast<float&>(x);
//this could only be used in cpp, looks like a function with template-parameters
or
float y = *(float*)&(x);
//this could be used in c and cpp
BRIEF: it means that the same memory is used as a different type. So you could convert binary representations of floats as int type like above to floats. 0x80000000 is -0 for example (the mantissa and exponent are null but the sign, the msb, is one. This also works for doubles and long doubles.
OPTIMIZE: I think reinterpret_cast would be optimized in many compilers, while the c-casting is made by pointerarithmetic (the value must be copied to the memory, cause pointers couldn't point to cpu- registers).
NOTE: In both cases you should save the casted value in a variable before cast! This macro could help:
#define asvar(x) ({decltype(x) __tmp__ = (x); __tmp__; })
Quick answer: use static_cast if it compiles, otherwise resort to reinterpret_cast.
Read the FAQ! Holding C++ data in C can be risky.
In C++, a pointer to an object can be converted to void * without any casts. But it's not true the other way round. You'd need a static_cast to get the original pointer back.

Cast an object value without pointers

Let's assume that A and B are two classes (or structures) having no inheritance relationships (thus, object slicing cannot work). I also have an object b of the type B. I would like to interpret its binary value as a value of type A:
A a = b;
I could use reinterpret_cast, but I would need to use pointers:
A a = reinterpret_cast<A>(b); // error: invalid cast
A a = *reinterpret_cast<A *>(&b); // correct [EDIT: see *footnote]
Is there a more compact way (without pointers) that does the same? (Including the case where sizeof(A) != sizeof(B))
Example of code that works using pointers: [EDIT: see *footnote]
#include <iostream>
using namespace std;
struct C {
int i;
string s;
};
struct S {
unsigned char data[sizeof(C)];
};
int main() {
C c;
c.i = 4;
c.s = "this is a string";
S s = *reinterpret_cast<S *>(&c);
C s1 = *reinterpret_cast<C *>(&s);
cout << s1.i << " " << s1.s << endl;
cout << reinterpret_cast<C *>(&s)->i << endl;
return 0;
}
*footnote: It worked when I tried it, but it is actually an undefined behavior (which means that it may work or not) - see comments below
No. I think there's nothing in the C++ syntax that allows you to implicitly ignore types. First, that's against the notion of static typing. Second, C++ lacks standardization at binary level. So, whatever you do to trick the compiler about the types you're using might be specific to a compiler implementation.
That being said, if you really wanna do it, you should check how your compiler's data alignment/padding works (i.e.: struct padding in c++) and if there's a way to control it (i.e.: What is the meaning of "__attribute__((packed, aligned(4))) "). If you're planning to do this across compilers (i.e.: with data transmitted across the network), then you should be extra careful. There are also platform issues, like different addressing models and endianness.
Yes, you can do it without a pointer:
A a = reinterpret_cast<A &>(b); // note the '&'
Note that this may be undefined behaviour. Check out the exact conditions at http://en.cppreference.com/w/cpp/language/reinterpret_cast

const char* to int cast?

I suppose the behaviour of the following snippet is supposed to be undefined but I just wanted to make sure I am understanding things right.
Let's say we have this code:
#include <iostream>
int main()
{
std::cout << "mamut" - 8 << std::endl;
return 0;
}
So what I think this does is (char*)((int)(const char*) - (int)), though the output after this is pretty strange, not that I expect it to make any real sense. So my question is about the casting between char* and int - is it undefined, or is there some logic behind it?
EDIT:
Let me just add this:
#include <iostream>
int main ()
{
const char* a = "mamut";
int b = int(a);
std::cout << b << std::endl;
std::cout << &a <<std::endl;
// seems b!= &a
for( int i = 0; i<100;i++)
{
std::cout<<(const char*)((int)a - i)<<std::endl;
}
return 0;
}
The output after i gets big enough gives me a something like _Jv_RegisterClasses etc.
Just for the record:
std::cout << a - i << std::endl;
produces the same result as:
std::cout<<(const char*)((int)a - i)<<std::endl;
There is no cast, you are merely telling cout that you want to print the string at the address of the string literal "mamut" minus 8 bytes. You are doing pointer arithmetic. cout will then print whatever happens to be at that address, or possibly crash & burn, since accessing arrays out of bounds leads to undefined behavior.
EDIT
Regarding the edit by the op: converting an address to int doesn't necessarily result in a correct number identical to the address. An address doesn't necessarily fit in an int and on top of that, int is a signed type and it doesn't make any sense to store addresses in signed types.
To guarantee a conversion from pointer to integer without losses, you need to use uintptr_t from stdint.h.
To quote the C standard 6.3.2.3 (I believe C++ is identical in this case):
Any pointer type may be converted to an integer type. Except as
previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type, the behavior is
undefined. The result need not be in the range of values of any
integer type.
There is no casting going on. "mamut" is a pointer to characters, and - 8 will do pointer arithmetic on it. You are right that it's undefined behavior, so even though the semantic behavior is pointer arithmetic, the runtime behavior can be literally anything.
You are printing string starting from address of "mamut" minus 8 bytes till null terminator i.e. in total 8+5 = 13 chars

Why strange behavior with casting back pointer to the original class?

Assume that in my code I have to store a void* as data member and typecast it back to the original class pointer when needed. To test its reliability, I wrote a test program (linux ubuntu 4.4.1 g++ -04 -Wall) and I was shocked to see the behavior.
struct A
{
int i;
static int c;
A () : i(c++) { cout<<"A() : i("<<i<<")\n"; }
};
int A::c;
int main ()
{
void *p = new A[3]; // good behavior for A* p = new A[3];
cout<<"p->i = "<<((A*)p)->i<<endl;
((A*&)p)++;
cout<<"p->i = "<<((A*)p)->i<<endl;
((A*&)p)++;
cout<<"p->i = "<<((A*)p)->i<<endl;
}
This is just a test program; in actual for my case, it's mandatory to store any pointer as void* and then cast it back to the actual pointer (with help of template). So let's not worry about that part. The output of the above code is,
p->i = 0
p->i = 0 // ?? why not 1
p->i = 1
However if you change the void* p; to A* p; it gives expected behavior. WHY ?
Another question, I cannot get away with (A*&) otherwise I cannot use operator ++; but it also gives warning as, dereferencing type-punned pointer will break strict-aliasing rules. Is there any decent way to overcome warning ?
Well, as the compiler warns you, you are violating the strict aliasing rule, which formally means that the results are undefined.
You can eliminate the strict aliasing violation by using a function template for the increment:
template<typename T>
void advance_pointer_as(void*& p, int n = 1) {
T* p_a(static_cast<T*>(p));
p_a += n;
p = p_a;
}
With this function template, the following definition of main() yields the expected results on the Ideone compiler (and emits no warnings):
int main()
{
void* p = new A[3];
std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
advance_pointer_as<A>(p);
std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
advance_pointer_as<A>(p);
std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
}
You have already received the correct answer and it is indeed the violation of the strict aliasing rule that leads to the unpredictable behavior of the code. I'd just note that the title of your question makes reference to "casting back pointer to the original class". In reality your code does not have anything to do with casting anything "back". Your code performs reinterpretation of raw memory content occupied by a void * pointer as a A * pointer. This is not "casting back". This is reinterpretation. Not even remotely the same thing.
A good way to illustrate the difference would be to use and int and float example. A float value declared and initialized as
float f = 2.0;
cab be cast (explicitly or implicitly converted) to int type
int i = (int) f;
with the expected result
assert(i == 2);
This is indeed a cast (a conversion).
Alternatively, the same float value can be also reinterpreted as an int value
int i = (int &) f;
However, in this case the value of i will be totally meaningless and generally unpredictable. I hope it is easy to see the difference between a conversion and a memory reinterpretation from these examples.
Reinterpretation is exactly what you are doing in your code. The (A *&) p expression is nothing else than a reinterpretation of raw memory occupied by pointer void *p as pointer of type A *. The language does not guarantee that these two pointer types have the same representation and even the same size. So, expecting the predictable behavior from your code is like expecting the above (int &) f expression to evaluate to 2.
The proper way to really "cast back" your void * pointer would be to do (A *) p, not (A *&) p. The result of (A *) p would indeed be the original pointer value, that can be safely manipulated by pointer arithmetic. The only proper way to obtain the original value as an lvalue would be to use an additional variable
A *pa = (A *) p;
...
pa++;
...
And there's no legal way to create an lvalue "in place", as you attempted to by your (A *&) p cast. The behavior of your code is an illustration of that.
As others have commented, your code appears like it should work. Only once (in 17+ years of coding in C++) I ran across something where I was looking straight at the code and the behavior, like in your case, just didn't make sense. I ended up running the code through debugger and opening a disassembly window. I found what could only be explained as a bug in VS2003 compiler because it was missing exactly one instruction. Simply rearranging local variables at the top of the function (30 lines or so from the error) made the compiler put the correct instruction back in. So try debugger with disassembly and follow memory/registers to see what it's actually doing?
As far as advancing the pointer, you should be able to advance it by doing:
p = (char*)p + sizeof( A );
VS2003 through VS2010 never give you complaints about that, not sure about g++

When to use reinterpret_cast?

I am little confused with the applicability of reinterpret_cast vs static_cast. From what I have read the general rules are to use static cast when the types can be interpreted at compile time hence the word static. This is the cast the C++ compiler uses internally for implicit casts also.
reinterpret_casts are applicable in two scenarios:
convert integer types to pointer types and vice versa
convert one pointer type to another. The general idea I get is this is unportable and should be avoided.
Where I am a little confused is one usage which I need, I am calling C++ from C and the C code needs to hold on to the C++ object so basically it holds a void*. What cast should be used to convert between the void * and the Class type?
I have seen usage of both static_cast and reinterpret_cast? Though from what I have been reading it appears static is better as the cast can happen at compile time? Though it says to use reinterpret_cast to convert from one pointer type to another?
The C++ standard guarantees the following:
static_casting a pointer to and from void* preserves the address. That is, in the following, a, b and c all point to the same address:
int* a = new int();
void* b = static_cast<void*>(a);
int* c = static_cast<int*>(b);
reinterpret_cast only guarantees that if you cast a pointer to a different type, and then reinterpret_cast it back to the original type, you get the original value. So in the following:
int* a = new int();
void* b = reinterpret_cast<void*>(a);
int* c = reinterpret_cast<int*>(b);
a and c contain the same value, but the value of b is unspecified. (in practice it will typically contain the same address as a and c, but that's not specified in the standard, and it may not be true on machines with more complex memory systems.)
For casting to and from void*, static_cast should be preferred.
One case when reinterpret_cast is necessary is when interfacing with opaque data types. This occurs frequently in vendor APIs over which the programmer has no control. Here's a contrived example where a vendor provides an API for storing and retrieving arbitrary global data:
// vendor.hpp
typedef struct _Opaque * VendorGlobalUserData;
void VendorSetUserData(VendorGlobalUserData p);
VendorGlobalUserData VendorGetUserData();
To use this API, the programmer must cast their data to VendorGlobalUserData and back again. static_cast won't work, one must use reinterpret_cast:
// main.cpp
#include "vendor.hpp"
#include <iostream>
using namespace std;
struct MyUserData {
MyUserData() : m(42) {}
int m;
};
int main() {
MyUserData u;
// store global data
VendorGlobalUserData d1;
// d1 = &u; // compile error
// d1 = static_cast<VendorGlobalUserData>(&u); // compile error
d1 = reinterpret_cast<VendorGlobalUserData>(&u); // ok
VendorSetUserData(d1);
// do other stuff...
// retrieve global data
VendorGlobalUserData d2 = VendorGetUserData();
MyUserData * p = 0;
// p = d2; // compile error
// p = static_cast<MyUserData *>(d2); // compile error
p = reinterpret_cast<MyUserData *>(d2); // ok
if (p) { cout << p->m << endl; }
return 0;
}
Below is a contrived implementation of the sample API:
// vendor.cpp
static VendorGlobalUserData g = 0;
void VendorSetUserData(VendorGlobalUserData p) { g = p; }
VendorGlobalUserData VendorGetUserData() { return g; }
The short answer:
If you don't know what reinterpret_cast stands for, don't use it. If you will need it in the future, you will know.
Full answer:
Let's consider basic number types.
When you convert for example int(12) to unsigned float (12.0f) your processor needs to invoke some calculations as both numbers has different bit representation. This is what static_cast stands for.
On the other hand, when you call reinterpret_cast the CPU does not invoke any calculations. It just treats a set of bits in the memory like if it had another type. So when you convert int* to float* with this keyword, the new value (after pointer dereferecing) has nothing to do with the old value in mathematical meaning (ignoring the fact that it is undefined behavior to read this value).
Be aware that reading or modifying values after reinterprt_cast'ing are very often Undefined Behavior. In most cases, you should use pointer or reference to std::byte (starting from C++17) if you want to achieve the bit representation of some data, it is almost always a legal operation. Other "safe" types are char and unsigned char, but I would say it shouldn't be used for that purpose in modern C++ as std::byte has better semantics.
Example: It is true that reinterpret_cast is not portable because of one reason - byte order (endianness). But this is often surprisingly the best reason to use it. Let's imagine the example: you have to read binary 32bit number from file, and you know it is big endian. Your code has to be generic and works properly on big endian (e.g. some ARM) and little endian (e.g. x86) systems. So you have to check the byte order. It is well-known on compile time so you can write constexpr function: You can write a function to achieve this:
/*constexpr*/ bool is_little_endian() {
std::uint16_t x=0x0001;
auto p = reinterpret_cast<std::uint8_t*>(&x);
return *p != 0;
}
Explanation: the binary representation of x in memory could be 0000'0000'0000'0001 (big) or 0000'0001'0000'0000 (little endian). After reinterpret-casting the byte under p pointer could be respectively 0000'0000 or 0000'0001. If you use static-casting, it will always be 0000'0001, no matter what endianness is being used.
EDIT:
In the first version I made example function is_little_endian to be constexpr. It compiles fine on the newest gcc (8.3.0) but the standard says it is illegal. The clang compiler refuses to compile it (which is correct).
The meaning of reinterpret_cast is not defined by the C++ standard. Hence, in theory a reinterpret_cast could crash your program. In practice compilers try to do what you expect, which is to interpret the bits of what you are passing in as if they were the type you are casting to. If you know what the compilers you are going to use do with reinterpret_cast you can use it, but to say that it is portable would be lying.
For the case you describe, and pretty much any case where you might consider reinterpret_cast, you can use static_cast or some other alternative instead. Among other things the standard has this to say about what you can expect of static_cast (§5.2.9):
An rvalue of type “pointer to cv void” can be explicitly converted to a pointer to object type. A value of type pointer to object converted to “pointer to cv void” and back to the original pointer type will have its original value.
So for your use case, it seems fairly clear that the standardization committee intended for you to use static_cast.
One use of reinterpret_cast is if you want to apply bitwise operations to (IEEE 754) floats. One example of this was the Fast Inverse Square-Root trick:
https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code
It treats the binary representation of the float as an integer, shifts it right and subtracts it from a constant, thereby halving and negating the exponent. After converting back to a float, it's subjected to a Newton-Raphson iteration to make this approximation more exact:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the deuce?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
This was originally written in C, so uses C casts, but the analogous C++ cast is the reinterpret_cast.
Here is a variant of Avi Ginsburg's program which clearly illustrates the property of reinterpret_cast mentioned by Chris Luengo, flodin, and cmdLP: that the compiler treats the pointed-to memory location as if it were an object of the new type:
#include <iostream>
#include <string>
#include <iomanip>
using namespace std;
class A
{
public:
int i;
};
class B : public A
{
public:
virtual void f() {}
};
int main()
{
string s;
B b;
b.i = 0;
A* as = static_cast<A*>(&b);
A* ar = reinterpret_cast<A*>(&b);
B* c = reinterpret_cast<B*>(ar);
cout << "as->i = " << hex << setfill('0') << as->i << "\n";
cout << "ar->i = " << ar->i << "\n";
cout << "b.i = " << b.i << "\n";
cout << "c->i = " << c->i << "\n";
cout << "\n";
cout << "&(as->i) = " << &(as->i) << "\n";
cout << "&(ar->i) = " << &(ar->i) << "\n";
cout << "&(b.i) = " << &(b.i) << "\n";
cout << "&(c->i) = " << &(c->i) << "\n";
cout << "\n";
cout << "&b = " << &b << "\n";
cout << "as = " << as << "\n";
cout << "ar = " << ar << "\n";
cout << "c = " << c << "\n";
cout << "Press ENTER to exit.\n";
getline(cin,s);
}
Which results in output like this:
as->i = 0
ar->i = 50ee64
b.i = 0
c->i = 0
&(as->i) = 00EFF978
&(ar->i) = 00EFF974
&(b.i) = 00EFF978
&(c->i) = 00EFF978
&b = 00EFF974
as = 00EFF978
ar = 00EFF974
c = 00EFF974
Press ENTER to exit.
It can be seen that the B object is built in memory as B-specific data first, followed by the embedded A object. The static_cast correctly returns the address of the embedded A object, and the pointer created by static_cast correctly gives the value of the data field. The pointer generated by reinterpret_cast treats b's memory location as if it were a plain A object, and so when the pointer tries to get the data field it returns some B-specific data as if it were the contents of this field.
One use of reinterpret_cast is to convert a pointer to an unsigned integer (when pointers and unsigned integers are the same size):
int i;
unsigned int u = reinterpret_cast<unsigned int>(&i);
You could use reinterprete_cast to check inheritance at compile time.
Look here:
Using reinterpret_cast to check inheritance at compile time
template <class outType, class inType>
outType safe_cast(inType pointer)
{
void* temp = static_cast<void*>(pointer);
return static_cast<outType>(temp);
}
I tried to conclude and wrote a simple safe cast using templates.
Note that this solution doesn't guarantee to cast pointers on a functions.
First you have some data in a specific type like int here:
int x = 0x7fffffff://==nan in binary representation
Then you want to access the same variable as an other type like float:
You can decide between
float y = reinterpret_cast<float&>(x);
//this could only be used in cpp, looks like a function with template-parameters
or
float y = *(float*)&(x);
//this could be used in c and cpp
BRIEF: it means that the same memory is used as a different type. So you could convert binary representations of floats as int type like above to floats. 0x80000000 is -0 for example (the mantissa and exponent are null but the sign, the msb, is one. This also works for doubles and long doubles.
OPTIMIZE: I think reinterpret_cast would be optimized in many compilers, while the c-casting is made by pointerarithmetic (the value must be copied to the memory, cause pointers couldn't point to cpu- registers).
NOTE: In both cases you should save the casted value in a variable before cast! This macro could help:
#define asvar(x) ({decltype(x) __tmp__ = (x); __tmp__; })
Quick answer: use static_cast if it compiles, otherwise resort to reinterpret_cast.
Read the FAQ! Holding C++ data in C can be risky.
In C++, a pointer to an object can be converted to void * without any casts. But it's not true the other way round. You'd need a static_cast to get the original pointer back.