C++ strict-aliasing agnostic cast - c++

I've read lots of QAs about strict aliasing here in Stack Overflow but all they are pretty common and discussion always tends to refer to deep-deep details of C++ standard which are almost always are difficult to understand properly. Especially when standard do not say things directly but describes something in a muddy unclear way.
So, my question is probably a possible duplicate of tonns of QAs here, but, please, just answer a specific question:
Is it a correct way to do a "nonalias_cast"?:
template<class OUT, class IN>
inline auto nonalias_cast(IN *data) {
char *tmp = reinterpret_cast<char *>(data);
return reinterpret_cast<OUT>(tmp);
}
float f = 3.14;
unsigned *u = nonalias_cast<unsigned *>(&f);
*u = 0x3f800000;
// now f should be equal 1.0
I guess the answer is no. But is there any nice workaround? Except disabling strict-aliasing flag of course. Union is not a handy option as well unless there is a way to fit a union hack inside nonalias_cast function body. memcpy is not an option here as well - data change should be synchronysed.
An impossible dream or an elusive reality?
UPD:
Okay, since we've got a negative answer on "is it possible?" question, I'd like to ask you an extra-question which do bothers me:
How would you resolve this task? I mean there is a plenty of practical tasks which more-less demand a "play with a bits" approach. For instance assume you have to write a IEEE-754 Floating Point Converter like this. I'm more concerned with the practical side of the question: how to have a workaround to reach the goal? In a least "pain in ##$" way.

As the other answers have correctly pointed out: This is not possible as you are not allowed to access the float object through an unsigned pointer and there is no cast that will remove that rule.
So how do you work around this issue? Don't access the object through an unsigned pointer! Use a float* or char* for passing the object around, as those are the only pointer types that are allowed under strict aliasing. Then when you actually need to access the object under unsigned semantics, you do a memcpy from the float* to a local unsigned (and memcpy back once you are done). Your compiler will be smart enough to generate efficient code for this.
Note that this means that you will have float* everywhere on your interfaces instead of unsigned*. And that is exactly what makes this work: The type system is aware of the correct data types at all times. Things only start to crumble if you try to smuggle a float through the type system as an unsigned*, which you'll hopefully agree is kind of a fishy idea in the first place.

Is it a correct way to do a "nonalias_cast"?
No.
But is there any nice workaround?
Again, no.
Reason for both is simply that &f is not the address of some object of type unsigned int, and no amount of casting on the pointer is going to change that.

No, your nonalias_cast does not work, and cannot work.
Type aliasing rules are not (directly) about converting pointers. In fact, none of your conversions have undefined behaviour. The rules are about accessing an object of certain type, through a pointer of another type.
No matter how you convert the pointer, the pointed object is still a float object, and accessing it through an unsigned pointer violates type aliasing rules.
An impossible dream or an elusive reality?
In standard C++, it is impossible.

Related

QByteArray access and strict aliasing [duplicate]

The accepted answer to What is the strict aliasing rule? mentions that you can use char * to alias another type but not the other way.
It doesn't make sense to me — if we have two pointers, one of type char * and another of type struct something * pointing to the same location, how is it possible that the first aliases the second but the second doesn't alias the first?
if we have two pointers, one of type char * and another of type struct something * pointing to the same location, how is it possible that the first aliases the second but the second doesn't alias the first?
It does, but that's not the point.
The point is that if you have one or more struct somethings then you may use a char* to read their constituent bytes, but if you have one or more chars then you may not use a struct something* to read them.
The wording in the referenced answer is slightly erroneous, so let’s get that ironed out first:
One object never aliases another object, but two pointers can “alias” the same object (meaning, the pointers point to the same memory location — as M.M. pointed out, this is still not 100% correct wording but you get the idea). Also, the standard itself doesn’t (to the best of my knowledge) actually talk about strict aliasing at all, but merely lays out rules that govern through which kinds of expressions an object may be accessed or not. Compiler flags like -fno-strict-aliasing tell the compiler whether it can assume the programmer followed those rules (so it can perform optimizations based on that assumption) or not.
Now to your question: Any object can be accessed through a pointer to char, but a char object (especially a char array) may not be accessed through most other pointer types.
Based on that, the compiler is required to make the following assumptions:
If the type of the actual object itself is not known, both char* and T* pointers may always point to the same object (alias each other) — symmetric relationship.
If types T1 and T2 are not “related” and neither is char, then T1* and T2* may never point to the same object — symmetric relationship.
A char* pointer may point to a char object or an object of any type T.
A T* pointer may not point to a char object — asymmetric relationship.
I believe, the main rationale behind the asymmetric rules about accessing object through pointers is that a char array might not satisfy the alignment requirements of, e.g., an int.
So, even without compiler optimizations based on the strict aliasing rule, writing an int to the location of a 4-byte char array at addresses 0x1, 0x2, 0x3, 0x4, for instance, will — in the best case — result in poor performance and — in the worst case — access a different memory location, because the CPU instructions might ignore the lowest two address bits when writing a 4-byte value (so here this might result in a write to 0x0, 0x1, 0x2, and 0x3).
Please also be aware that the meaning of “related” differs between C and C++, but that is not relevant to your question.
if we have two pointers, one of type char * and another of type struct something * pointing to the same location, how is it possible that the first aliases the second but the second doesn't alias the first?
Pointers don't alias each other; that's sloppy use of language. Aliasing is when an lvalue is used to access an object of a different type. (Dereferencing a pointer gives an lvalue).
In your example, what's important is the type of the object being aliased. For a concrete example let's say that the object is a double. Accessing the double by dereferencing a char * pointing at the double is fine because the strict aliasing rule permits this. However, accessing a double by dereferencing a struct something * is not permitted (unless, arguably, the struct starts with double!).
If the compiler is looking at a function which takes char * and struct something *, and it does not have available the information about the object being pointed to (this is actually unlikely as aliasing passes are done at a whole-program optimization stage); then it would have to allow for the possibility that the object might actually be a struct something *, so no optimization could be done inside this function.
Many aspects of the C++ Standard are derived from the C Standard, which needs to be understood in the historical context when it was written. If the C Standard were being written to describe a new language which included type-based aliasing, rather than describing an existing language which was designed around the idea that accesses to lvalues were accesses to bit patterns stored in memory, there would be no reason to give any kind of privileged status to the type used for storing characters in a string. Having explicit operations to treat regions of storage as bit patterns would allow optimizations to be simultaneously more effective and safer. Had the C Standard been written in such fashion, the C++ Standard presumably would have been likewise.
As it is, however, the Standard was written to describe a language in which a very common idiom was to copy the values of objects by copying all of the bytes thereof, and the authors of the Standard wanted to allow such constructs to be usable within portable programs.
Further, the authors of the Standard intended that implementations process many non-portable constructs "in a documented manner characteristic of the environment" in cases where doing so would be useful, but waived jurisdiction over when that should happen, since compiler writers were expected to understand their customers' and prospective customers' needs far better than the Committee ever could.
Suppose that in one compilation unit, one has the function:
void copy_thing(char *dest, char *src, int size)
{
while(size--)
*(char volatile *)(dest++) = *(char volatile*)(src++);
}
and in another compilation unit:
float f1,f2;
float test(void)
{
f1 = 1.0f;
f2 = 2.0f;
copy_thing((char*)&f2, (char*)&f1, sizeof f1);
return f2;
}
I think there would have been a consensus among Committee members that no quality implementation should treat the fact that copy_thing never writes to an object of type float as an invitation to assume that the return value will always be 2.0f. There are many things about the above code that should prevent or discourage an implementation from consolidating the read of f2 with the preceding write, with or without a special rule regarding character types, but different implementations would have different reasons for their forfearance.
It would be difficult to describe a set of rules which would require that all implementations process the above code correctly without blocking some existing or plausible implementations from implementing what would otherwise be useful optimizations. An implementation that treated all inter-module calls as opaque would handle such code correctly even if it was oblivious to the fact that a cast from T1 to T2 is a sign that an access to a T2 may affect a T1, or the fact that a volatile access might affect other objects in ways a compiler shouldn't expect to understand. An implementation that performed cross-module in-lining and was oblivious to the implications of typecasts or volatile would process such code correctly if it refrained from making any aliasing assumptions about accesses via character pointers.
The Committee wanted to recognize something in the above construct that compilers would be required to recognize as implying that f2 might be modified, since the alternative would be to view such a construct as Undefined Behavior despite the fact that it should be usable within portable programs. The fact that they chose the fact that the access was made via character pointer was the aspect that forced the issue was never intended to imply that compilers be oblivious to everything else, even though unfortunately some compiler writers interpret the Standard as an invitation to do just that.

Properly casting a `void*` to an integer in C++

I'm dealing with some code that uses an external library in which you can pass values to callbacks via a void* value.
Unfortunately, the previous person working on this code decided to just pass integers to these callbacks by casting an integer to a void pointer ((void*)val).
I'm now working on cleaning up this mess, and I'm trying to determine the "proper" way to cast an integer to/from a void*. Unfortunately, fixing the use of the void pointers is somewhat beyond the scope of the rework I'm able to do here.
Right now, I'm doing two casts to convert from/to a void pointer:
static_cast<int>(reinterpret_cast<intptr_t>(void_p))
and
reinterpret_cast<void *>(static_cast<intptr_t>(dat_val))
Since I'm on a 64 bit machine, casting directly ((int)void_p) results in the error:
error: cast from 'void*' to 'int' loses precision [-fpermissive]
The original implementation did work with -fpermissive, but I'm trying to get away from that for maintainability and bug-related issues, so I'm trying to do this "properly", e.g. c++ casts.
Casting directly to an int (static_cast<int>(void_p)) fails (error: invalid static_cast from type 'void*' to type 'int'). My understanding of reinterpret_cast is that it basically just causes the compiler to treat the address of the value in question as the cast-to data-type without actually emitting any machine code, so casting an int directly to a void* would be a bad idea because the void* is larger then the int (4/8 bytes respectively).
I think using intptr_t is the correct intermediate here, since it's guaranteed to be large enough to contain the integral value of the void*, and once I have an integer value I can then truncate it without causing the compiler to complain.
Is this the correct, or even a sane approach given I'm stuck having to push data through a void pointer?
I think using intptr_t is the correct intermediate here, since it's guaranteed to be large enough to contain the integral value of the void*, and once I have an integer value I can then truncate it without causing the compiler to complain.
Yes, for the reason you mentioned that's the proper intermediate type. By now, if your implementation doesn't offer it, you probably have more problems than just a missing typedef.
Is this the correct, or even a sane approach given I'm stuck having to push data through a void pointer?
Yes, given the constraints, it's quite sane.
You might consider checking the value fits instead of simply truncating it upon unpacking it from the void* in debug-mode, or even making all further processing of that integer use intptr instead of int to avoid truncation.
You could also consider pushing a pointer to an actual int instead of the int itself though that parameter. Be aware that's less efficient though, and opens you to lifetime issues.
Based on your question, I am assuming that you call a function in some library, passing it a void*, and at some point later in time, it calls one of your functions, passing it that same void*.
There are basically two possible ways to do this; the first is through explicit casting, as you showed in your current code.
The other, which Deduplicator alluded to, is a little less efficient, but allows you to maintain control of the data, and possibly modify it between when you call the library function, and when it calls your callback function. This could be achieved with code similar to this:
void callbackFunction(void* dataPtr){
int data = *(int*)dataPtr;
/* DO SOMETHING WITH data */
delete dataPtr;
}
void callLibraryFunction(int dataToPass){
int* ptrToPass = new int(dataToPass);
libraryFunction(ptrToPass,callbackFunction);
}
Which one you should use depends on what you need to do with the data, and whether the ability to modify the data could be useful in the future.
"Is this the correct, or even a sane approach given I'm stuck having to push data through a void pointer?"
Well, regarding correct and sane its seriously debatable, especially if you are the author of the code taking the void* in the interface.
I think using intptr_t is the correct intermediate here, since it's guaranteed to be large enough to contain the integral value of the void*, and once I have an integer value I can then truncate it without causing the compiler to complain.
Yes, that's the right type to use with a reinterpret_cast<intptr_t>, but you'll need to be sure, that a intptr_t pointer type has been passed in, and the address is valid and doesn't go out of scope.
It's not so unusual to stumble over this problem, when interacting with c API's, and these are offering callbacks, that allow you to pass in user-data, that will be handled transparently by them, and never are touched, besides of your entry points1.
So it's left up to the client code being sure about how that void* should be re-interpreted safely.
1) A classical example for this kind of situation, is the pthread_create() function
You have little choice but to use static and reinterpret cast here. Casting to an int will result in loss of precision, which is never ideal. Explicitly casting is always best avoided, because sooner or later what is being casted can change and there will be no compiler warnings then. But in this case you understandably have no choice. Or do you?
You can change the callback definitions on your side to be intptr_t or long int rather than void*, and it should then work and you will not have to do any type casts...

Strict aliasing rule and 'char *' pointers

The accepted answer to What is the strict aliasing rule? mentions that you can use char * to alias another type but not the other way.
It doesn't make sense to me — if we have two pointers, one of type char * and another of type struct something * pointing to the same location, how is it possible that the first aliases the second but the second doesn't alias the first?
if we have two pointers, one of type char * and another of type struct something * pointing to the same location, how is it possible that the first aliases the second but the second doesn't alias the first?
It does, but that's not the point.
The point is that if you have one or more struct somethings then you may use a char* to read their constituent bytes, but if you have one or more chars then you may not use a struct something* to read them.
The wording in the referenced answer is slightly erroneous, so let’s get that ironed out first:
One object never aliases another object, but two pointers can “alias” the same object (meaning, the pointers point to the same memory location — as M.M. pointed out, this is still not 100% correct wording but you get the idea). Also, the standard itself doesn’t (to the best of my knowledge) actually talk about strict aliasing at all, but merely lays out rules that govern through which kinds of expressions an object may be accessed or not. Compiler flags like -fno-strict-aliasing tell the compiler whether it can assume the programmer followed those rules (so it can perform optimizations based on that assumption) or not.
Now to your question: Any object can be accessed through a pointer to char, but a char object (especially a char array) may not be accessed through most other pointer types.
Based on that, the compiler is required to make the following assumptions:
If the type of the actual object itself is not known, both char* and T* pointers may always point to the same object (alias each other) — symmetric relationship.
If types T1 and T2 are not “related” and neither is char, then T1* and T2* may never point to the same object — symmetric relationship.
A char* pointer may point to a char object or an object of any type T.
A T* pointer may not point to a char object — asymmetric relationship.
I believe, the main rationale behind the asymmetric rules about accessing object through pointers is that a char array might not satisfy the alignment requirements of, e.g., an int.
So, even without compiler optimizations based on the strict aliasing rule, writing an int to the location of a 4-byte char array at addresses 0x1, 0x2, 0x3, 0x4, for instance, will — in the best case — result in poor performance and — in the worst case — access a different memory location, because the CPU instructions might ignore the lowest two address bits when writing a 4-byte value (so here this might result in a write to 0x0, 0x1, 0x2, and 0x3).
Please also be aware that the meaning of “related” differs between C and C++, but that is not relevant to your question.
if we have two pointers, one of type char * and another of type struct something * pointing to the same location, how is it possible that the first aliases the second but the second doesn't alias the first?
Pointers don't alias each other; that's sloppy use of language. Aliasing is when an lvalue is used to access an object of a different type. (Dereferencing a pointer gives an lvalue).
In your example, what's important is the type of the object being aliased. For a concrete example let's say that the object is a double. Accessing the double by dereferencing a char * pointing at the double is fine because the strict aliasing rule permits this. However, accessing a double by dereferencing a struct something * is not permitted (unless, arguably, the struct starts with double!).
If the compiler is looking at a function which takes char * and struct something *, and it does not have available the information about the object being pointed to (this is actually unlikely as aliasing passes are done at a whole-program optimization stage); then it would have to allow for the possibility that the object might actually be a struct something *, so no optimization could be done inside this function.
Many aspects of the C++ Standard are derived from the C Standard, which needs to be understood in the historical context when it was written. If the C Standard were being written to describe a new language which included type-based aliasing, rather than describing an existing language which was designed around the idea that accesses to lvalues were accesses to bit patterns stored in memory, there would be no reason to give any kind of privileged status to the type used for storing characters in a string. Having explicit operations to treat regions of storage as bit patterns would allow optimizations to be simultaneously more effective and safer. Had the C Standard been written in such fashion, the C++ Standard presumably would have been likewise.
As it is, however, the Standard was written to describe a language in which a very common idiom was to copy the values of objects by copying all of the bytes thereof, and the authors of the Standard wanted to allow such constructs to be usable within portable programs.
Further, the authors of the Standard intended that implementations process many non-portable constructs "in a documented manner characteristic of the environment" in cases where doing so would be useful, but waived jurisdiction over when that should happen, since compiler writers were expected to understand their customers' and prospective customers' needs far better than the Committee ever could.
Suppose that in one compilation unit, one has the function:
void copy_thing(char *dest, char *src, int size)
{
while(size--)
*(char volatile *)(dest++) = *(char volatile*)(src++);
}
and in another compilation unit:
float f1,f2;
float test(void)
{
f1 = 1.0f;
f2 = 2.0f;
copy_thing((char*)&f2, (char*)&f1, sizeof f1);
return f2;
}
I think there would have been a consensus among Committee members that no quality implementation should treat the fact that copy_thing never writes to an object of type float as an invitation to assume that the return value will always be 2.0f. There are many things about the above code that should prevent or discourage an implementation from consolidating the read of f2 with the preceding write, with or without a special rule regarding character types, but different implementations would have different reasons for their forfearance.
It would be difficult to describe a set of rules which would require that all implementations process the above code correctly without blocking some existing or plausible implementations from implementing what would otherwise be useful optimizations. An implementation that treated all inter-module calls as opaque would handle such code correctly even if it was oblivious to the fact that a cast from T1 to T2 is a sign that an access to a T2 may affect a T1, or the fact that a volatile access might affect other objects in ways a compiler shouldn't expect to understand. An implementation that performed cross-module in-lining and was oblivious to the implications of typecasts or volatile would process such code correctly if it refrained from making any aliasing assumptions about accesses via character pointers.
The Committee wanted to recognize something in the above construct that compilers would be required to recognize as implying that f2 might be modified, since the alternative would be to view such a construct as Undefined Behavior despite the fact that it should be usable within portable programs. The fact that they chose the fact that the access was made via character pointer was the aspect that forced the issue was never intended to imply that compilers be oblivious to everything else, even though unfortunately some compiler writers interpret the Standard as an invitation to do just that.

Accessing buffer values via 2 differently typed pointers

I have two questions, a general one about pointer type-manipulation in general, and then one for a specific case I have.
What happens when you access a buffer of memory using pointers of different types?
In practice on many different compilers, it seems to work out as my brain would like to envision it. However, I sort-of know it's UB in many (if not all cases). For example:
typedef unsigned char byte;
struct color { /* stuff */};
std::vector<color> colors( 512 * 512 );
// pointer of one type
color* colordata = colors.data();
// pointer to another type?
byte* bytes = reinterpret_cast<byte*>( colordata );
// Proceed to read from (potentially write into)
// the "bytes" of the 512 * 512 heap array
The first question would be: Is there any point where doing this kind of conversion is legal/safe/standard-sanctioned?
The second question: spinning off the first, if you knew that the struct named color was defined as:
struct color { byte c[4]; };
Now, is it legal/safe/standard-sactioned? Read-safe? Read / Write safe? I'd like to know, as my intuition tells me that for these very simple structs, the above naughty pointer manipulation isn't that bad, or is it?
[ Reopen Reasons: ]
While the linked question about strict aliasing applies somewhat here, it is mostly about C. The one answer referencing the C++03 standard may be outdated when compared to the C++11 standard (unless absolutely nothing has changed). This question has a practical application and I and others would benefit from more answers. Finally, this question is very specific in asking whether it is not only read-safe, write-safe, or both (or neither, and in two different scenarios (PoD data where the underlying types match and a more general case of arbitrary internal data).
Both are legal.
Firstly, since byte is a typedef for unsigned char, it has a magical get-out-of-jail-free when it comes to strict aliasing. You can alias any type as char or one of it's signed or unsigned derivatives.
Secondly, it is entirely legal in both C and C++ for a struct to be cast to a pointer to the type of it's first element, as long as it meets certain guarantees like being POD. This means that
struct x {
int f;
};
int main() {
x var;
int* p = (int*)&var;
}
does not violate strict aliasing either, even without the getout clause used for char.
As has been stated in the comments: Accessing the same piece of memory as two different types is UB. So, that's the formal answer (note that "UB" does include "doing precisely what you would expect if you are a sane person reading the code" as well as "just about anything other than what a sane person reading the code would expect")
Having said that, it appears that all popular compilers tend to cope with this fairly well. It is not unusual to see these sort of constructs (in "good" production code - even if the code isn't strictly language-lawyer correct). However, you are at the mercy of the compiler "doing the right thing", and it's definitely a case where you may find compiler bugs if you stress things too harshly.
There are several reasons that the standard defines this as UB - the main one being that "different types of data may be stored in different memory" and "it can be hard for the compiler to figure out what is safe when someone is mucking about casting pointers to the same data with different types" - e.g. if we have a pointer to a 32-bit integer and another pointer to char, both pointing to the same address, when is it safe to read the integer value after the char value has been written. By defining it as UB, it's entirely up to the compiler vendor to decide how precisely they want to treat these conditions. If it was "defined" that this will work, compilers may not be viable for certain processor types (or code would become horribly slow due to the effect of the liberal sprinkling of "make sure partial memory writes have completed before I read" operations, even when those are generally not needed).
So, in summary: It will most likely work on most processors, but don't expect any language lawyer to approve of your code.

Few doubts about casting operators in C++

The reinterpret_cast as we know can cast any pointer type to any another pointer type. The question I want to ask regarding this cast operator are:
How does reinterpret_cast work, What is the magic(the internal implementation) that allows reinterpret_cast to work?
How to ensure safety when using reinterpret_cast? As far as i know, it doesn't guarantee of safe casting, So what precaution to take while using reinterpret_cast?
What is the practical usage of this operator. I have not really encountered it in my professional programing experience, wherein I could'nt get around without using this operator.Any practical examples apart from usual int* to char* will be highly helpful and appreciated.
One other Question regarding casting operators in general:
Casting operators(static_cast, dynamic_cast, const_cast, reinterpret_cast) are all called Operators i.e is to the best of my understanding, So is it correct statement to make that casting operators cannot be overloaded unlike most other operators (I am aware not all operators can be overloaded and I am aware of which can't be(except the Q I am asking, Please refrain flaming me on that) Just I had this doubt that since they are operators, what does the standard say about these?
There is no magic. reinterpret_cast normally just means (at least try to) treat what you find at this address as if it was the type I've specified. The standard defines little enough about what it does that it could be different from that, but it rarely (if ever) really is.
In a few cases, you can get safety from something like a discriminated union. For example, if you're reading network packets, and read enough to see that what you've received is a TCP packet, then you can (fairly) safely do a reinterpret_cast from IPHdr to TCPHdr (or whatever names you happen to have used). The compiler won't (again, normally) do much though -- any safety is up to you to implement and enforce.
I've used code like I describe in 2), dealing with different types of network packets.
For your final question: you can overload casting for a class:
class XXX {
public:
operator YYY() { return whatever; }
};
This can be used for conversions in general though -- whether done by a static_cast, C-style cast, or even an implicit conversion. C++0x allows you to add an explicit qualifier so it won't be used for implicit conversions, but there's still no way to differentiate between a static_cast and a C-style cast though.
First, it's unclear what you mean by "non-standard pointer". I think your premise is flawed. Happily it doesn't seem to affect the questions.
"How does [it] work?" Well, the intent, as you can guess from the name, is to just change the interpretation of a bitpattern, perhaps extending or shorting as appropriate. This is a kind of change of type where the bitpattern is left unchanged but the interpretation and hence conceptual value is changed. And it's in contrast to a kind of change of type where the conceptual value is kept (e.g. int converted to double) while the bitpattern is changed as necessary to keep the conceptual value. But most cases of reinterpret_cast have implementation defined effect, so for those cases your compiler can do whatever it wants -- not necessarily keeping the bitpattern -- as long as it is documented.
"How to ensure safety" That is about knowing what your compiler does, and about avoiding reinterpret_cast. :-)
"What is the practical usage". Mostly it is about recovering type information that's been lost in C-oriented code where void* pointers are used to sort of emulate polymorphism.
Cheers & hth.,
reinterpret_cast generally lets you do some very bad things. In the case of casting a pointer it will permit casting from one type to another which has absolutely no reason to assume this should work. It's like saying "trust me I really want to do this". What exactly this does is unpredictable from one system to the next. On your system it might just copy the bit-patterns, where as on another one it could transform them in some (potentially useful) way.
e.g.
class Foo {
int a;
};
class Bar {
int a;
};
int main() {
Foo a;
// No inheritance relationship and not void* so must be reinterpret_cast
// if you really want to do it
Bar *b = reinterpret_cast<Bar*>(&a);
char buffer[sizeof(Bar)];
Bar *c = reinterpret_cast<Bar*>(buffer); // alignment?
}
Will quite happily let you do that, no matter what the scenario. Sometimes if you're doing low-level manipulation of things this might actually be what you want to do. (Imagine char * of a buffer casting to something user defined type)
Potential pitfalls are huge, even in the simplest case like a buffer, where alignment may well be a problem.
With dlsym() on Linux it's useful to be able to cast void* to a function pointer, which is otherwise undefined behaviour in C++. (Some systems might use separate address spaces or different size pointers!). This can only be done with reinterpret_cast in C++.
reinterpret_cast only works on pointers. The way it works is that it leaves the value of the pointer alone and changes the assumed type information about it. It says, "I know these types are not equivalent, but I want you to just pretend this is now a pointer to T2." Of course, this can cause any number of problems if you use the T2 pointer and it does not point to a T2.
There are very few guarantees about reinterpret_cast, which is why it is to be so avoided. You're really only allowed to cast from T1 to T2 and then back to T1 and know that, given some assumptions, that the final result will be the same as what you started with.
The only one I can think of is casting a char* to an unsigned char*. I know that the underlying representation is the same in my implementation so I know the cast is safe. I can't use a static cast though because it's a pointer to a buffer. In reality, you'll find very little legitimate use of reinterpret_cast in the real world.
Yes, they are operators. AFAIK you can't override them.
One "practical" use of reinterpret_cast.
I have a class where the members are not meant to be read. Example below
class ClassWithHiddenVariables
{
private:
int a;
double m;
public:
void SetVariables(int s, int d)
{
a = s;
m = d;
}
};
This class is used in a thousand places in an application without a problem.
Now, because of some reason I want see the members in one specific part. However, I don't want to touch the existing class.So break the rules as follows.
Create another class with the same bit pattern and public visibility. Here the original class contains an int and double.
class ExposeAnotherClass
{
public:
int a_exposed;
double m_exposed;
};
When you want to see members of the ClassWithHiddenVariables object, use reinterpret_cast to cast to ExposeAnotherClass. Example follows
ClassWithHiddenVariables obj;
obj.SetVariables(10, 20.02);
ExposeAnotherClass *ptrExposedClass;
ptrExposedClass = reinterpret_cast<ExposeAnotherClass*>(&obj);
cout<<ptrExposedClass->a_exposed<<"\n"<<ptrExposedClass->m_exposed;
I don't think this situation ever occurs in real world. But this is just an explanation of reinterpret_cast which considers objects as bit patterns.
reinterpret_cast tells the compiler "shut up, it's a variable of type T*" and there's no safety unless it is really a variable of type T*. On most implementations just nothing is done - the same value in the variable is passed to the destination.
Your class can have conversion operators to any type T* and those conversions will either be invokde implicitly under certain conditions or you can invoke them explicitly using static_cast.
I've used reinterpret_cast a lot in Windows programming. Message handling uses WPARAM and LPARAM parameters that need casting to the correct types.
reinterpret_cast is pretty equivalent to a C-style cast. It doesn't guarentee anything; it's there to allow you to do what you need to, in the hopes that you know what you're doing.
If you're looking to ensure safety, use dynamic_cast, as that's what it does. If the cast cannot be completed safely, dynamic_cast returns NULL or nullptr (C++0x).
Casting using the "casting operators" such as static_cast, dynamic_cast, etc.. cannot be overloaded. Straight conversions can, such as:
class Degrees
{
public:
operator double() { ... }
};
The reinterpret_cast as we know can
cast any non-standard pointer to
another non-standard pointer.
Almost, but not exactly. For example, you can't use reinterpret_cast to cast a const int* to an int*. For that, you need const_cast.
How does reinterpret_cast work, What is the magic(the internal
implementation) that allows
reinterpret_cast to work?
There's no magic at all. Ultimately, all data is just bytes. The C++ type system is merely an abstraction layer which tells the compiler how to "interpret" each byte. A reinterpret_cast is similar to a plain C-cast, in that it simply says "to hell with the type system: interpret these bytes as type X instead of type Y!"
How to ensure safety when using reinterpret_cast? As far as i know, it
doesn't guarantee of safe casting, So
what precaution to take while using
reinterpret_cast?
Well, reinterpret_cast is inherently dangerous. You shouldn't use it unless you really know what you're doing. Try to use static_cast instead. The C++ type system will protect you from doing anything too dangerous if you use static_cast.
What is the practical usage of this operator. I have not really
encountered it in my professional
programing experience, wherein I
could'nt get around without using this
operator.Any practical examples apart
from usual int* to char* will be
highly helpful and appreciated.
It has many uses, but usually these uses are somewhat "advanced". For example, if you are creating a memory pool of linked blocks, and storing pointers to free blocks on the blocks themselves, you'll need to reinterpret_cast a block from a T* to a T** to interpret the block as a pointer to the next block, rather than a block itself.