Do function pointer addresses hold after conversions? - c++

From what I understand, casting function pointers to different types is allowed by the C++ standard (as long as one never invokes them):
int my_func(int v) { return v; }
int main() {
using from_type = int(int);
using to_type = void(void);
from_type *from = &my_func;
to_type *to = reinterpret_cast<to_type *>(from);
// ...
}
Moreover, there is no undefined behavior if I cast the pointer back to its original type and invoke it.
So far, so good. What about the following, then?
const bool eq = (to == reinterpret_cast<to_type *>(my_func));
Does the address hold, too, after the conversion, or is this not guaranteed by the standard?
While this is irrelevant to the question, a possible scenario is when one goes hard on type erasure. If the address holds, something can be done without having to know the original function type.

From [expr.reinterpret.cast].6 (emphasis mine):
A function pointer can be explicitly converted to a function pointer of a different type.
[...]
Except that converting a prvalue of type “pointer to T1” to the type
“pointer to T2” (where T1 and T2 are function types) and back to its
original type yields the original pointer value, the result of such a
pointer conversion is unspecified.
So, the standard explicitly allows casting function pointers to different FP types and then back. This is an exception to the general rule that reinterpret_casting function pointers is unspecified.
In my understanding, that means to == reinterpret_cast<to_type *>(my_func) need not necessarily be true.

Related

Does C++ guarantee consistent pointer representations?

Does C++ guarantee consistent pointer representations when a pointer is casted to other pointer types?
For example, does C++ guarantee anything about the following program?
<stdint.h>
struct Foo {};
struct Bar : Foo {};
int main() {
Bar obj;
Foo * a = &obj;
Bar * b = &obj;
void *c = &obj;
#if MAYBE_UB
int * d = reinterpret_cast<int *>(&obj);
#endif
auto aa = reinterpret_cast<uintptr_t>(a);
auto bb = reinterpret_cast<uintptr_t>(b);
auto cc = reinterpret_cast<uintptr_t>(c);
#if MAYBE_UB
auto dd = reinterpret_cast<uintptr_t>(d); // UB? Not reading the pointee...
#endif
if (aa != bb) printf("bb differs\n");
if (aa != cc) printf("cc differs\n");
#if MAYBE_UB
if (aa != dd) printf("dd differs\n");
#endif
return 0;
}
Does C++ guarantee consistent pointer representations?
No, pointers can be represented in any inconsistent way compiler wants them to. Generally, the language tries to be abstract and talk as little as possible about representation of stuff. It's left for the implementation to figure that out.
Does C++ guarantee consistent pointer representations when a pointer is casted to other pointer types?
No, pointers can be represented in any way they want. Using rocks and sticks, for example.
does C++ guarantee anything about the following program?
TL;DR Well, yes, that the code should compile (ignoring the missing #includfe on top). But there is no guarantee about the output.
Aaanyway, I do not think you are interested in the representation of pointers, but if the value of pointer changes when the pointer value is converted to a different type. This has nothing to do with how the value is "represented", the representation of the pointer can change anyhow the compiler wants to. Well, ok, we know char * and void * have the same representation, from https://eel.is/c++draft/basic.compound#5 :
A pointer to cv void can be used to point to objects of unknown type.
Such a pointer shall be able to hold any object pointer.
An object of type “pointer to cv void” shall have the same representation and alignment requirements as an object of type “pointer to cv char”.
Anyway, we know about void * pointer https://eel.is/c++draft/expr#conv.ptr :
A prvalue of type “pointer to cv T”, where T is an object type, can be converted to a prvalue of type “pointer to cv void”.
The pointer value is unchanged by this conversion.
And this also is relevant:
A prvalue of type “pointer to cv D”, where D is a complete class type, can be converted to a prvalue of type “pointer to cv B”, where B is a base class ([class.derived]) of D.
If B is an inaccessible ([class.access]) or ambiguous ([class.member.lookup]) base class of D, a program that necessitates this conversion is ill-formed.
The result of the conversion is a pointer to the base class subobject of the derived class object.
But there is nothing about the value of the result. It can be the same, it can be different, it's not specified.
The problem arises that your code uses uintptr_t. We know that cstdint has to be the same as in C stdint.h, and from C we know only that C99 7.18.1.4:
The following type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
uintptr_t
We know nothing about the value of (uintptr_t)(Bar*)&obj. We also do not know if two uintptr_t variables compare equal, even when they were created from two equal void pointers. Only that they compare equal when you convert them back. We fall back to general rule at reinterpret_cast https://eel.is/c++draft/expr#reinterpret.cast-4 :
A pointer can be explicitly converted to any integral type large enough to hold all values of its type. The mapping function is implementation-defined.
[Note 2: It is intended to be unsurprising to those who know the addressing structure of the underlying machine.
— end note]
The "mapping function" can be anything. It can produce inconsistent values that depend on anything - it's implementation defined. So, basically, we know nothing about the resulting uintptr_t values, except there should be some values there.
Anyway, from the above, I conclude that all uintptr_t conversions in your code result in an implementation defined value and can have any value the implementation wants them to have. The program has defined, but implementation-defined behavior - any of those ifs can be true or false, depending on the "pointers to integers mapping function" the compiler uses.
However, we know, that any sane compiler will have a sane "mapping function" that maps pointers to integral types, and this mapping functions will be "unsurprising to those who know the addressing structure of the underlying machine" and will produce consistent values. You might be also interested in this pointer provenance.
As for MAYBE_UB, there is nothing undefined in it, we know that https://eel.is/c++draft/expr#reinterpret.cast-7 :
An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_­cast<cv T*>(static_­cast<cv void*>(v)).
But we know from https://eel.is/c++draft/expr#static.cast-13 :
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1.
If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified.
Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b.
Otherwise, the pointer value is unchanged by the conversion.
The resulting pointer can be unspecified or unchanged, depending on if int satisfies alignment requirements of &obj, but the code is still defined.

Is conversion between vectors defined behavior?

In order to serialize components in my game, I need to be able to access the data in various vectors only given a pointer and a size for the vector.
I want to get the data() pointer from a vector if I have only a void * pointing to the vector. I am attempting to convert from std::vector<T> to std::vector<char> to get the data() pointer. I want to know if the following code is defined behavior and not going to act any different in different situations.
#include <iostream>
#include <vector>
int main()
{
std::vector<int> ints = { 0, 1, 2, 3, 4 };
std::vector<char>* memory = reinterpret_cast<std::vector<char>*>(&ints);
int *intArray = reinterpret_cast<int *>(memory->data());
std::cout << intArray[0] << intArray[1] << intArray[2] << intArray[3] << intArray[4] << std::endl; //01234 Works on gcc and vc++
std::getchar();
}
This seems to work in this isolated case, but I don't know if it will give errors or undefined behavior inside the serialization code.
This is an aliasing violation:
std::vector<char>* memory = reinterpret_cast<std::vector<char>*>(&ints);
int *intArray = reinterpret_cast<int *>(memory->data());
Per [basic.life], accessing memory->data() here has undefined behavior.
The way to get around this is to call ints.data() to obtain a int* pointer to the underlying contiguous array. Afterwards, you are allowed to cast it to void*, char*, or unsigned char* (or std::byte* in C++17).
From there you can cast back to int* to access the elements again.
I don't think that it is UB.
With reinterpret_cast<std::vector<char>*>(&ints), you are casting a vector-object to another vector object of different (and actually incompatible) type. Yet you do not dereference the resulting pointer, and - as both vector objects will very likely have the same aliasing restrictions - the cast will be OK. Cf, for example, this online C++ draft). Note that a vector does not store the data types "in place" but will hold a pointer to the values.
5.2.10 Reinterpret cast
(7) An object pointer can be explicitly converted to an object pointer of
a different type.70 When a prvalue v of type “pointer to T1” is
converted to the type “pointer to cv T2”, the result is static_cast(static_cast(v)) if both T1 and T2 are standard-layout
types ([basic.types]) and the alignment requirements of T2 are no
stricter than those of T1, or if either type is void. Converting a
prvalue of type “pointer to T1” to the type “pointer to T2” (where T1
and T2 are object types and where the alignment requirements of T2 are
no stricter than those of T1) and back to its original type yields the
original pointer value. The result of any other such pointer
conversion is unspecified.
So casting a vector object forth and back should work in a defined manner here.
Second, you cast a pointer that originally points (and is aliased to) int "back" to its original type int. So aliasing is obviously not violated.
I don't see any UB here (unless a vector-object had stricter aliasing rules than a vector-object, which is very likely not the case).

Are different function pointers compatible with each other?

Today I have learned that function pointers and data pointers are not the same and are therefore not compatible with each other (Why are function pointers and data pointers incompatible in C/C++?). My question however is, are different function (non member) pointers compatible with each other (are implemented the same way).
In code:
typedef void(*FuncPtr0)();
typedef void(*FuncPtr1)(int);
FuncPtr0 p0;
FuncPtr1 p1;
p0 = reinterpret_cast<FuncPtr0>(p1); // will this always work, if p1 really
p0(); // points to a function of type FuncPtr0
Thanks for your help!
n3376 5.2.10/6
A function pointer can be explicitly converted to a function pointer of a different type. The effect of calling
a function through a pointer to a function type (8.3.5) that is not the same as the type used in the definition
of the function is undefined. Except that converting a prvalue of type “pointer to T1” to the type “pointer to
T2” (where T1 and T2 are function types) and back to its original type yields the original pointer value, the result of such a pointer conversion is unspecified.
No they are not compatible and will invoke undefined behavior. You will have unspecified results.
In fact you can cast them to each other, but you shouldn't call a function pointer which points to a non-compatible function signature.
For example, see this code :
typedef void(*FuncPtr0)();
void p1f() { std::cout << "ONE"; }
void p2f(int x) { std::cout << "TWO " << x ; }
int main()
{
FuncPtr0 p0 = reinterpret_cast<FuncPtr0>(p2f);
p0();
}
Output
TWO 1
The question is who set argument x to 1 ? It may run but the result is unspecified. In my system the result is something else (garbage) TWO 39.

reinterpret_cast an iterator to a pointer

I've got an iterator of Things. If I want to convert the current item to a pointer to the item, why does this work:
thing_pointer = &(*it);
But this not:
thing_pointer = reinterpret_cast<Thing*>(it);
This is the compiler error I'm trying to comprehend: http://msdn.microsoft.com/en-us/library/sy5tsf8z(v=vs.90).aspx
Just in case, the type of the iterator is std::_Vector_iterator<std::_Vector_val<Thing,std::allocator<Thing> > >
In
&(*it);
the * is overloaded to do what you logically mean: convert the iterator type to its pointed-to object. You can then safely take the address of this object.
Whereas in
reinterpret_cast<Thing*>(it);
you are telling the compiler to literally reinterpret the it object as a pointer. But it might not be a pointer at all -- it might be a 50-byte struct, for all you know! In that case, the first sizeof (Thing*) bytes of it will absolutely not happen to point at anything sensible.
Tip: reinterpret_cast<> is nearly always the wrong thing.
Obligitory Standard Quotes, emphasis mine:
5.2.19 Reinterpret cast
1/ [...] Conversions that can be performed explicitly using
reinterpret_cast are listed below. No other conversion can be
performed explicitly using reinterpret_cast.
4/ A pointer can be explicitly converted to any integral type large
enough to hold it. [...]
5/ A value of integral type or enumeration type can be explicitly
converted to a pointer. [...]
6/ A function pointer can be explicitly converted to a function
pointer of a different type. [...]
7/ An object pointer can be explicitly converted to an object pointer
of a different type. [...]
8/ Converting a function pointer to an object pointer type or vice
versa is conditionally-supported. [...]
9/ The null pointer value (4.10) is converted to the null pointer
value of the destination type. [...]
10/ [...] “pointer to member of X of type T1” can be explicitly
converted to [...] “pointer to member of Y of type T2” [...]
11/ A [...] T1 can be cast to the type “reference to T2” if an
expression of type “pointer to T1” can be explicitly converted to the
type “pointer to T2” using a reinterpret_cast. [...]
With the exception of the integral-to-pointer and value-to-reference conversions noted in 4/, 5/ and 11/ the only conversions that can be performed using reinterpret_cast are pointer-to-pointer conversions.
However in:
thing_pointer = reinterpret_cast<Thing*>(it);
it is not a pointer, but an object. It just so happens that this object was designed to emulate a pointer in many ways, but it's still not a pointer.
Because * operator of iterator is overloaded and it return a
reference to the object it points on.
You can force it by thing_pointer = *(reinterpret_cast<Thing**>(&it));. But it's undefined behavior.
Because iterator is not a pointer. It is a class of implementation-defined structure, and if you try to reinterpret it to a pointer, the raw data of the iterator class will be taken as a memory pointer, which may, but probably will not point to valid memory
The first gets a reference to the object, then takes the address of it, giving the pointer.
The second tries to cast the iterator to a pointer, which is likely to fail because most types can't be cast to pointers - only other pointers, integers, and class types with a conversion operator.

casting via void* instead of using reinterpret_cast [duplicate]

This question already has answers here:
Should I use static_cast or reinterpret_cast when casting a void* to whatever
(9 answers)
Closed 1 year ago.
I'm reading a book and I found that reinterpret_cast should not be used directly, but rather casting to void* in combination with static_cast:
T1 * p1=...
void *pv=p1;
T2 * p2= static_cast<T2*>(pv);
Instead of:
T1 * p1=...
T2 * p2= reinterpret_cast<T2*>(p1);
However, I can't find an explanation why is this better than the direct cast. I would very appreciate if someone can give me an explanation or point me to the answer.
Thanks in advance
p.s. I know what is reinterpret_cast used for, but I never saw that is used in this way
For types for which such cast is permitted (e.g. if T1 is a POD-type and T2 is unsigned char), the approach with static_cast is well-defined by the Standard.
On the other hand, reinterpret_cast is entirely implementation-defined - the only guarantee that you get for it is that you can cast a pointer type to any other pointer type and then back, and you'll get the original value; and also, you can cast a pointer type to an integral type large enough to hold a pointer value (which varies depending on implementation, and needs not exist at all), and then cast it back, and you'll get the original value.
To be more specific, I'll just quote the relevant parts of the Standard, highlighting important parts:
5.2.10[expr.reinterpret.cast]:
The mapping performed by reinterpret_cast is implementation-defined. [Note: it might, or might not, produce a representation different from the original value.] ... A pointer to an object can be explicitly converted to a pointer to an object of different type.) Except that converting an rvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value, the result of such a pointer conversion is unspecified.
So something like this:
struct pod_t { int x; };
pod_t pod;
char* p = reinterpret_cast<char*>(&pod);
memset(p, 0, sizeof pod);
is effectively unspecified.
Explaining why static_cast works is a bit more tricky. Here's the above code rewritten to use static_cast which I believe is guaranteed to always work as intended by the Standard:
struct pod_t { int x; };
pod_t pod;
char* p = static_cast<char*>(static_cast<void*>(&pod));
memset(p, 0, sizeof pod);
Again, let me quote the sections of the Standard that, together, lead me to conclude that the above should be portable:
3.9[basic.types]:
For any object (other than a base-class subobject) of POD type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value.
The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T).
3.9.2[basic.compound]:
Objects of cv-qualified (3.9.3) or cv-unqualified type void* (pointer to void), can be used to point to objects of unknown type. A void* shall be able to hold any object pointer. A cv-qualified or cv-unqualified (3.9.3) void* shall have the same representation and alignment requirements as a cv-qualified or cv-unqualified char*.
3.10[basic.lval]:
If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined):
...
a char or unsigned char type.
4.10[conv.ptr]:
An rvalue of type “pointer to cv T,” where T is an object type, can be converted to an rvalue of type “pointer to cv void.” The result of converting a “pointer to cv T” to a “pointer to cv void” points to the start of the storage location where the object of type T resides, as if the object is a most derived object (1.8) of type T (that is, not a base class subobject).
5.2.9[expr.static.cast]:
The inverse of any standard conversion sequence (clause 4), other than the lvalue-to-rvalue (4.1), array-topointer (4.2), function-to-pointer (4.3), and boolean (4.12) conversions, can be performed explicitly using static_cast.
[EDIT] On the other hand, we have this gem:
9.2[class.mem]/17:
A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [Note: There might therefore be unnamed padding within a POD-struct object, but not at its beginning, as necessary to achieve appropriate alignment. ]
which seems to imply that reinterpret_cast between pointers somehow implies "same address". Go figure.
There is not the slightest doubt that the intent is that both forms are well defined, but the wording fails to capture that.
Both forms will work in practice.
reinterpret_cast is more explicit about the intent and should be preferred.
The real reason this is so is because of how C++ defines inheritance, and because of member pointers.
With C, pointer is pretty much just an address, as it should be. In C++ it has to be more complex because of some of its features.
Member pointers are really an offset into a class, so casting them is always a disaster using C style.
If you have multiply inherited two virtual objects that also have some concrete parts, that's also a disaster for C style. This is the case in multiple inheritance that causes all the problems, though, so you should not ever want to use this anyway.
Really hopefully you never use these cases in the first place. Also, if you are casting a lot that's another sign you are messing up in in your design.
The only time I end up casting is with the primitives in areas C++ decides are not the same but where obviously they have to be. For actual objects, any time you want to cast something, start to question your design because you should be 'programming to the interface' most of the time. Of course, you can't change how 3rd party APIs work so you don't always have much choice.