Assigning a function pointer to a void pointer,
double f_dummy(double x) { return x; }
...
void *pv = f_dummy; //compilation error
is illegal as explained in this FAQ. The answer, however, closes with the statement :
Please do not email me if the above seems to work on your particular version of your particular compiler on your particular operating system. I don’t care. It’s illegal, period.
Edit : As a warning to others, I did encounter this "seems to work" behavior, through a case of inheritance involving class templates. No compiler warning, no unexpected run-time behavior.
This tickles my OCD bone and makes me wonder if what I've been doing, e.g.,
...
auto l_func = [](double x){ return f_dummy(x); };
void *pv = &l_func;
auto pl = static_cast<decltype(&l_func)>(pv);
cout << (*pl)(5.) << endl;
which compiles and runs cleanly (g++ -std=c++11 -Wall), is truly legal. Is it?
Yes, it's legal, because:
pointers to objects may be cast to void* and back again;
l_func is an object (a functor, with unspecified class type) — that's how lambdas are implemented, by standard mandate.
The FAQ text you cite is unrelated, as it refers to pointers to functions. _yourUnspecifiedLambdaType::operator() is the equivalent* function, but you're not doing anything with that here.
* Well, it's not even equivalent, because it's a member function!
Related
This is a slightly esoteric question, but I was curious whether the following class extension pattern is legal (as in, does not constitute UB) in modern C++ (for all intends and purposes I am fine with restricting the discussion to C++17 and later).
template<typename T>
struct AddOne {
T add_one() const {
T const& tref = *reinterpret_cast<T const*>(this);
return tref + 1;
}
};
template<template<typename> typename E, typename T>
E<T> const& as(T const& obj) {
return reinterpret_cast<E<T> const&>(obj);
}
auto test(float x) {
return as<AddOne>(x).add_one();
}
auto test1(int x) {
return as<AddOne>(x).add_one();
}
// a main() to make this an MVCE
// will return with the exit code 16
int main(int argc, const char * argv[]) {
return test1(15);
}
The above code is a complete example, it compiles, runs and produces the expected result with at least clang in C++17 mode. Check the disassembly code on compiler explorer: https://godbolt.org/z/S3ZX2Y
My interpretation is as follows: the standard states that reinterpret_cast can convert between pointers/references of any types, but that accessing there resulting references might be UB (as per to aliasing rules). At the same time, converting the resulting value back to the original type is guaranteed to yield the original value.
Based on this, merely reinterepting a reference to float as a reference to AddOne<float> does not invoke UB. Since we never attempt to access any memory behind that reference as instance of AddOne<float>, there is no UB here either. We only use the type information of that reference to select the correct implementation of add_one() member function. The function itself coverts the reference back to the original type, so again, no UB. Essentially, this pattern is semantically equivalent to this:
template<typename T>
struct AddOne {
static T add_one(T const& x) {
return x + 1;
}
};
auto test(float x) {
return AddOne<Int>::add_one(x);
}
Am I correct or is there something I miss here?
Consider this as an academic exercise in exploring the C++ standard.
Edit: this is not a duplicate of When to use reinterpret_cast? since that question does not discuss casting this pointer or using reinterpret_cast to dispatch on the reinterpreted type.
No, that's definitely not legal. For a number of reasons.
The first reason is, you've got *this dereferencing an AddOne<int>* which doesn't actually point to an AddOne<int>. It doesn't matter that the operation doesn't really require a dereference "behind the scenes"; *foo is only legal if foo points to an object of compatible type.
The second reason is similar: You're calling a member function on an AddOne<int> which isn't. It likewise doesn't matter that you don't access any of AddOne's (nonexistent) members: the function call itself is an access of the object value, running afoul of the strict aliasing rule.
The full answer was provided by user #n.m in the comments, so I will copy it here for the sake of completeness.
A paragraph from [basic.life] c++ standard sections states:
Before the lifetime of an object has started [...] The program has undefined
behavior if [...]
the pointer is used to access a non-static data member or call a non-static member function of the object
As to seems, this prohibits dispatch via the reinterpreted reference since that reference does not refer to a live object.
In my program, I would like to take the address of a temporary. Here is an example:
#include <iostream>
struct Number {
int value;
Number(int n) {
value = n;
}
};
void print(Number *number) {
std::cout << number->value << std::endl;
}
int main() {
Number example(123);
print(&example);
print(&Number(456)); // Is this safe and reliable?
}
This would output:
123
456
To compile, the -fpermissive flag is requied.
Here is my question: is this safe and reliable? In what possible case could something go wrong?
If your definition of "safe and reliable" includes "will compile and produce the same results if the compiler is updated" then your example is invalid.
Your example is ill-formed in all C++ standards.
This means, even if a compiler can be coerced to accept it now, there is no guarantee that a future update of your compiler will accept it or, if the compiler does accept the code, will produce the same desired effect.
Most compiler vendors have form for supporting non-standard features in compilers, and either removing or altering support of those features in later releases of the compiler.
Consider changing your function so it accepts a const Number & rather than a pointer. A const reference CAN be implicitly bound to a temporary without needing to bludgeon the compiler into submission (e.g. with command line options). A non-const reference cannot.
&Number(456) is an error because the built-in & operator cannot be applied to an rvalue. Since it is an error, it is neither safe nor reliable. "What could go wrong" is that the code could be rejected and/or behave unexpectedly by a compiler which follows the C++ Standard. You are relying on your compiler supporting some C++-like dialect in which this code is defined.
You can output the address of the temporary object in various ways. For example add a member function auto operator&() { return this; } . The overloaded operator& can be applied to prvalues of class type.
Another way would be to have a function that is like the opposite of move:
template<typename T>
T& make_lvalue(T&& n)
{
return n;
}
and then you can do print(&make_lvalue(Number(456)));
If you are feeling evil, you could make a global template overload of operator&.
This is fine but..
Number *a;
print(a); // this would be a null ptr error
How I would change it is
void print(const Number num) // make the paramater const so it doesnt change
{
std::cout << num.value << std::endl; // note the . instead of -> cuz this is a reference not a pointer
}
You would remove the "&" from your code like:
Number example(123);
print(example);
print(Number(456));
and if you need to pass a pointer you just put a "*" to dereference it.
chasester
I have stumbled upon the following code structure and I'm wondering whether this is intentional or just poor understanding of casting mechanisms:
struct AbstractBase{
virtual void doThis(){
//Basic implementation here.
};
virtual void doThat()=0;
};
struct DerivedA: public AbstractBase{
virtual void doThis(){
//Other implementation here.
};
virtual void doThat(){
// some stuff here.
};
};
// More derived classes with similar structure....
// Dubious stuff happening here:
void strangeStuff(AbstractBase* pAbstract, int switcher){
AbstractBase* a = NULL;
switch(switcher){
case TYPE_DERIVED_A:
// why would someone use the abstract base pointer here???
a = dynamic_cast<DerivedA*>(pAbstract);
a->doThis();
a->doThat();
break;
// similar case statement with other derived classes...
}
}
// "main"
DerivedA* pDerivedA = new DerivedA;
strangeStuff( pDerivedA, TYPE_DERIVED_A );
My guess is, that this dynamic_cast statement is just the result of poor understanding and very bad programming style in general (the whole way the code works, just feels painful to me) and that it doesn't cause any change in behaviour for this specific use case.
However, since I'm not an expert on casting, I'd like to know whether there are any subtle side-effects that I'm not aware of.
Blockquote [C++11: 5.2.7/9]: The value of a failed cast to pointer type is the null pointer value of the required result type.
The dynamic_cast can return NULL if the type was wrong, making the following lines crash. Hence, this can be either 1. an attempt to make (logical) errors more explicit, or 2. some sort of in-code documentation.
So while it doesn't look like the best design, it is not exactly true that the cast has no effect whatsoever.
My guess would be that the coder screwed up.
A second guess would be that you skipped a check for a being null in your simplification.
A third, and highly unlikely possibility, is that the coder was exploiting undefined behavior to optimize.
With this code:
a = dynamic_cast<DerivedA*>(pAbstract);
a->doThis();
if a is not of type DerivedA* (or more derived), the a->doThis() is undefined behavior. And if it is of type DerivedA*, then the dynamic_cast does absolutely nothing (guaranteed).
A compiler can, in theory, optimize out any other possibility away, even if you did not change the type of a, and remain conforming behavior. Even if someone later checks if a is null, the execution of undefined behavior on the very next line means that the compiler is free not to set a to null on the dynamic_cast line.
I would doubt that a given compiler would do this, but I could be wrong.
There are compilers that detect certain paths cause undefined behavior (in the future), eliminate such possibilities from happening backwards in execution to the point where the undefined behavior would have been set in motion, and then "know" that the code in question cannot be in the state that would trigger undefined behavior. It can then use this knowledge to optimize the code in question.
Here is an example:
std::string foo( unsigned int x ) {
std::string r;
if (x == (unsigned)-1)) {
r = "hello ";
}
int y = x;
std::stringstream ss;
ss << y;
r += ss.str();
return r;
}
The compiler can see the y=x line above. If x would overflow an int, then the conversion y=x is undefined behavior. It happens regardless of the result of the first branch.
In short, if the first branch runs, undefined behavior would result. And undefined behavior can do anything, including time travel -- it can go back in time and prevent that branch from being taken.
So the
if (x == (unsigned)-1)) {
r = "hello ";
}
branch can be eliminated by the optimizer, legally in C++.
While the above is just a toy case, gcc does optimizations very much like this. There is a flag to tell it not do.
(unsigned -1 is defined behavior, but overflowing an int is not, in C++. In practice, this is because there are platforms in which signed int overflow causes problems, and C++ doesn't want to impose extra costs on them to make a conforming implementation.)
dynamic_cast will confirm that the dynamic type does match the type indicated by the switcher variable, making the code slightly less dangerous. However, it will give a null pointer in the case of a mismatch, and the code neglects to check for that.
But it seems more likely that the author didn't really understand the use of virtual functions (for uniform treatment of polymorphic types) and RTTI (for the rarer cases where you need to distinguish between types), and attempted to invent their own form of manual, error-prone type identification.
I have recently encountered a behavior in C++ regarding function pointers, that I can't fully understand. I asked Google for help as well as some of my more experienced colleagues, but even they couldn't help.
The following code showcases this mystique behavior:
class MyClass{
private:
int i;
public:
MyClass(): i(0) {}
MyClass(int i): i(i) {}
void PrintText() const { std::cout << "some text " << std::endl;}
};
typedef void (*MyFunction) (void*);
void func(MyClass& mc){
mc.PrintText();
}
int main(){
void* v_mc = new MyClass;
MyFunction f = (MyFunction) func; //It works!
f(v_mc); //It works correctly!!!
return 0;
}
So, first I define a simple class that will be used later (especially, it's member method PrintText). Then, I define name object void (*) (void*) as MyFunction - a pointer to function that has one void* parameter and doesn't return a value.
After that, I define function func() that accepts a reference to MyClass object and calls its method PrintText.
And finally, magic happens in main function. I dynamically allocate memory for new MyClass object casting the returned pointer to void*. Then, I cast pointer to func() function to MyFunction pointer - I didn't expect this to compile at all but it does.
And finally, I call this new object with a void* argument even though underlying function (func()) accepts reference to MyClass object. And everything works correctly!
I tried compiling this code with both Visual Studio 2010 (Windows) and XCode 5 (OSX) and it works in the same manner - no warnings are reported whatsoever. I imagine the reason why this works is that C++ references are actually implemented as pointers behind the scenes but this is not an explanation.
I hope someone can explain this behavior.
The formal explanation is simple: undefined behaviour is undefined. When you call a function through a pointer to a different function type, it's undefined behaviour and the program can legally do anything (crash, appear to work, order pizza online ... anyting goes).
You can try reasoning about why the behaviour you're experiencing happens. It's probably a combination of one or more of these factors:
Your compiler internally implements references as pointers.
On your platform, all pointers have the same size and binary representation.
Since PrintText() doesn't access *this at all, the compiler can effectively ignore the value of mc altogether and just call the PrintText() function inside func.
However, you must remember that while you're currently experiencing the behaviour you've described on your current platform, compiler version and under this phase of the moon, this could change at any time for no apparent reason whatsoever (such as a change in surrounding code triggering different optimisations). Remember that undefined behaviour is simply undefined.
As to why you can cast &func to MyFunction - the standard explicitly allows that (with a reinterpret_cast, to which the C-style cast translates in this context). You can legally cast a pointer to function to any other pointer to function type. However, pretty much the only thing you can legally do with it is move it around or cast it back to the original type. As I said above, if you call through a function pointer of the wrong type, it's undefined behaviour.
I hope someone can explain this behavior.
The behaviour is undefined.
MyFunction f = (MyFunction) func; //It works!
It "works" because you use c-style cast which has the same effect as reinterpret_cast in this case I think. If you had used static_cast or simply not cast at all, the compiler would have warned of your mistake and failed. When you call the wrongly interpreted function pointer, you get undefined behaviour.
It's only by chance that it works. Compilers are not guaranteed to make it work. Behind the scenes, your compiler is treating the reference as a pointer, so your alternative function signature just happens to work.
I'm sorry, to me isn't clear why you call this a strange behavior, I don't see a undefined behavior that depends on moon cycle here, is the way to use function pointers in C.
Adding some debug output you may see that the pointer to the object remain the same in all the calls.
void PrintText() const { std::cout << "some text " << this << std::endl;}
^^^^
void func(MyClass& mc){
std::cout << (void *)&mc << std::endl;
^^^
void *v_mc = new MyClass;
std::cout << (void *)v_mc << std::endl;
^^^^
I'm curious why C++ does not define void via :
typedef struct { } void;
I.e. what is the value in a type that cannot be instantiated, even if that installation must produce no code?
If we use gcc -O3 -S, then both the following produce identical assembler :
int main() { return 0; }
and
template <class T> T f(T a) { }
typedef struct { } moo;
int main() { moo a; f(a); return 0; }
This makes perfect sense. A struct { } simply takes an empty value, easy enough to optimize away. In fact, the strange part is that they produce different code without the -O3.
You cannot however pull this same trick with simply typedef void moo because void cannot assume any value, not even an empty one. Does this distinction have any utility?
There are various other strongly typed languages like Haskell, and presumably the MLs, that have a value for their void type, but offer no valueless types overtly, although some posses native pointer types, which act like void *.
I see the rationale for void being unable to be instantiated coming from the C roots of C++. In the old gory days, type safety wasn't that big a deal and void*s were constantly passed around. However, you could always be looking at code that does not literally say void* (due to typedefs, macros, and in C++, templates) but still means it.
It is a tiny bit of safety if you cannot dereference a void* but have to cast it to a pointer to a proper type, first. Whether you accomplish that by using an incomplete type as Ben Voigt suggests, or if you use the built-in void type doesn't matter. You're protected from incorrectly guessing that you are dealing with a type, which you are not.
Yes, void introduces annoying special cases, especially when designing templates. But it's a good thing (i.e. intentional) that the compiler doesn't silently accept attempts to instantiate void.
Because that wouldn't make void an incomplete type, now would it?
Try your example code again with:
struct x; // incomplete
typedef x moo;
Why should void be an incomplete type?
There are many reasons.
The simplest is this: moo FuncName(...) must still return something. Whether it is a neutral value, whether it is the "not a value value", or whatever; it still must say return value;, where value is some actual value. It must say return moo();.
Why write a function that technically returns something, when it isn't actually returning something? The compiler can't optimize the return out, because it's returning a value of a complete type. The caller might actually use it.
C++ isn't all templates, you know. The compiler doesn't necessarily have the knowledge to throw everything away. It often has to perform in-function optimizations that have no knowledge of the external use of that code.
void in this case means "I don't return anything." There is a fundamental difference between that and "I return a meaningless value." It is legal to do this:
moo FuncName(...) { return moo(); }
moo x = FuncName(...);
This is at best misleading. A cursory scan suggests that a value is being returned and stored. The identifier x now has a value and can be used for something.
Whereas this:
void FuncName(...) {}
void x = FuncName(...);
is a compiler error. So is this:
void FuncName(...) {}
moo x = FuncName(...);
It's clear what's going on: FuncName has no return value.
Even if you were designing C++ from scratch, with no hold-overs from C, you would still need some keyword to indicate the difference between a function that returns a value and one that does not.
Furthermore, void* is special in part because void is not a complete type. Because the standard mandates that it isn't a complete type, the concept of a void* can mean "pointer to untyped memory." This has specialized casting semantics. Pointers to typed memory can be implicitly cast to untyped memory. Pointers to untyped memory can be explicitly cast back to the typed pointer that it used to be.
If you used moo* instead, then the semantics get weird. You have a pointer to typed memory. Currently, C++ defines casting between unrelated typed pointers (except for certain special cases) to be undefined behavior. Now the standard has to add another exception for the moo type. It has to say what happens when you do this:
moo *m = new moo;
*moo;
With void, this is a compiler error. What is it with moo? It's still useless and meaningless, but the compiler has to allow it.
To be honest, I would say that the better question would be "Why should void be a complete type?"