How to load function with dlsym() without reinterpret_cast? - c++

I'm trying to use clang-tidy to enforce the C++ Core Guidelines. While it does have a lot of valid points, there is one thing I cannot really work around: dlsym returns a void* which I need to turn into a proper function pointer somehow. To do that I use reinterpret_cast. Since the guidelines forbid it, I have warnings about it.
Of course I can put //NOLINT comments everywhere, but I'm looking for a solution that doesn't use reinterpret_cast so the warnings go away.
Are there any workarounds for this problem?

There is no other way in the language to cast a function pointer type to an object pointer type except for reinterpret_cast. Doing so is implementation-defined behavior [expr.reinterpret.cast]/8:
Converting a function pointer to an object pointer type or vice versa is conditionally-supported. The meaning of such a conversion is implementation-defined, except that if an implementation supports conversions in both directions, converting a prvalue of one type to the other type and back, possibly with different cv-qualification, shall yield the original pointer value.
That means that a conforming C++ compiler must document if it does not support this feature. And, if it does support it, it must document how exactly it behaves. You can rely on it working (or not being available) in the documented way on that compiler.
Concerning the Core Guidelines linting: If you would have to put //NOLINT "everywhere", then that would seem to imply that you're calling naked dlsym() in many places. Consider wrapping it, for example
template <typename T>
inline T* lookupSymbol(void* module, const char* name)
{
auto symbol = reinterpret_cast<T*>(dlsym(module, name)); // NOLINT
if (!symbol)
throw std::runtime_error("failed to find symbol '"s + name + '\'');
return symbol;
}

Related

Pointer-to-function or pointer-to-object. What is the meaning of clang warning? [duplicate]

Am i wrong about the following?
C++ standards says that conversion between pointer-to-function and pointer-to-object (and back) is conditionnaly-supported with implementation-defined semantics, while all C standards says that this is illegal in all cases, right?
void foo() {}
int main(void)
{
void (*fp)() = foo;
void* ptr = (void*)fp;
return 0;
}
ISO/IEC 14882:2011
5.2.10 Reinterpret cast [expr.reinterpret.cast]
8 Converting a function pointer to an object pointer type or vice
versa is conditionally-supported. The meaning of such a conversion is
implementation-defined, except that if an implementation supports
conversions in both directions, converting a prvalue of one type to
the other type and back, possibly with different cvqualification,
shall yield the original pointer value.
I can't find anything about it in C standard right now...
In C++03, such conversions were illegal (not UB). The compiler was supposed to issue a diagnostic. A lot of compilers on Unix systems didn't issue a diagnostic. This was essentially a clash between standards, POSIX vs C++.
In C++11, such conversions are "conditionally supported". No diagnostic is required if the system does supports such conversions; there's nothing to diagnose.
In C, such conversions officially are undefined behavior, so no diagnostic is required. If the system happens to do the "right" thing, well that's one way to implement UB.
In C99, this is once again UB. However, the standard also lists such conversions as one of the "common extensions" to the language:
J.5.7 Function pointer casts
A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).
A pointer to a function may be cast to a pointer to an object or to void, allowing a function to be inspected or modified (for example, by a debugger) (6.5.4).
You're right, the C(99) standard says nothing about conversion from pointer-to-function to pointer-to-object, therefore it's undefined behaviour.*
*Note, however, that it does define behaviour between pointer-to-function types.
In all C standards the conversion between pointer-to-function and pointer-to-object is not defined, in C++ before C++11, the conversion was not allowed and compilers had to give an error, but there were compilers which accepted the conversion for C and backward compatibility and because is useful for things like dynamically loaded libraries access (for instance the dlsym POSIX function mandates its use). C++11 introduced the notion of conditionally-supported features an used it to adapt the standard with the existing practice. Now either the compiler should reject the program trying to do such conversion or it should respect the limited constraints given.
Though the behavior of the cast is not defined by the "core" of the standard, this case is explicitly described as invalid in the C99 rationale document (6.3.2.3, Pointers):
Nothing is said about pointers to functions, which may be incommensurate with object pointers and/or integers.
Even with an explicit cast, it is invalid to convert a function pointer to an object pointer or a pointer to void, or vice versa.
And since it may be useful, it is also mentioned in the Annex J of the standard as a "common extension" (C11 Standard J.5.7, Function pointer casts):
A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).
A pointer to a function may be cast to a pointer to an object or to void, allowing a
function to be inspected or modified (for example, by a debugger) (6.5.4).
Describing this as an extension means that this is not part of the standard requirements (but it wouldn't be needed, the omission of any explicit behavior is enough).
The POSIX standard specifies:
2.12.3 Pointer Types
All function pointer types shall have the same representation as the type pointer to void.
Conversion of a function pointer to void * shall not alter the representation. A void * value
resulting from such a conversion can be converted back to the original function pointer type,
using an explicit cast, without loss of information.
Note: The ISO C standard does not require this, but it is required for POSIX conformance.
This requirement ends up meaning that any C or C++ compiler that wants to be able to support POSIX will support this kind of casting.

Is it safe to reinterpret_cast from std::function<void()> * to std::function<std::monostate()> *?

Example:
std::function<std::monostate()> convert(std::function<void()> func){
return *reinterpret_cast<std::function<std::monostate()> * >(&func);
}
Are std::function<void()> and std::function<std::monostate()> considered "similar" enough for reinterpret_cast to be safe?
Edit: someone asked me to clarify what I am asking. I am not asking if the general case of foo<X> and foo<Y> are similar but whether foo<void> and foo<std::monostate> are.
No this is unsafe and leads to undefined behavior. In particular, there's no guarantee that the two layouts will be compatible. Of course, you might get away with it with some compiler and runtime combinations, but then it might break if some future release of your compiler decides to implement certain forms of control flow integrity.
The safe way to do what you want, albeit at a small cost in performance, is just to return a new lambda, as in:
std::function<std::monostate()> convert(std::function<void()> func){
return [func=std::move(func)]() -> std::monostate { func(); return {}; };
}
Are std::function<void()> and std::function<std::monostate()> considered "similar" enough for reinterpret_cast to be safe?
No. Given a template foo and distinct types X and Y, the instantiations foo<X> and foo<Y> are not similar, regardless of any perceived relationship between X and Y (as long as they are not the same type, which is why they were qualified as "distinct"). Different template instantiations are unrelated unless documented otherwise. There is no such documentation for std::function.
The rules for "similar" make allowances for digging into pointer types, but there is nothing special for templates. (Nor could there be, since a template specialization could look radically different than its base template.) Different types as template arguments yield dissimilar templated classes. No need to dig deeper into those arguments.
I am not asking if the general case of foo<X> and foo<Y> are similar but whether foo<void> and foo<std::monostate> are.
There is nothing special about void and std::monostate that would make them two names for the same type. (In fact, they cannot be the same type, as the former has zero values, while the latter has exactly one value.) So, asking about foo<void> and foo<std::monostate> is the same as asking about the general case, just with a greater possibility of seeing connections that do not exist.
Also, the question is not about foo<void> and foo<std::monostate> but about foo<void()> and foo<std::monostate()>. The types used as template arguments are function types, not object types. Function types are very particular in that two function types are the same only when all of their parameter and return types are exact matches; none of the conversions allowed when invoking a function are considered. (Not that there is a conversion from void to std::monostate.) The function types are different, so again the templates instantiated from those types are not similar.
Perhaps a more focused version of this question would have asked about function pointers instead of std::function objects.
(from a comment:) I was looking at the assembly code of std::monostate() functions and void() functions and they generate the same assembly verbatim.
Generated assembly means nothing as far as the language is concerned. At best, you have evidence that with your compiler, it seems likely that you could get away with invoking a function pointer after casting it from void (*)() to std::monostate (*)(). Not "safe" so much as "works for now". And that assumes that you use the function pointer directly instead of burying it inside a std::function (a complex adapter of types).
C++ is a strongly typed language. Different types are different even if they are treated the same at the level of assembly code. This might be more readily apparent if we switch to more familiar types. On many common systems, char is signed, making it equivalent to signed char at the assembly code level. However, this does not affect the similarity of functions. The following code is illegal, even if changing char to signed char has no effect on the assembly code generated for foo().
char foo() { return 'c'; }
int main()
{
signed char (*fun)() = foo; // <-- Error: invalid conversion
// ^^^^^^ -- because the return type is signed char, not char
}
One can downgrade this error to a warning with a reinterpret_cast. After all, it is legal to cast a function pointer to any function pointer type. However, it is not safe to invoke the function through the cast pointer (unless cast back to the original type), hence the warning. Invoking it might work very reliably on your system, but that is due to your system, not the language. When you ask about "safe", you are asking for guidance from the language specs, not merely what will probably work on your system.

C++ casting for C-style downcasting

When working with a C API that uses C-style inheritance, (taking advantage of the standard layout of C-structs), such as GLib, we usually use C-style casts to downcast:
struct base_object
{
int x;
int y;
int z;
};
struct derived_object
{
base_object base;
int foo;
int bar;
};
void func(base_object* b)
{
derived_object* d = (derived_object*) b; /* Downcast */
}
But if we're writing new C++ code that uses a C-API like this, should we continue to use C-style casts, or should we prefer C++ casts? If the latter, what type of C++ casts should we use to emulate C downcasting?
At first, I thought reinterpret_cast would be suitable:
derived_object* d = reinterpret_cast<derived_object*>(b);
However, I'm always wary of reinterpret_cast because the C++ standard guarantees very little about what will happen. It may be safer to use static_cast to void*:
derived_object* d = static_cast<derived_object*>(static_cast<void*>(b))
Of course, this is really cumbersome, making me think it's better to just use C-style casts in this case.
So what is the best practice here?
If you look at the specification for C-style casts in the C++ spec you'll find that cast notation is defined in terms of the other type conversion operators (dynamic_cast, static_cast, reinterpret_cast, const_cast), and in this case reinterpret_cast is used.
Additionally, reinterpret_cast gives more guarantees than is indicated by the answer you link to. The one you care about is:
§ 9.2/20: A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.
If you want to use a cast notation I think using the C++ type conversion operators explicitly is best. However rather than littering casts all over the code you should probably write a function for each conversion (implemented using reinterpret_cast) and then use that.
derived_object *downcast_to_derived(base_object *b) {
return reinterpret_cast<derived_object*>(b);
}
However, I'm always wary of reinterpret_cast because the C++ standard
guarantees very little about what will happen.
C++-style casts are no less safe than C-style casts, because C-style cast is defined in terms of C++-style casts.
5.4.4 The conversions performed by
— a const_cast (5.2.11),
— a static_cast (5.2.9),
— a static_cast followed by a const_cast,
— a reinterpret_cast (5.2.10), or
— a reinterpret_cast followed by a const_cast,
can be performed using the cast notation of explicit type conversion.
[...]
If a conversion can be interpreted in more than one of the ways listed above, the interpretation that
appears first in the list is used, even if a cast resulting from that interpretation is ill-formed.
The sad answer is that you can't avoid casts in code like you written, because the compiler knows very little about relations between classes. Some way or another, you may want to refactor it (casts or classes or the code that uses them).
The bottom line is:
If you can, use proper inheritance.
If you can't, use reinterpret_cast.
new C++ code that uses a C-API like this
Don't write new C++ code in a C style, it doesn't make use of the C++ language features, and it also forces the user of your wrapper to use this same "C" style. Instead, create a proper C++ class that wraps the C API interface details and hides them behind a C++ class.
should we continue to use C-style casts
No
or should we prefer C++ casts
Yes, but only when you have to.
Use C++ inheritance and virtual accessor functions (probably). Please show how you plan to use the derived object in func, this may provide a better answer for you.
If func expects to use the methods of the derived object, then it should receive a derived object. If it expects to use the methods of a base_object, but the methods are somehow changed because the pointer is to a derived_object, then virtual functions are the C++ way to do this.
Also, you want to pass a reference to func, not a pointer.
dynamic_cast, requires certain conditions to be met:
http://www.cplusplus.com/doc/tutorial/typecasting/
If you are just converting struct ptrs to struct ptrs and you know what you want, then static_cast, or reinterpret_cast may be the best?
However, if you truly are interested in writing C++ code, then the casts should be your last and final resort, since there are better patterns. The two common situations I would consider casting are:
You are interfacing with some event passing mechanism that passes a generic base class to an event handler.
You have a container of objects. The container requires it to contain homogenous types (i.e every element contains the same "thing"), but you want to store different types in the container.
I think dynamic_cast is exactly what you want.

Casts between pointer-to-function and pointer-to-object in C and C++

Am i wrong about the following?
C++ standards says that conversion between pointer-to-function and pointer-to-object (and back) is conditionnaly-supported with implementation-defined semantics, while all C standards says that this is illegal in all cases, right?
void foo() {}
int main(void)
{
void (*fp)() = foo;
void* ptr = (void*)fp;
return 0;
}
ISO/IEC 14882:2011
5.2.10 Reinterpret cast [expr.reinterpret.cast]
8 Converting a function pointer to an object pointer type or vice
versa is conditionally-supported. The meaning of such a conversion is
implementation-defined, except that if an implementation supports
conversions in both directions, converting a prvalue of one type to
the other type and back, possibly with different cvqualification,
shall yield the original pointer value.
I can't find anything about it in C standard right now...
In C++03, such conversions were illegal (not UB). The compiler was supposed to issue a diagnostic. A lot of compilers on Unix systems didn't issue a diagnostic. This was essentially a clash between standards, POSIX vs C++.
In C++11, such conversions are "conditionally supported". No diagnostic is required if the system does supports such conversions; there's nothing to diagnose.
In C, such conversions officially are undefined behavior, so no diagnostic is required. If the system happens to do the "right" thing, well that's one way to implement UB.
In C99, this is once again UB. However, the standard also lists such conversions as one of the "common extensions" to the language:
J.5.7 Function pointer casts
A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).
A pointer to a function may be cast to a pointer to an object or to void, allowing a function to be inspected or modified (for example, by a debugger) (6.5.4).
You're right, the C(99) standard says nothing about conversion from pointer-to-function to pointer-to-object, therefore it's undefined behaviour.*
*Note, however, that it does define behaviour between pointer-to-function types.
In all C standards the conversion between pointer-to-function and pointer-to-object is not defined, in C++ before C++11, the conversion was not allowed and compilers had to give an error, but there were compilers which accepted the conversion for C and backward compatibility and because is useful for things like dynamically loaded libraries access (for instance the dlsym POSIX function mandates its use). C++11 introduced the notion of conditionally-supported features an used it to adapt the standard with the existing practice. Now either the compiler should reject the program trying to do such conversion or it should respect the limited constraints given.
Though the behavior of the cast is not defined by the "core" of the standard, this case is explicitly described as invalid in the C99 rationale document (6.3.2.3, Pointers):
Nothing is said about pointers to functions, which may be incommensurate with object pointers and/or integers.
Even with an explicit cast, it is invalid to convert a function pointer to an object pointer or a pointer to void, or vice versa.
And since it may be useful, it is also mentioned in the Annex J of the standard as a "common extension" (C11 Standard J.5.7, Function pointer casts):
A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).
A pointer to a function may be cast to a pointer to an object or to void, allowing a
function to be inspected or modified (for example, by a debugger) (6.5.4).
Describing this as an extension means that this is not part of the standard requirements (but it wouldn't be needed, the omission of any explicit behavior is enough).
The POSIX standard specifies:
2.12.3 Pointer Types
All function pointer types shall have the same representation as the type pointer to void.
Conversion of a function pointer to void * shall not alter the representation. A void * value
resulting from such a conversion can be converted back to the original function pointer type,
using an explicit cast, without loss of information.
Note: The ISO C standard does not require this, but it is required for POSIX conformance.
This requirement ends up meaning that any C or C++ compiler that wants to be able to support POSIX will support this kind of casting.

What are the uses of the type `std::nullptr_t`?

I learned that nullptr, in addition to being convertible to any pointer type (but not to any integral type) also has its own type std::nullptr_t. So it is possible to have a method overload that accepts std::nullptr_t.
Exactly why is such an overload required?
If more than one overload accepts a pointer type, an overload for std::nullptr_t is necessary to accept a nullptr argument. Without the std::nullptr_t overload, it would be ambiguous which pointer overload should be selected when passed nullptr.
Example:
void f(int *intp)
{
// Passed an int pointer
}
void f(char *charp)
{
// Passed a char pointer
}
void f(std::nullptr_t nullp)
{
// Passed a null pointer
}
There are some special cases that comparison with a nullptr_t type is useful to indicate whether an object is valid.
For example, the operator== and operator!= overloads of std::function could only take nullptr_t as the parameter to tell if the function object is empty. For more details you could read this question.
Also, what other type would you give it, that doesn't simply re-introduce the problems we had with NULL? The whole point is to get rid of the nasty implicit conversions, but we can't actually change behaviour of old programs so here we are.
The type was introduced to avoid confusion between integer zero and the the null memory. And as always cpp gives you access to the type. Where as Java only gives you access to the value. It really doesnt matter what purpose you find for it. I normally use it as a token in function overloading.
But I have some issues with the implementation of cpp null const.
Why didnt they just continue with NULL or null? That definition was already being used for that purpose. What about code that already was using nullptr for something else.
Not to mention nullptr is just too long. Annoying to type and ugly to look at most times. 6 characters just to default initialize a variable.
With the introduction of nullptr, you would think zero would no longer be both a integer and null pointer const. However zero still holds that annoying ambiguity. So I dont see the sense then of this new nullptr value. If you define a function that can accept an integer or a char pointer, and pass zero to that function call, the compiler will complain that it is totally ambigious! And I dont think casting to an integer will help.
Finally, it sucks that nullptr_t is part of the std namespace and not simply a keyword. Infact I am just learning this fact, after how long I have been using nullptr_t in my functions. MinGW32 that comes with CodeBlocks allows you to get away with using nullptr_t with std namespace. Infact MinGW32 allows void* increment and a whole lot of other things.
Which leads me to: cpp has too much denominations and confusion. To the point where code compatibility with one compiler is not compatibility with another of the same cpp version. Static library of one compiler cannot work with a different compiler. There is no reason why it has to be this way. And I think this is just one way to help kill cpp.