I learned that templates are the void* equivalents in C++. Is this true?
I have this issue in polling "events" off some procedure, when I have an EventType variable and I may also need to pass raw data that is related to that event.
struct WindowEvent {
enum Type {WINDOW_SIZE_CHANGE, USER_CLICK, ...};
void* data;
};
The user may then cast data to the necessary type, depending on the event type.
Is this approach okay in C++? Are there any better approaches?
In C, which generally lacks support for polymorphism, void* pointers can be used to accept data of any type, along with some run-time representation of the actual type, or just knowledge that the data will be casted back to the correct type.
In C++ and other languages with support for polymorphism one will generally instead use either dynamic polymorphism (classes with virtual functions) or static polymorphism (function overloads and templates).
The main difference is that the C approach yields dynamic (run time) manual type checking, while the C++ approaches yield mostly static (compile time) and fully automated type checking. This means less time spent on testing and hunting down silly easily preventable bugs. The cost is more verbose code, and that means that there's a code size offset somewhere, under which the C approach probably rules for productivity, and above which the C++ approaches rule.
"I learned that templates are the void equivalents in C++. Is this true?"*
No - Templates maintain type safety
"Is this approach okay in C++?"
No
"Are there any better approaches?"
Depending on the use case one could use (for example)
class EventData {
public:
virtual int getData() = 0;
};
And then use the appropriate inherited class. Perhaps using smart pointers.
Related
Is there a legitimate use of void* in C++? Or was this introduced because C had it?
Just to recap my thoughts:
Input: If we want to allow multiple input types we can overload functions and methods, alternatively we can define a common base class, or template (thanks for mentioning this in the answers). In both cases the code get's more descriptive and less error prone (provided the base class is implemented in a sane way).
Output: I can't think of any situation where I would prefer to receive void* as opposed to something derived from a known base class.
Just to make it clear what I mean: I'm not specifically asking if there is a use-case for void*, but if there is a case where void* is the best or only available choice. Which has been perfectly answered by several people below.
void* is at least necessary as the result of ::operator new (also every operator new...) and of malloc and as the argument of the placement new operator.
void* can be thought as the common supertype of every pointer type. So it is not exactly meaning pointer to void, but pointer to anything.
BTW, if you wanted to keep some data for several unrelated global variables, you might use some std::map<void*,int> score; then, after having declared global int x; and double y; and std::string s; do score[&x]=1; and score[&y]=2; and score[&z]=3;
memset wants a void* address (the most generic ones)
Also, POSIX systems have dlsym and its return type evidently should be void*
There are multiple reasons to use void*, the 3 most common being:
interacting with a C library using void* in its interface
type-erasure
denoting un-typed memory
In reverse order, denoting un-typed memory with void* (3) instead of char* (or variants) helps preventing accidental pointer arithmetic; there are very few operations available on void* so it usually require casting before being useful. And of course, much like with char* there is no issue with aliasing.
Type-erasure (2) is still used in C++, in conjunction with templates or not:
non-generic code helps reducing binary bloat, it's useful in cold paths even in generic code
non-generic code is necessary for storage sometimes, even in generic container such as std::function
And obviously, when the interface you deal with uses void* (1), you have little choice.
Oh yes. Even in C++ sometimes we go with void * rather than template<class T*> because sometimes the extra code from the template expansion weighs too much.
Commonly I would use it as the actual implementation of the type, and the template type would inherit from it and wrap the casts.
Also, custom slab allocators (operator new implementations) must use void *. This is one of the reasons why g++ added an extension of permitting pointer arithmatic on void * as though it were of size 1.
Input: If we want to allow multiple input types we can overload
functions and methods
True.
alternatively we can define a common base
class.
This is partially true: what if you can't define a common base class, an interface or similar? To define those you need to have access to the source code, which is often not possible.
You didn't mention templates. However, templates cannot help you with polymorphism: they work with static types i.e. known at compile time.
void* may be consider as the lowest common denominator. In C++, you typically don't need it because (i) you can't inherently do much with it and (ii) there are almost always better solutions.
Even further, you will typically end up on converting it to other concrete types. That's why char * is usually better, although it may indicate that you're expecting a C-style string, rather than a pure block of data. That's whyvoid* is better than char* for that, because it allows implicit cast from other pointer types.
You're supposed to receive some data, work with it and produce an output; to achieve that, you need to know the data you're working with, otherwise you have a different problem which is not the one you were originally solving. Many languages don't have void* and have no problem with that, for instance.
Another legitimate use
When printing pointer addresses with functions like printf the pointer shall have void* type and, therefore, you may need a cast to void*
Yes, it is as useful as any other thing in the language.
As an example, you can use it to erase the type of a class that you are able to statically cast to the right type when needed, in order to have a minimal and flexible interface.
In that response there is an example of use that should give you an idea.
I copy and paste it below for the sake of clarity:
class Dispatcher {
Dispatcher() { }
template<class C, void(C::*M)() = C::receive>
static void invoke(void *instance) {
(static_cast<C*>(instance)->*M)();
}
public:
template<class C, void(C::*M)() = &C::receive>
static Dispatcher create(C *instance) {
Dispatcher d;
d.fn = &invoke<C, M>;
d.instance = instance;
return d;
}
void operator()() {
(fn)(instance);
}
private:
using Fn = void(*)(void *);
Fn fn;
void *instance;
};
Obviously, this is only one of the bunch of uses of void*.
Interfacing with an external library function which returns a pointer. Here is one for an Ada application.
extern "C" { void* ada_function();}
void* m_status_ptr = ada_function();
This returns a pointer to whatever it was Ada wanted to tell you about. You don't have to do anything fancy with it, you can give it back to Ada to do the next thing.
In fact disentangling an Ada pointer in C++ is non-trivial.
In short, C++ as a strict language (not taking into account C relics like malloc()) requires void* since it has no common parent of all possible types. Unlike ObjC, for example, which has object.
The first thing that occurs to my mind (which I suspect is a concrete case of a couple of the answers above) is the capability to pass an object instance to a threadproc in Windows.
I've got a couple of C++ classes which need to do this, they have worker thread implementations and the LPVOID parameter in the CreateThread() API gets an address of a static method implementation in the class so the worker thread can do the work with a specific instance of the class. Simple static cast back in the threadproc yields the instance to work with, allowing each instantiated object to have a worker thread from a single static method implementation.
In case of multiple inheritance, if you need to get a pointer to the first byte of a memory chunk occupied by an object, you may dynamic_cast to void*.
I created this .h file
#pragma once
namespace Core
{
class IComparableObject
{
public:
virtual int CompareTo(IComparableObject obj)=0;
};
}
But compiler doesn't like IComparableObject obj param if the method is virtual pure, while
virtual int CompareTo(IComparableObject obj) {}
It's ok, however I want it as virtual pure. How can I manage to do it? Is it possible?
You are trying to pass obj by value. You cannot pass an abstract class instance by value, because no abstract class can ever be instantiated (directly). To do what you want, you have to pass obj by reference, for example like so:
virtual int CompareTo(IComparableObject const &obj)=0;
It works when you give an implementation for CompareTo because then the class is not abstract any longer. But be aware that slicing occurs! You don't want to pass obj by value.
Well I have to give an unexpected answer here! Dennycrane said you can do this:
virtual int CompareTo(IComparableObject const &obj)=0;
but this is not correct either. Oh yes, it compiles, but it is useless because it can never be implemented correctly.
This issue is fundamental to the collapse of (statically typed) Object Orientation, so it is vital that programmers using OO recognize the issue. The problem has a name, it is called the covariance problem and it destroys OO utterly as a general programming paradigm; that is, a way of representing and independently implementing general abstractions.
This explanation will be a bit long and sloppy so bear with me and try to read between the lines.
First, an abstract class with a pure virtual method taking no arguments can be easily implemented in any derived class, since the method has access to the non-static data variables of the derived class via the this pointer. The this pointer has the type of a pointer to the derived class, and so we can say it varies along with the class, in fact it is covariant with the derived class type.
Let me call this kind of polymorphism first order, it clearly supports dispatching predicates on the object type. Indeed, the return type of such a method may also vary down with the object and class type, that is, the return type is covariant.
Now, I will generalise the idea of a method with no arguments to allow arbitrary scalar arguments (such as ints) claiming this changes nothing: this is merely a family of methods indexed by the scalar type. The important property here is that the scalar type is closed. In a derived class exactly the same scalar type must be used. in other words, the type is invariant.
General introduction of invariant parameters to a virtual function still permits polymorphism, but the result is still first order.
Unfortunately, such functions have limited utility, although they are very useful when the abstraction is only first order: a good example is device drivers.
But what if you want to model something which is actually interesting, that is, it is at least a relation?
The answer to this is: you cannot do it. This is a mathematical fact and has nothing to do with the programming language involved. Lets suppose you have an abstraction for say, numbers, and you want to add one number to another number, or compare them (as in the OP's example). Ignoring symmetry, if you have N implementations, you will have to write N^2 functions to perform the operations. If you add a new implementation of the abstraction, you have to write N+1 new functions.
Now, I have the first proof that OO is screwed: you cannot fit N^2 methods into a virtual dispatch schema because such a schema is linear. N classes gives you N methods you can implement and for N>1, N^2 > N, so OO is screwed, QED.
In a C++ context you can see the problem: consider :
struct MyComparable : IComparableObject {
int CompareTo(IComparableObject &other) { .. }
};
Arggg! We're screwed! We can't fill in the .. part here because we only have a reference to an abstraction, which has no data in it to compare to. Of course this must be the case, because there are an open/indeterminate/infinite number of possible implementations. There's no possible way to write a single comparison routine as an axiom.
Of course, if you have various property routines, or a common universal representation you can do it, but this does not count, because then the mapping to the universal representation is parameterless and thus the abstraction is only first order. For example if you have various integer representations and you add them by converting both to GNU gmp's data type mpz, then you are using two covariant projection functions and a single global non-polymorphic comparison or addition function.
This is not a counter example, it is a non-solution of the problem, which is to represent a relation or method which is covariant in at least two variables (at least self and other).
You may think you could solve this with:
struct MyComparable : IComparableObject {
int CompareTo(MyComparable &other) { .. }
};
After all you can implement this interface because you know the representation of other now, since it is MyComparable.
Do not laugh at this solution, because it is exactly what Bertrand Meyer did in Eiffel, and it is what many people do in C++ with a small change to try to work around the fact it isn't type safe and doesn't actually override the base-class function:
struct MyComparable : IComparableObject {
int CompareTo(IComparableObject &other) {
try
MyComparable &sibling = dynamic_cast(other);
...
catch (..) { return 0; }
}
};
This isn't a solution. It says that two things aren't equal just because they have different representations. That does not meet the requirement, which is to compare two things in the abstract. Two numbers, for example, cannot fail to be equal just because the representation used is different: zero equals zero, even if one is an mpz and the other an int. Remember: the idea is to properly represent an abstraction, and that means the behaviour must depend only on the abstract value, not the details of a particular implementation.
Some people have tried double dispatch. Clearly, that cannot work either. There is no possible escape from the basic issue here: you cannot stuff a square into a line.
virtual function dispatch is linear, second order problems are quadratic, so OO cannot represent second order problems.
Now I want to be very clear here that C++ and other statically typed OO languages are broken, not because they can't solve this problem, because it cannot be solved, and it isn't a problem: its a simple fact. The reason these languages and the OO paradigm in general are broken is because they promise to deliver general abstractions and then fail to do so. In the case of C++ this is the promise:
struct IComparableObject { virtual int CompareTo(IComparableObject obj)=0; };
and here is where the implicit contract is broken:
struct MyComparable : IComparableObject {
int CompareTo(IComparableObject &other) { throw 00; }
};
because the implementation I gave there is effectively the only possible one.
Well before leaving, you may ask: What is the right way (TM).
The answer is: use functional programming. In C++ that means templates.
template<class T, class U> int compare(T,U);
So if you have N types to compare, and you actually compare all combinations, then yes indeed you have to provide N^2 specialisations. Which shows templates deliver on the promise, at least in this respect. It's a fact: you can't dispatch at run time over an open set of types if the function is variant in more than one parameter.
BTW: in case you aren't convinced by theory .. just go look at the ISO C++ Standard library and see how much virtual function polymorphism is used there, compared to functional programming with templates..
Finally please note carefully that I am not saying classes and such like are useless, I use virtual function polymorphism myself: I'm saying that this is limited to particular problems and not a general way to represent abstractions, and therefore not worthy of being called a paradigm.
From C++03, §10.4 3:
An abstract class shall not be used as a parameter type, as a function return type, or as the type of an explicit conversion. Pointers and references to an abstract class can be declared.
Passing obj as a const reference is allowed.
When the CompareTo member function is pure virtual, IComparableObject is an abstract class.
You can't directly copy an object of an abstract class.
When you pass an object by value you're directly copying that object.
Instead of passing by value, you can pass by reference to const.
That is, formal argument type IComparableObject const&.
By the way, the function should probably be declared const so that it can be called on const object.
Also, instead of #pragma once, which is non-standard (but supported by most compilers), consider an ordinary include guard.
Also, when posting code that illustrates a problem, be sure to post exact code. In this case, there's a missing semicolon at the end, indicating manual typing of the code (and so that there could be other typos not so easily identified as such, but instead misidentified as part of your problem). Simply copy and paste real code.
Cheers & hth.,
Is the type check a mere integer comparison? Or would it make sense to have a GetTypeId virtual function to distinguishing which would make it an integer comparison?
(Just don't want things to be a string comparison on the class names)
EDIT: What I mean is, if I'm often expecting the wrong type, would it make sense to use something like:
struct Token
{
enum {
AND,
OR,
IF
};
virtual std::size_t GetTokenId() = 0;
};
struct AndToken : public Token
{
std::size_t GetTokenId() { return AND; }
};
And use the GetTokenId member instead of relying on dynamic_cast.
The functionality of the dynamic_cast goes far beyond a simple type check. If it was just a type check, it would be very easy to implement (something like what you have in your original post).
In addition to type checking, dynamic_cast can perform casts to void * and hierarchical cross-casts. These kinds of casts conceptually require some ability to traverse class hierarchy in both directions (up and down). The data structures needed to support such casts are more complicated than a mere scalar type id. The information the dynamic_cast is using is a part of RTTI.
Trying to describe it here would be counterproductive. I used to have a good link that described one possible implementation of RTTI... will try to find it.
I don't know the exact implementation, but here is an idea how I would do it:
Casting from Derived* to Base* can be done in compile time. Casting between two unrelated polimorphic types can be done in compile time too (just return NULL).
Casting from Base* to Derived* needs to be done in run-time, because multiple derived classes possible. The identification of dynamic type can be done using the virtual method table bound to the object (that's why it requires polymorphic classes).
This VMT probably contains extra information about the base classes and their data offsets. These data offsets are relevant when multiple inheritance is involved and is added to the source pointer to make it point to the right location.
If the desired type was not found among the base classes, dynamic_cast would return null.
In some of the original compilers you are correct they used string comparison.
As a result dynamic_cast<> was very slow (relatively speaking) as the class hierarchy was traversed each step up/down the hierarchy chain required a string compare against the class name.
This leads to a lot of people developing their own casting techniques. This was nearly always ultimately futile as it required each class to be annotated correctly and when things went wrong it was nearly impossible to trace the error.
But that is also ancient history.
I am not sure how it is done now but it definitely does not involve string comparison. Doing it yourself is also a bad idea (never do work that the compiler is already doing). Any attempt you make will not be as fast or as accurate as the compiler, remember that years of development have gone into making the compiler code as quick as possible (and it will always be correct).
The compiler cannot divine additional information you may have and stick it in dynamic_cast. If you know certain invariants about your code and you can show that your manual casting mechanism is faster, do it yourself. It doesn't really matter how dynamic_cast is implemented in that case.
Why is RTTI (Runtime Type Information) necessary?
RTTI, Run-Time Type Information, introduces a [mild] form of reflection for C++.
It allows to know for example the type of a super class, hence allowing to handle an heterogeneous collection of objects which are all derived from the same base type. in ways that are specific to the individual super-classes. (Say you have an array of "Vehicle" objects and need to deal differently with the "Truck" objects found amid the array).
The question whether RTTI is necessary is however an open one. Story has it that Bjarne Stroustrup purposefully excluded this feature from the original C++ specification, by fear that it would be misused.
There are indeed opportunities to overuse/misuse reflection features, and this may have been even more of a factor when C++ was initially introduced because there wasn't such a OOP culture in the mainstream programmer community.
This said, with a more OOP savvy community, with the effective demonstration of all the good things reflection can do (eg. with languages such as Java or C#) and with the fancy design patterns in use nowadays, I strongly believe that RTTI and reflection features at large are very important even if sometimes misused.
I can think of exactly one case when it would be appropriate to use RTTI, and it doesn't even work.
It is fairly common for C-compatible APIs which perform callbacks to provide a user-defined void* to communicate a state structure back to the caller. When calling such an API from C++, it is quite common to pass the this pointer through said void* argument. From the callback, one might want to invoke virtual functions on the passed pointer.
In some cases when the callback parameters are insecure (such as LPARAM of a Windows message), it is obviously desirable to validate the pointer before using it for a virtual call, by checking the hidden vfptr. dynamic_cast is the natural way to do this, but results in undefined behavior exactly when the object is invalid (IIRC, it is undefined behavior if the pointer is to anything except an object with a virtual table). So RTTI is utterly useless for preventing a shatter attack in this way.
Feel free to present any other valid use cases for RTTI, cause I'm totally unconvinced.
EDIT: boost::any got mentioned. As far as boost::any is concerned, you can disable RTTI and use the following typeid implementation:
typedef const void* typeinfo_nonrtti;
template <typename T> typeinfo_nonrtti typeid_nonrtti();
template <typename T> class typeinfo_nonrtti_helper
{
friend typeinfo_nonrtti typeid_nonrtti<T>();
static char unique;
};
template <typename T> char typeinfo_nonrtti_helper<T>::unique;
template <typename T>
typeinfo_nonrtti typeid_nonrtti() { return &typeinfo_nonrtti_helper<T>::unique; }
I'm in the process of creating a class that stores metadata about a particular data source. The metadata is structured in a tree, very similar to how XML is structured. The metadata values can be integer, decimal, or string values.
I'm curious if there is a good way in C++ to store variant data for a situation like this. I'd like for the variant to use standard libraries, so I'm avoiding the COM, Ole, and SQL VARIANT types that are available.
My current solution looks something like this:
enum MetaValueType
{
MetaChar,
MetaString,
MetaShort,
MetaInt,
MetaFloat,
MetaDouble
};
union MetaUnion
{
char cValue;
short sValue;
int iValue;
float fValue;
double dValue;
};
class MetaValue
{
...
private:
MetaValueType ValueType;
std::string StringValue;
MetaUnion VariantValue;
};
The MetaValue class has various Get functions for obtaining the currently stored variant value, but it ends up making every query for a value a big block of if/else if statements to figure out which value I'm looking for.
I've also explored storing the value as only a string, and performing conversions to get different variant types out, but as far as I've seen this leads to a bunch of internal string parsing and error handling which isn't pretty, opens up a big old can of precision and data loss issues with floating point values, and still doesn't eliminate the query if/else if issue stated above.
Has anybody implemented or seen something that's cleaner to use for a C++ variant data type using standard libraries?
As of C++17, there’s std::variant.
If you can’t use that yet, you might want Boost.Variant. A similar, but distinct, type for modelling polymorphism is provided by std::any (and, pre-C++17, Boost.Any).
Just as an additional pointer, you can look for “type erasure”.
While Konrad's answer (using an existing standardized solution) is certainly preferable to writing your own bug-prone version, the boost variant has some overheads, especially in copy construction and memory.
A common customized approach is the following modified Factory Pattern:
Create a Base interface for a generic object that also encapsulates the object type (either as an enum), or using 'typeid' (preferable).
Now implement the interface using a template Derived class.
Create a factory class with a templateized create function with signature:
template <typename _T> Base * Factory::create ();
This internally creates a Derived<_T> object on the heap, and retuns a dynamic cast pointer. Specialize this for each class you want implemented.
Finally, define a Variant wrapper that contains this Base * pointer and defines template get and set functions. Utility functions like getType(), isEmpty(), assignment and equality operators, etc can be appropriately implemented here.
Depending on the utility functions and the factory implementation, supported classes will need to support some basic functions like assignment or copy construction.
You can also go down to a more C-ish solution, which would have a void* the size of a double on your system, plus an enum for which type you're using. It's reasonably clean, but definitely a solution for someone who feels wholly comfortable with the raw bytes of the system.
C++17 now has std::variant which is exactly what you're looking for.
std::variant
The class template std::variant represents a type-safe union. An
instance of std::variant at any given time either holds a value of one
of its alternative types, or in the case of error - no value (this
state is hard to achieve, see valueless_by_exception).
As with unions, if a variant holds a value of some object type T, the
object representation of T is allocated directly within the object
representation of the variant itself. Variant is not allowed to
allocate additional (dynamic) memory.
Although the question had been answered for a long time, for the record I would like to mention that QVariant in the Qt libraries also does this.
Because C++ forbids unions from including types that have non-default
constructors or destructors, most interesting Qt classes cannot be
used in unions. Without QVariant, this would be a problem for
QObject::property() and for database work, etc.
A QVariant object holds a single value of a single type() at a time.
(Some type()s are multi-valued, for example a string list.) You can
find out what type, T, the variant holds, convert it to a different
type using convert(), get its value using one of the toT() functions
(e.g., toSize()) and check whether the type can be converted to a
particular type using canConvert().