Is there a legitimate use for void*? - c++

Is there a legitimate use of void* in C++? Or was this introduced because C had it?
Just to recap my thoughts:
Input: If we want to allow multiple input types we can overload functions and methods, alternatively we can define a common base class, or template (thanks for mentioning this in the answers). In both cases the code get's more descriptive and less error prone (provided the base class is implemented in a sane way).
Output: I can't think of any situation where I would prefer to receive void* as opposed to something derived from a known base class.
Just to make it clear what I mean: I'm not specifically asking if there is a use-case for void*, but if there is a case where void* is the best or only available choice. Which has been perfectly answered by several people below.

void* is at least necessary as the result of ::operator new (also every operator new...) and of malloc and as the argument of the placement new operator.
void* can be thought as the common supertype of every pointer type. So it is not exactly meaning pointer to void, but pointer to anything.
BTW, if you wanted to keep some data for several unrelated global variables, you might use some std::map<void*,int> score; then, after having declared global int x; and double y; and std::string s; do score[&x]=1; and score[&y]=2; and score[&z]=3;
memset wants a void* address (the most generic ones)
Also, POSIX systems have dlsym and its return type evidently should be void*

There are multiple reasons to use void*, the 3 most common being:
interacting with a C library using void* in its interface
type-erasure
denoting un-typed memory
In reverse order, denoting un-typed memory with void* (3) instead of char* (or variants) helps preventing accidental pointer arithmetic; there are very few operations available on void* so it usually require casting before being useful. And of course, much like with char* there is no issue with aliasing.
Type-erasure (2) is still used in C++, in conjunction with templates or not:
non-generic code helps reducing binary bloat, it's useful in cold paths even in generic code
non-generic code is necessary for storage sometimes, even in generic container such as std::function
And obviously, when the interface you deal with uses void* (1), you have little choice.

Oh yes. Even in C++ sometimes we go with void * rather than template<class T*> because sometimes the extra code from the template expansion weighs too much.
Commonly I would use it as the actual implementation of the type, and the template type would inherit from it and wrap the casts.
Also, custom slab allocators (operator new implementations) must use void *. This is one of the reasons why g++ added an extension of permitting pointer arithmatic on void * as though it were of size 1.

Input: If we want to allow multiple input types we can overload
functions and methods
True.
alternatively we can define a common base
class.
This is partially true: what if you can't define a common base class, an interface or similar? To define those you need to have access to the source code, which is often not possible.
You didn't mention templates. However, templates cannot help you with polymorphism: they work with static types i.e. known at compile time.
void* may be consider as the lowest common denominator. In C++, you typically don't need it because (i) you can't inherently do much with it and (ii) there are almost always better solutions.
Even further, you will typically end up on converting it to other concrete types. That's why char * is usually better, although it may indicate that you're expecting a C-style string, rather than a pure block of data. That's whyvoid* is better than char* for that, because it allows implicit cast from other pointer types.
You're supposed to receive some data, work with it and produce an output; to achieve that, you need to know the data you're working with, otherwise you have a different problem which is not the one you were originally solving. Many languages don't have void* and have no problem with that, for instance.
Another legitimate use
When printing pointer addresses with functions like printf the pointer shall have void* type and, therefore, you may need a cast to void*

Yes, it is as useful as any other thing in the language.
As an example, you can use it to erase the type of a class that you are able to statically cast to the right type when needed, in order to have a minimal and flexible interface.
In that response there is an example of use that should give you an idea.
I copy and paste it below for the sake of clarity:
class Dispatcher {
Dispatcher() { }
template<class C, void(C::*M)() = C::receive>
static void invoke(void *instance) {
(static_cast<C*>(instance)->*M)();
}
public:
template<class C, void(C::*M)() = &C::receive>
static Dispatcher create(C *instance) {
Dispatcher d;
d.fn = &invoke<C, M>;
d.instance = instance;
return d;
}
void operator()() {
(fn)(instance);
}
private:
using Fn = void(*)(void *);
Fn fn;
void *instance;
};
Obviously, this is only one of the bunch of uses of void*.

Interfacing with an external library function which returns a pointer. Here is one for an Ada application.
extern "C" { void* ada_function();}
void* m_status_ptr = ada_function();
This returns a pointer to whatever it was Ada wanted to tell you about. You don't have to do anything fancy with it, you can give it back to Ada to do the next thing.
In fact disentangling an Ada pointer in C++ is non-trivial.

In short, C++ as a strict language (not taking into account C relics like malloc()) requires void* since it has no common parent of all possible types. Unlike ObjC, for example, which has object.

The first thing that occurs to my mind (which I suspect is a concrete case of a couple of the answers above) is the capability to pass an object instance to a threadproc in Windows.
I've got a couple of C++ classes which need to do this, they have worker thread implementations and the LPVOID parameter in the CreateThread() API gets an address of a static method implementation in the class so the worker thread can do the work with a specific instance of the class. Simple static cast back in the threadproc yields the instance to work with, allowing each instantiated object to have a worker thread from a single static method implementation.

In case of multiple inheritance, if you need to get a pointer to the first byte of a memory chunk occupied by an object, you may dynamic_cast to void*.

Related

Void pointer in C++

I learned that templates are the void* equivalents in C++. Is this true?
I have this issue in polling "events" off some procedure, when I have an EventType variable and I may also need to pass raw data that is related to that event.
struct WindowEvent {
enum Type {WINDOW_SIZE_CHANGE, USER_CLICK, ...};
void* data;
};
The user may then cast data to the necessary type, depending on the event type.
Is this approach okay in C++? Are there any better approaches?
In C, which generally lacks support for polymorphism, void* pointers can be used to accept data of any type, along with some run-time representation of the actual type, or just knowledge that the data will be casted back to the correct type.
In C++ and other languages with support for polymorphism one will generally instead use either dynamic polymorphism (classes with virtual functions) or static polymorphism (function overloads and templates).
The main difference is that the C approach yields dynamic (run time) manual type checking, while the C++ approaches yield mostly static (compile time) and fully automated type checking. This means less time spent on testing and hunting down silly easily preventable bugs. The cost is more verbose code, and that means that there's a code size offset somewhere, under which the C approach probably rules for productivity, and above which the C++ approaches rule.
"I learned that templates are the void equivalents in C++. Is this true?"*
No - Templates maintain type safety
"Is this approach okay in C++?"
No
"Are there any better approaches?"
Depending on the use case one could use (for example)
class EventData {
public:
virtual int getData() = 0;
};
And then use the appropriate inherited class. Perhaps using smart pointers.

Is there a way to have a function return a type?

I have a data class (struct actually) two variables: a void pointer and a string containing the type of the object being pointed to.
struct data{
void* index;
std::string type;
data(): index(0), type("null"){}
data(void* index, std::string type): index(index), type(type){}};
Now I need to use the object being pointed to, by casting the void pointer to a type that is specified by the string, so I thought of using an std::map with strings and functions.
std::unordered_map<std::string, function> cast;
The problem is that the functions must always have the exact same return-type and can't return a type itself.
Edit:
Because I use the data class as a return-type and as arguments, templates won't suffice.
(also added some code to show what I mean)
data somefunction(data a){
//do stuff
return data();}
Currently, I use functions like this to do the trick, but I thought it could be done more easily:
void functionforstring(data a){
dynamic_cast<string*>(data.index)->function();}
Neither thing is possible in C++:
Functions cannot return types (that is to say, types are not values).
Code cannot operate on objects whose type it doesn't know at compile-time (that is to say, C++ is statically typed). Of course there is dynamic polymorphism via virtual functions, but even with that, the type of the pointer you use to call them is known at compile time by the calling code.
So the operation you want, "convert to the pointer type indicated by a string" is not possible. If it were possible, then the result would be a pointer whose type is not known at compile time, and that cannot be.
There's nothing you could do with this "pointer of type unknown at compile time", that you can't do using the void* you started with. void* pretty much already is what C++ has in place of a pointer to unknown type.
While it's not possible to return a type from a function, you could use typeid to get information about the object, and use the string returned by typeid(*obj).name() as an argument to your constructor.
Keep in mind that this string would be implementation defined, so you would have to generate this string at runtime for every type that you might possibly use in the program in order to make your unordered_map useful.
There is almost certainly a much simpler and more idiomatic way to accomplish your goal in C++, however. Perhaps if you explained more about the goals of the program, someone might be able to suggest an alternative approach.

C++ Abstract class can't have a method with a parameter of that class

I created this .h file
#pragma once
namespace Core
{
class IComparableObject
{
public:
virtual int CompareTo(IComparableObject obj)=0;
};
}
But compiler doesn't like IComparableObject obj param if the method is virtual pure, while
virtual int CompareTo(IComparableObject obj) {}
It's ok, however I want it as virtual pure. How can I manage to do it? Is it possible?
You are trying to pass obj by value. You cannot pass an abstract class instance by value, because no abstract class can ever be instantiated (directly). To do what you want, you have to pass obj by reference, for example like so:
virtual int CompareTo(IComparableObject const &obj)=0;
It works when you give an implementation for CompareTo because then the class is not abstract any longer. But be aware that slicing occurs! You don't want to pass obj by value.
Well I have to give an unexpected answer here! Dennycrane said you can do this:
virtual int CompareTo(IComparableObject const &obj)=0;
but this is not correct either. Oh yes, it compiles, but it is useless because it can never be implemented correctly.
This issue is fundamental to the collapse of (statically typed) Object Orientation, so it is vital that programmers using OO recognize the issue. The problem has a name, it is called the covariance problem and it destroys OO utterly as a general programming paradigm; that is, a way of representing and independently implementing general abstractions.
This explanation will be a bit long and sloppy so bear with me and try to read between the lines.
First, an abstract class with a pure virtual method taking no arguments can be easily implemented in any derived class, since the method has access to the non-static data variables of the derived class via the this pointer. The this pointer has the type of a pointer to the derived class, and so we can say it varies along with the class, in fact it is covariant with the derived class type.
Let me call this kind of polymorphism first order, it clearly supports dispatching predicates on the object type. Indeed, the return type of such a method may also vary down with the object and class type, that is, the return type is covariant.
Now, I will generalise the idea of a method with no arguments to allow arbitrary scalar arguments (such as ints) claiming this changes nothing: this is merely a family of methods indexed by the scalar type. The important property here is that the scalar type is closed. In a derived class exactly the same scalar type must be used. in other words, the type is invariant.
General introduction of invariant parameters to a virtual function still permits polymorphism, but the result is still first order.
Unfortunately, such functions have limited utility, although they are very useful when the abstraction is only first order: a good example is device drivers.
But what if you want to model something which is actually interesting, that is, it is at least a relation?
The answer to this is: you cannot do it. This is a mathematical fact and has nothing to do with the programming language involved. Lets suppose you have an abstraction for say, numbers, and you want to add one number to another number, or compare them (as in the OP's example). Ignoring symmetry, if you have N implementations, you will have to write N^2 functions to perform the operations. If you add a new implementation of the abstraction, you have to write N+1 new functions.
Now, I have the first proof that OO is screwed: you cannot fit N^2 methods into a virtual dispatch schema because such a schema is linear. N classes gives you N methods you can implement and for N>1, N^2 > N, so OO is screwed, QED.
In a C++ context you can see the problem: consider :
struct MyComparable : IComparableObject {
int CompareTo(IComparableObject &other) { .. }
};
Arggg! We're screwed! We can't fill in the .. part here because we only have a reference to an abstraction, which has no data in it to compare to. Of course this must be the case, because there are an open/indeterminate/infinite number of possible implementations. There's no possible way to write a single comparison routine as an axiom.
Of course, if you have various property routines, or a common universal representation you can do it, but this does not count, because then the mapping to the universal representation is parameterless and thus the abstraction is only first order. For example if you have various integer representations and you add them by converting both to GNU gmp's data type mpz, then you are using two covariant projection functions and a single global non-polymorphic comparison or addition function.
This is not a counter example, it is a non-solution of the problem, which is to represent a relation or method which is covariant in at least two variables (at least self and other).
You may think you could solve this with:
struct MyComparable : IComparableObject {
int CompareTo(MyComparable &other) { .. }
};
After all you can implement this interface because you know the representation of other now, since it is MyComparable.
Do not laugh at this solution, because it is exactly what Bertrand Meyer did in Eiffel, and it is what many people do in C++ with a small change to try to work around the fact it isn't type safe and doesn't actually override the base-class function:
struct MyComparable : IComparableObject {
int CompareTo(IComparableObject &other) {
try
MyComparable &sibling = dynamic_cast(other);
...
catch (..) { return 0; }
}
};
This isn't a solution. It says that two things aren't equal just because they have different representations. That does not meet the requirement, which is to compare two things in the abstract. Two numbers, for example, cannot fail to be equal just because the representation used is different: zero equals zero, even if one is an mpz and the other an int. Remember: the idea is to properly represent an abstraction, and that means the behaviour must depend only on the abstract value, not the details of a particular implementation.
Some people have tried double dispatch. Clearly, that cannot work either. There is no possible escape from the basic issue here: you cannot stuff a square into a line.
virtual function dispatch is linear, second order problems are quadratic, so OO cannot represent second order problems.
Now I want to be very clear here that C++ and other statically typed OO languages are broken, not because they can't solve this problem, because it cannot be solved, and it isn't a problem: its a simple fact. The reason these languages and the OO paradigm in general are broken is because they promise to deliver general abstractions and then fail to do so. In the case of C++ this is the promise:
struct IComparableObject { virtual int CompareTo(IComparableObject obj)=0; };
and here is where the implicit contract is broken:
struct MyComparable : IComparableObject {
int CompareTo(IComparableObject &other) { throw 00; }
};
because the implementation I gave there is effectively the only possible one.
Well before leaving, you may ask: What is the right way (TM).
The answer is: use functional programming. In C++ that means templates.
template<class T, class U> int compare(T,U);
So if you have N types to compare, and you actually compare all combinations, then yes indeed you have to provide N^2 specialisations. Which shows templates deliver on the promise, at least in this respect. It's a fact: you can't dispatch at run time over an open set of types if the function is variant in more than one parameter.
BTW: in case you aren't convinced by theory .. just go look at the ISO C++ Standard library and see how much virtual function polymorphism is used there, compared to functional programming with templates..
Finally please note carefully that I am not saying classes and such like are useless, I use virtual function polymorphism myself: I'm saying that this is limited to particular problems and not a general way to represent abstractions, and therefore not worthy of being called a paradigm.
From C++03, ยง10.4 3:
An abstract class shall not be used as a parameter type, as a function return type, or as the type of an explicit conversion. Pointers and references to an abstract class can be declared.
Passing obj as a const reference is allowed.
When the CompareTo member function is pure virtual, IComparableObject is an abstract class.
You can't directly copy an object of an abstract class.
When you pass an object by value you're directly copying that object.
Instead of passing by value, you can pass by reference to const.
That is, formal argument type IComparableObject const&.
By the way, the function should probably be declared const so that it can be called on const object.
Also, instead of #pragma once, which is non-standard (but supported by most compilers), consider an ordinary include guard.
Also, when posting code that illustrates a problem, be sure to post exact code. In this case, there's a missing semicolon at the end, indicating manual typing of the code (and so that there could be other typos not so easily identified as such, but instead misidentified as part of your problem). Simply copy and paste real code.
Cheers & hth.,

How is dynamic_cast typically implemented?

Is the type check a mere integer comparison? Or would it make sense to have a GetTypeId virtual function to distinguishing which would make it an integer comparison?
(Just don't want things to be a string comparison on the class names)
EDIT: What I mean is, if I'm often expecting the wrong type, would it make sense to use something like:
struct Token
{
enum {
AND,
OR,
IF
};
virtual std::size_t GetTokenId() = 0;
};
struct AndToken : public Token
{
std::size_t GetTokenId() { return AND; }
};
And use the GetTokenId member instead of relying on dynamic_cast.
The functionality of the dynamic_cast goes far beyond a simple type check. If it was just a type check, it would be very easy to implement (something like what you have in your original post).
In addition to type checking, dynamic_cast can perform casts to void * and hierarchical cross-casts. These kinds of casts conceptually require some ability to traverse class hierarchy in both directions (up and down). The data structures needed to support such casts are more complicated than a mere scalar type id. The information the dynamic_cast is using is a part of RTTI.
Trying to describe it here would be counterproductive. I used to have a good link that described one possible implementation of RTTI... will try to find it.
I don't know the exact implementation, but here is an idea how I would do it:
Casting from Derived* to Base* can be done in compile time. Casting between two unrelated polimorphic types can be done in compile time too (just return NULL).
Casting from Base* to Derived* needs to be done in run-time, because multiple derived classes possible. The identification of dynamic type can be done using the virtual method table bound to the object (that's why it requires polymorphic classes).
This VMT probably contains extra information about the base classes and their data offsets. These data offsets are relevant when multiple inheritance is involved and is added to the source pointer to make it point to the right location.
If the desired type was not found among the base classes, dynamic_cast would return null.
In some of the original compilers you are correct they used string comparison.
As a result dynamic_cast<> was very slow (relatively speaking) as the class hierarchy was traversed each step up/down the hierarchy chain required a string compare against the class name.
This leads to a lot of people developing their own casting techniques. This was nearly always ultimately futile as it required each class to be annotated correctly and when things went wrong it was nearly impossible to trace the error.
But that is also ancient history.
I am not sure how it is done now but it definitely does not involve string comparison. Doing it yourself is also a bad idea (never do work that the compiler is already doing). Any attempt you make will not be as fast or as accurate as the compiler, remember that years of development have gone into making the compiler code as quick as possible (and it will always be correct).
The compiler cannot divine additional information you may have and stick it in dynamic_cast. If you know certain invariants about your code and you can show that your manual casting mechanism is faster, do it yourself. It doesn't really matter how dynamic_cast is implemented in that case.

Why is is it not possible to pass a const set<Derived*> as const set<Base*> to a function?

Before this is marked as duplicate, I'm aware of this question, but in my case we are talking about const containers.
I have 2 classes:
class Base { };
class Derived : public Base { };
And a function:
void register_objects(const std::set<Base*> &objects) {}
I would like to invoke this function as:
std::set<Derived*> objs;
register_objects(objs);
The compiler does not accept this. Why not? The set is not modifiable so there is no risk of non-Derived objects being inserted into it. How can I do this in the best way?
Edit:
I understand that now the compiler works in a way that set<Base*> and set<Derived*> are totally unrelated and therefor the function signature is not found. My question now however is: why does the compiler work like this? Would there be any objections to not see const set<Derived*> as derivative of const set<Base*>
The reason the compiler doesn't accept this is that the standard tells it not to.
The reason the standard tells it not to, is that the committee did not what to introduce a rule that const MyTemplate<Derived*> is a related type to const MyTemplate<Base*> even though the non-const types are not related. And they certainly didn't want a special rule for std::set, since in general the language does not make special cases for library classes.
The reason the standards committee didn't want to make those types related, is that MyTemplate might not have the semantics of a container. Consider:
template <typename T>
struct MyTemplate {
T *ptr;
};
template<>
struct MyTemplate<Derived*> {
int a;
void foo();
};
template<>
struct MyTemplate<Base*> {
std::set<double> b;
void bar();
};
Then what does it even mean to pass a const MyTemplate<Derived*> as a const MyTemplate<Base*>? The two classes have no member functions in common, and aren't layout-compatible. You'd need a conversion operator between the two, or the compiler would have no idea what to do whether they're const or not. But the way templates are defined in the standard, the compiler has no idea what to do even without the template specializations.
std::set itself could provide a conversion operator, but that would just have to make a copy(*), which you can do yourself easily enough. If there were such a thing as a std::immutable_set, then I think it would be possible to implement that such that a std::immutable_set<Base*> could be constructed from a std::immutable_set<Derived*> just by pointing to the same pImpl. Even so, strange things would happen if you had non-virtual operators overloaded in the derived class - the base container would call the base version, so the conversion might de-order the set if it had a non-default comparator that did anything with the objects themselves instead of their addresses. So the conversion would come with heavy caveats. But anyway, there isn't an immutable_set, and const is not the same thing as immutable.
Also, suppose that Derived is related to Base by virtual or multiple inheritance. Then you can't just reinterpret the address of a Derived as the address of a Base: in most implementations the implicit conversion changes the address. It follows that you can't just batch-convert a structure containing Derived* as a structure containing Base* without copying the structure. But the C++ standard actually allows this to happen for any non-POD class, not just with multiple inheritance. And Derived is non-POD, since it has a base class. So in order to support this change to std::set, the fundamentals of inheritance and struct layout would have to be altered. It's a basic limitation of the C++ language that standard containers cannot be re-interpreted in the way you want, and I'm not aware of any tricks that could make them so without reducing efficiency or portability or both. It's frustrating, but this stuff is difficult.
Since your code is passing a set by value anyway, you could just make that copy:
std::set<Derived*> objs;
register_objects(std::set<Base*>(objs.begin(), objs.end());
[Edit: you've changed your code sample not to pass by value. My code still works, and afaik is the best you can do other than refactoring the calling code to use a std::set<Base*> in the first place.]
Writing a wrapper for std::set<Base*> that ensures all elements are Derived*, the way Java generics work, is easier than arranging for the conversion you want to be efficient. So you could do something like:
template<typename T, typename U>
struct MySetWrapper {
// Requirement: std::less is consistent. The default probably is,
// but for all we know there are specializations which aren't.
// User beware.
std::set<T> content;
void insert(U value) { content.insert(value); }
// might need a lot more methods, and for the above to return the right
// type, depending how else objs is used.
};
MySetWrapper<Base*,Derived*> objs;
// insert lots of values
register_objects(objs.content);
(*) Actually, I guess it could copy-on-write, which in the case of a const parameter used in the typical way would mean it never needs to do the copy. But copy-on-write is a bit discredited within STL implementations, and even if it wasn't I doubt the committee would want to mandate such a heavyweight implementation detail.
If your register_objects function receives an argument, it can put/expect any Base subclass in there. That's what it's signature sais.
It's a violation of the Liskov substitution principle.
This particular problem is also referred to as Covariance. In this case, where your function argument is a constant container, it could be made to work. In case the argument container is mutable, it can't work.
Take a look here first: Is array of derived same as array of base. In your case set of derived is a totally different container from set of base and since there is no implicit conversion operator is available to convert between them , compiler is giving an error.
std::set<Base*> and std::set<Derived*> are basically two different objects. Though the Base and Derived classes are linked via inheritance, at compiler template instantiation level they are two different instantiation(of set).
Firstly, It seems a bit odd that you aren't passing by reference ...
Secondly, as mentioned in the other post, you would be better off creating the passed-in set as a std::set< Base* > and then newing a Derived class in for each set member.
Your problem surely arises from the fact that the 2 types are completely different. std::set< Derived* > is in no way inherited from std::set< Base* > as far as the compiler is concerned. They are simply 2 different types of set ...
Well, as stated in the question you mention, set<Base*> and set<Derived*> are different objects. Your register_objects() function takes a set<Base*> object. So the compiler do not know about any register_objects() that takes set<Derived*>. The constness of the parameter does not change anything. Solutions stated in the quoted question seem the best things you can do. Depends on what you need to do ...
As you are aware, the two classes are quite similar once you remove the non-const operations. However, in C++ inheritance is a property of types, whereas const is a mere qualifier on top of types. That means that you can't properly state that const X derives from const Y, even when X derives from Y.
Furthermore, if X does not inherit from Y, that applies to all cv-qualified variants of X and Y as well. This extends to std::set instantiations. Since std::set<Foo> does not inherit from std::set<bar>, std::set<Foo> const does not inherit from std::set<bar> const either.
You are quite right that this is logically allowable, but it would require further language features. They are available in C# 4.0, if you're interested in seeing another language's way of doing it. See here: http://community.bartdesmet.net/blogs/bart/archive/2009/04/13/c-4-0-feature-focus-part-4-generic-co-and-contra-variance-for-delegate-and-interface-types.aspx
Didn't see it linked yet, so here's a bullet point in the C++ FAQ Lite related to this:
http://www.parashift.com/c++-faq-lite/proper-inheritance.html#faq-21.3
I think their Bag-of-Apples != Bag-of-Fruit analogy suits the question.