Related
Is there a legitimate use of void* in C++? Or was this introduced because C had it?
Just to recap my thoughts:
Input: If we want to allow multiple input types we can overload functions and methods, alternatively we can define a common base class, or template (thanks for mentioning this in the answers). In both cases the code get's more descriptive and less error prone (provided the base class is implemented in a sane way).
Output: I can't think of any situation where I would prefer to receive void* as opposed to something derived from a known base class.
Just to make it clear what I mean: I'm not specifically asking if there is a use-case for void*, but if there is a case where void* is the best or only available choice. Which has been perfectly answered by several people below.
void* is at least necessary as the result of ::operator new (also every operator new...) and of malloc and as the argument of the placement new operator.
void* can be thought as the common supertype of every pointer type. So it is not exactly meaning pointer to void, but pointer to anything.
BTW, if you wanted to keep some data for several unrelated global variables, you might use some std::map<void*,int> score; then, after having declared global int x; and double y; and std::string s; do score[&x]=1; and score[&y]=2; and score[&z]=3;
memset wants a void* address (the most generic ones)
Also, POSIX systems have dlsym and its return type evidently should be void*
There are multiple reasons to use void*, the 3 most common being:
interacting with a C library using void* in its interface
type-erasure
denoting un-typed memory
In reverse order, denoting un-typed memory with void* (3) instead of char* (or variants) helps preventing accidental pointer arithmetic; there are very few operations available on void* so it usually require casting before being useful. And of course, much like with char* there is no issue with aliasing.
Type-erasure (2) is still used in C++, in conjunction with templates or not:
non-generic code helps reducing binary bloat, it's useful in cold paths even in generic code
non-generic code is necessary for storage sometimes, even in generic container such as std::function
And obviously, when the interface you deal with uses void* (1), you have little choice.
Oh yes. Even in C++ sometimes we go with void * rather than template<class T*> because sometimes the extra code from the template expansion weighs too much.
Commonly I would use it as the actual implementation of the type, and the template type would inherit from it and wrap the casts.
Also, custom slab allocators (operator new implementations) must use void *. This is one of the reasons why g++ added an extension of permitting pointer arithmatic on void * as though it were of size 1.
Input: If we want to allow multiple input types we can overload
functions and methods
True.
alternatively we can define a common base
class.
This is partially true: what if you can't define a common base class, an interface or similar? To define those you need to have access to the source code, which is often not possible.
You didn't mention templates. However, templates cannot help you with polymorphism: they work with static types i.e. known at compile time.
void* may be consider as the lowest common denominator. In C++, you typically don't need it because (i) you can't inherently do much with it and (ii) there are almost always better solutions.
Even further, you will typically end up on converting it to other concrete types. That's why char * is usually better, although it may indicate that you're expecting a C-style string, rather than a pure block of data. That's whyvoid* is better than char* for that, because it allows implicit cast from other pointer types.
You're supposed to receive some data, work with it and produce an output; to achieve that, you need to know the data you're working with, otherwise you have a different problem which is not the one you were originally solving. Many languages don't have void* and have no problem with that, for instance.
Another legitimate use
When printing pointer addresses with functions like printf the pointer shall have void* type and, therefore, you may need a cast to void*
Yes, it is as useful as any other thing in the language.
As an example, you can use it to erase the type of a class that you are able to statically cast to the right type when needed, in order to have a minimal and flexible interface.
In that response there is an example of use that should give you an idea.
I copy and paste it below for the sake of clarity:
class Dispatcher {
Dispatcher() { }
template<class C, void(C::*M)() = C::receive>
static void invoke(void *instance) {
(static_cast<C*>(instance)->*M)();
}
public:
template<class C, void(C::*M)() = &C::receive>
static Dispatcher create(C *instance) {
Dispatcher d;
d.fn = &invoke<C, M>;
d.instance = instance;
return d;
}
void operator()() {
(fn)(instance);
}
private:
using Fn = void(*)(void *);
Fn fn;
void *instance;
};
Obviously, this is only one of the bunch of uses of void*.
Interfacing with an external library function which returns a pointer. Here is one for an Ada application.
extern "C" { void* ada_function();}
void* m_status_ptr = ada_function();
This returns a pointer to whatever it was Ada wanted to tell you about. You don't have to do anything fancy with it, you can give it back to Ada to do the next thing.
In fact disentangling an Ada pointer in C++ is non-trivial.
In short, C++ as a strict language (not taking into account C relics like malloc()) requires void* since it has no common parent of all possible types. Unlike ObjC, for example, which has object.
The first thing that occurs to my mind (which I suspect is a concrete case of a couple of the answers above) is the capability to pass an object instance to a threadproc in Windows.
I've got a couple of C++ classes which need to do this, they have worker thread implementations and the LPVOID parameter in the CreateThread() API gets an address of a static method implementation in the class so the worker thread can do the work with a specific instance of the class. Simple static cast back in the threadproc yields the instance to work with, allowing each instantiated object to have a worker thread from a single static method implementation.
In case of multiple inheritance, if you need to get a pointer to the first byte of a memory chunk occupied by an object, you may dynamic_cast to void*.
When working with a C API that uses C-style inheritance, (taking advantage of the standard layout of C-structs), such as GLib, we usually use C-style casts to downcast:
struct base_object
{
int x;
int y;
int z;
};
struct derived_object
{
base_object base;
int foo;
int bar;
};
void func(base_object* b)
{
derived_object* d = (derived_object*) b; /* Downcast */
}
But if we're writing new C++ code that uses a C-API like this, should we continue to use C-style casts, or should we prefer C++ casts? If the latter, what type of C++ casts should we use to emulate C downcasting?
At first, I thought reinterpret_cast would be suitable:
derived_object* d = reinterpret_cast<derived_object*>(b);
However, I'm always wary of reinterpret_cast because the C++ standard guarantees very little about what will happen. It may be safer to use static_cast to void*:
derived_object* d = static_cast<derived_object*>(static_cast<void*>(b))
Of course, this is really cumbersome, making me think it's better to just use C-style casts in this case.
So what is the best practice here?
If you look at the specification for C-style casts in the C++ spec you'll find that cast notation is defined in terms of the other type conversion operators (dynamic_cast, static_cast, reinterpret_cast, const_cast), and in this case reinterpret_cast is used.
Additionally, reinterpret_cast gives more guarantees than is indicated by the answer you link to. The one you care about is:
§ 9.2/20: A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.
If you want to use a cast notation I think using the C++ type conversion operators explicitly is best. However rather than littering casts all over the code you should probably write a function for each conversion (implemented using reinterpret_cast) and then use that.
derived_object *downcast_to_derived(base_object *b) {
return reinterpret_cast<derived_object*>(b);
}
However, I'm always wary of reinterpret_cast because the C++ standard
guarantees very little about what will happen.
C++-style casts are no less safe than C-style casts, because C-style cast is defined in terms of C++-style casts.
5.4.4 The conversions performed by
— a const_cast (5.2.11),
— a static_cast (5.2.9),
— a static_cast followed by a const_cast,
— a reinterpret_cast (5.2.10), or
— a reinterpret_cast followed by a const_cast,
can be performed using the cast notation of explicit type conversion.
[...]
If a conversion can be interpreted in more than one of the ways listed above, the interpretation that
appears first in the list is used, even if a cast resulting from that interpretation is ill-formed.
The sad answer is that you can't avoid casts in code like you written, because the compiler knows very little about relations between classes. Some way or another, you may want to refactor it (casts or classes or the code that uses them).
The bottom line is:
If you can, use proper inheritance.
If you can't, use reinterpret_cast.
new C++ code that uses a C-API like this
Don't write new C++ code in a C style, it doesn't make use of the C++ language features, and it also forces the user of your wrapper to use this same "C" style. Instead, create a proper C++ class that wraps the C API interface details and hides them behind a C++ class.
should we continue to use C-style casts
No
or should we prefer C++ casts
Yes, but only when you have to.
Use C++ inheritance and virtual accessor functions (probably). Please show how you plan to use the derived object in func, this may provide a better answer for you.
If func expects to use the methods of the derived object, then it should receive a derived object. If it expects to use the methods of a base_object, but the methods are somehow changed because the pointer is to a derived_object, then virtual functions are the C++ way to do this.
Also, you want to pass a reference to func, not a pointer.
dynamic_cast, requires certain conditions to be met:
http://www.cplusplus.com/doc/tutorial/typecasting/
If you are just converting struct ptrs to struct ptrs and you know what you want, then static_cast, or reinterpret_cast may be the best?
However, if you truly are interested in writing C++ code, then the casts should be your last and final resort, since there are better patterns. The two common situations I would consider casting are:
You are interfacing with some event passing mechanism that passes a generic base class to an event handler.
You have a container of objects. The container requires it to contain homogenous types (i.e every element contains the same "thing"), but you want to store different types in the container.
I think dynamic_cast is exactly what you want.
I've heard that the static_cast function should be preferred to C-style or simple function-style casting. Is this true? Why?
The main reason is that classic C casts make no distinction between what we call static_cast<>(), reinterpret_cast<>(), const_cast<>(), and dynamic_cast<>(). These four things are completely different.
A static_cast<>() is usually safe. There is a valid conversion in the language, or an appropriate constructor that makes it possible. The only time it's a bit risky is when you cast down to an inherited class; you must make sure that the object is actually the descendant that you claim it is, by means external to the language (like a flag in the object). A dynamic_cast<>() is safe as long as the result is checked (pointer) or a possible exception is taken into account (reference).
A reinterpret_cast<>() (or a const_cast<>()) on the other hand is always dangerous. You tell the compiler: "trust me: I know this doesn't look like a foo (this looks as if it isn't mutable), but it is".
The first problem is that it's almost impossible to tell which one will occur in a C-style cast without looking at large and disperse pieces of code and knowing all the rules.
Let's assume these:
class CDerivedClass : public CMyBase {...};
class CMyOtherStuff {...} ;
CMyBase *pSomething; // filled somewhere
Now, these two are compiled the same way:
CDerivedClass *pMyObject;
pMyObject = static_cast<CDerivedClass*>(pSomething); // Safe; as long as we checked
pMyObject = (CDerivedClass*)(pSomething); // Same as static_cast<>
// Safe; as long as we checked
// but harder to read
However, let's see this almost identical code:
CMyOtherStuff *pOther;
pOther = static_cast<CMyOtherStuff*>(pSomething); // Compiler error: Can't convert
pOther = (CMyOtherStuff*)(pSomething); // No compiler error.
// Same as reinterpret_cast<>
// and it's wrong!!!
As you can see, there is no easy way to distinguish between the two situations without knowing a lot about all the classes involved.
The second problem is that the C-style casts are too hard to locate. In complex expressions it can be very hard to see C-style casts. It is virtually impossible to write an automated tool that needs to locate C-style casts (for example a search tool) without a full blown C++ compiler front-end. On the other hand, it's easy to search for "static_cast<" or "reinterpret_cast<".
pOther = reinterpret_cast<CMyOtherStuff*>(pSomething);
// No compiler error.
// but the presence of a reinterpret_cast<> is
// like a Siren with Red Flashing Lights in your code.
// The mere typing of it should cause you to feel VERY uncomfortable.
That means that, not only are C-style casts more dangerous, but it's a lot harder to find them all to make sure that they are correct.
One pragmatic tip: you can search easily for the static_cast keyword in your source code if you plan to tidy up the project.
In short:
static_cast<>() gives you a compile time checking ability, C-Style
cast doesn't.
static_cast<>() can be spotted easily
anywhere inside a C++ source code; in contrast, C_Style cast is harder to spot.
Intentions are conveyed much better using C++ casts.
More Explanation:
The static cast performs conversions between compatible types. It
is similar to the C-style cast, but is more restrictive. For example,
the C-style cast would allow an integer pointer to point to a char.
char c = 10; // 1 byte
int *p = (int*)&c; // 4 bytes
Since this results in a 4-byte pointer pointing to 1 byte of allocated
memory, writing to this pointer will either cause a run-time error or
will overwrite some adjacent memory.
*p = 5; // run-time error: stack corruption
In contrast to the C-style cast, the static cast will allow the
compiler to check that the pointer and pointee data types are
compatible, which allows the programmer to catch this incorrect
pointer assignment during compilation.
int *q = static_cast<int*>(&c); // compile-time error
Read more on:
What is the difference between static_cast<> and C style casting
and
Regular cast vs. static_cast vs. dynamic_cast
The question is bigger than just using whether static_cast<> or C-style casting because there are different things that happen when using C-style casts. The C++ casting operators are intended to make those different operations more explicit.
On the surface static_cast<> and C-style casts appear to be the same thing, for example when casting one value to another:
int i;
double d = (double)i; //C-style cast
double d2 = static_cast<double>( i ); //C++ cast
Both of those cast the integer value to a double. However when working with pointers things get more complicated. Some examples:
class A {};
class B : public A {};
A* a = new B;
B* b = (B*)a; //(1) what is this supposed to do?
char* c = (char*)new int( 5 ); //(2) that weird?
char* c1 = static_cast<char*>( new int( 5 ) ); //(3) compile time error
In this example (1) may be OK because the object pointed to by A is really an instance of B. But what if you don't know at that point in code what a actually points to?
(2) may be perfectly legal (you only want to look at one byte of the integer), but it could also be a mistake in which case an error would be nice, like (3).
The C++ casting operators are intended to expose these issues in the code by providing compile-time or run-time errors when possible.
So, for strict "value casting" you can use static_cast<>. If you want run-time polymorphic casting of pointers use dynamic_cast<>. If you really want to forget about types, you can use reintrepret_cast<>. And to just throw const out the window there is const_cast<>.
They just make the code more explicit so that it looks like you know what you were doing.
static_cast means that you can't accidentally const_cast or reinterpret_cast, which is a good thing.
Allows casts to be found easily in
your code using grep or similar
tools.
Makes it explicit what kind
of cast you are doing, and engaging
the compiler's help in enforcing it.
If you only want to cast away
const-ness, then you can use
const_cast, which will not allow you
to do other types of conversions.
Casts are inherently ugly -- you as
a programmer are overruling how the
compiler would ordinarily treat your
code. You are saying to the
compiler, "I know better than you."
That being the case, it makes sense
that performing a cast should be a
moderately painful thing to do, and
that they should stick out in your
code, since they are a likely source
of problems.
See Effective C++ Introduction
It's about how much type-safety you want to impose.
When you write (bar) foo (which is equivalent to reinterpret_cast<bar> foo if you haven't provided a type conversion operator) you are telling the compiler to ignore type safety, and just do as it's told.
When you write static_cast<bar> foo you are asking the compiler to at least check that the type conversion makes sense and, for integral types, to insert some conversion code.
EDIT 2014-02-26
I wrote this answer more than 5 years ago, and I got it wrong. (See comments.) But it still gets upvotes!
C Style casts are easy to miss in a block of code. C++ style casts are not only better practice; they offer a much greater degree of flexibility.
reinterpret_cast allows integral to pointer type conversions, however can be unsafe if misused.
static_cast offers good conversion for numeric types e.g. from as enums to ints or ints to floats or any data types you are confident of type. It does not perform any run time checks.
dynamic_cast on the other hand will perform these checks flagging any ambiguous assignments or conversions. It only works on pointers and references and incurs an overhead.
There are a couple of others but these are the main ones you will come across.
static_cast, aside from manipulating pointers to classes, can also be used to perform conversions explicitly defined in classes, as well as to perform standard conversions between fundamental types:
double d = 3.14159265;
int i = static_cast<int>(d);
Having a template method that casts to a particular Class is sometimes userfull, and I do use quite a lot, but while Implementing the 'd-pointer', they stopped working, because I don't know the internals of the 'd' while in the header file. is there any way for the snipped bellow to work?
class BlahPrivate;
class Blah{
public:
template<typename T> T*method(){ return static_cast<T*>( d->object ); }
private:
BlahPrivate *d;
}
First of all, if you want to separate the cast logic from the template internals, you can do that with a PIMPL (Pointer to Impl) idiom, adding a layer of indirection. Basically, place this template in its own header that DOES include the definition for BlahPrivate. Make that standalone. Then make a .h file that calls the function you have above, except it forwards the function call to the header file that has the BlahPrivate definition and cast logic.
Secondly, you're probably better off just defining implicit conversion operators in BlahPrivate for the types you'd like to convert it to... for example, putting this in your class:
operator std::string() { return std::string("This is a BlahPRivate"); }
would allow you to use BlahPrivate wherever a string was expected - it's pretty nifty :) Obviously, you'd want to give your casts more meaning though.
Don't go crazy with the implicit cast operators or it will bite you in the butt. Actually, I think this whole thing is probably a bad idea, because even your proposed function would make debugging hard - instead of getting a static cast error for a bad type on your line with the error, you'll get it in this function and have to trace it back.
Similarly, implicit casts may do a cast (and work) when you don't want them to: i.e. you wrote your parameters backwards in a function, and the std::string one was automatically converted to by the implicit function above - implicit casts lessen your type safety. Sometimes doing things by hand (when they're trivial like a cast) is better - after all, you really shouldn't have to cast often - if you do it's often a sign of bad design and you should rethink what you're doing.
What are the benefits of explicit type cast in C++ ?
They're more specific than the full general C-style casts. You don't give up quite as much type safety and the compiler can still double check some aspects of them for you.
They're easy to grep for if you want to try clean up your code.
The syntax intentionally mimics a templated function call. As a a result, you can "extend" the language by defining your own casts (e.g., Boost's lexical_cast).
Clarity in reading the code. There is not too much else of benefit, except for the cases where the compiler cannot infer the implicit cast at all, in which case you'd have to cast explicitly anyway.
The recommended way to cast in c++ is through dynamic_cast, static_cast, and the rest of those casting operators:
http://www.cplusplus.com/doc/tutorial/typecasting/
Notice that all typecasts are explicit in C++ and C. There are, in the language, no "implicit" typecasts. It's called "implicit conversions" and "explicit conversions", the latter of which are also called "casts".
Typecasts are most often used to inhibit warnings by the compiler. For instance if you have a signed and an unsigned value and compare them, the compiler usually warns you. If you know the comparison is correct, you can cast the operands to a signed type.
Another example is overload resolution, where type casts can be used to drive to the function to be called. For example to print out the address of a char, you can't just say cout << &c because that would try to interpret it as a C-Style String. Instead you would have to cast to void* before you print.
Often implicit conversions are superior to casts. Boost provides boost::implicit_cast to go by implicit conversions. For example the following relies on the implicit conversions of pointers to void*
cout << boost::implicit_cast<void*>(&c);
This has the benefit that it's only allowing conversions that are safe to do. Downcasts are not permitted, for example.
often used on void* to recover a implicitly known type. Especially common on embedded OS's for callbacks. This is a case where when you register for an event and maybe pass your "this" pointer as a context void*, then when the triggers it will pass you the void * context and you covert it back to the type the "this" pointer was.
Sometimes explicit casting avoids certain undesirable cirumstances by using explicit keyword.
class ExplicitExample
{
public:
ExplicitExample(int a){...}
}
ExplicitExample objExp = 'A';//No error.. call the integer constructor
Change as
explicit ExplicitExample(int a){ ... }
Now compile.
ExplicitExample objExp = 'A';
We get this error in VS2005.
error C2440: 'initializing' : cannot convert from 'char' to 'ExplicitExample'
Constructor for class 'ExplicitExample' is declared 'explicit'
To overcome this error, we have to declare as
ExplicitExample objExp = ExplicitExample('A');
It means as a programmer, we tell the compiler we know what we are calling. So compiler ignores this error.