Difference between void pointers in C and C++

Difference between void pointers in C and C++ - c++

Why the following is wrong in C++ (But valid in C)
void*p;
char*s;
p=s;
s=p; //this is wrong ,should do s=(char*)p;
Why do I need the casting,as p now contains address of char pointer and s is also char pointer?

That's valid C, but not C++; they are two different languages, even if they do have many features in common.
In C++, there is no implicit conversion from void* to a typed pointer, so you need a cast. You should prefer a C++ cast, since they restrict which conversions are allowed and so help to prevent mistakes:
s = static_cast<char*>(p);
Better still, you should use polymorphic techniques (such as abstract base classes or templates) to avoid the need to use untyped pointers in the first place; but that's rather beyond the scope of this question.

The value doesn't matter, the type does. Since p is a void pointer and s a char pointer, you have to cast, even if they have the same value. In C it will be ok, void* is the generic pointer, but this is incorrect in C++.
By the way, p doesn't contains char pointer, it's a void pointer and it contains a memory address.

In general, this rule doesn't even have anything to do with pointers. It's just that you can assign values of some type to variables of other types, but not always vice versa. A similar situation would be this:
double d = 0.0;
int i = 0;
d = i; // Totally OK
i = d; // Warning!
So that's just something you have to live with.

Related

Return type of void *, is it different in C++ and in C? [duplicate]

Why the following is wrong in C++ (But valid in C)
void*p;
char*s;
p=s;
s=p; //this is wrong ,should do s=(char*)p;
Why do I need the casting,as p now contains address of char pointer and s is also char pointer?

That's valid C, but not C++; they are two different languages, even if they do have many features in common.
In C++, there is no implicit conversion from void* to a typed pointer, so you need a cast. You should prefer a C++ cast, since they restrict which conversions are allowed and so help to prevent mistakes:
s = static_cast<char*>(p);
Better still, you should use polymorphic techniques (such as abstract base classes or templates) to avoid the need to use untyped pointers in the first place; but that's rather beyond the scope of this question.

The value doesn't matter, the type does. Since p is a void pointer and s a char pointer, you have to cast, even if they have the same value. In C it will be ok, void* is the generic pointer, but this is incorrect in C++.
By the way, p doesn't contains char pointer, it's a void pointer and it contains a memory address.

In general, this rule doesn't even have anything to do with pointers. It's just that you can assign values of some type to variables of other types, but not always vice versa. A similar situation would be this:
double d = 0.0;
int i = 0;
d = i; // Totally OK
i = d; // Warning!
So that's just something you have to live with.

What real use does a double pointer have?

I have searched and searched for an answer to this but can't find anything I actually "get".
I am very very new to c++ and can't get my head around the use of double, triple pointers etc.. What is the point of them?
Can anyone enlighten me

Honestly, in well-written C++ you should very rarely see a T** outside of library code. In fact, the more stars you have, the closer you are to winning an award of a certain nature.
That's not to say that a pointer-to-pointer is never called for; you may need to construct a pointer to a pointer for the same reason that you ever need to construct a pointer to any other type of object.
In particular, I might expect to see such a thing inside a data structure or algorithm implementation, when you're shuffling around dynamically allocated nodes, perhaps?
Generally, though, outside of this context, if you need to pass around a reference to a pointer, you'd do just that (i.e. T*&) rather than doubling up on pointers, and even that ought to be fairly rare.
On Stack Overflow you're going to see people doing ghastly things with pointers to arrays of dynamically allocated pointers to data, trying to implement the least efficient "2D vector" they can think of. Please don't be inspired by them.
In summary, your intuition is not without merit.

An important reason why you should/must know about pointer-to-pointer-... is that you sometimes have to interface with other languages (like C for instance) through some API (for instance the Windows API).
Those APIs often have functions that have an output-parameter that returns a pointer. However those other languages often don't have references or compatible (with C++) references. That's a situation when pointer-to-pointer is needed.

It's less used in c++. However, in C, it can be very useful. Say that you have a function that will malloc some random amount of memory and fill the memory with some stuff. It would be a pain to have to call a function to get the size you need to allocate and then call another function that will fill the memory. Instead you can use a double pointer. The double pointer allows the function to set the pointer to the memory location. There are some other things it can be used for but that's the best thing I can think of.
int func(char** mem){
*mem = malloc(50);
return 50;
}
int main(){
char* mem = NULL;
int size = func(&mem);
free(mem);
}

I am very very new to c++ and can't get my head around the use of double, triple pointers etc.. What is the point of them?
The trick to understanding pointers in C is simply to go back to the basics, which you were probably never taught. They are:
Variables store values of a particular type.
Pointers are a kind of value.
If x is a variable of type T then &x is a value of type T*.
If x evaluates to a value of type T* then *x is a variable of type T. More specifically...
... if x evaluates to a value of type T* that is equal to &a for some variable a of type T, then *x is an alias for a.
Now everything follows:
int x = 123;
x is a variable of type int. Its value is 123.
int* y = &x;
y is a variable of type int*. x is a variable of type int. So &x is a value of type int*. Therefore we can store &x in y.
*y = 456;
y evaluates to the contents of variable y. That's a value of type int*. Applying * to a value of type int* gives a variable of type int. Therefore we can assign 456 to it. What is *y? It is an alias for x. Therefore we have just assigned 456 to x.
int** z = &y;
What is z? It's a variable of type int**. What is &y? Since y is a variable of type int*, &y must be a value of type int**. Therefore we can assign it to z.
**z = 789;
What is **z? Work from the inside out. z evaluates to an int**. Therefore *z is a variable of type int*. It is an alias for y. Therefore this is the same as *y, and we already know what that is; it's an alias for x.
No really, what's the point?
Here, I have a piece of paper. It says 1600 Pennsylvania Avenue Washington DC. Is that a house? No, it's a piece of paper with the address of a house written on it. But we can use that piece of paper to find the house.
Here, I have ten million pieces of paper, all numbered. Paper number 123456 says 1600 Pennsylvania Avenue. Is 123456 a house? No. Is it a piece of paper? No. But it is still enough information for me to find the house.
That's the point: often we need to refer to entities through multiple levels of indirection for convenience.
That said, double pointers are confusing and a sign that your algorithm is insufficiently abstract. Try to avoid them by using good design techniques.

A double-pointer, is simply a pointer to a pointer. A common usage is for arrays of character strings. Imagine the first function in just about every C/C++ program:
int main(int argc, char *argv[])
{
...
}
Which can also be written
int main(int argc, char **argv)
{
...
}
The variable argv is a pointer to an array of pointers to char. This is a standard way of passing around arrays of C "strings". Why do that? I've seen it used for multi-language support, blocks of error strings, etc.
Don't forget that a pointer is just a number - the index of the memory "slot" inside a computer. That's it, nothing more. So a double-pointer is index of a piece of memory that just happens to hold another index to somewhere else. A mathematical join-the-dots if you like.
This is how I explained pointers to my kids:
Imagine the computer memory is a series of boxes. Each box has a number written on it, starting at zero, going up by 1, to however many bytes of memory there is. Say you have a pointer to some place in memory. This pointer is just the box number. My pointer is, say 4. I look into box #4. Inside is another number, this time it's 6. So now we look into box #6, and get the final thing we wanted. My original pointer (that said "4") was a double-pointer, because the content of its box was the index of another box, rather than being a final result.
It seems in recent times pointers themselves have become a pariah of programming. Back in the not-too-distant past, it was completely normal to pass around pointers to pointers. But with the proliferation of Java, and increasing use of pass-by-reference in C++, the fundamental understanding of pointers declined - particularly around when Java became established as a first-year computer science beginners language, over say Pascal and C.
I think a lot of the venom about pointers is because people just don't ever understand them properly. Things people don't understand get derided. So they became "too hard" and "too dangerous". I guess with even supposedly learned people advocating Smart Pointers, etc. these ideas are to be expected. But in reality there a very powerful programming tool. Honestly, pointers are the magic of programming, and after-all, they're just a number.

In many situations, a Foo*& is a replacement for a Foo**. In both cases, you have a pointer whose address can be modified.
Suppose you have an abstract non-value type and you need to return it, but the return value is taken up by the error code:
error_code get_foo( Foo** ppfoo )
or
error_code get_foo( Foo*& pfoo_out )
Now a function argument being mutable is rarely useful, so the ability to change where the outermost pointer ppFoo points at is rarely useful. However, a pointer is nullable -- so if get_foo's argument is optional, a pointer acts like an optional reference.
In this case, the return value is a raw pointer. If it returns an owned resource, it should usually be instead a std::unique_ptr<Foo>* -- a smart pointer at that level of indirection.
If instead, it is returning a pointer to something it does not share ownership of, then a raw pointer makes more sense.
There are other uses for Foo** besides these "crude out parameters". If you have a polymorphic non-value type, non-owning handles are Foo*, and the same reason why you'd want to have an int* you would want to have a Foo**.
Which then leads you to ask "why do you want an int*?" In modern C++ int* is a non-owning nullable mutable reference to an int. It behaves better when stored in a struct than a reference does (references in structs generate confusing semantics around assignment and copy, especially if mixed with non-references).
You could sometimes replace int* with std::reference_wrapper<int>, well std::optional<std::reference_wrapper<int>>, but note that is going to be 2x as large as a simple int*.
So there are legitimate reasons to use int*. Once you have that, you can legitimately use Foo** when you want a pointer to a non-value type. You can even get to int** by having a contiguous array of int*s you want to operate on.
Legitimately getting to three-star programmer gets harder. Now you need a legitimate reason to (say) want to pass a Foo** by indirection. Usually long before you reach that point, you should have considered abstracting and/or simplifying your code structure.
All of this ignores the most common reason; interacting with C APIs. C doesn't have unique_ptr, it doesn't have span. It tends to use primitive types instead of structs because structs require awkward function based access (no operator overloading).
So when C++ interacts with C, you sometimes get 0-3 more *s than the equivalent C++ code would.

The use is to have a pointer to a pointer, e.g., if you want to pass a pointer to a method by reference.

What real use does a double pointer have?
Here is practical example. Say you have a function and you want to send an array of string params to it (maybe you have a DLL you want to pass params to). This can look like this:
#include <iostream>
void printParams(const char **params, int size)
{
for (int i = 0; i < size; ++i)
{
std::cout << params[i] << std::endl;
}
}
int main()
{
const char *params[] = { "param1", "param2", "param3" };
printParams(params, 3);
return 0;
}
You will be sending an array of const char pointers, each pointer pointing to the start of a null terminated C string. The compiler will decay your array into pointer at function argument, hence what you get is const char ** a pointer to first pointer of array of const char pointers. Since the array size is lost at this point, you will want to pass it as second argument.

One case where I've used it is a function manipulating a linked list, in C.
There is
struct node { struct node *next; ... };
for the list nodes, and
struct node *first;
to point to the first element. All the manipulation functions take a struct node **, because I can guarantee that this pointer is non-NULL even if the list is empty, and I don't need any special cases for insertion and deletion:
void link(struct node *new_node, struct node **list)
{
new_node->next = *list;
*list = new_node;
}
void unlink(struct node **prev_ptr)
{
*prev_ptr = (*prev_ptr)->next;
}
To insert at the beginning of the list, just pass a pointer to the first pointer, and it will do the right thing even if the value of first is NULL.
struct node *new_node = (struct node *)malloc(sizeof *new_node);
link(new_node, &first);

Multiple indirection is largely a holdover from C (which has neither reference nor container types). You shouldn't see multiple indirection that much in well-written C++, unless you're dealing with a legacy C library or something like that.
Having said that, multiple indirection falls out of some fairly common use cases.
In both C and C++, array expressions will "decay" from type "N-element array of T" to "pointer to T" under most circumstances1. So, assume an array definition like
T *a[N]; // for any type T
When you pass a to a function, like so:
foo( a );
the expression a will be converted from "N-element array of T *" to "pointer to T *", or T **, so what the function actually receives is
void foo( T **a ) { ... }
A second place they pop up is when you want a function to modify a parameter of pointer type, something like
void foo( T **ptr )
{
*ptr = new_value();
}
void bar( void )
{
T *val;
foo( &val );
}
Since C++ introduced references, you probably won't see that as often. You'll usually only see that when working with a C-based API.
You can also use multiple indirection to set up "jagged" arrays, but you can achieve the same thing with C++ containers for much less pain. But if you're feeling masochistic:
T **arr;
try
{
arr = new T *[rows];
for ( size_t i = 0; i < rows; i++ )
arr[i] = new T [size_for_row(i)];
}
catch ( std::bad_alloc& e )
{
...
}
But most of the time in C++, the only time you should see multiple indirection is when an array of pointers "decays" to a pointer expression itself.
The exceptions to this rule occur when the expression is the operand of the sizeof or unary & operator, or is a string literal used to initialize another array in a declaration.

In C++, if you want to pass a pointer as an out or in/out parameter, you pass it by reference:
int x;
void f(int *&p) { p = &x; }
But, a reference can't ("legally") be nullptr, so, if the pointer is optional, you need a pointer to a pointer:
void g(int **p) { if (p) *p = &x; }
Sure, since C++17 you have std::optional, but, the "double pointer" has been idiomatic C/C++ code for many decades, so should be OK. Also, the usage is not so nice, you either:
void h(std::optional<int*> &p) { if (p) *p = &x) }
which is kind of ugly at the call site, unless you already have a std::optional, or:
void u(std::optional<std::reference_wrapper<int*>> p) { if (p) p->get() = &x; }
which is not so nice in itself.
Also, some might argue that g is nicer to read at the call site:
f(p);
g(&p); // `&` indicates that `p` might change, to some folks

How does void* work as a universal reference type?

From Programming Language Pragmatics, by Scott
For systems programming, or to facilitate the writing of
general-purpose con- tainer (collection) objects (lists, stacks,
queues, sets, etc.) that hold references to other objects, several
languages provide a universal reference type. In C and C++, this
type is called void *. In Clu it is called any; in Modula-2,
address; in Modula-3, refany; in Java, Object; in C#, object.
In C and C++, how does void * work as a universal reference type?
void * is always only a pointer type, while a universal reference type contains all values, both pointers and nonpointers. So I can't see how void * is a universal reference type.
Thanks.

A void* pointer will generally hold any pointer that is not a C++ pointer-to-member. It's rather inconvenient in practice, since you need to cast it to another pointer type before you can use it. You also need to convert it to the same pointer type that it was converted from to make the void*, otherwise you risk undefined behavior.
A good example would be the qsort function. It takes a void* pointer as a parameter, meaning it can point to an array of anything. The comparison function you pass to qsort must know how to cast two void* pointers back to the types of the array elements in order to compare them.

The crux of your confusion is that neither an instance of void * nor an instance of Modula-3's refany, nor an instance of any other language's "can refer to anything" type, contains the object that it refers to. A variable of type void * is always a pointer and a variable of type refany is always a reference. But the object that they refer to can be of any type.
A purist of programming-language theory would tell you that C does not have references at all, because pointers are not references. It has a nearly-universal pointer type, void *, which can point to an object of any type (including integers, aggregates, and other pointers). As a common but not ubiquitous extension, it can also point to any function (functions are not objects).
The purist would also tell you that C++ does not have a (nearly-)universal pointer type, because of its stricter type system, and doesn't have a universal reference type either.
They would also say that the book you are reading is being sloppy with its terminology, and they would caution you to not take any one such book for the gospel truth on terminological matters, or any other matters. You should instead read widely in both books and CS journals and conference proceedings (collectively known as "the literature") until you gain an "ear" for what is generally-agreed-on terminology, what is specific to a subdiscipline or a community of practice, and so on.
And finally they would remind you that C and C++ are two different languages, and anyone who speaks of them in the same breath is either glossing over the distinctions (which may or may not be relevant in context), decades out of date, or both.

Probably the reason is that you can take address of any variable of any type and cast it to void*.

It does by a silent contract that you know the actual type of object.
So you can store different kinds of elements in a container, but you need to somehow know what is what when taking elements back, to interpret them correctly.
The only convenience void* offers is that it's idiomatic for this, i.e. it's clear that dereferencing the pointer makes no sense, and void* is implicitly convertible to any pointer type. That is for c/
In c++ this is called type erasure techniques preferred. Or special types, like any (there is a boost version of this too.)

void* is no more just a pointer. Thus, it holds an address of an object (or an array and stuffs like that)
When your program is running, every variable should have it owns address in memory, right? And pointer is somethings point to that address.
In normal, each type of pointer should be the same type of object int b = 5; int* p = &b; for example. But that is the case you know what the type is, it means the specific type.
But sometimes, you just want to know that it stores somethings somewhere in memory and you know what "type" of that address, you can cast easily. For example, in OpenCV library which I am learning, there are a lot of functions where user can pass the arguments to instead of declaring global variables and most use in callback functions, like this:
void onChange(int v, void *ptr)
Here, the library does not care about what ptr point to, it just know that when you call the function, if you pass an address to like this onChange(5,&b) then you must cast ptr to the same type before dealing with it int b = static_cast<int*>(ptr);

Probably this explanation from Understanding pointers from Richard Reese will help
A pointer to void is a general-purpose pointer used to hold references to any data type.
It has two interesting properties:
A pointer to void will have the same representation and memory alignment as a pointer to char
A pointer to void will never be equal to another pointer. However, two void pointers assigned a NULL value will be equal.
Any pointer can be assigned to a pointer to void. It can then be cast back to its original pointer type. When this happens the value will be equal to the original pointer value.
This is illustrated in the following sequence, where a pointer to
int is assigned to a pointer to void and then back to a pointer to int
#include<stdio.h>
void main()
{
int num = 100;
int *pi = &num;
printf("value of pi is %p\n", pi);
void* pv = pi;
pi = (int*)pv;
printf("value of pi is %p\n", pi);
}
Pointers to void are used for data pointers, not function pointers

Pointer typecasting with and without reference

Given that A* pA; and B* pB;, is there ANY difference between below type castings (query for all C++ style casts):
pB = reinterpret_cast<B*>(pA); // pointer
pB = reinterpret_cast<B*&>(pA); // pointer-reference

The two are radically different, at least in theory (and possibly on
a few rare machines, in practice). The first takes a pointer to A, and
converts it to a pointer to B; in theory, at least, this may involve
changes in size and representation. (I've actually worked on machines
where char* was larger than int*. I rather doubt that any such
machines still exist, although perhaps in the embedded world...) The
second is really the equivalent of *reinterpret_cast<B**>(&pA); it
takes the bits in pA, and tells the compiler to interpret them as
a B*. If the format is different, tough luck, and if the size is
different, you're likely to only access part of pA or to access memory
which isn't part of pA.
Also, the first is an rvalue, the second an lvalue. Thus, something
like:
++ reinterpret_cast<B*>( pA );
is illegal, but:
++ reinterpret_cast<B*&>( pA );
isn't. This is a useful technique for obfuscating code, and getting
unaligned pointers, pointers into the middle of objects, or other
pointers you don't dare dereference.
In general, the second form should be avoided, but there are rare
exceptions. Posix guarantees that all pointers, including pointers to
functions (but not pointers to members—Posix specifies a C ABI,
which doesn't have pointers to members), have the same size and format,
so the second form is guaranteed to work. And it is the only way you
can legally convert the void* returned by dlsym into a pointer to
a function:
int (*pf)( int );
reinterpret_cast<void*&>( pf ) = dlsym( handle, "functionName" );
(In C, you'd write:
int (*pf)( int );
*(void**)( &pf ) = dlsym( handle, "functionName" );
, see the
official
specification.) Such tricks allow conversions between pointer types
which aren't otherwise allowed, but depends on additional guarantees not
in the standard.

reinterpret_cast is implementation-defined. So in theory results can be different, but in practice you can assume that results will be the same because typeid(B*) is equal to typeid(B*&) => actually you casting to same type in both cases.

There is No difference.
pB = reinterpret_cast<B*>(pA); // pointer
is casting to a pointer of the type B* and
pB = reinterpret_cast<B*&>(pA); // pointer-reference
is casting to reference to a pointer.
Note: there is no such thing as pointer to reference.

No functional difference whatsoever.
There's obviously a maintainability difference if you'd use the latter variant. Your peers would think you had smoked something. ;-)

Why do we have reinterpret_cast in C++ when two chained static_cast can do its job?

Say I want to cast A* to char* and vice-versa, we have two choices (I mean, many of us think we've two choices, because both seems to work! Hence the confusion!):
struct A
{
int age;
char name[128];
};
A a;
char *buffer = static_cast<char*>(static_cast<void*>(&a)); //choice 1
char *buffer = reinterpret_cast<char*>(&a); //choice 2
Both work fine.
//convert back
A *pA = static_cast<A*>(static_cast<void*>(buffer)); //choice 1
A *pA = reinterpret_cast<A*>(buffer); //choice 2
Even this works fine!
So why do we have reinterpret_cast in C++ when two chained static_cast can do its job?
Some of you might think this topic is a duplicate of the previous topics such as listed at the bottom of this post, but it's not. Those topics discuss only theoretically, but none of them gives even a single example demonstrating why reintepret_cast is really needed, and two static_cast would surely fail. I agree, one static_cast would fail. But how about two?
If the syntax of two chained static_cast looks cumbersome, then we can write a function template to make it more programmer-friendly:
template<class To, class From>
To any_cast(From v)
{
return static_cast<To>(static_cast<void*>(v));
}
And then we can use this, as:
char *buffer = any_cast<char*>(&a); //choice 1
char *buffer = reinterpret_cast<char*>(&a); //choice 2
//convert back
A *pA = any_cast<A*>(buffer); //choice 1
A *pA = reinterpret_cast<A*>(buffer); //choice 2
Also, see this situation where any_cast can be useful: Proper casting for fstream read and write member functions.
So my question basically is,
Why do we have reinterpret_cast in C++?
Please show me even a single example where two chained static_cast would surely fail to do the same job?
Which cast to use; static_cast or reinterpret_cast?
Cast from Void* to TYPE* : static_cast or reinterpret_cast

There are things that reinterpret_cast can do that no sequence of static_casts can do (all from C++03 5.2.10):
A pointer can be explicitly converted to any integral type large enough to hold it.
A value of integral type or enumeration type can be explicitly converted to a pointer.
A pointer to a function can be explicitly converted to a pointer to a function of a different type.
An rvalue of type "pointer to member of X of type T1" can be explicitly converted to an rvalue of type "pointer to member of Y of type T2" if T1 and T2 are both function types or both object types.
Also, from C++03 9.2/17:
A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.

You need reinterpret_cast to get a pointer with a hardcoded address (like here):
int* pointer = reinterpret_cast<int*>( 0x1234 );
you might want to have such code to get to some memory-mapped device input-output port.

A concrete example:
char a[4] = "Hi\n";
char* p = &a;
f(reinterpret_cast<char (&)[4]>(p)); // call f after restoring full type
// ^-- any_cast<> can't do this...
// e.g. given...
template <typename T, int N> // <=--- can match this function
void f(T (&)[N]) { std::cout << "array size " << N << '\n'; }

Other than the practical reasons that others have given where there is a difference in what they can do it's a good thing to have because its doing a different job.
static_cast is saying please convert data of type X to Y.
reinterpret_cast is saying please interpret the data in X as a Y.
It may well be that the underlying operations are the same, and that either would work in many cases. But there is a conceptual difference between saying please convert X into a Y, and saying "yes I know this data is declared as a X but please use it as if it was really a Y".

As far as I can tell your choice 1 (two chained static_cast) is dreaded undefined behaviour. Static cast only guarantees that casting pointer to void * and then back to original pointer works in a way that the resulting pointer from these to conversions still points to the original object. All other conversions are UB. For pointers to objects (instances of the user defined classes) static_cast may alter the pointer value.
For the reinterpret_cast - it only alters the type of the pointer and as far as I know - it never touches the pointer value.
So technically speaking the two choices are not equivalent.
EDIT: For the reference, static_cast is described in section 5.2.9 of current C++0x draft (sorry, don't have C++03 standard, the draft I consider current is n3225.pdf). It describes all allowed conversions, and I guess anything not specifically listed = UB. So it can blow you PC if it chooses to do so.

Using of C Style casting is not safer. It never checks for different types can be mixed together.
C++ casts helps you to make sure the type casts are done as per related objects (based on the cast you use). This is the more recommended way to use casts than using the traditional C Style casts that's always harmful.

Look, people, you don't really need reinterpret_cast, static_cast, or even the other two C++ styles casts (dynamic* and const).
Using a C style cast is both shorter and allows you to do everything the four C++-style cast let you do.
anyType someVar = (anyOtherType)otherVar;
So why use the C++-style casts? Readability. Secondly: because the more restrictive casts allow more code safety.
*okay, you might need dynamic

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js