What is void* and to what variables/objects it can point to - c++

Specifically, can it point to int/float etc.?
What about objects like NSString and the like?
Any examples will be greatly appreciated.

void* is such a pointer, that any pointer can be implicitly converted to void*.
For example;
int* p = new int;
void* pv = p; //OK;
p = pv; //Error, the opposite conversion must be explicit in C++ (in C this is OK too)
Also note that pointers to const cannot be converted to void* without a const_cast
E.g.
const int * pc = new const int(4);
void * pv = pc; //Error
const void* pcv = pc; //OK
Hth.

In C any pointer can point to any address in memory, as the type information is in the pointer, not in the target. So an int* is just a pointer to some memory location, which is believed to be an integer. A void* pointer, is just a pointer to a memory location where the type is not defined (could be anything).
Thus any pointer can be cast to void*, but not the other way around, because casting (for example) a void pointer to an int pointer is adding information to it - by performing the cast, you are declaring that the target data is integer, so naturally you have to explicitly say this. The other way around, all you are doing is saying that the int pointer is some kind of pointer, which is fine.
It's probably the same in C++.

A void * can point at any data-like thing in memory, like an integer value, a struct, or whatever.
Do note, however, that you cannot freely convert between void * and function pointers. This is because on some architectures, code is not in the same address space as data, and thus it's possible that address 0x00000000 for code refers to a different set of bits than address 0x00000000 for data does.
It would be possible to implement the compiler so that void * is large enough to remember the difference, but in general I think this is not done, instead the language leaves it undefined.
On typical/mainstream computers, code and data reside in the same address space, and then the compilers typically generate sensible results if you do store a function pointer into a void *, since it can be quite useful.

Besides everything else that was already said by the other users, a void* it's commonly used in callback definitions. This allows your callback to receive user data of any type, including your own defined objects/structs, which should be casted to the appropriate type before using it:
void my_player_cb(int reason, void* data)
{
Player_t* player = (Player_t*)data;
if (reason == END_OF_FILE)
{
if (player->playing)
{
// execute stop(), release allocated resources and
// start() playing the next file on the list
}
}
}

void* can point to an address in memory but the syntax has no type-information. you can cast it to any pointer-type you want but it is your responsibility that that type matches the semantics of the data.

Related

Dereferencing a pointer to constant

Let's say we have a class called object.
int main(){
object a;
const object* b = &a;
(*b);
}
Question:
b is a pointer to const, but the object it is pointing to is actually not a constant object. My question is, when we dereference the pointer "b" why do we get a constant object instead of what it is actually pointing at which is normal non constant object.
Because that's how the built-in dereference operator * works in C++. If you dereference a pointer of type T *, you get an lvalue of type T. In your case T is const object.
Neither * operator, not the pointer itself cares (or knows) that the object the pointer is pointing to is actually non-constant. It just can't be any other way within the concept of static typing used by C++ language.
The whole purpose of const-qualification at the first (and deeper) levels of indirection is to provide you with an ability to create restrictive access paths to objects. I.e. by creating a pointer-to-const you are deliberately and willingly preventing yourself (or someone else) from modifying the pointee, even if the pointee is not constant.
The type of an expression is just based on the declared types of the variables in the expression, it can't depend on dynamic run-time data. For instance, you could write:
object a1;
const object a2;
const object *b;
if (rand() % 2 == 0) {
b = &a1;
} else {
b = &a2;
}
(*b);
The type of *b is determined at compile-time, it can't depend on what rand() returns when you run the program.
Since b is declared to point to const object, that's what the type of *b is.
Because the const keyword means 'you can not modify that object' rather than 'the object can not be modified'.
This is useful when you pass an object to some function to use the object's value only, but but not to modify it:
// We expect PrintMyString to read the string
// and use its contents but not to modify it
void PrintMyString(const char *string);
void myfunction(int number)
{
// build and print a string of stars with given length
if(number>0 && number<=16]
{
char string[17];
int i;
for(i=0, i<number; i++)
string[i] = '*';
string[number] = '\0';
PrintMyString(string);
}
}
The function PrintMyString will get an array of characters, but the array will be passed as 'read only' to the function. The array is certainly modifiable within the owning function, but the PrintMyString can only read what it gets and not alter the contents.
There are plenty of other answers here, but I'm driven to post this because I feel that many of them provide too much detail, wander off topic, or assume knowledge of C++ jargon that the OP almost certainly doesn't have. I think that's unhelpful to the OP and others at a similar phase of their careers so I'm going to try to cut through some of that.
The situation itself is actually very straightforward. The declaration:
const SomeClassOrType *p = ...
simply says that whatever p is pointing to cannot be modified through that pointer, and that is useful to ensure that code that gains access to that object via p doesn't modify it when it isn't supposed to. It is most often used in parameter declarations so that functions and methods know whether or not they can modify the object being passed in (*).
It says nothing at all about the const-ness of what is actually being pointed to. The pointer itself, being just a simple soul, does not carry that information around with it. It only knows that, so far as the pointer is concerned, the object pointed to can be read from but not written to.
Concrete example (stolen from CiaPan):
void PrintString (const char *string);
char string [] = "abcde";
const char *p = string;
PrintString (p);
void PrintString (const char *ptr_to_string)
{
ptr_to_string [0] = 0; // oops! but the compiler will catch this, even though the original string itself is writeable
}
You can just pass string to PrintString directly of course, same difference, because the parameter is also declared const.
Another way to look at this is that const is information provided by the programmer to help the compiler check the correctness / consistency of the code at compile time. It lets the compiler catch mistakes like the one above, and perhaps perform better optimisations. By the time you actually run your code, it is history.
(*) The modern idiom is probably to use a const reference, but I don't want to muddy the waters by throwing that into the main discussion.
const object* b = &a;
This means: b is a pointer to a const (read-only) object
Doesn't say anything about the const-ness of a. It just means b has no right to modify a.
Also known as low-level const.
Contrary to the above, top-level const is when the pointer itself is const.
object *const b = &a; // b is a const pointer to an object
b = &some_other_object; // error - can't assign b to another object
// since b is a const pointer
*b = some_value; // this is fine since object is non-const
Simply put,
For plain poiters, the result of unary operator * is a reference to pointed-to type.
So, if the type pointed-to is const-qualified, the result is a reference to a constant.
References to constants can bind to any objects, even to mutable objects of otherwise the same type. When you bind a constref to a value, you promise to not ad-hoc modify that value through this refence (you still can do it explicitly though, cast it to a non-const reference).
Likewise, const-pointers can point to objects still modifiable otherwise.
By writing
const object* b = &a;
you declare that b is a pointer (*) to a const of type object, to which you then assign the address of a. a is of type object (but not const); you are permitted to use the adress of the non-const object in place of an adress of an const object.
When you dereference * b however, the compiler can only go according to your declaration - thus *b is a const object (however you can still modify a as you like, so beware of thinking that the object b points to cannot change - it mere cannot be changed via b)
const object* b = &a;
b will treat what it points to as const, i.e. it cannot change a
object* const b = &a;
b itself is const, i.e. it cannot point to other object address, but it can change a
Because at run-time it might be a const object, meaning that any operations performed on the dereference must be compatible with const. Ignore the fact that in your example it has been pointed to a non-const object and cannot point to anything else, the language doesn't work like that.
Here’s a specific example of why it is the way it is. Let’s say you declare:
int a[] = {1, 2, 3};
constexpr size_t n_a = sizeof(a)/sizeof(a[0]);
extern int sum( const int* sequence, size_t n );
sum_a = sum( a, n_a );
Now you implement sum() in another module. It doesn’t have any idea how the original object you’re pointing to was declared. It would be possible to write a compiler that tagged pointers with that information, but no compilers in actual use today do, for a number of good reasons.¹
Inside sum(), which might be in a shared library that cannot be recompiled with whole-program optimization, all you see is a pointer to memory that cannot be altered. In fact, on some implementations, trying to write through a pointer to const might crash the program with a memory-protection error. And the ability to reinterpret a block of memory as some other type is important to C/C++. For example, memset() or memcpy() reinterprets it as an array of arbitrary bytes. So the implementation of sum() has no way to tell the provenance of its pointer argument. As far as it’s concerned, it’s just a pointer to const int[].
More importantly, the contract of the function says that it’s not going to modify the object through that pointer.² It could simply cast away the const qualifier explicitly, but that would be a logic error. If you’re declaring a pointer const, it’s like engaging the safety on your gun: you want the compiler to stop you from shooting yourself in the foot.
¹ Including extra instructions to extract the address from a pointer, extra memory to store them, compatibility with the standard calling convention for the architecture, breaking a lot of existing code that assumes things like long being able to hold a pointer, and the lack of any benefit.
² With the pedantic exception of mutable data members.

How does void* work as a universal reference type?

From Programming Language Pragmatics, by Scott
For systems programming, or to facilitate the writing of
general-purpose con- tainer (collection) objects (lists, stacks,
queues, sets, etc.) that hold references to other objects, several
languages provide a universal reference type. In C and C++, this
type is called void *. In Clu it is called any; in Modula-2,
address; in Modula-3, refany; in Java, Object; in C#, object.
In C and C++, how does void * work as a universal reference type?
void * is always only a pointer type, while a universal reference type contains all values, both pointers and nonpointers. So I can't see how void * is a universal reference type.
Thanks.
A void* pointer will generally hold any pointer that is not a C++ pointer-to-member. It's rather inconvenient in practice, since you need to cast it to another pointer type before you can use it. You also need to convert it to the same pointer type that it was converted from to make the void*, otherwise you risk undefined behavior.
A good example would be the qsort function. It takes a void* pointer as a parameter, meaning it can point to an array of anything. The comparison function you pass to qsort must know how to cast two void* pointers back to the types of the array elements in order to compare them.
The crux of your confusion is that neither an instance of void * nor an instance of Modula-3's refany, nor an instance of any other language's "can refer to anything" type, contains the object that it refers to. A variable of type void * is always a pointer and a variable of type refany is always a reference. But the object that they refer to can be of any type.
A purist of programming-language theory would tell you that C does not have references at all, because pointers are not references. It has a nearly-universal pointer type, void *, which can point to an object of any type (including integers, aggregates, and other pointers). As a common but not ubiquitous extension, it can also point to any function (functions are not objects).
The purist would also tell you that C++ does not have a (nearly-)universal pointer type, because of its stricter type system, and doesn't have a universal reference type either.
They would also say that the book you are reading is being sloppy with its terminology, and they would caution you to not take any one such book for the gospel truth on terminological matters, or any other matters. You should instead read widely in both books and CS journals and conference proceedings (collectively known as "the literature") until you gain an "ear" for what is generally-agreed-on terminology, what is specific to a subdiscipline or a community of practice, and so on.
And finally they would remind you that C and C++ are two different languages, and anyone who speaks of them in the same breath is either glossing over the distinctions (which may or may not be relevant in context), decades out of date, or both.
Probably the reason is that you can take address of any variable of any type and cast it to void*.
It does by a silent contract that you know the actual type of object.
So you can store different kinds of elements in a container, but you need to somehow know what is what when taking elements back, to interpret them correctly.
The only convenience void* offers is that it's idiomatic for this, i.e. it's clear that dereferencing the pointer makes no sense, and void* is implicitly convertible to any pointer type. That is for c/
In c++ this is called type erasure techniques preferred. Or special types, like any (there is a boost version of this too.)
void* is no more just a pointer. Thus, it holds an address of an object (or an array and stuffs like that)
When your program is running, every variable should have it owns address in memory, right? And pointer is somethings point to that address.
In normal, each type of pointer should be the same type of object int b = 5; int* p = &b; for example. But that is the case you know what the type is, it means the specific type.
But sometimes, you just want to know that it stores somethings somewhere in memory and you know what "type" of that address, you can cast easily. For example, in OpenCV library which I am learning, there are a lot of functions where user can pass the arguments to instead of declaring global variables and most use in callback functions, like this:
void onChange(int v, void *ptr)
Here, the library does not care about what ptr point to, it just know that when you call the function, if you pass an address to like this onChange(5,&b) then you must cast ptr to the same type before dealing with it int b = static_cast<int*>(ptr);
Probably this explanation from Understanding pointers from Richard Reese will help
A pointer to void is a general-purpose pointer used to hold references to any data type.
It has two interesting properties:
A pointer to void will have the same representation and memory alignment as a pointer to char
A pointer to void will never be equal to another pointer. However, two void pointers assigned a NULL value will be equal.
Any pointer can be assigned to a pointer to void. It can then be cast back to its original pointer type. When this happens the value will be equal to the original pointer value.
This is illustrated in the following sequence, where a pointer to
int is assigned to a pointer to void and then back to a pointer to int
#include<stdio.h>
void main()
{
int num = 100;
int *pi = &num;
printf("value of pi is %p\n", pi);
void* pv = pi;
pi = (int*)pv;
printf("value of pi is %p\n", pi);
}
Pointers to void are used for data pointers, not function pointers

void* to pointer

Normally when calling a dynamically loaded function I usually do a standard straight cast:
typedef int (*GenericFn)(); // matches x86 FARPROC, minus explicit calling convention
typedef bool (*DesiredFn)(int,int);
GenericFn address = GetProcAddress(module, "MyFunction");
DesiredFn target = reinterpret_cast<DesiredFn>(address);
Today I did something a little different (and braindead).
DesiredFn target = nullptr;
void* temp = static_cast<void*>(&target); // pointer to function pointer
GenericFn* address = static_cast<GenericFn*>(temp);
*address = GetProcAddress(module, "MyFunction"); // supposedly valid?
// temp is declared void* because a void** cannot be cast to GenericFn* without
// first doing a void** -> void* conversion
assert(target == MyFunction); // true on VC10, presumably GCC
My questions:
Is the behavior of a void* (note: not a void**) to an object pointer type well-defined?
Why does the compiler allow static_cast<void*> on a void**?
Why am I stupid enough to try this?
Do you see anything else that's wrong with this example?
I've since decided to use method #1 again because of code clarity (and because I know it's supposed to work). I'm still interested in why method #2 worked though :).
In case you're wondering (about my explanation)
Today I was removing <windows.h> dependencies in several public interfaces, and rather than redeclare FARPROC like I should have, I experimentally changed my FARPROC return-type function to instead accept a void* output parameter (I know, it should probably have been a void**).
// implemented in some library cpp file
void detail::FunctionResolve(std::string export, void* output)
{
FARPROC* address = static_cast<FARPROC*>(output);
*address = GetProcAddress(...);
}
// header-defined interface class
template<typename F>
class RuntimeFunction {
F* target;
void SetFunction(std::string export) {
// old: this->target = reinterpret_cast<F*>(detail::FunctionResolve(...));
// new:
detail::FunctionResolve(export, static_cast<void*>(&this->target));
}
};
typedef int (*GenericFn)(); // matches x86 FARPROC, minus explicit calling convention
typedef bool (*DesiredFn)(int,int);
DesiredFn target = nullptr;
void* temp = static_cast<void*>(&target); // pointer to function pointer
There's nothing wrong here, but the cast is unnecessary. A pointer to any object (a pointer to a function is an object) can be converted to a pointer to void without a cast. e.g.
void* temp = &target;
GenericFn* address = static_cast<GenericFn*>(temp);
You can convert from a pointer to void to a pointer to any object type but the results are only defined if you cast a value the was converted to a void* back to the original type that it was converted from. Technically, only a static_cast<DesiredFn*>(temp) would have a well defined result.
*address = GetProcAddress(module, "MyFunction"); // supposedly valid?
This isn't technically correct as you have lied about the type of the value that you assigned to address so address isn't pointing to an object that matches its type information.
Having said all that, in many implementations function pointers are all represented in the same way and any cast and conversions don't have any effect on the value that is actually stored. So long as you call the function throught a pointer that actually matches the type of the pointer you won't have any problems.
After all, you have to rely on your implementation's behaviour of reinterpret_cast and GetProcAddress for the original method to work at all, but - as you say - I would recommend sticking with the reinterpret_cast approach in this case as it is clearer what is going on.
It doesn't matter because void* and void** are the same size. You're just changing the type. Why not just cast directly to the type you want?
DesiredFn target =
reinterpret_cast<DesiredFn>(GetProcAddress(module, "MyFunction"));

Mental model for void* and void**?

Note: I'm a experienced C++ programmer, so I don't need any pointer basics. It's just that I never worked with void** and have kind of a hard time getting my mental model adjusted to void* vs. void**. I am hoping someone can explain this in a good way, so that I can remember the semantics more easily.
Consider the following code: (compiles with e.g. VC++ 2005)
int main() {
int obj = 42;
void* ptr_to_obj = &obj;
void* addr_of_ptr_to_obj = &ptr_to_obj;
void** ptr_to_ptr_to_obj = &ptr_to_obj;
void* another_addr = ptr_to_ptr_to_obj[0];
// another_addr+1; // not allowed : 'void*' unknown size
ptr_to_ptr_to_obj+1; // allowed
}
void* is a pointer to something, but you don't know what. Because you don't know what it is, you don't know how much room it takes up, so you can't increment the pointer.
void** is a pointer to void*, so it's a pointer to a pointer. We know how much room pointers take up, so we can increment the void** pointer to point to the next pointer.
A void* points to an object whose type is unknown to the compiler.
A void** points to a variable which stores such a void*.
A void * can point at anything (except functions). So it can even point at pointers, so it can even point at other void * objects.
A void ** is a pointer-to-void *, so it can only be used to point at void * objects.
void is misleading because it sounds like null. However, it's better to think of void as an unspecified type. So a void* is pointer for an unspecified type, and a void** is a pointer to a pointer for an unspecified type.
void is a type which has no objects.
void * is a conventional scalar type.
void ** is also a conventional scalar type that happens to point to void *.
void * can be used to point to anything, but I prefer to use it only for uninitialized storage. There is usually a better alternative to pointing a void * at an actual object.

What does (void**) mean in C?

I would look this up, but honestly I wouldn't know where to start because I don't know what it is called. I've seen variables passed to functions like this:
myFunction((void**)&variable);
Which confuses the heck out of me cause all of those look familiar to me; I've just never seen them put together like that before.
What does it mean? I am a newb so the less jargon, the better, thanks!
void* is a "pointer to anything". void ** is another level of indirection - "pointer to pointer to anything". Basically, you pass that in when you want to allow the function to return a pointer of any type.
&variable takes the address of variable. variable should already be some kind of a pointer for that to work, but it's probably not void * - it might be, say int *, so taking its address would result in a int **. If the function takes void ** then you need to cast to that type.
(Of course, it needs to actually return an object of the right type, otherwise calling code will fail down the track when it tries to use it the wrong way.)
Take it apart piece by piece...
myFunction takes a pointer to a pointer of type void (which pretty much means it could point to anything). It might be declared something like this:
myFunction(void **something);
Anything you pass in has to have that type. So you take the address of a pointer, and cast it with (void**) to make it be a void pointer. (Basically stripping it of any idea about what it points to - which the compiler might whine about otherwise.)
This means that &variable is the address (& does this) of a pointer - so variable is a pointer. To what? Who knows!
Here is a more complete snippet, to give an idea of how this fits together:
#include <stdio.h>
int myInteger = 1;
int myOtherInt = 2;
int *myPointer = &myInteger;
myFunction(void **something){
*something = &myOtherInt;
}
main(){
printf("Address:%p Value:%d\n", myPointer, *myPointer);
myFunction((void**)&myPointer);
printf("Address:%p Value:%d\n", myPointer, *myPointer);
}
If you compile and run this, it should give this sort of output:
Address:0x601020 Value:1
Address:0x601024 Value:2
You can see that myFunction changed the value of myPointer - which it could only do because it was passed the address of the pointer.
It's a cast to a pointer to a void pointer.
You see this quite often with functions like CoCreateInstance() on Windows systems.
ISomeInterface* ifaceptr = 0;
HRESULT hr = ::CoCreateInstance(CLSID_SomeImplementation, NULL, CLSCTX_ALL,
IID_ISomeInterface, (void**)&ifaceptr);
if(SUCCEEDED(hr))
{
ifaceptr->DoSomething();
}
The cast converts the pointer to an ISomeInterface pointer into a pointer to a void pointer so that CoCreateInstance() can set ifaceptr to a valid value.
Since it is a pointer to a void pointer, the function can output pointers of any type, depending on the interface ID (such as IID_ISomeInterface).
It's a pointer to a pointer to a variable with an unspecified type. All pointers are the same size, so void* just means "a pointer to something but I have no idea what it is". A void** could also be a 2D array of unspecified type.
That casts &variable to a void** (that is, a pointer to a pointer to void).
For example, if you have something along the lines of
void myFunction(void** arg);
int* variable;
This passes the address of variable (that's what the unary-& does, it takes the address) to myFunction().
The variable is a pointer to something of undefined (void) type. The & operator returns the address of that variable, so you now have a pointer to a pointer of something. The pointer is therefore passed into the function by reference. The function might have a side effect which changes the memory referenced by that pointer. In other words, calling this function might change the something that the original pointer is referencing.