Related
I am learning C++ using C++ Primer 5th edition. In particular, i read about void*. There it is written that:
We cannot use a void* to operate on the object it addresses—we don’t know that object’s type, and the type determines what operations we can perform on that object.
void*: Pointer type that can point to any nonconst type. Such pointers may not
be dereferenced.
My question is that if we're not allowed to use a void* to operate on the object it addressess then why do we need a void*. Also, i am not sure if the above quoted statement from C++ Primer is technically correct because i am not able to understand what it is conveying. Maybe some examples can help me understand what the author meant when he said that "we cannot use a void* to operate on the object it addresses". So can someone please provide some example to clarify what the author meant and whether he is correct or incorrect in saying the above statement.
My question is that if we're not allowed to use a void* to operate on the object it addressess then why do we need a void*
It's indeed quite rare to need void* in C++. It's more common in C.
But where it's useful is type-erasure. For example, try to store an object of any type in a variable, determining the type at runtime. You'll find that hiding the type becomes essential to achieve that task.
What you may be missing is that it is possible to convert the void* back to the typed pointer afterwards (or in special cases, you can reinterpret as another pointer type), which allows you to operate on the object.
Maybe some examples can help me understand what the author meant when he said that "we cannot use a void* to operate on the object it addresses"
Example:
int i;
int* int_ptr = &i;
void* void_ptr = &i;
*int_ptr = 42; // OK
*void_ptr = 42; // ill-formed
As the example demonstrates, we cannot modify the pointed int object through the pointer to void.
so since a void* has no size(as written in the answer by PMF)
Their answer is misleading or you've misunderstood. The pointer has a size. But since there is no information about the type of the pointed object, the size of the pointed object is unknown. In a way, that's part of why it can point to an object of any size.
so how can a int* on the right hand side be implicitly converted to a void*
All pointers to objects can implicitly be converted to void* because the language rules say so.
Yes, the author is right.
A pointer of type void* cannot be dereferenced, because it has no size1. The compiler would not know how much data he needs to get from that address if you try to access it:
void* myData = std::malloc(1000); // Allocate some memory (note that the return type of malloc() is void*)
int value = *myData; // Error, can't dereference
int field = myData->myField; // Error, a void pointer obviously has no fields
The first example fails because the compiler doesn't know how much data to get. We need to tell it the size of the data to get:
int value = *(int*)myData; // Now fine, we have casted the pointer to int*
int value = *(char*)myData; // Fine too, but NOT the same as above!
or, to be more in the C++-world:
int value = *static_cast<int*>(myData);
int value = *static_cast<char*>(myData);
The two examples return a different result, because the first gets an integer (32 bit on most systems) from the target address, while the second only gets a single byte and then moves that to a larger variable.
The reason why the use of void* is sometimes still useful is when the type of data doesn't matter much, like when just copying stuff around. Methods such as memset or memcpy take void* parameters, since they don't care about the actual structure of the data (but they need to be given the size explicitly). When working in C++ (as opposed to C) you'll not use these very often, though.
1 "No size" applies to the size of the destination object, not the size of the variable containing the pointer. sizeof(void*) is perfectly valid and returns, the size of a pointer variable. This is always equal to any other pointer size, so sizeof(void*)==sizeof(int*)==sizeof(MyClass*) is always true (for 99% of today's compilers at least). The type of the pointer however defines the size of the element it points to. And that is required for the compiler so he knows how much data he needs to get, or, when used with + or -, how much to add or subtract to get the address of the next or previous elements.
void * is basically a catch-all type. Any pointer type can be implicitly cast to void * without getting any errors. As such, it is mostly used in low level data manipulations, where all that matters is the data that some memory block contains, rather than what the data represents. On the flip side, when you have a void * pointer, it is impossible to determine directly which type it was originally. That's why you can't operate on the object it addresses.
if we try something like
typedef struct foo {
int key;
int value;
} t_foo;
void try_fill_with_zero(void *destination) {
destination->key = 0;
destination->value = 0;
}
int main() {
t_foo *foo_instance = malloc(sizeof(t_foo));
try_fill_with_zero(foo_instance, sizeof(t_foo));
}
we will get a compilation error because it is impossible to determine what type void *destination was, as soon as the address gets into try_fill_with_zero. That's an example of being unable to "use a void* to operate on the object it addresses"
Typically you will see something like this:
typedef struct foo {
int key;
int value;
} t_foo;
void init_with_zero(void *destination, size_t bytes) {
unsigned char *to_fill = (unsigned char *)destination;
for (int i = 0; i < bytes; i++) {
to_fill[i] = 0;
}
}
int main() {
t_foo *foo_instance = malloc(sizeof(t_foo));
int test_int;
init_with_zero(foo_instance, sizeof(t_foo));
init_with_zero(&test_int, sizeof(int));
}
Here we can operate on the memory that we pass to init_with_zero represented as bytes.
You can think of void * as representing missing knowledge about the associated type of the data at this address. You may still cast it to something else and then dereference it, if you know what is behind it. Example:
int n = 5;
void * p = (void *) &n;
At this point, p we have lost the type information for p and thus, the compiler does not know what to do with it. But if you know this p is an address to an integer, then you can use that information:
int * q = (int *) p;
int m = *q;
And m will be equal to n.
void is not a type like any other. There is no object of type void. Hence, there exists no way of operating on such pointers.
This is one of my favourite kind of questions because at first I was also so confused about void pointers.
Like the rest of the Answers above void * refers to a generic type of data.
Being a void pointer you must understand that it only holds the address of some kind of data or object.
No other information about the object itself, at first you are asking yourself why do you even need this if it's only able to hold an address. That's because you can still cast your pointer to a more specific kind of data, and that's the real power.
Making generic functions that works with all kind of data.
And to be more clear let's say you want to implement generic sorting algorithm.
The sorting algorithm has basically 2 steps:
The algorithm itself.
The comparation between the objects.
Here we will also talk about pointer functions.
Let's take for example qsort built in function
void qsort(void *base, size_t nitems, size_t size, int (*compar)(const void *, const void*))
We see that it takes the next parameters:
base − This is the pointer to the first element of the array to be sorted.
nitems − This is the number of elements in the array pointed by base.
size − This is the size in bytes of each element in the array.
compar − This is the function that compares two elements.
And based on the article that I referenced above we can do something like this:
int values[] = { 88, 56, 100, 2, 25 };
int cmpfunc (const void * a, const void * b) {
return ( *(int*)a - *(int*)b );
}
int main () {
int n;
printf("Before sorting the list is: \n");
for( n = 0 ; n < 5; n++ ) {
printf("%d ", values[n]);
}
qsort(values, 5, sizeof(int), cmpfunc);
printf("\nAfter sorting the list is: \n");
for( n = 0 ; n < 5; n++ ) {
printf("%d ", values[n]);
}
return(0);
}
Where you can define your own custom compare function that can match any kind of data, there can be even a more complex data structure like a class instance of some kind of object you just define. Let's say a Person class, that has a field age and you want to sort all Persons by age.
And that's one example where you can use void * , you can abstract this and create other use cases based on this example.
It is true that is a C example, but I think, being something that appeared in C can make more sense of the real usage of void *. If you can understand what you can do with void * you are good to go.
For C++ you can also check templates, templates can let you achieve a generic type for your functions / objects.
At xdn-project/digitalnote ./src/crypto/crypto.cpp file there is an error at line 338 when compiling (using cmake):
return sizeof(rs_comm) + pubs_count * sizeof(rs_comm().ab[0]);
^
error: value-initialization of incomplete type
‘Crypto::rs_comm:: []’
I found the solution on cryptonotefoundation/cryptonote:
return sizeof(rs_comm) + pubs_count * sizeof(((rs_comm*)0)->ab[0]);
I can play with Java JDK quite well, but currently at C++ need help :) It would be nice to see detail explanation of this code part:
sizeof(((rs_comm*)0)->ab[0]);
My questions are:
Asterisk after rs_comm - what its for?
0) - what is the purpose of 0 here?
The fragment of code:
struct rs_comm {
Hash h;
struct {
EllipticCurvePoint a, b;
} ab[];
};
static inline size_t rs_comm_size(size_t pubs_count) {
return sizeof(rs_comm) + pubs_count * sizeof(rs_comm().ab[0]);
}
sizeof is an operator that return the size of a specific type. It can work directly with a type or with an expression.
This part (rs_comm*)0 is taking 0 (0 is a valid null pointer constant) and casting it to a pointer of the struct rs_comm (or the class, I don't know the definition of rs_comm, but I am guessing).
Now, it is accessing using the -> operator to the data-member ab. ab has to be define as array, so it can get the first item in the array.
Because, sizeof doesn't really evaluate the expression but just figuring out the type and get the size of it.
So, the final result is the size of the first element in the array ab for the class/struct rs_comm.
So ab is an member of struct rs_comm, and is an array.
If you have a rs_comm object, i.e. rs_comm rs;, but you don't know the type of ab, you want to know its size, sizeof(rs.ab[0]) will do.
If you have a pointer to rs_comm, i.e. rs_comm *p_rs;, then sizeof(p_rs->ab[0]) will do the same thing.
If you don't have a rs_comm object nor a pointer to rs_comm, you can change a NULL pointer to a pointer to rs_comm, this is what ((rs_comm *)0) do.
Replace the p_rs in sizeof(p_rs->ab[0]) with ((rs_comm *)0), you get sizeof(((rs_comm *)0)->ab[0]).
sizeof(variable) .
return the size variable data type .
like
char x ;
cout << sizeof(x) ;
the result would be
I am reading a book called "Teach Yourself C in 21 Days" (I have already learned Java and C# so I am moving at a much faster pace). I was reading the chapter on pointers and the -> (arrow) operator came up without explanation. I think that it is used to call members and functions (like the equivalent of the . (dot) operator, but for pointers instead of members). But I am not entirely sure.
Could I please get an explanation and a code sample?
foo->bar is equivalent to (*foo).bar, i.e. it gets the member called bar from the struct that foo points to.
Yes, that's it.
It's just the dot version when you want to access elements of a struct/class that is a pointer instead of a reference.
struct foo
{
int x;
float y;
};
struct foo var;
struct foo* pvar;
pvar = malloc(sizeof(struct foo));
var.x = 5;
(&var)->y = 14.3;
pvar->y = 22.4;
(*pvar).x = 6;
That's it!
I'd just add to the answers the "why?".
. is standard member access operator that has a higher precedence than * pointer operator.
When you are trying to access a struct's internals and you wrote it as *foo.bar then the compiler would think to want a 'bar' element of 'foo' (which is an address in memory) and obviously that mere address does not have any members.
Thus you need to ask the compiler to first dereference whith (*foo) and then access the member element: (*foo).bar, which is a bit clumsy to write so the good folks have come up with a shorthand version: foo->bar which is sort of member access by pointer operator.
a->b is just short for (*a).b in every way (same for functions: a->b() is short for (*a).b()).
foo->bar is only shorthand for (*foo).bar. That's all there is to it.
Well I have to add something as well. Structure is a bit different than array because array is a pointer and structure is not. So be careful!
Lets say I write this useless piece of code:
#include <stdio.h>
typedef struct{
int km;
int kph;
int kg;
} car;
int main(void){
car audi = {12000, 230, 760};
car *ptr = &audi;
}
Here pointer ptr points to the address (!) of the structure variable audi but beside address structure also has a chunk of data (!)! The first member of the chunk of data has the same address than structure itself and you can get it's data by only dereferencing a pointer like this *ptr (no braces).
But If you want to acess any other member than the first one, you have to add a designator like .km, .kph, .kg which are nothing more than offsets to the base address of the chunk of data...
But because of the preceedence you can't write *ptr.kg as access operator . is evaluated before dereference operator * and you would get *(ptr.kg) which is not possible as pointer has no members! And compiler knows this and will therefore issue an error e.g.:
error: ‘ptr’ is a pointer; did you mean to use ‘->’?
printf("%d\n", *ptr.km);
Instead you use this (*ptr).kg and you force compiler to 1st dereference the pointer and enable acess to the chunk of data and 2nd you add an offset (designator) to choose the member.
Check this image I made:
But if you would have nested members this syntax would become unreadable and therefore -> was introduced. I think readability is the only justifiable reason for using it as this ptr->kg is much easier to write than (*ptr).kg.
Now let us write this differently so that you see the connection more clearly. (*ptr).kg ⟹ (*&audi).kg ⟹ audi.kg. Here I first used the fact that ptr is an "address of audi" i.e. &audi and fact that "reference" & and "dereference" * operators cancel eachother out.
struct Node {
int i;
int j;
};
struct Node a, *p = &a;
Here the to access the values of i and j we can use the variable a and the pointer p as follows: a.i, (*p).i and p->i are all the same.
Here . is a "Direct Selector" and -> is an "Indirect Selector".
I had to make a small change to Jack's program to get it to run. After declaring the struct pointer pvar, point it to the address of var. I found this solution on page 242 of Stephen Kochan's Programming in C.
#include <stdio.h>
int main()
{
struct foo
{
int x;
float y;
};
struct foo var;
struct foo* pvar;
pvar = &var;
var.x = 5;
(&var)->y = 14.3;
printf("%i - %.02f\n", var.x, (&var)->y);
pvar->x = 6;
pvar->y = 22.4;
printf("%i - %.02f\n", pvar->x, pvar->y);
return 0;
}
Run this in vim with the following command:
:!gcc -o var var.c && ./var
Will output:
5 - 14.30
6 - 22.40
#include<stdio.h>
int main()
{
struct foo
{
int x;
float y;
} var1;
struct foo var;
struct foo* pvar;
pvar = &var1;
/* if pvar = &var; it directly
takes values stored in var, and if give
new > values like pvar->x = 6; pvar->y = 22.4;
it modifies the values of var
object..so better to give new reference. */
var.x = 5;
(&var)->y = 14.3;
printf("%i - %.02f\n", var.x, (&var)->y);
pvar->x = 6;
pvar->y = 22.4;
printf("%i - %.02f\n", pvar->x, pvar->y);
return 0;
}
The -> operator makes the code more readable than the * operator in some situations.
Such as: (quoted from the EDK II project)
typedef
EFI_STATUS
(EFIAPI *EFI_BLOCK_READ)(
IN EFI_BLOCK_IO_PROTOCOL *This,
IN UINT32 MediaId,
IN EFI_LBA Lba,
IN UINTN BufferSize,
OUT VOID *Buffer
);
struct _EFI_BLOCK_IO_PROTOCOL {
///
/// The revision to which the block IO interface adheres. All future
/// revisions must be backwards compatible. If a future version is not
/// back wards compatible, it is not the same GUID.
///
UINT64 Revision;
///
/// Pointer to the EFI_BLOCK_IO_MEDIA data for this device.
///
EFI_BLOCK_IO_MEDIA *Media;
EFI_BLOCK_RESET Reset;
EFI_BLOCK_READ ReadBlocks;
EFI_BLOCK_WRITE WriteBlocks;
EFI_BLOCK_FLUSH FlushBlocks;
};
The _EFI_BLOCK_IO_PROTOCOL struct contains 4 function pointer members.
Suppose you have a variable struct _EFI_BLOCK_IO_PROTOCOL * pStruct, and you want to use the good old * operator to call it's member function pointer. You will end up with code like this:
(*pStruct).ReadBlocks(...arguments...)
But with the -> operator, you can write like this:
pStruct->ReadBlocks(...arguments...).
Which looks better?
#include<stdio.h>
struct examp{
int number;
};
struct examp a,*b=&a;`enter code here`
main()
{
a.number=5;
/* a.number,b->number,(*b).number produces same output. b->number is mostly used in linked list*/
printf("%d \n %d \n %d",a.number,b->number,(*b).number);
}
output is 5
5 5
Dot is a dereference operator and used to connect the structure variable for a particular record of structure.
Eg :
struct student
{
int s.no;
Char name [];
int age;
} s1,s2;
main()
{
s1.name;
s2.name;
}
In such way we can use a dot operator to access the structure variable
I have the following structure in C++ :
struct wrapper
{
// Param constructor
wrapper(unsigned int _id, const char* _string1, unsigned int _year,
unsigned int _value, unsigned int _usage, const char* _string2)
:
id(_id), year(_year), value(_value), usage(_usage)
{
int len = strlen(_string1);
string1 = new char[len + 1]();
strncpy(string1, _string1, len);
len = strlen(_string2);
string2 = new char[len + 1]();
strncpy(string2, _string2, len);
};
// Destructor
~wrapper()
{
if(string1 != NULL)
delete [] string1;
if(string2 != NULL)
delete [] string2;
}
// Elements
unsigned int id;
unsigned int year;
unsigned int value;
unsigned int usage;
char* string1;
char* string2;
};
In main.cpp let's say I allocate memory for one object of this structure :
wrapper* testObj = new wrapper(125600, "Hello", 2013, 300, 0, "bye bye");
Can I now delete the entire object using pointer arithmetic and a pointer that points to one of the structure elements ?
Something like this :
void* ptr = &(testObj->string2);
ptr -= 0x14;
delete (wrapper*)ptr;
I've tested myself and apparently it works but I'm not 100% sure that is equivalent to delete testObj.
Thanks.
Technically, the code like this would work (ignoring the fact that wrapper testObj should be wrapper* testObj and that the offset is not necessarily 0x14, e.g. debug builds sometimes pad the structures, and maybe some other detail I missed), but it is a horrible, horrible idea. I can't stress hard enough how horrible it is.
Instead of 0x14 you could use offsetof macro.
If you like spending nights in the company of the debugger, sure, feel free to do so.
I will assume that the reason for the question is sheer curiosity about whether it is possible to use pointer arithmetic to navigate from members to parent, and not that you would like to really do it in production code. Please tell me I am right.
Can I now delete the entire object using pointer arithmetic and a pointer that points to one of the structure elements ?
Theoretically, yes.
The pointer that you give to delete needs to have the correct value, and it doesn't really matter whether that value comes from an existing pointer variable, or by "adjusting" one in this manner.
You also need to consider the type of the pointer; if nothing else, you should cast to char* before performing your arithmetic so that you are moving in steps of single bytes. Your current code will not compile because ISO C++ forbids incrementing a pointer of type 'void*' (how big is a void?).
However, I recommend not doing this at all. Your magic number 0x14 is unreliable, given alignment and padding and the potential of your structure to change shape.
Instead, store a pointer to the actual object. Also stop with all the horrid memory mess, and use std::string. At present, your lack of copy constructor is presenting a nasty bug.
You can do this sort of thing with pointer arithmetic. Whether you should is an entirely different story. Consider this macro (I know... I know...) that will give you the base address of a structure given its type, the name of a structure member and a pointer to that member:
#define ADDRESS_FROM_MEMBER(T, member, ptr) reinterpret_cast<T*>( \
reinterpret_cast<unsigned char *>(ptr) - (ptrdiff_t)(&(reinterpret_cast<T*>(0))->member))
In the following lines of code, I need to adjust the pointer pm by an offset in bytes in one of its fields. Is there an better/easier way to do this, than incessantly casting back and forth from char * and PartitionMap * such that the pointer arithmetic still works out?
PartitionMap *pm(reinterpret_cast<PartitionMap *>(partitionMaps));
for ( ; index > 0 ; --index)
{
pm = (PartitionMap *)(((char *)pm) + pm->partitionMapLength);
}
return pm;
For those that can't grok from the code, it's looping through variable length descriptors in a buffer that inherit from PartitionMap.
Also for those concerned, partitionMapLength always returns lengths that are supported by the system this runs on. The data I'm traversing conforms to the UDF specification.
I often use these templates for this:
template<typename T>
T *add_pointer(T *p, unsigned int n) {
return reinterpret_cast<T *>(reinterpret_cast<char *>(p) + n);
}
template<typename T>
const T *add_pointer(const T *p, unsigned int n) {
return reinterpret_cast<const T *>(reinterpret_cast<const char *>(p) + n);
}
They maintain the type, but add single bytes to them, for example:
T *x = add_pointer(x, 1); // increments x by one byte, regardless of the type of x
Casting is the only way, whether it's to a char* or intptr_t or other some such type, and then to your final type.
You can of course just keep two variables around: a char * to step through the buffer and a PartitionMap * to access it. Makes it a little clearer what's going on.
for (char *ptr = ??, pm = (PartitionMap *)ptr ; index > 0 ; --index)
{
ptr += pm->partitionMapLength;
pm = (PartitionMap *)ptr;
}
return pm;
As others have mentioned you need the casts, but you can hide the ugliness in a macro or function. However, one other thing to keep in mind is alignment requirements. On most processors you can't simply increment a pointer to a type by an arbitrary number of bytes and cast the result back into a pointer to the original type without problems accessing the struct through the new pointer due to misalignment.
One of the few architectures (even if it is about the most popular) that will let you get away with it is the x86 architecture. However, even if you're writing for Windows, you'll want to take this problem into account - Win64 does enforce alignment requirements.
So even accessing the partitionMapLength member through the pointer might crash your program.
You might be able to easily work around this problem using a compiler extension like __unaligned on Windows:
PartitionMap __unaliged *pm(reinterpret_cast<PartitionMap *>(partitionMaps));
for ( ; index > 0 ; --index)
{
pm = (PartitionMap __unaligned *)(((char *)pm) + pm->partitionMapLength);
}
return pm;
Or you can copy the potentially unaligned data into a properly aligned struct:
PartitionMap *pm(reinterpret_cast<PartitionMap *>(partitionMaps));
char* p = reinterpret_cast<char*>( pm);
ParititionMap tmpMap;
for ( ; index > 0 ; --index)
{
p += pm->partitionMapLength;
memcpy( &tmpMap, p, sizeof( newMap));
pm = &tmpMap;
}
// you may need a more spohisticated copy to return something useful
size_t siz = pm->partitionMapLength;
pm = reinterpret_cast<PartitionMap*>( malloc( siz));
if (pm) {
memcpy( pm, p, siz);
}
return pm;
The casting has to be done, but it makes the code nearly unreadable. For readability's sake, isolate it in a static inline function.
What is puzzling me is why you have 'partitionMapLength' in bytes?
Wouldn't it be better if it was in 'partitionMap' units since you anyway cast it?
PartitionMap *pmBase(reinterpret_cast<PartitionMap *>(partitionMaps));
PartitionMap *pm;
...
pm = pmBase + index; // just guessing about your 'index' variable here
Both C and C++ allow you to iterate through an array via pointers and ++:
#include <iostream>
int[] arry = { 0, 1, 2, 3 };
int* ptr = arry;
while (*ptr != 3) {
std::cout << *ptr << '\n';
++ptr;
}
For this to work, adding to pointers is defined to take the memory address stored in the pointer and then add the sizeof whatever the type is times the value being added. For instance, in our example ++ptr adds 1 * sizeof(int) to the memory address stored in ptr.
If you have a pointer to a type, and want to advance a particular number of bytes from that spot, the only way to do so is to cast to char* (because sizeof(char) is defined to be one).