How to Advance void * pointer? - c++

In C++ I had:
MallocMetadata *tmp = static_cast<MallocMetadata *> (p);
But now I want tmp to be 5 bytes before in memory so I tried:
MallocMetadata *tmp = static_cast<MallocMetadata *> (p-5);
But that didn't compile, I read some articles which suggested this (and didn't work too):
MallocMetadata *tmp = static_cast<MallocMetadata *> (static_cast<char *> (p) - 5);
How to fix this problem, please note: I am sure that place in memory is legal plus I want tmp to be of type MallocMetadata* to use it later.

You can use reinterpret_cast to convert pointers other than void* to another pointers.
MallocMetadata *tmp = reinterpret_cast<MallocMetadata *> (static_cast<char *> (p) - 5);
Another choice is casting the char* after subtracting something to void* again.
MallocMetadata *tmp = static_cast<MallocMetadata *> (static_cast<void *> (static_cast<char *> (p) - 5));

C++ How to Advance void * pointer?
It is not possible to advance a void*.
Advancing a pointer by one modifies the pointer to point to the next sibling of the previously pointed object within an array of objects. The distance between two elements of an array differs between objects of different types. The distance is exactly the same as the size of the object.
Thus to advance a pointer, it is necessary to know the size of the pointed object. void* can point to an object of any size, and there is no way to get information about that size from the pointer.
What you can do instead is static cast void* to the dynamic type of the pointed object. The size of the pointed object is then known by virtue of knowing the type of the pointer, as long as the type is complete. You can then use pointer arithmetic to advance the converted pointer to a sibling of the pointed object.
But now I want tmp to be 5 bytes before in memory
Before we proceed any further, I want to make it clear that this is an unsafe thing to attempt, and you must know the language rules in detail to have even a remote chance of doing this correctly. I urge you to consider whether doing this is necessary.
To get a pointer to the memory address 5 bytes before, you can static_cast void* to unsigned char* and do pointer arithmetic on the converted pointer:
static_cast<unsigned char*>(p) - 5
MallocMetadata *tmp = static_cast<MallocMetadata *> (static_cast<char *> (p) - 5);
char* isn't static-castable to arbitrary object pointer types. if the memory address is properly aligned and ((the address contains an object of similar type) or (MallocMetadata is a trivial type and the address doesn't contain an object of another type and you're going to write to the address and not read, thereby creating a new object)), then you can use reinterpret_cast instead:
MallocMetadata *tmp = reinterpret_cast<MallocMetadata*>(
static_cast<char*>(p) - 5
);
A full example:
// preparation
int offset = 5;
std::size_t padding = sizeof(MallocMetadata) >= offset
? 0
: sizeof(MallocMetadata) - offset;
auto align = static_cast<std::align_val_t>(alignof(MallocMetadata));
void* p_storage = ::operator new(sizeof(MallocMetadata) + padding, align);
MallocMetadata* p_mm = new (p_storage) MallocMetadata{};
void* p = reinterpret_cast<char*>(p_mm) + offset;
// same as above
MallocMetadata *tmp = reinterpret_cast<MallocMetadata*>(
static_cast<char*>(p) - offset
);
// cleanup
tmp->~MallocMetadata();
::operator delete(tmp);

I don't know what you'll make of this, but I'll try:
There's a requirement in the standard that void * and character pointers have the same representation and alignment that falls out of C and how historically you had character pointer types where you now have void *.
If you have a void *, actually a void * and not some other type of pointer, and you wanted to advance it a byte at a time, you should be able to create a reference-to-pointer-to-unsigned-character bound to the pointer-to-void, as in:
auto &ucp = reinterpret_cast<unsigned char *&>(void_pointer);
And now it should be possible to manipulate void_pointer through operations on ucp reference.
So ++ucp will advance it, and therefore void_pointer, by one.

Related

Why can't we use a void* to operate on the object it addresses

I am learning C++ using C++ Primer 5th edition. In particular, i read about void*. There it is written that:
We cannot use a void* to operate on the object it addresses—we don’t know that object’s type, and the type determines what operations we can perform on that object.
void*: Pointer type that can point to any nonconst type. Such pointers may not
be dereferenced.
My question is that if we're not allowed to use a void* to operate on the object it addressess then why do we need a void*. Also, i am not sure if the above quoted statement from C++ Primer is technically correct because i am not able to understand what it is conveying. Maybe some examples can help me understand what the author meant when he said that "we cannot use a void* to operate on the object it addresses". So can someone please provide some example to clarify what the author meant and whether he is correct or incorrect in saying the above statement.
My question is that if we're not allowed to use a void* to operate on the object it addressess then why do we need a void*
It's indeed quite rare to need void* in C++. It's more common in C.
But where it's useful is type-erasure. For example, try to store an object of any type in a variable, determining the type at runtime. You'll find that hiding the type becomes essential to achieve that task.
What you may be missing is that it is possible to convert the void* back to the typed pointer afterwards (or in special cases, you can reinterpret as another pointer type), which allows you to operate on the object.
Maybe some examples can help me understand what the author meant when he said that "we cannot use a void* to operate on the object it addresses"
Example:
int i;
int* int_ptr = &i;
void* void_ptr = &i;
*int_ptr = 42; // OK
*void_ptr = 42; // ill-formed
As the example demonstrates, we cannot modify the pointed int object through the pointer to void.
so since a void* has no size(as written in the answer by PMF)
Their answer is misleading or you've misunderstood. The pointer has a size. But since there is no information about the type of the pointed object, the size of the pointed object is unknown. In a way, that's part of why it can point to an object of any size.
so how can a int* on the right hand side be implicitly converted to a void*
All pointers to objects can implicitly be converted to void* because the language rules say so.
Yes, the author is right.
A pointer of type void* cannot be dereferenced, because it has no size1. The compiler would not know how much data he needs to get from that address if you try to access it:
void* myData = std::malloc(1000); // Allocate some memory (note that the return type of malloc() is void*)
int value = *myData; // Error, can't dereference
int field = myData->myField; // Error, a void pointer obviously has no fields
The first example fails because the compiler doesn't know how much data to get. We need to tell it the size of the data to get:
int value = *(int*)myData; // Now fine, we have casted the pointer to int*
int value = *(char*)myData; // Fine too, but NOT the same as above!
or, to be more in the C++-world:
int value = *static_cast<int*>(myData);
int value = *static_cast<char*>(myData);
The two examples return a different result, because the first gets an integer (32 bit on most systems) from the target address, while the second only gets a single byte and then moves that to a larger variable.
The reason why the use of void* is sometimes still useful is when the type of data doesn't matter much, like when just copying stuff around. Methods such as memset or memcpy take void* parameters, since they don't care about the actual structure of the data (but they need to be given the size explicitly). When working in C++ (as opposed to C) you'll not use these very often, though.
1 "No size" applies to the size of the destination object, not the size of the variable containing the pointer. sizeof(void*) is perfectly valid and returns, the size of a pointer variable. This is always equal to any other pointer size, so sizeof(void*)==sizeof(int*)==sizeof(MyClass*) is always true (for 99% of today's compilers at least). The type of the pointer however defines the size of the element it points to. And that is required for the compiler so he knows how much data he needs to get, or, when used with + or -, how much to add or subtract to get the address of the next or previous elements.
void * is basically a catch-all type. Any pointer type can be implicitly cast to void * without getting any errors. As such, it is mostly used in low level data manipulations, where all that matters is the data that some memory block contains, rather than what the data represents. On the flip side, when you have a void * pointer, it is impossible to determine directly which type it was originally. That's why you can't operate on the object it addresses.
if we try something like
typedef struct foo {
int key;
int value;
} t_foo;
void try_fill_with_zero(void *destination) {
destination->key = 0;
destination->value = 0;
}
int main() {
t_foo *foo_instance = malloc(sizeof(t_foo));
try_fill_with_zero(foo_instance, sizeof(t_foo));
}
we will get a compilation error because it is impossible to determine what type void *destination was, as soon as the address gets into try_fill_with_zero. That's an example of being unable to "use a void* to operate on the object it addresses"
Typically you will see something like this:
typedef struct foo {
int key;
int value;
} t_foo;
void init_with_zero(void *destination, size_t bytes) {
unsigned char *to_fill = (unsigned char *)destination;
for (int i = 0; i < bytes; i++) {
to_fill[i] = 0;
}
}
int main() {
t_foo *foo_instance = malloc(sizeof(t_foo));
int test_int;
init_with_zero(foo_instance, sizeof(t_foo));
init_with_zero(&test_int, sizeof(int));
}
Here we can operate on the memory that we pass to init_with_zero represented as bytes.
You can think of void * as representing missing knowledge about the associated type of the data at this address. You may still cast it to something else and then dereference it, if you know what is behind it. Example:
int n = 5;
void * p = (void *) &n;
At this point, p we have lost the type information for p and thus, the compiler does not know what to do with it. But if you know this p is an address to an integer, then you can use that information:
int * q = (int *) p;
int m = *q;
And m will be equal to n.
void is not a type like any other. There is no object of type void. Hence, there exists no way of operating on such pointers.
This is one of my favourite kind of questions because at first I was also so confused about void pointers.
Like the rest of the Answers above void * refers to a generic type of data.
Being a void pointer you must understand that it only holds the address of some kind of data or object.
No other information about the object itself, at first you are asking yourself why do you even need this if it's only able to hold an address. That's because you can still cast your pointer to a more specific kind of data, and that's the real power.
Making generic functions that works with all kind of data.
And to be more clear let's say you want to implement generic sorting algorithm.
The sorting algorithm has basically 2 steps:
The algorithm itself.
The comparation between the objects.
Here we will also talk about pointer functions.
Let's take for example qsort built in function
void qsort(void *base, size_t nitems, size_t size, int (*compar)(const void *, const void*))
We see that it takes the next parameters:
base − This is the pointer to the first element of the array to be sorted.
nitems − This is the number of elements in the array pointed by base.
size − This is the size in bytes of each element in the array.
compar − This is the function that compares two elements.
And based on the article that I referenced above we can do something like this:
int values[] = { 88, 56, 100, 2, 25 };
int cmpfunc (const void * a, const void * b) {
return ( *(int*)a - *(int*)b );
}
int main () {
int n;
printf("Before sorting the list is: \n");
for( n = 0 ; n < 5; n++ ) {
printf("%d ", values[n]);
}
qsort(values, 5, sizeof(int), cmpfunc);
printf("\nAfter sorting the list is: \n");
for( n = 0 ; n < 5; n++ ) {
printf("%d ", values[n]);
}
return(0);
}
Where you can define your own custom compare function that can match any kind of data, there can be even a more complex data structure like a class instance of some kind of object you just define. Let's say a Person class, that has a field age and you want to sort all Persons by age.
And that's one example where you can use void * , you can abstract this and create other use cases based on this example.
It is true that is a C example, but I think, being something that appeared in C can make more sense of the real usage of void *. If you can understand what you can do with void * you are good to go.
For C++ you can also check templates, templates can let you achieve a generic type for your functions / objects.

Difference between creating a pointer with 'new' and without 'new' apart from memory allocation?

What is the difference between these pointers?
I know that this one is going to be stored on the heap, even though a pointer is only 8 bytes anyways, so the memory is not important for me.
int* aa = new int;
aa = nullptr;
and this one is going to be stored on the stack.
int* bb = nullptr;
They both seem to work the same in my program. Is there any difference apart from memory allocation? I have a feeling that the second one is bad for some reason.
2) Another question which is somewhat related:
Does creating a pointer like that actually take more memory? If we take a look at the first snippet, it creates an int somewhere (4 bytes) and then creates a pointer to it (8 bytes), so is it 12 bytes in total? If yes are they both in the heap then? I can do this, so it means an int exists:
*aa = 20;
Pointers are integers that just indicate a memory position, and a type (so they can only point to variables of that type).
So in your examples, all pointers are stored in the stack (unless they are global variables, but that is another question). What they are pointing to is in the heap, as in the next example.
void foo()
{
int * ptr = new int(42);
// more things...
delete ptr;
}
You can have a pointer pointing into the stack, for example, this way:
void foo()
{
int x = 5;
int * ptr = &x;
// more things...
}
The '&' operator obtains the memory position of the variable x in the example above.
nullptr is the typed equivalent to old NULL. They are a way to initialize a pointer to a known and secure value, meaning that they are not pointing to anything else, and that you can compare whether they are NULL or not.
The program will accept pointers pointing to the stack or the heap: it does not matter.
void addFive(int * x)
{
*x += 5;
}
void foo()
{
int x = 5;
int * ptr1 = &x;
int * ptr2 = new int(42);
addFive( ptr1 );
addFive( ptr2 );
addFive( &x );
printf( "%d\n", *ptr1 );
printf( "%d\n", *ptr2 );
// more things...
delete ptr2;
}
The only difference is that the C runtime will keep structures telling how much memory has been spent in the heap, and therefore storing variables in the heap comes at a cost in performance. On the other hand, the stack is always limited to a fixed amount of memory (relatively small), while the heap is much larger, allowing you to store big arrays, for example.
You could take a look at C-Sim, which simulates memory in C (disclaimer: I wrote it).
Hope this helps.

Casting SomeType** to SomeType*[] and vice versa

I really need to this specifically as I am using SWIG and need to make a cast to match the function definition.
The function definition accepts
SomeType const * const variable_name[]
Also, another question would be-
How to allocate memory to
SomeType * variable[] = <??? using new or malloc >
for x entries?
Edit:
I have searched quite a lot, but I keep stumbling into post which allocate memory to SomeType** using new SomeType*[x] i.e.
SomeType** variable = new SomeType*[x];
Can you please tell me a way to do this?
The function wants an array of pointers.
The statement:
SomeType * variable[];
Is not valid syntax.
You will need:
SomeType * * variable;
Declares a pointer to a pointer of SomeType.
You will need to perform memory allocation in two steps.
First, allocate the array of pointers:
variable = new SomeType * [/* some quantity */];
Remember, the above statement only allocates room for the pointers. The memory contents is still not initialized.
Secondly, allocate pointers to the objects.
for (unsigned int i = 0; i < some_quantity; ++i)
{
variable[i] = new SomeType;
}
When deleting, delete the contents of the array before the array:
for (unsigned int i = 0; i < some_quantity; ++i)
{
delete variable[i];
}
delete[] variable;
The function definition accepts
SomeType const * const variable_name[]
I'm no expert in C++, but if the declaration of arrays in function parameters is the same as in C then this is a synonym for the following type:
SomeType const * const * variable_name
That is, an array in a function parameter is really a pointer. Your book should have explained this early on.
I have searched quite a lot, but I keep stumbling into post which allocate memory to SomeType** using new SomeType*[x] i.e.
SomeType** variable = new SomeType*[x];
Can you please tell me a way to do this?
You could indeed allocate a SomeType const * const * using similar code to that, so you've answered your own question. I assume you could also use an std::vector<SomeType const * const> like so:
std::vector<SomeType const * const> *foo = new std::vector<SomeType const * const>();
/* XXX: Add some items to the vector */
SomeType const * const *bar = &foo[0];
This would be useful if you're not sure how many items foo should contain, and you expect it to grow.
I don't understand why people don't read books anymore. It's the fastest and cheapest (if you consider the cost of man hours) way to learn correctly from a reputable figure.

Customized memory allocation and deletion

struct Rational
{
int a;
int b;
};
struct NextOnFreeList
{
NextOnFreeList *next;
};
// Build the linked-list
NextOnFreeList* freeList = NULL; // head of the linked-list
size_t size = (sizeof(Rational) > sizeof(NextOnFreeList *)) ? sizeof(Rational) : sizeof(NextOnFreeList *);
NextOnFreeList *runner = static_cast <NextOnFreeList *> new char [size]; // LineA
freeList = runner;
for (int i = 0; i < EXPANSION_SIZE; i++) {
runner->next = static_cast <NextOnFreeList *> new char [size];
runner = runner->next;
}
runner->next = 0;
Question 1> LineA
Since the size of Rational(i.e. 8 bytes) is larger than NextOnFreeList(i.e. 4 bytes),
each element in the Linked-list will ONLY use partial of the allocated memory. Is that correct?
// Delete the linked-list
NextOnFreeList *nextPtr = NULL;
for (nextPtr = freeList; nextPtr != NULL; nextPtr = freeList) {
freeList = freeList->next;
delete [] nextPtr; // LineB
}
Question 2> LineB
why should we use 'delete [] nextPtr' instead of 'delete nextPtr'?
Question 3>
Rational* ptr = static_cast<Rational*>( freeList ); // LineC
ptr->a = 10;
ptr->b = 20;
Is it true by LineC, we can bring back all the allocated memory original with size of 'size' and
use the memory to store all elements inside the Rational.
Q1: Yes, but I would rather use std::allocator<Rational>::allocate (or union of both - you will avoid alignment problems this way)
Q2: It is actually bad, because you should cast it to char* first, then use delete[]. Again, using std::allocator would be better. And: It does not matter on default implementation (calls free), but forget I said that ;) ... using malloc/free directly is safer.
Q3:: It is fine, but I am not sure if static_cast will allow that (reinterpret_cast or casting to void* in between may help)
EDIT: I hope your Q3 is not final, because you need to update the free list first (before using the pointer).
2nd EDIT: Links + note: hiding the free-list inside the allocator would be best for C++ (using malloc/free directly in it, or new[] / delete[])
That seems correct. In a 32 bit architecture you most likely overallocated and only a portion of the memory will be used. In a 64 bit architecture most likely the two sizes are the same.
Neither one is appropriate. You allocated that memory as char* so you must delete it as such (by casting back to char* and then delete[]) or you have undefined behavior.
Line C won't compile because the two pointee types are unrelated. If you did use reinterpret_cast you would be violating the strict aliasing rules, again causing undefined behavior.
Q.1:
Yes but that is not the best idea to use static_cast. reinterpret_cast would be more preferable.
Q.2:
I dont there is a defined behavior to recognize your static_cast to accept a simple delete. safest way would be to look at this as the original allocation of array. So cast back to char* and use []delete
Q.3 yep, yes you can.

Deallocate structure using pointer arithmetics and a pointer to an element of that structure

I have the following structure in C++ :
struct wrapper
{
// Param constructor
wrapper(unsigned int _id, const char* _string1, unsigned int _year,
unsigned int _value, unsigned int _usage, const char* _string2)
:
id(_id), year(_year), value(_value), usage(_usage)
{
int len = strlen(_string1);
string1 = new char[len + 1]();
strncpy(string1, _string1, len);
len = strlen(_string2);
string2 = new char[len + 1]();
strncpy(string2, _string2, len);
};
// Destructor
~wrapper()
{
if(string1 != NULL)
delete [] string1;
if(string2 != NULL)
delete [] string2;
}
// Elements
unsigned int id;
unsigned int year;
unsigned int value;
unsigned int usage;
char* string1;
char* string2;
};
In main.cpp let's say I allocate memory for one object of this structure :
wrapper* testObj = new wrapper(125600, "Hello", 2013, 300, 0, "bye bye");
Can I now delete the entire object using pointer arithmetic and a pointer that points to one of the structure elements ?
Something like this :
void* ptr = &(testObj->string2);
ptr -= 0x14;
delete (wrapper*)ptr;
I've tested myself and apparently it works but I'm not 100% sure that is equivalent to delete testObj.
Thanks.
Technically, the code like this would work (ignoring the fact that wrapper testObj should be wrapper* testObj and that the offset is not necessarily 0x14, e.g. debug builds sometimes pad the structures, and maybe some other detail I missed), but it is a horrible, horrible idea. I can't stress hard enough how horrible it is.
Instead of 0x14 you could use offsetof macro.
If you like spending nights in the company of the debugger, sure, feel free to do so.
I will assume that the reason for the question is sheer curiosity about whether it is possible to use pointer arithmetic to navigate from members to parent, and not that you would like to really do it in production code. Please tell me I am right.
Can I now delete the entire object using pointer arithmetic and a pointer that points to one of the structure elements ?
Theoretically, yes.
The pointer that you give to delete needs to have the correct value, and it doesn't really matter whether that value comes from an existing pointer variable, or by "adjusting" one in this manner.
You also need to consider the type of the pointer; if nothing else, you should cast to char* before performing your arithmetic so that you are moving in steps of single bytes. Your current code will not compile because ISO C++ forbids incrementing a pointer of type 'void*' (how big is a void?).
However, I recommend not doing this at all. Your magic number 0x14 is unreliable, given alignment and padding and the potential of your structure to change shape.
Instead, store a pointer to the actual object. Also stop with all the horrid memory mess, and use std::string. At present, your lack of copy constructor is presenting a nasty bug.
You can do this sort of thing with pointer arithmetic. Whether you should is an entirely different story. Consider this macro (I know... I know...) that will give you the base address of a structure given its type, the name of a structure member and a pointer to that member:
#define ADDRESS_FROM_MEMBER(T, member, ptr) reinterpret_cast<T*>( \
reinterpret_cast<unsigned char *>(ptr) - (ptrdiff_t)(&(reinterpret_cast<T*>(0))->member))