Casting Function Pointer to Integer in C++ - c++

I have an array of unsigned integers that need to store pointers to data and functions as well as some data. In the device I am working with, the sizeof pointer is the same as sizeof unsigned int. How can I cast pointer to function into unsigned int? I know that this makes the code not portable, but it is micro controller specific. I tried this:
stackPtr[4] = reinterpret_cast<unsigned int>(task_ptr);
but it give me an error "invalid type conversion"
Casting it to void pointer and then to int is messy.
stackPtr[4] = reinterpret_cast<unsigned int>(static_cast<void *> (task_ptr));
Is there a clean way of doing it?
Edit - task_ptr is function pointer void task_ptr(void)
Love Barmar's answer, takes my portability shortcoming away. Also array of void pointer actually makes more sense then Unsigned Ints. Thank you Barmar and isaach1000.
EDIT 2: Got it, my compiler is thinking large memory model so it is using 32 bit pointers not 16 bit that I was expecting (small micros with 17K total memory).

A C-style cast can fit an octogonal peg into a trapezoidal hole, so I would say that given your extremely specific target hardware and requirements, I would use that cast, possibly wrapped into a template for greater clarity.
Alternately, the double cast to void* and then int does have the advantage of making the code stand out like a sore thumb so your future maintainers know something's going on and can pay special attention.
EDIT for comment:
It appears your compiler may have a bug. The following code compiles on g++ 4.5:
#include <iostream>
int f()
{
return 0;
}
int main()
{
int value = (int)&f;
std::cout << value << std::endl;
}
EDIT2:
You may also wish to consider using the intptr_t type instead of int. It's an integral type large enough to hold a pointer.

In C++ a pointer can be converted to a value of an integral type large enough to hold it. The conditionally-supported type std::intptr_t is defined such that you can convert a void* to intptr_t and back to get the original value. If void* has a size equal to or larger than function pointers on your platform then you can do the conversion in the following way.
#include <cstdint>
#include <cassert>
void foo() {}
int main() {
void (*a)() = &foo;
std::intptr_t b = reinterpret_cast<std::intptr_t>(a);
void (*c)() = reinterpret_cast<void(*)()>(b);
assert(a==c);
}

This is ansi compliant:
int MyFunc(void* p)
{
return 1;
}
int main()
{
int arr[2];
int (*foo)(int*);
arr[0] = (int)(MyFunc);
foo = (int (*)(int*))(arr[0]);
arr[1] = (*foo)(NULL);
}

Related

Why can't we use a void* to operate on the object it addresses

I am learning C++ using C++ Primer 5th edition. In particular, i read about void*. There it is written that:
We cannot use a void* to operate on the object it addresses—we don’t know that object’s type, and the type determines what operations we can perform on that object.
void*: Pointer type that can point to any nonconst type. Such pointers may not
be dereferenced.
My question is that if we're not allowed to use a void* to operate on the object it addressess then why do we need a void*. Also, i am not sure if the above quoted statement from C++ Primer is technically correct because i am not able to understand what it is conveying. Maybe some examples can help me understand what the author meant when he said that "we cannot use a void* to operate on the object it addresses". So can someone please provide some example to clarify what the author meant and whether he is correct or incorrect in saying the above statement.
My question is that if we're not allowed to use a void* to operate on the object it addressess then why do we need a void*
It's indeed quite rare to need void* in C++. It's more common in C.
But where it's useful is type-erasure. For example, try to store an object of any type in a variable, determining the type at runtime. You'll find that hiding the type becomes essential to achieve that task.
What you may be missing is that it is possible to convert the void* back to the typed pointer afterwards (or in special cases, you can reinterpret as another pointer type), which allows you to operate on the object.
Maybe some examples can help me understand what the author meant when he said that "we cannot use a void* to operate on the object it addresses"
Example:
int i;
int* int_ptr = &i;
void* void_ptr = &i;
*int_ptr = 42; // OK
*void_ptr = 42; // ill-formed
As the example demonstrates, we cannot modify the pointed int object through the pointer to void.
so since a void* has no size(as written in the answer by PMF)
Their answer is misleading or you've misunderstood. The pointer has a size. But since there is no information about the type of the pointed object, the size of the pointed object is unknown. In a way, that's part of why it can point to an object of any size.
so how can a int* on the right hand side be implicitly converted to a void*
All pointers to objects can implicitly be converted to void* because the language rules say so.
Yes, the author is right.
A pointer of type void* cannot be dereferenced, because it has no size1. The compiler would not know how much data he needs to get from that address if you try to access it:
void* myData = std::malloc(1000); // Allocate some memory (note that the return type of malloc() is void*)
int value = *myData; // Error, can't dereference
int field = myData->myField; // Error, a void pointer obviously has no fields
The first example fails because the compiler doesn't know how much data to get. We need to tell it the size of the data to get:
int value = *(int*)myData; // Now fine, we have casted the pointer to int*
int value = *(char*)myData; // Fine too, but NOT the same as above!
or, to be more in the C++-world:
int value = *static_cast<int*>(myData);
int value = *static_cast<char*>(myData);
The two examples return a different result, because the first gets an integer (32 bit on most systems) from the target address, while the second only gets a single byte and then moves that to a larger variable.
The reason why the use of void* is sometimes still useful is when the type of data doesn't matter much, like when just copying stuff around. Methods such as memset or memcpy take void* parameters, since they don't care about the actual structure of the data (but they need to be given the size explicitly). When working in C++ (as opposed to C) you'll not use these very often, though.
1 "No size" applies to the size of the destination object, not the size of the variable containing the pointer. sizeof(void*) is perfectly valid and returns, the size of a pointer variable. This is always equal to any other pointer size, so sizeof(void*)==sizeof(int*)==sizeof(MyClass*) is always true (for 99% of today's compilers at least). The type of the pointer however defines the size of the element it points to. And that is required for the compiler so he knows how much data he needs to get, or, when used with + or -, how much to add or subtract to get the address of the next or previous elements.
void * is basically a catch-all type. Any pointer type can be implicitly cast to void * without getting any errors. As such, it is mostly used in low level data manipulations, where all that matters is the data that some memory block contains, rather than what the data represents. On the flip side, when you have a void * pointer, it is impossible to determine directly which type it was originally. That's why you can't operate on the object it addresses.
if we try something like
typedef struct foo {
int key;
int value;
} t_foo;
void try_fill_with_zero(void *destination) {
destination->key = 0;
destination->value = 0;
}
int main() {
t_foo *foo_instance = malloc(sizeof(t_foo));
try_fill_with_zero(foo_instance, sizeof(t_foo));
}
we will get a compilation error because it is impossible to determine what type void *destination was, as soon as the address gets into try_fill_with_zero. That's an example of being unable to "use a void* to operate on the object it addresses"
Typically you will see something like this:
typedef struct foo {
int key;
int value;
} t_foo;
void init_with_zero(void *destination, size_t bytes) {
unsigned char *to_fill = (unsigned char *)destination;
for (int i = 0; i < bytes; i++) {
to_fill[i] = 0;
}
}
int main() {
t_foo *foo_instance = malloc(sizeof(t_foo));
int test_int;
init_with_zero(foo_instance, sizeof(t_foo));
init_with_zero(&test_int, sizeof(int));
}
Here we can operate on the memory that we pass to init_with_zero represented as bytes.
You can think of void * as representing missing knowledge about the associated type of the data at this address. You may still cast it to something else and then dereference it, if you know what is behind it. Example:
int n = 5;
void * p = (void *) &n;
At this point, p we have lost the type information for p and thus, the compiler does not know what to do with it. But if you know this p is an address to an integer, then you can use that information:
int * q = (int *) p;
int m = *q;
And m will be equal to n.
void is not a type like any other. There is no object of type void. Hence, there exists no way of operating on such pointers.
This is one of my favourite kind of questions because at first I was also so confused about void pointers.
Like the rest of the Answers above void * refers to a generic type of data.
Being a void pointer you must understand that it only holds the address of some kind of data or object.
No other information about the object itself, at first you are asking yourself why do you even need this if it's only able to hold an address. That's because you can still cast your pointer to a more specific kind of data, and that's the real power.
Making generic functions that works with all kind of data.
And to be more clear let's say you want to implement generic sorting algorithm.
The sorting algorithm has basically 2 steps:
The algorithm itself.
The comparation between the objects.
Here we will also talk about pointer functions.
Let's take for example qsort built in function
void qsort(void *base, size_t nitems, size_t size, int (*compar)(const void *, const void*))
We see that it takes the next parameters:
base − This is the pointer to the first element of the array to be sorted.
nitems − This is the number of elements in the array pointed by base.
size − This is the size in bytes of each element in the array.
compar − This is the function that compares two elements.
And based on the article that I referenced above we can do something like this:
int values[] = { 88, 56, 100, 2, 25 };
int cmpfunc (const void * a, const void * b) {
return ( *(int*)a - *(int*)b );
}
int main () {
int n;
printf("Before sorting the list is: \n");
for( n = 0 ; n < 5; n++ ) {
printf("%d ", values[n]);
}
qsort(values, 5, sizeof(int), cmpfunc);
printf("\nAfter sorting the list is: \n");
for( n = 0 ; n < 5; n++ ) {
printf("%d ", values[n]);
}
return(0);
}
Where you can define your own custom compare function that can match any kind of data, there can be even a more complex data structure like a class instance of some kind of object you just define. Let's say a Person class, that has a field age and you want to sort all Persons by age.
And that's one example where you can use void * , you can abstract this and create other use cases based on this example.
It is true that is a C example, but I think, being something that appeared in C can make more sense of the real usage of void *. If you can understand what you can do with void * you are good to go.
For C++ you can also check templates, templates can let you achieve a generic type for your functions / objects.

Why can I static_cast void* to int* but not to int*&?

An API uses void* to store untyped pointer offsets. It's a bit hacky, but okay whatever.
To express my offset arithmetic, I tried doing something like this
int main ()
{
void * foo;
foo = static_cast <int *> (nullptr) + 100;
static_cast <int * &> (foo) += 100;
}
The last line fails to compile (gcc)
x.cpp:7:28: error: invalid static_cast from type ‘void*’ to type ‘int*&’
The fix is simple:
foo = static_cast <int *> (foo) + 100;
But why isn't the first one allowed?
Before you answer "because the standard says so", why does the standard say so? Is the first method somehow dangerous? Or is it just an oversight?
It's not allowed for the same reason that int i; static_cast<long &>(l) = 3L; isn't allowed.
Sure, on a lot of implementations (where int and long have the same size, representation and alignment), it could work. But the rules for which casts are valid are mostly the same for all implementations, and clearly this could never work on platforms where int and long have different sizes, meaning it'd be impossible to allow accessing one as the other on those platforms.
Historically, there have been implementations on which void * and int * have different representations.
Later, after the standard stating that accessing void * as if it were an int * is invalid, implementations also started optimising on the assumption that valid programs do not do that:
void *f (void **ppv, int **ppi) {
void *result = *ppv;
*ppi = nullptr;
return result;
}
The implementation is allowed to optimise this to
void *f (void **ppv, int **ppi) {
*ppi = nullptr;
return *ppv;
}
and such optimisations, when they reduce code size or increase efficiency, are commonplace nowadays. If f were allowed to be called as void *pv = &pv; f (pv, &static_cast<int*&>(pv));, this optimisation would be invalid. Because such optimisations have proved useful, the rules are unlikely to change.

Portable tagged pointers

Is there a portable way to implement a tagged pointer in C/C++, like some documented macros that work across platforms and compilers? Or when you tag your pointers you are at your own peril? If such helper functions/macros exist, are they part of any standard or just are available as open source libraries?
Just for those who do not know what tagged pointer is but are interested, it is a way to store some extra data inside a normal pointer, because on most architectures some bits in pointers are always 0 or 1, so you keep your flags/types/hints in those extra bits, and just erase them right before you want to use pointer to dereference some actual value.
const int gc_flag = 1;
const int flag_mask = 7; // aka 0b00000000000111, because on some theoretical CPU under some arbitrary OS compiled with some random compiler and using some particular malloc last three bits are always zero in pointers.
struct value {
void *data;
};
struct value val;
val.data = &data | gc_flag;
int data = *(int*)(val.data & flag_mask);
https://en.wikipedia.org/wiki/Pointer_tagging
You can get the lowest N bits of an address for your personal use by guaranteeing that the objects are aligned to multiples of 1 << N. This can be achieved platform-independently by different ways (alignas and aligned_storage for stack-based objects or std::aligned_alloc for dynamic objects), depending on what you want to achieve:
struct Data { ... };
alignas(1 << 4) Data d; // 4-bits, 16-byte alignment
assert(reinterpret_cast<std::uintptr_t>(&d) % 16 == 0);
// dynamic (preferably with a unique_ptr or alike)
void* ptr = std::aligned_alloc(1 << 4, sizeof(Data));
auto obj = new (ptr) Data;
...
obj->~Data();
std::free(ptr);
You pay by throwing away a lot of memory, exponentionally growing with the number of bits required. Also, if you plan to allocate many of such objects contiguously, such an array won't fit in the processor's cacheline for comparatively small arrays, possibly slowing down the program considerably. This solution therefore is not to scale.
If you're sure that the addresses you are passing around always have certain bits unused, then you could use uintptr_t as a transport type. This is an integer type that maps to pointers in the expected way (and will fail to exist on an obscure platform that offers no such possible map).
There aren't any standard macros but you can roll your own easily enough. The code (sans macros) might look like:
void T_func(uintptr_t t)
{
uint8_t tag = (t & 7);
T *ptr = (T *)(t & ~(uintptr_t)7);
// ...
}
int main()
{
T *ptr = new T;
assert( ((uintptr_t)ptr % 8) == 0 );
T_func( (uintptr_t)ptr + 3 );
}
This may defeat compiler optimizations that involve tracking pointer usage.
Well, GCC at least can compute the size of bit-fields, so you can get portability across platforms (I don't have an MSVC available to test with). You can use this to pack the pointer and tag into an intptr_t, and intptr_t is guaranteed to be able to hold a pointer.
#include <limits.h>
#include <stdio.h>
#include <stdint.h>
#include <stddef.h>
#include <inttypes.h>
struct tagged_ptr
{
intptr_t ptr : (sizeof(intptr_t)*CHAR_BIT-3);
intptr_t tag : 3;
};
int main(int argc, char *argv[])
{
struct tagged_ptr p;
p.tag = 3;
p.ptr = (intptr_t)argv[0];
printf("sizeof(p): %zu <---WTF MinGW!\n", sizeof p);
printf("sizeof(p): %lu\n", (unsigned long int)sizeof p);
printf("sizeof(void *): %u\n", (unsigned int)sizeof (void *));
printf("argv[0]: %p\n", argv[0]);
printf("p.tag: %" PRIxPTR "\n", p.tag);
printf("p.ptr: %" PRIxPTR "\n", p.ptr);
printf("(void *)*(intptr_t*)&p: %p\n", (void *)*(intptr_t *)&p);
}
Gives:
$ ./tag.exe
sizeof(p): zu <---WTF MinGW!
sizeof(p): 8
sizeof(void *): 8
argv[0]: 00000000007613B0
p.tag: 3
p.ptr: 7613b0
(void *)*(intptr_t*)&p: 60000000007613B0
I've put the tag at the top, but changing the order of the struct would put it at the bottom. Then shifting the pointer-to-be-stored right by 3 would implement the OP's use case. Probably make macros for access to make it easier.
I also kinda like the struct because you can't accidentally dereference it as if it were a plain pointer.

Why will this not dynamic_cast?

I am trying to learn about some of the C++ features and coded up a little test. However, when I try to compile, I get the following error (below). Why is this happening and what is the correct way to do it? I'm trying to cast a 32 bit pointer to an 8 bit pointer and print out the contents after the conversion.
cast3.cpp:22: error: cannot dynamic_cast 'bigptr' (of type 'uint32_t*') to type 'uint8_t*' (target is not pointer or reference to class)
Code:
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void castme(uint8_t small[], int size);
int main(void)
{
uint8_t *small;
uint32_t big = 0x01234567;
uint32_t *bigptr = &big;
small = dynamic_cast<uint8_t *>(bigptr); // Line 22
castme(small, sizeof(big));
return 0;
}
void castme(uint8_t small[], int size)
{
for (int i = 0; i < size; i++)
{
printf("0x%x\n", small[i]);
}
}
dynamic_cast only works on classes with virtual member functions. To cast raw pointer types between each other, you need reinterpret_cast.
You are using the wrong cast. dynamic_cast only works with polymorphic class types, as it performs RTTI lookups at runtime. You are not using polymorphic class types in your code. To simply treat one pointer type as another pointer type, you need to use reinterpret_cast instead:
small = reinterpret_cast<uint8_t *>(bigptr);
To make the code compile, you can do the following:
small = reinterpret_cast<uint8_t *>(bigptr);
but I wouldn't do that, you should probably dereference the uint32_t pointer and then cast to the type you desire - there's no point casting to a uint8_t pointer in my mind.
ie.
uint8_t small_one = *bigptr;
Use reinterpret_cast. dynamic_cast has a different purpose.

Returning an array ... rather a reference or pointer to an array

I am a bit confused. There are two ways to return an array from a method. The first suggests the following:
typedef int arrT[10];
arrT *func(int i);
However, how do I capture the return which is an int (*)[]?
Another way is through a reference or pointer:
int (*func(int i)[10];
or
int (&func(int i)[10];
The return types are either int (*)[] or int (&)[].
The trouble I am having is how I can assign a variable to accept the point and I continue to get errors such as:
can't convert int* to int (*)[]
Any idea what I am doing wrong or what is lacking in my knowledge?
If you want to return an array by value, put it in a structure.
The Standard committee already did that, and thus you can use std::array<int,10>.
std::array<int,10> func(int i);
std::array<int,10> x = func(77);
This makes it very straightforward to return by reference also:
std::array<int,10>& func2(int i);
std::array<int,10>& y = func2(5);
First, the information you give is incorrect.
You write,
“There are two ways to return an array from a method”
and then you give as examples of the ways
typedef int arrT[10];
arrT *func(int i);
and
int (*func(int i))[10];
(I’ve added the missing right parenthesis), where you say that this latter way, in contrast to the first, is an example of
“through a reference or pointer”
Well, these two declarations mean exactly the same, to wit:
typedef int A[10];
A* fp1( int i ) { return 0; }
int (*fp2( int i ))[10] { return 0; }
int main()
{
int (*p1)[10] = fp1( 100 );
int (*p2)[10] = fp2( 200 );
}
In both cases a pointer to the array is returned, and this pointer is typed as "pointer to array". Dereferencing that pointer yields the array itself, which decays to a pointer to itself again, but now typed as "pointer to item". It’s a pointer to the first item of the array. At the machine code level these two pointers are, in practice, exactly the same. Coming from a Pascal background that confused me for a long time, but the upshot is, since it’s generally impractical to carry the array size along in the type (which precludes dealing with arrays of different runtime sizes), most array handling code deals with the pointer-to-first-item instead of the pointer-to-the-whole-array.
I.e., normally such a low level C language like function would be declared as just
int* func()
return a pointer to the first item of an array of size established at run time.
Now, if you want to return an array by value then you have two choices:
Returning a fixed size array by value: put it in a struct.
The standard already provides a templated class that does this, std::array.
Returning a variable size array by value: use a class that deals with copying.
The standard already provides a templated class that does this, std::vector.
For example,
#include <vector>
using namespace std;
vector<int> foo() { return vector<int>( 10 ); }
int main()
{
vector<int> const v = foo();
// ...
}
This is the most general. Using std::array is more of an optimization for special cases. As a beginner, keep in mind Donald Knuth’s advice: “Premature optimization is the root of all evil.” I.e., just use std::vector unless there is a really really good reason to use std::array.
using arrT10 = int[10]; // Or you can use typedef if you want
arrT10 * func(int i)
{
arrT10 a10;
return &a10;
// int a[10];
// return a; // ERROR: can't convert int* to int (*)[]
}
This will give you a warning because func returns an address of a local variable so we should NEVER code like this but I'm sure this code can help you.