I have this old C - code that my compiler is warning me about (old C-style cast).
volatile uint32_t* map;
void** argForSomeAPIfunction = (void**)↦
How can I convert this to C++ cast style? I need to convert volatile uint32_t** to void**.
The reason why I need this is that there is a closed-source vendor API that expects me to pass a void** pointer. I cannot change that function signature. On the other hand, the vendor told me that I can access this map of registers, that are always 32-bit, using a pointer to uint32_t, and he provided me the working example in C-style, that used (void**)&. I tried this C-example and it's working fine.
Side note: This vendor API communicates with a PCIe card, via a custom kernel module. There are two options two read the internal memory of the PCIe card, using a safe API function that uses "ioctl" and built-in checks, or using an unsupervised access to this 32-bit register map, that is initialized by calling to API::InitializeRegisterMap((void**)&map);. This is done only once, at the very beginning of the program. After that, I access directly the memory using map[offset], where offset goes in steps of 4 bytes, instead of the supervised function API::ReadRegister(offset) that in some cases is slower and critically delays the data acquisition (while the card is doing other tasks). The computer does not change the register contents, it just reads those. The external independent card can change at any time the register content, which is why the keyword volatile was introducted in the example from the vendor, I believe.
If this is for initialization, then I guess what it does is fill in a void *, not anything else, i.e. as if in C++ you'd have a parameter of type void *&.
I believe it is intended to be used as
void *ptr;
if (API::InitializeRegisterMap(&ptr)) {
...
}
then afterwards you will take the value in ptr and convert that value to volatile uint32_t *map:
void *ptr;
volatile std::uint32_t *map;
if (API::InitializeRegisterMap(&ptr)) {
map = static_cast<std::uint32_t *>(ptr);
}
Most notably in C, as opposed to C++, this can be written completely castless, which should be considered a good sign.
Also, you want to avoid reinterpret_cast like the plague.
After several trial and error steps, I noticed that this cannot be done with a simple cast step, you need to concatenate two:
void** argForSomeAPIfunction = reinterpret_cast<void**>(const_cast<uint32_t**>(&map));
The inner one is needed to cast away the volatile-ness, the second one for converting from int to void.
Related
Pointer types like int*, char*, and float* point to different types. But I have heard that pointers are simply implemented as links to other addresses - then how is this link associated with a type that the compiler can match with the type of the linked address (the variable at this location)?
Types are mostly compile time things in c++. A variable's type is used at compile time to determine what the operations (in other C++ code) do on that variable.
So a variable bob of type int* when you ++ it, maps at runtime to a generic pointer-sized integer being increased by sizeof(int).
To a certain extent this is a lie; C++'s behavior is specified in terms of an abstract machine, not a concrete one. The compiler interprets your code as expressing operations on that abtract machine (that doesn't exist), then writes concrete assembly code that realizes those operations (insofar as they are defined) on concrete hardware.
In that abstract machine, int* and double* are not just numbers. If you dereference an int* and write to some memory, then do the same with a double*, and the memory overlaps, in the abstract machine the result is undefined behavior.
In the concrete implementation of that abstract machine, pointers-as-numbers as int* or double* dereferenced with the same address results in quite well defined behavior.
This difference is important. The compiler is free to assume the abstract machine (where int* and double* are very distinct things) is the only reality that matters. So if you write to a int*, write to a double* then read back from the int* the compiler can skip the read back, because it can prove that in the abstract machine writing to a double* cannot change a the value that an int* points to.
So
int buf[10]={0};
int* a = &buff[0];
double* d = reinterpret_cast<double*>(&buff[0]);
*a = 77;
*d = 3.14;
std::cout << *a;
the apparent read at std::cout << *a can be skipped by the compiler. Meanwhile, if it actually happened on real hardware, it would read bits generated by the *d write.
When reasoning about C++ you have to think of 3 things at once; what happens at compile time, the abstract machine behavior, and the concrete implementation of your code. In two of these (compile time and abstract machine) int* is implemented differently than float*. At actual runtime, int* and float* are both going to be 64 or 32 bit integers in a register or in memory somewhere.
Type checking is done at compile time. The error happens then, or never, excluding cases of RTTI (runtime type information).
RTTI is things like dynamic_cast, which does not work on pointers to primitives like float* or int*.
At compile time that variable carries with it the fact it is a int* everywhere it goes. In the abstract machine, ditto. In the concrete compiled output, it has forgotten it is an int*.
There's no particular "link" at this stage, nor any hidden meta-data stored somewhere. Since C and C++ are compiled and eventually produce a standalone executable, the compiler "trusts" the programmer and simply provides him with a data type that represents a memory address.
If there's nothing explicitly defined at this address, you can use void * pointer. If you know that this will be the location of something in particular, you can qualify it with a certain data type like int * or char *. The compiler will therefore be able to directly access the object that lies behind but the way this address is stored remains the same in every case, and keep the same format.
Note that this qualification is done at compilation time only. It totally disappear in the definitive executable code. This means that this generated code will be produced to handle certain kinds of objects, but nothing will tell you which ones at first if you disassemble the machine code. You'll have to figure this out by yourself.
Variables represent data which is stored in one or more memory cells or "bytes". The compiler will associate this group of bytes with a name and a type when the variable is defined.
The hardware uses a binary number to access a memory cell. This is known as the "address" of the memory cell.
When you store some data in a variable, the compiler will look up the name of the variable and check that the data you want to store is compatible with its type. If it is, it then generates code which will save it in the memory cell(s) at that address.
Since this address is a number, it can itself be stored in a variable. The type of this address variable will be "pointer to T", where T is the type of the data stored in that address.
It is the responsibility of the programmer to make sure that this address variable does correspond to valid data and not some random area of memory. The compiler will not check this for you.
I am working on a system that passes data from one "location" to another. The passing of the data is to be sent as a data block where the sending mechanism knows nothing of its contents but the end points do.
Generally for this type of application I store my data as a block of uint8_t (unsigned 8-bit integers) i.e. bytes.
In one scenario the end points may store int16_t data elements and I will have to use somthing like:
pWord = reinterpret_cast<int16_t *>(pData);
In another scenario it could be:
pMyClass = reinterpret_cast<MyClass *>(pData);
Note: uint8_t *pData;
I have seen others use void* as the "generic" data block, or even char*. In my mind an uint8_t is the most basic kind of element and is the obvious choice, but I am wondering if there is any advantage in using other data types like void*. For example are there conversion rules for void* such that you can use static_cast instead of reinterpret_cast?
There are some subtle differences. A void * pointer can not be used to access the contents of the data directly. It will ALWAYS have to be cast [perhaps implicitly by passing it to another function, such as memcpy, fread, or similar] into a different type. Whether this is "important" or not is a different question. For most intents and purposes, it is not particularly important.
In C++ there is really not that many places where you (should) need to do this, since templates and inheritance in various combinations allow you to do similar things with proper type safety.
In strict C++ standard perspective, you can't arbitrarily cast "any data" to int16_t - this may well work in some compilers on some types of processors, but on others, it may well fail (e.g. some processors are picky about addresses being aligned, and if you have some data of type char or uint8_t that you are converting int16_t, the compiler is not obliged to make sure it's aligned correctly for use in the latter form - this is just one scenario, there are many others where it could go wrong). The language does define access to data of "different type" from char or unsigned char (aka uint8_t) types, so this is safe.
passing a input buffer using char* or void* is something used in C, not C++(since it provides other mechanisms). Here are a few points as to why you would use one or the other:
void* is used to pass a buffer of RAW data - you always need to do a cast to interpret the data
you may choose void* instead of char* because of conversion convenience: you can cast from void* to any other pointer type, but char* is limited to compiler defined data types ( you can't do a cast from char* to struct myType*)
you use char* when you want a straight forward way of specifying the size of a buffer: void foo(char* array, int size N) will give you an immediate idea of the size of the buffer.
There may be other reasons for using one or the other but the main thing is that most of these reasons are useful in C
In principle, char* or uint8_t* implies memory that can read/written to at the granularity of bytes, which is not always true. That's why void* is best used for untyped memory.
For example, the VRAM of the Gameboy Advance can only be used with 16-bit read/writes.
I have read that converting a function pointer to a data pointer and vice versa works on most platforms but is not guaranteed to work. Why is this the case? Shouldn't both be simply addresses into main memory and therefore be compatible?
An architecture doesn't have to store code and data in the same memory. With a Harvard architecture, code and data are stored in completely different memory. Most architectures are Von Neumann architectures with code and data in the same memory but C doesn't limit itself to only certain types of architectures if at all possible.
Some computers have (had) separate address spaces for code and data. On such hardware it just doesn't work.
The language is designed not only for current desktop applications, but to allow it to be implemented on a large set of hardware.
It seems like the C language committee never intended void* to be a pointer to function, they just wanted a generic pointer to objects.
The C99 Rationale says:
6.3.2.3 Pointers
C has now been implemented on a wide range of architectures. While some of these
architectures feature uniform pointers which are the size of some integer type, maximally
portable code cannot assume any necessary correspondence between different pointer types and the integer types. On some implementations, pointers can even be wider than any integer type.
The use of void* (“pointer to void”) as a generic object pointer type is an invention of the C89 Committee. Adoption of this type was stimulated by the desire to specify function prototype arguments that either quietly convert arbitrary pointers (as in fread) or complain if the argument type does not exactly match (as in strcmp). Nothing is said about pointers to functions, which may be incommensurate with object pointers and/or integers.
Note Nothing is said about pointers to functions in the last paragraph. They might be different from other pointers, and the committee is aware of that.
For those who remember MS-DOS, Windows 3.1 and older the answer is quite easy. All of these used to support several different memory models, with varying combinations of characteristics for code and data pointers.
So for instance for the Compact model (small code, large data):
sizeof(void *) > sizeof(void(*)())
and conversely in the Medium model (large code, small data):
sizeof(void *) < sizeof(void(*)())
In this case you didn't have separate storage for code and date but still couldn't convert between the two pointers (short of using non-standard __near and __far modifiers).
Additionally there's no guarantee that even if the pointers are the same size, that they point to the same thing - in the DOS Small memory model, both code and data used near pointers, but they pointed to different segments. So converting a function pointer to a data pointer wouldn't give you a pointer that had any relationship to the function at all, and hence there was no use for such a conversion.
Pointers to void are supposed to be able to accommodate a pointer to any kind of data -- but not necessarily a pointer to a function. Some systems have different requirements for pointers to functions than pointers to data (e.g, there are DSPs with different addressing for data vs. code, medium model on MS-DOS used 32-bit pointers for code but only 16-bit pointers for data).
In addition to what is already said here, it is interesting to look at POSIX dlsym():
The ISO C standard does not require that pointers to functions can be cast back and forth to pointers to data. Indeed, the ISO C standard does not require that an object of type void * can hold a pointer to a function. Implementations supporting the XSI extension, however, do require that an object of type void * can hold a pointer to a function. The result of converting a pointer to a function into a pointer to another data type (except void *) is still undefined, however. Note that compilers conforming to the ISO C standard are required to generate a warning if a conversion from a void * pointer to a function pointer is attempted as in:
fptr = (int (*)(int))dlsym(handle, "my_function");
Due to the problem noted here, a future version may either add a new function to return function pointers, or the current interface may be deprecated in favor of two new functions: one that returns data pointers and the other that returns function pointers.
C++11 has a solution to the long-standing mismatch between C/C++ and POSIX with regard to dlsym(). One can use reinterpret_cast to convert a function pointer to/from a data pointer so long as the implementation supports this feature.
From the standard, 5.2.10 para. 8, "converting a function pointer to an object pointer type or vice versa is conditionally-supported." 1.3.5 defines "conditionally-supported" as a "program construct that an implementation is not required to support".
Depending on the target architecture, code and data may be stored in fundamentally incompatible, physically distinct areas of memory.
undefined doesn't necessarily mean not allowed, it can mean that the compiler implementor has more freedom to do it how they want.
For instance it may not be possible on some architectures - undefined allows them to still have a conforming 'C' library even if you can't do this.
Another solution:
Assuming POSIX guarantees function and data pointers to have the same size and representation (I can't find the text for this, but the example OP cited suggests they at least intended to make this requirement), the following should work:
double (*cosine)(double);
void *tmp;
handle = dlopen("libm.so", RTLD_LAZY);
tmp = dlsym(handle, "cos");
memcpy(&cosine, &tmp, sizeof cosine);
This avoids violating the aliasing rules by going through the char [] representation, which is allowed to alias all types.
Yet another approach:
union {
double (*fptr)(double);
void *dptr;
} u;
u.dptr = dlsym(handle, "cos");
cosine = u.fptr;
But I would recommend the memcpy approach if you want absolutely 100% correct C.
They can be different types with different space requirements. Assigning to one can irreversibly slice the value of the pointer so that assigning back results in something different.
I believe they can be different types because the standard doesn't want to limit possible implementations that save space when it's not needed or when the size could cause the CPU to have to do extra crap to use it, etc...
The only truly portable solution is not to use dlsym for functions, and instead use dlsym to obtain a pointer to data that contains function pointers. For example, in your library:
struct module foo_module = {
.create = create_func,
.destroy = destroy_func,
.write = write_func,
/* ... */
};
and then in your application:
struct module *foo = dlsym(handle, "foo_module");
foo->create(/*...*/);
/* ... */
Incidentally, this is good design practice anyway, and makes it easy to support both dynamic loading via dlopen and static linking all modules on systems that don't support dynamic linking, or where the user/system integrator does not want to use dynamic linking.
A modern example of where function pointers can differ in size from data pointers: C++ class member function pointers
Directly quoted from https://blogs.msdn.microsoft.com/oldnewthing/20040209-00/?p=40713/
class Base1 { int b1; void Base1Method(); };
class Base2 { int b2; void Base2Method(); };
class Derived : public Base1, Base2 { int d; void DerivedMethod(); };
There are now two possible this pointers.
A pointer to a member function of Base1 can be used as a pointer to a
member function of Derived, since they both use the same this
pointer. But a pointer to a member function of Base2 cannot be used
as-is as a pointer to a member function of Derived, since the this
pointer needs to be adjusted.
There are many ways of solving this. Here's how the Visual Studio
compiler decides to handle it:
A pointer to a member function of a multiply-inherited class is really
a structure.
[Address of function]
[Adjustor]
The size of a pointer-to-member-function of a class that uses multiple inheritance is the size of a pointer plus the size of a size_t.
tl;dr: When using multiple inheritance, a pointer to a member function may (depending on compiler, version, architecture, etc) actually be stored as
struct {
void * func;
size_t offset;
}
which is obviously larger than a void *.
On most architectures, pointers to all normal data types have the same representation, so casting between data pointer types is a no-op.
However, it's conceivable that function pointers might require a different representation, perhaps they're larger than other pointers. If void* could hold function pointers, this would mean that void*'s representation would have to be the larger size. And all casts of data pointers to/from void* would have to perform this extra copy.
As someone mentioned, if you need this you can achieve it using a union. But most uses of void* are just for data, so it would be onerous to increase all their memory use just in case a function pointer needs to be stored.
I know that this hasn't been commented on since 2012, but I thought it would be useful to add that I do know an architecture that has very incompatible pointers for data and functions since a call on that architecture checks privilege and carries extra information. No amount of casting will help. It's The Mill.
i'm pretty new to box2d and i'm trying to use the userdata (of type void*) field in the b2body object to store an int value (an enum value, so i know the type of the object).
right now i'm doing something this:
int number = 1023;
void* data = (void*)(&number);
int rNumber = *(int*)data;
and i get the value correctly, but as i've been reading around casting to void* it's not portable or recommendable... is my code cross-platform? is it's behavior defined or implementation dependent?
Thanks!
Casting to void * is portable. It is not recommended because you are losing type safety. Anything can be put into a void * and anything can be gotten out. It makes it easier to shoot yourself in the foot. Otherwise void * is fine as long as you are careful and extra cautious.
You are actually not casting int to void*, you cast int* to void*, which is totally different.
A pointer to any type can be stored in a void*, and be cast back again to the same type. That is guaranteed to work.
Casting to any other type is not portable, as the language standard doesn't say that different pointers have to be the same size, or work the same way. Just that void* has to be wide enough to contain them.
One of the problems with void* is that you need to know (keep track of) what type it originally was in order to cast it properly. If it originally was a float and you case it to an int the compiler would just take your word for it.
To avoid this you could create a wrapper for your data which contains the type, that way you will be able to always cast it to the right type.
Edit: you should also make a habit of using C++ casting style instead of C i.e. reinterpret_cast
void * is somehow a relic of the past (ISO C), but very convenient. You can use it safely as far as you are careful casting back and forward the type you want. Consider other alternatives like the c++ class system or overloading a function
anyways you will have better cast operators, some times there is no other way around (void*), some other times they are just too convenient.
It can lead to non portable code, not because of the casting system, but because some people are tempted to do non portable operations with them. The biggest problem lies in the fact that (void*) is a as big as a memory address, which in many platforms happens to be also the length of the platform integers.
However in some rare exceptions size(void*) != size(int)
If you try to do some type of operations/magic with them without casting back to the type you want, you might have problems. You might be surprised of how many times I have seen people wanting to store an integer into a void* pointer
To answer your question, yes, it's safe to do.
To answer the question you didn't ask, that void pointer isn't meant for keeping an int in. Box2D has that pointer for you to point back to an Entity object in your game engine, so you can associate an Entity in your game to a b2Body in the physics simulation. It allows you to more easily program your entities interact with one another when one b2Body interacts with another.
So you shouldn't just be putting an enum in that void*. You should be pointing it directly to the game object represented by that b2body, which could have an enum in it.
I've often used pointers to const objects, like so...
const int *p;
That simply means that you can't change the integer that p is pointing at through p. But I've also seen reference to const pointers, declared like this...
int* const p;
As I understand it, that means that the pointer variable itself is constant -- you can change the integer it points at all day long, but you can't make it point at something else.
What possible use would that have?
When you're designing C programs for embedded systems, or special purpose programs that need to refer to the same memory (multi-processor applications sharing memory) then you need constant pointers.
For instance, I have a 32 bit MIPs processor that has a little LCD attached to it. I have to write my LCD data to a specific port in memory, which then gets sent to the LCD controller.
I could #define that number, but then I also have to cast it as a pointer, and the C compiler doesn't have as many options when I do that.
Further, I might need it to be volatile, which can also be cast, but it's easier and clearer to use the syntax provided - a const pointer to a volatile memory location.
For PC programs, an example would be: If you design DOS VGA games (there are tutorials online which are fun to go through to learn basic low level graphics) then you need to write to the VGA memory, which might be referenced as an offset from a const pointer.
-Adam
It allows you to protect the pointer from being changed. This means you can protect assumptions you make based on the pointer never changing or from unintentional modification, for example:
int* const p = &i;
...
p++; /* Compiler error, oops you meant */
(*p)++; /* Increment the number */
another example:
if you know where it was initialized, you can avoid future NULL checks.
The compiler guarantees you that the pointer never changed (to NULL)…
In any non-const C++ member function, the this pointer is of type C * const, where C is the class type -- you can change what it points to (i.e. its members), but you can't change it to point to a different instance of a C. For const member functions, this is of type const C * const. There are also (rarely encountered) volatile and const volatile member functions, for which this also has the volatile qualifier.
One use is in low-level (device driver or embedded) code where you need to reference a specific address that's mapped to an input/output device like a hardware pin. Some languages allow you to link variables at specific addresses (e.g. Ada has use at). In C the most idiomatic way to do this is to declare a constant pointer. Note that such usages should also have the volatile qualifier.
Other times it's just defensive coding. If you have a pointer that shouldn't change it's wise to declare it such that it cannot change. This will allow the compiler (and lint tools) to detect erroneous attempts to modify it.
I've always used them when I wanted to avoid unintended modification to the pointer (such as pointer arithmetic, or inside a function). You can also use them for Singleton patterns.
'this' is a hardcoded constant pointer.
Same as a "const int" ... if the compiler knows it's not going to change, it can be optimization assumptions based on that.
struct MyClass
{
char* const ptr;
MyClass(char* str) :ptr(str) {}
void SomeFunc(MyOtherClass moc)
{
for(int i=0; i < 100; ++i)
{
printf("%c", ptr[i]);
moc.SomeOtherFunc(this);
}
}
}
Now, the compiler could do quite a bit to optimize that loop --- provided it knows that SomeOtherFunc() does not change the value of ptr. With the const, the compiler knows that, and can make the assumptions. Without it, the compiler has to assume that SomeOtherFunc will change ptr.
I have seen some OLE code where you there was an object passed in from outside the code and to work with it, you had to access the specific memory that it passed in. So we used const pointers to make sure that functions always manipulated the values than came in through the OLE interface.
Several good reasons have been given as answers to this questions (memory-mapped devices and just plain old defensive coding), but I'd be willing to bet that most instances where you see this it's actually an error and that the intent was to have to item be a pointer-to-const.
I certainly have no data to back up this hunch, but I'd still make the bet.
Think of type* and const type* as types themselves. Then, you can see why you might want to have a const of those types.
always think of a pointer as an int. this means that
object* var;
actually can be thought of as
int var;
so, a const pointer simply means that:
const object* var;
becomes
const int var;
and hence u can't change the address that the pointer points too, and thats all. To prevent data change, u must make it a pointer to a const object.