I've been running the following code through different compilers:
int main()
{
float **a;
void **b;
b = a;
}
From what I've been able to gather, void ** is not a generic pointer which means that any conversion from another pointer should not compile or at least throw a warning. However, here are my results (all done on Windows):
gcc - Throws a warning, as expected.
g++ - Throws an error, as expected (this is due to the less permissive typing of C++, right?)
MSVC (cl.exe) - Throws no warnings whatsoever, even with /Wall specified.
My question is: Am I missing something about the whole thing and is there any specific reason why MSVC does not produce a warning? MSVC does produce a warning when converting from void ** to float **.
Another thing of note: If I replace a = b with the explicit conversion a = (void **)b, none of the compilers throw a warning. I thought this should be an invalid cast, so why wouldn't there be any warnings?
The reason I am asking this question is because I was starting to learn CUDA and in the official Programming Guide (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#device-memory) the following code can be found:
// Allocate vectors in device memory
float* d_A;
cudaMalloc(&d_A, size);
which should perform an implicit conversion to void ** for &d_A, as the first argument of cudaMalloc is of type void **. Similar code can be found all over the documentation. Is this just sloppy work on NVIDIA's end or am I, again, missing something? Since nvcc uses MSVC, the code compiles without warnings.
Am I missing something about the whole thing and is there any specific reason why MSVC does not produce a warning? MSVC does produce a warning when converting from void ** to float **
This assignment without a cast is a constraint violation, so a standard compliant compiler will print a warning or error. However, MSVC is not fully compliant C implementation.
Another thing of note: If I replace a = b with the explicit conversion a = (void **)b, none of the compilers throw a warning. I thought this should be an invalid cast, so why wouldn't there be any warnings?
Pointer conversions via a cast are allowed in some situations. The C standard says the following in section 6.3.2.3p7:
A pointer to an object type may be converted to a pointer to a different object type. If the
resulting pointer is not correctly aligned for the referenced type, the behavior is
undefined. Otherwise, when converted back again, the result shall compare equal to the
original pointer. When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
So you can convert between pointer types provided there are no alignment issues and you only convert back (unless the target is a char *).
float* d_A;
cudaMalloc(&d_A, size);
...
Is this just sloppy work on NVIDIA's end or am I, again, missing something?
Presumably, this function is dereferencing the given pointer and writing the address of some allocated memory. That would mean it is trying to write to a float * as if it were a void *. This is not the same as the typical conversion to/from a void *. Strictly speaking this looks like undefined behavior, although it "works" because modern x86 processors (when not in real mode) use the same representation for all pointer types.
Related
In a recent question, someone mentioned that when printing a pointer value with printf, the caller must cast the pointer to void *, like so:
int *my_ptr = ....
printf("My pointer is: %p", (void *)my_ptr);
For the life of me I can't figure out why. I found this question, which is almost the same. The answer to question is correct - it explains that ints and pointers are not necessarily the same length.
This is, of course, true, but when I already have a pointer, like in the case above, why should I cast from int * to void *? When is an int * different from a void *? In fact, when does (void *)my_ptr generate any machine code that's different from simply my_ptr?
UPDATE:
Multiple knowledgeable responders quoted the standard, saying passing the wrong type may result in undefined behavior. How? I expect printf("%p", (int *)ptr) and printf("%p", (void *)ptr) to generate the exact same stack-frame. When will the two calls generate different stack frames?
The %p conversion specifier requires an argument of type void *. If you don't pass an argument of type void *, the function call invokes undefined behavior.
From the C Standard (C11, 7.21.6.1p8 Formatted input/output functions):
"p - The argument shall be a pointer to void."
Pointer types in C are not required to have the same size or the same representation.
An example of an implementation with different pointer types representation is Cray PVP where the representation of pointer types is 64-bit for void * and char * but 32-bit for the other pointer types.
See "Cray C/C++ Reference Manual", Table 3. in "9.1.2.2" http://docs.cray.com/books/004-2179-003/004-2179-003-manual.pdf
In C language all pointer types potentially differ in their representations. So, yes, int * is different from void *. A real-life platform that would illustrate this difference might be difficult (or impossible) to find, but at the conceptual level the difference is still there.
In other words, in general case different pointer types have different representations. int * is different from void * and different from double *. The fact that your platform uses the same representation for void * and int * is nothing more than a coincidence, as far as C language is concerned.
The language states that some pointer types are required to have identical representations, which includes void * vs. char *, pointers to different struct types or, say, int * and const int *. But these are just exceptions from the general rule.
Other people have adequately addressed the case of passing an int * to a prototyped function with a fixed number of arguments that expects a different pointer type.
printf is not such a function. It is a variadic function, so the default argument promotions are used for its anonymous arguments (i.e. everything after the format string) and if the promoted type of each argument does not exactly match the type expected by the format effector, the behavior is undefined. In particular, even if int * and void * have identical representation,
int a;
printf("%p\n", &a);
has undefined behavior.
This is because the layout of the call frame may depend on the exact concrete type of each argument. ABIs that specify different argument areas for pointer and non-pointer types have occurred in real life (e.g. the Motorola 68000 would like you to keep pointers in the address registers and non-pointers in the data registers to the maximum extent possible). I'm not aware of any real-world ABI that segregates distinct pointer types, but it's allowed and it would not surprise me to hear of one.
c11: 7.21.6 Formatted input/output functions (p8):
p The argument shall be a pointer to void. The value of the pointer is
converted to a sequence of printing characters, in an implementation-defined
manner.
In reality except on ancient mainframes/minis, different pointer types are extremely unlikely to have different sizes. However they have different types, and per the specification for printf, calling it with the wrong type argument for the format specifier results in undefined behavior. This means don't do it.
when printing a pointer value with printf, the caller must cast the pointer to void *
Even casting to void * is not sufficient for all pointers.
C has 2 kind of pointer: Pointers to objects and pointers to functions.
Any object pointer can convert to void* with no problem:
printf("My pointer is: %p", (void *)my_ptr); // OK when my_ptr points to an object
A conversion of a pointer to a function to void * is not defined.
Consider a system in 2021 where void * is 64-bit and a function pointer is 128 bit.
C does specify (my emphasis)
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type. C17dr § 6.3.2.3 6
To print a function pointer could attempt:
printf("My function pointer is: %ju", (uintmax_t) my_function_ptr); // Selectively OK
C lacks a truly universal pointer and lacks a clean way to print function pointers.
Addressing the question:
When will the two calls generate different stack frames?
The compiler may notice that the behaviour is undefined, and issue an exception, illegal instruction, etc. There's no requirement for the compiler to attempt to generate a stack frame, function call or whatever.
See here for an example of the compiler doing this in another case of UB . Instead of generating a deference instruction with null argument, it generates ud2 illegal instruction.
When the behaviour is undefined according to the language standard, there are no requirements on the compiler's behaviour.
The behavior of my code is different in c and c++.
void *(*funcPtr)() = dlsym(some symbol..) ; // (1) works in c but not c++
int (*funcPtr)();
*(void**)(&funcPtr) = dlsym(some symbol..) ; // (2) works in c++
I dont understand how the 2nd casting works in c++ while the 1st casting doesnt work in c++. The error message for (1) show is invalid conversion from void* to void*() in c++.
The problem is that dlsym returns a void* pointer. While in C, any such pointer is implicitly convertible into any other (object!) pointer type (for comparison: casting the result of malloc), this is not the case in C++ (here you need to cast the result of malloc).
For function pointers, though, this cast is not implicit even in C. Apparently, as your code compiles, your compiler added this implicit cast for function pointers, too, as a compiler extension (in consistency with object pointers); however, for being fully standard compliant, it actually should issue a diagnostic, e. g. a compiler warning, mandated by C17 6.5.4.3:
Conversions that involve pointers, other than where permitted by the constraints of 6.5.16.1, shall be specified by means of an explicit cast.
But now instead of casting the target pointer, you rather should cast the result of dlsym into the appropriate function pointer:
int (*funcPtr)() = reinterpret_cast<int(*)()>(dlsym(some symbol..));
or simpler:
auto funcPtr = reinterpret_cast<int(*)()>(dlsym(some symbol..));
or even:
int (*funcPtr)() = reinterpret_cast<decltype(funcPtr)>(dlsym(some symbol..));
(Last one especially interesting if funcPtr has been declared previously.)
Additionally, you should prefer C++ over C style casts, see my variants above. You get more precise control over what type of cast actually occurs.
Side note: Have you noticed that the two function pointers declared in your question differ in return type (void* vs. int)? Additionally, a function not accepting any arguments in C needs to be declared as void f(void); accordingly, function pointers: void*(*f)(void). C++ allows usage of void for compatibility, while skipping void in C has totally different meaning: The function could accept anything, you need to know from elsewhere (documentation) how many arguments of which type actually can be passed to.
Strictly speaking, none of this code is valid in either language.
void *(*funcPtr)() = dlsym(some symbol..) ;
dlsym returns type void* so this is not valid in either language. The C standard only allows for implicit conversions between void* and pointers to object type, see C17 6.3.2.3:
A pointer to void may be converted to or from a pointer to any object type. A pointer to
any object type may be converted to a pointer to void and back again; the result shall
compare equal to the original pointer.
Specifically, your code is a language violation of one of the rules of simple assignment, C17 6.5.16.1, emphasis mine:
the left operand has atomic, qualified, or unqualified pointer type, and (considering
the type the left operand would have after lvalue conversion) one operand is a pointer
to an object type, and the other is a pointer to a qualified or unqualified version of
void, and the type pointed to by the left has all the qualifiers of the type pointed to
by the right;
The reason why it might compile on certain compilers is overly lax compiler settings. If you are for example using gcc, you need to compile with gcc -std=c17 -pedantic-errors to get a language compliant compiler, otherwise it defaults to the non-standard "gnu11" language.
int (*funcPtr)();
*(void**)(&funcPtr) = dlsym(some symbol..) ; // (2) works in c++
Here you explicitly force the function pointer to become type void**. The cast is fine syntax-wise, but this may or may not be a valid pointer conversion on the specific compiler. Again, conversions between object pointers and function pointers are not supported by the C or C++ standard, but you are relying on non-standard extensions. Formally this code invokes undefined behavior in both C and C++.
In practice, lots of compilers have well-defined behavior here, because most systems have the same representation of object pointers and function pointers.
Given that the conversion is valid, you can of course de-reference a void** to get a void* and then assign another void* to it.
I have come across some code that looks like it is forward declaring a struct but I can not find any definition for the struct in the code base. It seems to be used as though the struct was defined. Could someone explain why the below code is valid c++?
What type is Frame? What is the size? I cannot use sizeof() as it will complain it is undefined.
I am trying to convert a similar piece of code to Visual Studio 2015 from 2010. The reinterpret_cast cast is complaining that it cannot be converted due to the fact that
'reinterpret_cast': conversion from 'unsigned long' to 'Frame *' of
greater size
#include <stdio.h>
struct Frame;
int main()
{
unsigned long currentFrame = 5;
Frame* frame = reinterpret_cast<Frame*>(currentFrame);
printf("%p", frame);
}
GCC 4.9.2 was used to compile this example.
I understand the error, but do not understand how the struct is being used. Is it defaulting to int?
The program behaviour is undefined, as a conversion from an unsigned long to Frame* where the former is set to a value not associated with a pointer value that you can set is not in accordance with one of the possibilities mentioned in http://en.cppreference.com/w/cpp/language/reinterpret_cast.
The fact that printf appears to output the address of a pointer is a manifestation of that undefined behaviour.
The fact that Frame is an incomplete type does not matter here. With the exception of nullptr, one past the address of a scalar (i.e. single object or a plain-old-data object), and one past the end of an array, the behaviour on setting a pointer type to memory you don't own is also undefined.
Since you are using Frame just as a pointer the compiler doesn't need to know anything about Frame structure itself. It's like using an opaque pointer to something without caring what's pointed.
The cast fails because unsigned long is not guaranteed to be the same size of a pointer according to operating system and data model (eg LLP64 vs LP64). You should consider using intptr_t from <stdint.h> which is guaranteed to be able to store all the bits of a pointer but I don't see how you could need to reinterpred a literal to a memory address.
In the first comment to Python C Module - Malloc fails in specific version of Python, #user694733 mentions that casting char** to void** is not valid. I read Invalid conversion from Foo** to void** - why is implicit type conversion allowed to void* but not to void**? and http://c-faq.com/ptrs/genericpp.html but there is a reference to standard, but no real example, in which case this might be incorrect, leading to errores . Thinking of e.g. void** to double** or vice versa, is there a case where it can go wrong? Why (technically, not just because it is UB)?
If that was allowed, it would create a loop hole in the type system:
T* ptr;
void **vptr = &ptr; // &ptr is of type T**
int value;
*vptr = &value; // &value is int*, can be converted to void*
At this point, ptr, which is according to the type system a pointer to T, is pointing to value that is an int. While the language allows you to circumvent the type system, you have to explicitly request it. Implicit conversions are designed to avoid this type of issues.
but there is a reference to standard, but no real example, in which case this might be incorrect, leading to errors
This is not accurate. Page http://c-faq.com/ptrs/genericpp.html which you mentioned points to another page http://c-faq.com/null/machexamp.html which contains an example of machines with different pointer sizes for different types:
The Eclipse MV series from Data General has three architecturally supported pointer formats (word, byte, and bit pointers), two of which are used by C compilers: byte pointers for char * and void *, and word pointers for everything else. For historical reasons during the evolution of the 32-bit MV line from the 16-bit Nova line, word pointers and byte pointers had the offset, indirection, and ring protection bits in different places in the word. Passing a mismatched pointer format to a function resulted in protection faults. Eventually, the MV C compiler added many compatibility options to try to deal with code that had pointer type mismatch errors.
The biggest practical problem is with multiple inheritance. When you use a pointer to a class with multiple base classes, the actual value of the pointer will depend on the type of the pointer, and the compiler inserts fix-up code to adjust it when you assign from one pointer type to another. When you have a pointer to the pointer, the compiler no longer has the opportunity to do those fixups, so the operation is disallowed by the standard.
What would this statement yield?
void *p = malloc(sizeof(void));
Edit: An extension to the question.
If sizeof(void) yields 1 in GCC compiler, then 1 byte of memory is allocated and the pointer p points to that byte and would p++ be incremented to 0x2346? Suppose p was 0x2345. I am talking about p and not *p.
The type void has no size; that would be a compilation error. For the same reason you can't do something like:
void n;
EDIT.
To my surprise, doing sizeof(void) actually does compile in GNU C:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc -w - && ./a.out
1
However, in C++ it does not:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc++ -w - && ./a.out
<stdin>: In function 'int main()':
<stdin>:1: error: invalid application of 'sizeof' to a void type
<stdin>:1: error: 'printf' was not declared in this scope
If you are using GCC and you are not using compilation flags that remove compiler specific extensions, then sizeof(void) is 1. GCC has a nonstandard extension that does that.
In general, void is a incomplete type, and you cannot use sizeof for incomplete types.
Although void may stand in place for a type, it cannot actually hold a value. Therefore, it has no size in memory. Getting the size of a void isn’t defined.
A void pointer is simply a language construct meaning a pointer to untyped memory.
void has no size. In both C and C++, the expression sizeof (void) is invalid.
In C, quoting N1570 6.5.3.4 paragraph 1:
The sizeof operator shall not be applied to an expression that
has function type or an incomplete type, to the parenthesized name of
such a type, or to an expression that designates a bit-field member.
(N1570 is a draft of the 2011 ISO C standard.)
void is an incomplete type. This paragraph is a constraint, meaning that any conforming C compiler must diagnose any violation of it. (The diagnostic message may be a non-fatal warning.)
The C++ 11 standard has very similar wording. Both editions were published after this question was asked, but the rules go back to the 1989 ANSI C standard and the earliest C++ standards. In fact, the rule that void is an incomplete type to which sizeof may not be applied goes back exactly as far as the introduction of void into the language.
gcc has an extension that treats sizeof (void) as 1. gcc is not a conforming C compiler by default, so in its default mode it doesn't warn about sizeof (void). Extensions like this are permitted even for fully conforming C compilers, but the diagnostic is still required.
Taking the size of void is a GCC extension.
sizeof() cannot be applied to incomplete types. And void is incomplete type that cannot be completed.
In C, sizeof(void) == 1 in GCC, but this appears to depend on your compiler.
In C++, I get:
In function 'int main()':
Line 2: error: invalid application of 'sizeof' to a void type
compilation terminated due to -Wfatal-errors.
To the 2nd part of the question: Note that sizeof(void *)!= sizeof(void).
On a 32-bit arch, sizeof(void *) is 4 bytes, so p++, would be set accordingly.The amount by which a pointer is incremented is dependent on the data it is pointing to. So, it will be increased by 1 byte.
while sizeof(void) perhaps makes no sense in itself, it is important when you're doing any pointer math.
eg.
void *p;
while(...)
p++;
If sizeof(void) is considered 1 then this will work.
If sizeof(void) is considered 0 then you hit an infinite loop.
Most C++ compilers choosed to raise a compile error when trying to get sizeof(void).
When compiling C, gcc is not conforming and chose to define sizeof(void) as 1. It may look strange, but has a rationale. When you do pointer arithmetic adding or removing one unit means adding or removing the object pointed to size. Thus defining sizeof(void) as 1 helps defining void* as a pointer to byte (untyped memory address). Otherwise you would have surprising behaviors using pointer arithmetic like p+1 == p when p is void*. Such pointer arithmetic on void pointers is not allowed in c++ but works fine with when compiling C with gcc.
The standard recommended way would be to use char* for that kind of purpose (pointer to byte).
Another similar difference between C and C++ when using sizeof occurs when you defined an empty struct like:
struct Empty {
} empty;
Using gcc as my C compiler sizeof(empty) returns 0.
Using g++ the same code will return 1.
I'm not sure what states both C and C++ standards on this point, but I believe defining the size of some empty structs/objects helps with reference management to avoid that two references to differing consecutive objects, the first one being empty, get the same address. If reference are implemented using hidden pointers as it is often done, ensuring different address will help comparing them.
But this is merely avoiding a surprising behavior (corner case comparison of references) by introduction another one (empty objects, even PODs consume at least 1 byte memory).