So apparently a std::nullptr_t argument is converted to a null pointer of type void * (Section 5.2.2/7 of N3337) when passed without a parameter (via ...). This means that to properly pass a null char * pointer, for example, a cast is still needed:
some_variadic_function("a", "b", "c", (const char *) std::nullptr);
since there is no guarantee that a null void * has the same bit pattern as a null char *. Correct?
This also means that there is no advantage to std::nullptr over 0 in such cases, except perhaps for clarity.
You ask:
since there is no guarantee that a null void * has the same bit pattern as a null char *. Correct?
Well, actually, that guarantee does exist, Deduplicator's answer already shows where the standard requires this. But that is not relevant to your question.
Passing void * to a variadic function, and accessing it using va_arg as char *, is specifically allowed as a special exception.
C++11:
18.10 Other runtime support [support.runtime]
1 Headers <csetjmp> (nonlocal jumps), <csignal> (signal handling), <cstdalign> (alignment), <cstdarg> (variable arguments), <cstdbool> (__bool_true_false_are_defined). (runtime environment
getenv(), system()), and <ctime> (system clock clock(), time()) provide further compatibility with C code.
2 The contents of these headers are the same as the Standard C library headers <setjmp.h>, <signal.h>, <stdalign.h>, <stdarg.h>, <stdbool.h>, <stdlib.h>, and <time.h>, respectively, with the following
changes:
[... nothing about va_arg]
C99:
7.15.1.1 The va_arg macro
[...] If there is no actual next argument, or if type is not compatible with the type of the actual next argument (as promoted according to the default argument promotions), the behavior is undefined, except for the following cases:
-- one type is a signed integer type, the other type is the corresponding unsigned integer type, and the value is representable in both types;
-- one type is pointer to void and the other is a pointer to a character type.
However, this does mean that in other cases where two types T1 and T2 have the same representation and alignment requirements, the behaviour is undefined if T1 is passed to a variadic function, and it is retrieved as T2.
An example of this: passing (void *) 0 and accessing it as char *, is allowed, passing (void *) 0 and accessing it as unsigned char * is also allowed, but passing (char *) 0 and accessing it as unsigned char * is not allowed. If a compiler is capable of inlining calls to variadic functions, and optimises based on the strict requirements of the standard, such mismatches could break badly.
This also means that there is no advantage to std::nullptr over 0 in such cases, except perhaps for clarity.
I would definitely not use nullptr without casting it, even though in this one special case it is valid. It is far too hard to see that it is valid. And if a cast is included anyway, (char *) 0 is just as clear as a null pointer value.
You are wrong. One of the few guarantees are that a char* has the same size and representation as the corresponding void*.
3.9.2 Compound Types §4
A pointer to cv-qualified (3.9.3) or cv-unqualified void can be used to point to objects of unknown type.
Such a pointer shall be able to hold any object pointer. An object of type cv void* shall have the same
representation and alignment requirements as cv char*.
Edit: Looks like this answer by hvd is better, showing a few more traps specific to the variadic function part of the question.
Related
In a recent question, someone mentioned that when printing a pointer value with printf, the caller must cast the pointer to void *, like so:
int *my_ptr = ....
printf("My pointer is: %p", (void *)my_ptr);
For the life of me I can't figure out why. I found this question, which is almost the same. The answer to question is correct - it explains that ints and pointers are not necessarily the same length.
This is, of course, true, but when I already have a pointer, like in the case above, why should I cast from int * to void *? When is an int * different from a void *? In fact, when does (void *)my_ptr generate any machine code that's different from simply my_ptr?
UPDATE:
Multiple knowledgeable responders quoted the standard, saying passing the wrong type may result in undefined behavior. How? I expect printf("%p", (int *)ptr) and printf("%p", (void *)ptr) to generate the exact same stack-frame. When will the two calls generate different stack frames?
The %p conversion specifier requires an argument of type void *. If you don't pass an argument of type void *, the function call invokes undefined behavior.
From the C Standard (C11, 7.21.6.1p8 Formatted input/output functions):
"p - The argument shall be a pointer to void."
Pointer types in C are not required to have the same size or the same representation.
An example of an implementation with different pointer types representation is Cray PVP where the representation of pointer types is 64-bit for void * and char * but 32-bit for the other pointer types.
See "Cray C/C++ Reference Manual", Table 3. in "9.1.2.2" http://docs.cray.com/books/004-2179-003/004-2179-003-manual.pdf
In C language all pointer types potentially differ in their representations. So, yes, int * is different from void *. A real-life platform that would illustrate this difference might be difficult (or impossible) to find, but at the conceptual level the difference is still there.
In other words, in general case different pointer types have different representations. int * is different from void * and different from double *. The fact that your platform uses the same representation for void * and int * is nothing more than a coincidence, as far as C language is concerned.
The language states that some pointer types are required to have identical representations, which includes void * vs. char *, pointers to different struct types or, say, int * and const int *. But these are just exceptions from the general rule.
Other people have adequately addressed the case of passing an int * to a prototyped function with a fixed number of arguments that expects a different pointer type.
printf is not such a function. It is a variadic function, so the default argument promotions are used for its anonymous arguments (i.e. everything after the format string) and if the promoted type of each argument does not exactly match the type expected by the format effector, the behavior is undefined. In particular, even if int * and void * have identical representation,
int a;
printf("%p\n", &a);
has undefined behavior.
This is because the layout of the call frame may depend on the exact concrete type of each argument. ABIs that specify different argument areas for pointer and non-pointer types have occurred in real life (e.g. the Motorola 68000 would like you to keep pointers in the address registers and non-pointers in the data registers to the maximum extent possible). I'm not aware of any real-world ABI that segregates distinct pointer types, but it's allowed and it would not surprise me to hear of one.
c11: 7.21.6 Formatted input/output functions (p8):
p The argument shall be a pointer to void. The value of the pointer is
converted to a sequence of printing characters, in an implementation-defined
manner.
In reality except on ancient mainframes/minis, different pointer types are extremely unlikely to have different sizes. However they have different types, and per the specification for printf, calling it with the wrong type argument for the format specifier results in undefined behavior. This means don't do it.
when printing a pointer value with printf, the caller must cast the pointer to void *
Even casting to void * is not sufficient for all pointers.
C has 2 kind of pointer: Pointers to objects and pointers to functions.
Any object pointer can convert to void* with no problem:
printf("My pointer is: %p", (void *)my_ptr); // OK when my_ptr points to an object
A conversion of a pointer to a function to void * is not defined.
Consider a system in 2021 where void * is 64-bit and a function pointer is 128 bit.
C does specify (my emphasis)
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type. C17dr § 6.3.2.3 6
To print a function pointer could attempt:
printf("My function pointer is: %ju", (uintmax_t) my_function_ptr); // Selectively OK
C lacks a truly universal pointer and lacks a clean way to print function pointers.
Addressing the question:
When will the two calls generate different stack frames?
The compiler may notice that the behaviour is undefined, and issue an exception, illegal instruction, etc. There's no requirement for the compiler to attempt to generate a stack frame, function call or whatever.
See here for an example of the compiler doing this in another case of UB . Instead of generating a deference instruction with null argument, it generates ud2 illegal instruction.
When the behaviour is undefined according to the language standard, there are no requirements on the compiler's behaviour.
The behavior of my code is different in c and c++.
void *(*funcPtr)() = dlsym(some symbol..) ; // (1) works in c but not c++
int (*funcPtr)();
*(void**)(&funcPtr) = dlsym(some symbol..) ; // (2) works in c++
I dont understand how the 2nd casting works in c++ while the 1st casting doesnt work in c++. The error message for (1) show is invalid conversion from void* to void*() in c++.
The problem is that dlsym returns a void* pointer. While in C, any such pointer is implicitly convertible into any other (object!) pointer type (for comparison: casting the result of malloc), this is not the case in C++ (here you need to cast the result of malloc).
For function pointers, though, this cast is not implicit even in C. Apparently, as your code compiles, your compiler added this implicit cast for function pointers, too, as a compiler extension (in consistency with object pointers); however, for being fully standard compliant, it actually should issue a diagnostic, e. g. a compiler warning, mandated by C17 6.5.4.3:
Conversions that involve pointers, other than where permitted by the constraints of 6.5.16.1, shall be specified by means of an explicit cast.
But now instead of casting the target pointer, you rather should cast the result of dlsym into the appropriate function pointer:
int (*funcPtr)() = reinterpret_cast<int(*)()>(dlsym(some symbol..));
or simpler:
auto funcPtr = reinterpret_cast<int(*)()>(dlsym(some symbol..));
or even:
int (*funcPtr)() = reinterpret_cast<decltype(funcPtr)>(dlsym(some symbol..));
(Last one especially interesting if funcPtr has been declared previously.)
Additionally, you should prefer C++ over C style casts, see my variants above. You get more precise control over what type of cast actually occurs.
Side note: Have you noticed that the two function pointers declared in your question differ in return type (void* vs. int)? Additionally, a function not accepting any arguments in C needs to be declared as void f(void); accordingly, function pointers: void*(*f)(void). C++ allows usage of void for compatibility, while skipping void in C has totally different meaning: The function could accept anything, you need to know from elsewhere (documentation) how many arguments of which type actually can be passed to.
Strictly speaking, none of this code is valid in either language.
void *(*funcPtr)() = dlsym(some symbol..) ;
dlsym returns type void* so this is not valid in either language. The C standard only allows for implicit conversions between void* and pointers to object type, see C17 6.3.2.3:
A pointer to void may be converted to or from a pointer to any object type. A pointer to
any object type may be converted to a pointer to void and back again; the result shall
compare equal to the original pointer.
Specifically, your code is a language violation of one of the rules of simple assignment, C17 6.5.16.1, emphasis mine:
the left operand has atomic, qualified, or unqualified pointer type, and (considering
the type the left operand would have after lvalue conversion) one operand is a pointer
to an object type, and the other is a pointer to a qualified or unqualified version of
void, and the type pointed to by the left has all the qualifiers of the type pointed to
by the right;
The reason why it might compile on certain compilers is overly lax compiler settings. If you are for example using gcc, you need to compile with gcc -std=c17 -pedantic-errors to get a language compliant compiler, otherwise it defaults to the non-standard "gnu11" language.
int (*funcPtr)();
*(void**)(&funcPtr) = dlsym(some symbol..) ; // (2) works in c++
Here you explicitly force the function pointer to become type void**. The cast is fine syntax-wise, but this may or may not be a valid pointer conversion on the specific compiler. Again, conversions between object pointers and function pointers are not supported by the C or C++ standard, but you are relying on non-standard extensions. Formally this code invokes undefined behavior in both C and C++.
In practice, lots of compilers have well-defined behavior here, because most systems have the same representation of object pointers and function pointers.
Given that the conversion is valid, you can of course de-reference a void** to get a void* and then assign another void* to it.
More specifically, if I have the following function pointer type:
typedef void (*callback_type) (intptr_t context, void* buffer, size_t count);
can I safely and without "undefined behavior" do:
callback_type func_ptr = (callback_type)write;
intptr_t context = fd;
func_ptr(context, some_buffer, buffer_size);
?
Where write() is the system call (EDIT: has the signature ssize_t write(int fd, const void *buf, size_t count);, thus takes an int as the first argument), and fd is an int file descriptor. I assume the answer would be the same for C and C++, so I am tagging both.
No
That won't be portable because you are passing a parameter that will be a different size in the common LP64 paradigm.
Also, you aren't dereferencing the function pointer with the correct type, and the results of that are undefined.
Now, as you seem to have concluded, the function pointer will work as expected and the only practical problem is going to be: how will write(2) interpret the intptr_t first parameter?
And the actual run-time problem is that, on LP64, you are passing a 64-bit value to a 32-bit parameter. This might misalign the subsequent parameters. On a system with register parameters it would probably work just fine.
Let's have a look at C standard.
C11 (n1570), § 6.3.2.3 Pointers
A pointer to a function of one type may be converted to a pointer to a
function of another type and back again; the result shall compare
equal to the original pointer. If a converted pointer is used to call
a function whose type is not compatible with the referenced type, the
behavior is undefined.
C11 (n1570), § 6.7.6.3 Function declarators (including prototypes)
For two function types to be compatible, both shall specify compatible
return types. Moreover, the parameter type lists, if both are present,
shall agree in the number of parameters and in use of the ellipsis
terminator; corresponding parameters shall have compatible types.
C11 (n1570), § 6.2.7 Compatible type and composite type
Two types have compatible type if their types are the same.
Conclusion:
void (*) (intptr_t context, void* buffer, size_t count);
cannot be converted to:
void (*) (int context, void* buffer, size_t count);
The problem is not with passing the argument back and forth between functions, since automatic promotion from one integral type to another is done.
The problem is, what if intptr_t is shorter than int, thus not every value of int can be represented by an intptr_t? In such a case, the some of the highest bits in the int will be truncated when converting to intptr_t, so you'll end up write()ing to an invalid file descriptor. Although that should not invoke undefined behavior, it's still erroneous.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Correct format specifier to print pointer (address)?
When printing a pointer using printf, is it necessary to cast the pointer to void *? In other words, in code like
#include <stdio.h>
int main() {
int a;
printf("address of a = %p\n", &a);
}
should the argument really be (void *) &a? gcc doesn't seem to give any warnings when no explicit cast is made.
Yes, the cast to void* is required.
int a;
printf("address of a = %p\n", &a);
&a is of type int*; printf's "%p" format requires an argument of type void*. The int* argument is not implicitly converted to void*, because the declaration of printf doesn't provide type information for parameters other than the first (the format string). All arguments after the format string have the default argument promotions applied to them; these promotions do not convert int* to void*.
The likely result is that printf sees an argument that's really of type int* and interprets it as if it were of type void*. This is type-punning, not conversion, and it has undefined behavior. It will likely happen to work if int* and void* happen to have the same representation, but the language standard does not guarantee that, even by implication. And the type-punning I described is only one possible behavior; the standard says literally nothing about what can happen.
(If you do the same thing with a non-variadic function with a visible prototype, so the compiler knows at the point of the call that the parameter is of type void*, then it will generate code to do an implicit int*-to-void* conversion. That's not the case here.)
Is this a C or a C++ question? For C++, it seems that according to 5.2.2 [expr.call] paragraph 7 there isn't any implicit conversion to void*. It seems that C99's 6.5.2.2 paragraph 6 also doesn't imply any explicit promotion of pointer types. This would mean that an explicit cast to void* is required as pointer types can have different size (at least in C++): if the layout of the different pointer types isn't identical you'd end up with undefined behavior. Can someone point out where it is guaranteed that a pointer is passed with the appropriate size when using variable argument lists?
Of course, being a C++ programmer this isn't much of a problem: just don't use functions with variable number of arguments. That's not a viable approach in C, though.
I think it might be necessary to cast. Are we certain that the size of pointers is always the same? I'm sure I read on stackoverflow recently that the size (or maybe just the alignment?) of a struct* can be different to that of a union*. This would suggest that one or both can be different from the size of a void*.
So even if the value doesn't change much, or at all, in the conversion, maybe the cast is needed to ensure the size of the pointer itself is correct.
In print, %p expects a void* so you should explicitly cast it. If you don't do so, and if you are lucky then the pointer size and pointer representation might save the day. But you should explicitly cast it to be certain - anything else is technically undefined behaviour.
Considering this code fragment:
struct My {
operator const char*()const{ return "my"; }
} my;
CStringA s( "aha" );
printf("%s %s", s, my );
// another variadic function to get rid of comments about printf :)
void foo( int i, ... ) {
va_list vars;
va_start(vars, i);
for( const char* p = va_arg(vars,const char*)
; p != NULL
; p=va_arg(vars,const char*) )
{
std::cout << p << std::endl;
}
va_end(vars);
}
foo( 1, s, my );
This snippet results in the 'intuitive' output "aha". But I haven't got a clue how this can work:
if the variadic-function call is translated into pushing the pointers of the arguments, printf will receive a CStringA* that is interpreted as a const char*
if the variadic-function call is calling operator (const char*) on it, why wouldn't it do so for my own class?
Can someone explain this?
EDIT: added a dummy variadic function that treats it's arguments as const char*s. Behold - it even crashes when it reaches the my argument...
The relevant text of C++98 standard §5.2.2/7:
The lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conversions are performed on the argument expression. After these conversions, if the argument does not have arithmetic, enumeration, pointer, pointer to member, or class type, the program is ill-formed. If the argument has a non-POD class type (clause 9), the behavior is undefined.
So formally the behavior is undefined.
However, a given compiler can provide any number of language extensions, and Visual C++ does. The MSDN Library documents the behavior of Visual C++ as follows, with respect to passing arguments to ...:
If the actual argument is of type float, it is promoted to type double prior to the function call.
Any signed or unsigned char, short, enumerated type, or bit field is converted to either a signed or an unsigned int using integral promotion.
Any argument of class type is passed by value as a data structure; the copy is created by binary copying instead of by invoking the class's copy constructor (if one exists).
This doesn’t mention anything about Visual C++ applying user defined conversions.
MS CString is "cleverly" layed out, so that it's POD representation is exactly the pointer to its null terminated character string. (sizeof(CStringA) == sizeof(char*)) When it is used in any printf-style function the function just get's passed the character pointer.
So this works because of the last point above and the way CString is layed out.
What you're doing is undefined behaviour, and is either a non-standard extension provided by your compiler or works by sheer luck. I'm guessing that the CString stores the string data as the first element in the structure, and thus that reading from the CString as if it were a char * yields a valid null-terminated string.
You cannot insert Non-POD data into variadic functions.
More info
if the variadic-function call is calling operator (const char*) on it, why wouldn't it do so for my own class?
Yes but you should explicitly cast it in your code: printf("%s", (LPCSTR)s, ...);.
It doesn't. It doesn't even call the operator const char*. Visual C++ just passes the class data to printf as if by memcpy. It works because of the layout of the CString class: It only contains one member variable which is a pointer to the character data.
If the variadic-function call is translated into pushing the pointers of the arguments, …
That is not how variadic functions work. The values of the arguments, rather than pointers to the arguments, are passed, after special conversion rules for built-in types (such as char to int).
C++03 §5.2.2p7:
When there is no parameter for a given argument, the argument is passed in such a way that the receiving function can obtain the value of the argument by invoking va_arg (18.7). The lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conversions are performed on the argument expression. After these conversions, if the argument does not have arithmetic, enumeration, pointer, pointer to member, or class type, the program is ill-formed. If the argument has a non-POD class type (clause 9), the behavior is undefined. If the argument has integral or enumeration type that is subject to the integral promotions (4.5), or a floating point type that is subject to the floating point promotion (4.6), the value of the argument is converted to the promoted type before the call. These promotions are referred to as the default argument promotions.
In particular from the above:
If the argument has a non-POD class type (clause 9), the behavior is undefined.
C++ punts to C for the definition of va_arg, and C99 TC3 §7.15.1.2p2 says:
… if type is not compatible with the type of the actual next argument (as promoted according to the default argument promotions), the behavior is undefined, except for the following cases: [list of cases that don't apply here]
Thus, if you pass a class type, it must be POD, and the receiving function must apply the correct type, otherwise the behavior is undefined. This means that in the worst case, it may work exactly as you expect.
Printf will not apply the correct type for any user-defined class type as it has no knowledge of them, so you cannot pass any UDT class type to printf. Your foo does the same thing by using a char pointer instead of the correct class type.
Your printf statement is wrong:
printf("%s", s, my );
Should be:
printf("%s %s", s, my);
Which will print out "aha my".
CString has a converstion operator for const char* (its actually for LPCTSTR which is a const TCHAR* - CStringA has a conversion function for LPCSTR).
The printf call will not convert your CStringA object to a CStringA* pointer. It essentially treats it like a void*. In the case of CString, it is sheer luck (or perhaps design of Microsoft's developers taking advantage of something that isn't in the standard) that it will give you the NULL-terminated string. If you were to use a _bstr_t instead (which has the size of the string first), despite having the conversion function, it would fail horribly.
It is good practice (and required in many cases) to explicitly cast your objects/pointers to what you want them to be when you call printf (or any variadic function for that matter).