int func(int n)
{return n;}
int main()
{ cout << func[4] ;
cout << func[4,3,5] ;}
what do these actually mean? I guess it is about accessing func+4 and func is allocated space on calling func[4].
But, func[4,3,5] is just absurd.
The reason this code compiles and func[4] is not a syntax error is:
1.Function types can implicitly convert to pointers of the same type.
So, if we have code like this:
int f(int);
using func_t = int(*)(int);
void g(func_t);
we can write
g(f)
and aren't forced to write g(&f). The &, taking us from type int(int) to int(*)(int) happens implicitly.
2.In C (and necessarily in C++ for compatibility) pointers are connected to arrays, and when p is a pointer p[x] is the same as *(p + x). So func[4] is the same as *(func + 4).
3.*(p+x) has the type of a function int(int), but again can implicitly decay to a pointer type whenever necessary. So *(func + 4) can implicitly just be (func + 4).
4.Pointers of any type are streamable to std::cout.
Note, that just because it isn't a syntax error doesn't mean it is valid. Of course it is undefined behavior, and as the compiler warning emitted by gcc and clang indicates, pointer arithmetic with a function pointer is generally wrong, because you cannot make an array of functions. The implementation places functions however it likes. (You can make an array of function pointers but that is something else entirely.)
Edit: I should correct myself -- this answer is not entirely correct. func[4] is not valid, because the pointer is not a pointer to an object type. #holyblackcat answer is correct, see his answer for reference in the standard.
This code should be ill-formed, and gcc only compiles it without an error because they are using a nonstandard extension by default. Clang and msvc correctly reject this code.
I'm surprised no answer mentions it, but:
The code in the question is simply not valid C++.
It's rejected by Clang and MSVC with no flags. GCC rejects it with -pedantic-errors.
a[b] (in absence of operator overloading) is defined as *(a + b), and the builtin operator + requires the pointer operand to be a pointer to an object type (which functions pointers are not).
[expr.add]/1
...either both operands shall have arithmetic or unscoped enumeration type, or one operand shall be a pointer to a completely-defined object type and the other shall have integral or unscoped enumeration type.
(Emphasis mine.)
GCC compiles the code because an extension allowing arithmetic on function pointer is enabled by default.
Due to function-to-pointer decay, func[4] is treated as &(&func)[4], which effectively means &func + 4, which (as the link explains) simply adds 4 to the numerical value of the pointer. Calling resulting pointer will most likely cause a crash or unpredicatble results.
std::cout doesn't have an overload of << suitable for printing function pointers, and the best suitable overload the compiler is able to find is the one for printing bools. The pointer gets converted to bool, and since it's non-null, it becomes true, which is then printed as 1.
Lastly, func[4,3,5] has the same effect as func[5], since in this context , is treated as an operator, and x , y is equal to y.
Since it has not been mentioned yet: func[3, 4, 5] is identical to func[5] - the commas in there are the builtin comma operator which evaluates the left hand side expression, discards it and then evaluates the right hand side expression. There is no function call happening here and the commas in the code are not delimiting function parameters.
Yes,It is about accessing the func+4 which is not already defined leading to a garbage value.So the compiler will indicate you with the following warning message.
hereProgram: In function 'int main()':
Program:7: warning: pointer to a function used in arithmetic
Related
Consider the following code.
void f(double p) {}
void f(double* p) {}
int main()
{ f(1-1); return 0; }
MSVC 2017 doesn't compile that. It figures there is an ambiguous overloaded call, as 1-1 is the same as 0 and therefore can be converted into double*. Other tricks, like 0x0, 0L, or static_cast<int>(0), do not work either. Even declaring a const int Zero = 0 and calling f(Zero) produces the same error. It only works properly if Zero is not const.
It looks like the same issue applies to GCC 5 and below, but not GCC 6. I am curious if this is a part of C++ standard, a known MSVC bug, or a setting in the compiler. A cursory Google did not yield results.
MSVC considers 1-1 to be a null pointer constant. This was correct by the standard for C++03, where all integral constant expressions with value 0 were null pointer constants, but it was changed so that only zero integer literals are null pointer constants for C++11 with CWG issue 903. This is a breaking change, as you can see in your example and as is also documented in the standard, see [diff.cpp03.conv] of the C++14 standard (draft N4140).
MSVC applies this change only in conformance mode. So your code will compile with the /permissive- flag, but I think the change was implemented only in MSVC 2019, see here.
In the case of GCC, GCC 5 defaults to C++98 mode, while GCC 6 and later default to C++14 mode, which is why the change in behavior seems to depend on the GCC version.
If you call f with a null pointer constant as argument, then the call is ambiguous, because the null pointer constant can be converted to a null pointer value of any pointer type and this conversion has same rank as the conversion of int (or any integral type) to double.
The compiler works correctly, in accordance to [over.match] and [conv], more specifically [conv.fpint] and [conv.ptr].
A standard conversion sequence is [blah blah] Zero or one [...] floating-integral conversions, pointer conversions, [...].
and
A prvalue of an integer type or of an unscoped enumeration type can be converted to a prvalue of a floating-point type. The result is exact if possible [blah blah]
and
A null pointer constant is an integer literal with value zero or [...]. A null pointer constant can be converted to a pointer type; the result is the null pointer value of that type [blah blah]
Now, overload resolution is to choose the best match among all candidate functions (which, as a fun feature, need not even be accessible at the call location!). The best match is the one with exact parameters or, alternatively, the fewest possible conversions. Zero or one standard conversions may happen (... for every parameter), and zero is "better" than one.
(1-1) is an integer literal with value 0.
You can convert the zero integer literal to each of either double or double* (or nullptr_t), with exactly one conversion. So, assuming that more than one of these functions is declared (as is the case in the example), there exists more than a single candidate, and all candidates are equally good, there exists no best match. It's ambiguous, and the compiler is right about complaining.
The behavior of my code is different in c and c++.
void *(*funcPtr)() = dlsym(some symbol..) ; // (1) works in c but not c++
int (*funcPtr)();
*(void**)(&funcPtr) = dlsym(some symbol..) ; // (2) works in c++
I dont understand how the 2nd casting works in c++ while the 1st casting doesnt work in c++. The error message for (1) show is invalid conversion from void* to void*() in c++.
The problem is that dlsym returns a void* pointer. While in C, any such pointer is implicitly convertible into any other (object!) pointer type (for comparison: casting the result of malloc), this is not the case in C++ (here you need to cast the result of malloc).
For function pointers, though, this cast is not implicit even in C. Apparently, as your code compiles, your compiler added this implicit cast for function pointers, too, as a compiler extension (in consistency with object pointers); however, for being fully standard compliant, it actually should issue a diagnostic, e. g. a compiler warning, mandated by C17 6.5.4.3:
Conversions that involve pointers, other than where permitted by the constraints of 6.5.16.1, shall be specified by means of an explicit cast.
But now instead of casting the target pointer, you rather should cast the result of dlsym into the appropriate function pointer:
int (*funcPtr)() = reinterpret_cast<int(*)()>(dlsym(some symbol..));
or simpler:
auto funcPtr = reinterpret_cast<int(*)()>(dlsym(some symbol..));
or even:
int (*funcPtr)() = reinterpret_cast<decltype(funcPtr)>(dlsym(some symbol..));
(Last one especially interesting if funcPtr has been declared previously.)
Additionally, you should prefer C++ over C style casts, see my variants above. You get more precise control over what type of cast actually occurs.
Side note: Have you noticed that the two function pointers declared in your question differ in return type (void* vs. int)? Additionally, a function not accepting any arguments in C needs to be declared as void f(void); accordingly, function pointers: void*(*f)(void). C++ allows usage of void for compatibility, while skipping void in C has totally different meaning: The function could accept anything, you need to know from elsewhere (documentation) how many arguments of which type actually can be passed to.
Strictly speaking, none of this code is valid in either language.
void *(*funcPtr)() = dlsym(some symbol..) ;
dlsym returns type void* so this is not valid in either language. The C standard only allows for implicit conversions between void* and pointers to object type, see C17 6.3.2.3:
A pointer to void may be converted to or from a pointer to any object type. A pointer to
any object type may be converted to a pointer to void and back again; the result shall
compare equal to the original pointer.
Specifically, your code is a language violation of one of the rules of simple assignment, C17 6.5.16.1, emphasis mine:
the left operand has atomic, qualified, or unqualified pointer type, and (considering
the type the left operand would have after lvalue conversion) one operand is a pointer
to an object type, and the other is a pointer to a qualified or unqualified version of
void, and the type pointed to by the left has all the qualifiers of the type pointed to
by the right;
The reason why it might compile on certain compilers is overly lax compiler settings. If you are for example using gcc, you need to compile with gcc -std=c17 -pedantic-errors to get a language compliant compiler, otherwise it defaults to the non-standard "gnu11" language.
int (*funcPtr)();
*(void**)(&funcPtr) = dlsym(some symbol..) ; // (2) works in c++
Here you explicitly force the function pointer to become type void**. The cast is fine syntax-wise, but this may or may not be a valid pointer conversion on the specific compiler. Again, conversions between object pointers and function pointers are not supported by the C or C++ standard, but you are relying on non-standard extensions. Formally this code invokes undefined behavior in both C and C++.
In practice, lots of compilers have well-defined behavior here, because most systems have the same representation of object pointers and function pointers.
Given that the conversion is valid, you can of course de-reference a void** to get a void* and then assign another void* to it.
The following code compiles without any error in VSC++2017 and doesn't compile in gcc 7.3.0 (error: invalid static_cast from type ‘int(int)’ to type ‘void*’
void* p = static_cast<void*>(func))
#include <iostream>
int func(int x) { return 2 * x; }
int main() {
void* p = static_cast<void*>(func);
return 0;
}
Functions are implicitly convertible only to function pointers. A function pointer is not a pointer in the strict meaning of the word in the language, which refers only to pointers to objects.
Function pointers cannot be converted to void* using static_cast. The shown program is ill-formed. If a compiler does not warn, then it fails to conform to the standard. Failing to compile an ill-formed program does not violate the standard.
On systems where void* is guaranteed to be able to point to a function (such as POSIX), you can use reinterpret_cast instead:
void* p = reinterpret_cast<void*>(func);
But this is not portable to systems that lack the guarantee. (I know of no system that has a C++ compiler and does not have this guarantee, but that does not mean such system does not exist).
Standard quote:
[expr.reinterpret.cast]
Converting a function pointer to an object pointer type or vice versa is conditionally-supported. The meaning
of such a conversion is implementation-defined, except that if an implementation supports conversions in both
directions, converting a prvalue of one type to the other type and back, possibly with different cv-qualification,
shall yield the original pointer value.
Note that this conditional support does not extend to pointers to member functions. Pointers to member functions are not function pointers.
Code sample:
struct name
{
int a, b;
};
int main()
{
&(((struct name *)NULL)->b);
}
Does this cause undefined behaviour? We could debate whether it "dereferences null", however C11 doesn't define the term "dereference".
6.5.3.2/4 clearly says that using * on a null pointer causes undefined behaviour; however it doesn't say the same for -> and also it does not define a -> b as being (*a).b ; it has separate definitions for each operator.
The semantics of -> in 6.5.2.3/4 says:
A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.
However, NULL does not point to an object, so the second sentence seems underspecified.
Also relevant might be 6.5.3.2/1:
Constraints:
The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
However I feel that the bolded text is defective and should read lvalue that potentially designates an object , as per 6.3.2.1/1 (definition of lvalue) -- C99 messed up the definition of lvalue, so C11 had to rewrite it and perhaps this section got missed.
6.3.2.1/1 does say:
An lvalue is an expression (with an object type other than void) that potentially
designates an object; if an lvalue does not designate an object when it is evaluated, the
behavior is undefined
however the & operator does evaluate its operand. (It doesn't access the stored value but that is different).
This long chain of reasoning seems to suggest that the code causes UB however it is fairly tenuous and it's not clear to me what the writers of the Standard intended. If in fact they intended anything, rather than leaving it up to us to debate :)
From a lawyer point of view, the expression &(((struct name *)NULL)->b); should lead to UB, since you could not find a path in which there would be no UB. IMHO the root cause is that at a moment you apply the -> operator on an expression that does not point to an object.
From a compiler point of view, assuming the compiler programmer was not overcomplicated, it is clear that the expression returns the same value as offsetof(name, b) would, and I'm pretty sure that provided it is compiled without error any existing compiler will give that result.
As written, we could not blame a compiler that would note that in the inner part you use operator -> on an expression than cannot point to an object (since it is null) and issue a warning or an error.
My conclusion is that until there is a special paragraph saying that provided it is only to take its address it is legal do dereference a null pointer, this expression is not legal C.
Yes, this use of -> has undefined behavior in the direct sense of the English term undefined.
The behavior is only defined if the first expression points to an object and not defined (=undefined) otherwise. In general you shouldn't search more in the term undefined, it means just that: the standard doesn't provide a meaning for your code. (Sometimes it points explicitly to such situations that it doesn't define, but this doesn't change the general meaning of the term.)
This is a slackness that is introduced to help compiler builders to deal with things. They may defined a behavior, even for the code that you are presenting. In particular, for a compiler implementation it is perfectly fine to use such code or similar for the offsetof macro. Making this code a constraint violation would block that path for compiler implementations.
Let's start with the indirection operator *:
6.5.3.2 p4:
The unary * operator denotes indirection. If the operand points to a function, the result is
a function designator; if it points to an object, the result is an lvalue designating the
object. If the operand has type "pointer to type", the result has type "type". If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined. 102)
*E, where E is a null pointer, is undefined behavior.
There is a footnote that states:
102) Thus, &*E is equivalent to E (even if E is a null pointer), and &(E1[E2]) to ((E1)+(E2)). It is
always true that if E is a function designator or an lvalue that is a valid operand of the unary &
operator, *&E is a function designator or an lvalue equal to E. If *P is an lvalue and T is the name of
an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points.
Which means that &*E, where E is NULL, is defined, but the question is whether the same is true for &(*E).m, where E is a null pointer and its type is a struct that has a member m?
C Standard doesn't define that behavior.
If it were defined, new problems would arise, one of which is listed below. C Standard is correct to keep it undefined, and provides a macro offsetof that handles the problem internally.
6.3.2.3 Pointers
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant. 66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
This means that an integer constant expression with the value 0 is converted to a null pointer constant.
But the value of a null pointer constant is not defined as 0. The value is implementation defined.
7.19 Common definitions
The macros are
NULL
which expands to an implementation-defined null pointer constant
This means C allows an implementation where the null pointer will have a value where all bits are set and using member access on that value will result in an overflow which is undefined behavior
Another problem is how do you evaluate &(*E).m? Do the brackets apply and is * evaluated first. Keeping it undefined solves this problem.
First, let's establish that we need a pointer to an object:
6.5.2.3 Structure and union members
4 A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.96) If the first expression is a pointer to
a qualified type, the result has the so-qualified version of the type of the designated
member.
Unfortunately, no null pointer ever points to an object.
6.3.2.3 Pointers
3 An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
Result: Undefined Behavior.
As a side-note, some other things to chew over:
6.3.2.3 Pointers
4 Conversion of a null pointer to another pointer type yields a null pointer of that type.
Any two null pointers shall compare equal.
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.
So even if the UB should happen to be benign this time, it might still result in some totally unexpected number.
Nothing in the C standard would impose any requirements on what a system could do with the expression. It would, when the standard was written, have been perfectly reasonable for it to to cause the following sequence of events at runtime:
Code loads a null pointer into the addressing unit
Code asks the addressing unit to add the offset of field b.
The addressing unit trigger a trap when attempting to add an integer to a null pointer (which should for robustness be a run-time trap, even though many systems don't catch it)
The system starts executing essentially random code after being dispatched through a trap vector that was never set because code to set it would have wasted been a waste of memory, as addressing traps shouldn't occur.
The very essence of what Undefined Behavior meant at the time.
Note that most of the compilers that have appeared since the early days of C would regard the address of a member of an object located at a constant address as being a compile-time constant, but I don't think such behavior was mandated then, nor has anything been added to the standard which would mandate that compile-time address calculations involving null pointers be defined in cases where run-time calculations would not.
No. Let's take this apart:
&(((struct name *)NULL)->b);
is the same as:
struct name * ptr = NULL;
&(ptr->b);
The first line is obviously valid and well defined.
In the second line, we calculate the address of a field relative to the address 0x0 which is perfectly legal as well. The Amiga, for example, had the pointer to the kernel in the address 0x4. So you could use a method like this to call kernel functions.
In fact, the same approach is used on the C macro offsetof (wikipedia):
#define offsetof(st, m) ((size_t)(&((st *)0)->m))
So the confusion here revolves around the fact that NULL pointers are scary. But from a compiler and standard point of view, the expression is legal in C (C++ is a different beast since you can overload the & operator).
What is the unary-& doing here?
int * a = 1990;
int result = &5[a];
If you were to print result you would get the value 2010.
You have to compile it with -fpermissive or it will stop due to errors.
In C, x [y] and y [x] are identical. So &5[a] is the same as &a[5].
&5[a] is the same as &a[5] and the same as a + 5. In your case it's undefined behavior because a points to nowhere.
C11 standard chapter 6.5.6 Additive operators/8 (the same in C++):
If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
"...unary & on numeric literal"?
Postfix operators in C always have higher priority than prefix ones. In case of &5[a], the [] has higher priority than the &. Which means that in &5[a] the unary & is not applied to "numeric literal" as you seem to incorrectly believe. It is applied to the entire 5[a] subexpression. I.e. &5[a] is equivalent to &(5[a]).
As for what 5[a] means - this is a beaten-to-death FAQ. Look it up.
And no, you don't have "to compile it with -fpermissive" (my compiler tells me it doesn't even know what -fpermissive is). You have to figure out that this
int * a = 1990;
is not legal code in either C or C++. If anything, it requires an explicit cast
int * a = (int *) 1990;
not some obscure switch of some specific compiler you happened to be using at the moment. The same applies to another illegal initialization in int result = &5[a].
Finally, even if we overlook the illegal code and the undefined behavior triggered by that 5[a], the behavior of this code will still be highly implementation-dependent. I.e. the answer is no, in general case you will not get 2010 in result.
You cannot apply the unary & operator to an integer literal, because a literal is not an lvalue.
Due to operator precedence, your code doesn't do that. Since the indexing operator [] binds more tightly than unary &, &5[a] is equivalent to &(5[a]).
Here's a program similar to yours, except that it's valid code, not requiring -fpermissive to compile:
#include <stdio.h>
int main(void) {
int arr[6];
int *ptr1 = arr;
int *ptr2 = &5[ptr1];
printf("%p %p\n", ptr1, ptr2);
}
As explained in this question and my answer, the indexing operator is commutative (because it's defined in terms of addition, and addition is commutative), so 5[a] is equivalent to a[5]. So the expression &5[ptr1] computes the address of element 5 of arr.
In your program:
int * a = 1990;
int result = &5[a];
the initialization of a is invalid because a is of type int* and 1990 is of type int, and there is no implicit conversion from int to int*. Likewise, the initialization of result is invalid because &5[a] is of type int*. Apparently -fpermissive causes the compiler to violate the rules of the language and permit these invalid implicit conversions.
At least in the version of gcc I'm using, the -fpermissive option is valid only for C++ and Objective-C, not for C. In C, gcc permits such implicit conversions (with a warning) anyway. I strongly recommend not using this option. (Your question is tagged both C and C++. Keep in mind that C and C++ are two distinct, though closely related, languages. They happen to behave similarly in this case, but it's usually best to pick one language or the other.)