C++ function returning function - c++

Where in the standard are functions returning functions disallowed? I understand they are conceptually ridiculous, but it seems to me that the grammar would allow them. According to this webpage, a "noptr-declarator [is] any valid declarator" which would include the declarator of a function:
int f()();
Regarding the syntax.
It seems to me that the syntax, as spelled out in [dcl.decl], allows
int f(char)(double)
which could be interpreted as the function f that takes a char and returns a function with same signature as int g(double).
1 declarator:
2 ptr-declarator
3 noptr-declarator parameters-and-qualifiers trailing-return-type
4 ptr-declarator:
5 noptr-declarator
6 ptr-operator ptr-declarator
7 noptr-declarator:
8 declarator-id attribute-specifier-seq opt
9 noptr-declarator parameters-and-qualifiers
10 noptr-declarator [ constant-expression opt ] attribute-specifier-seq opt
11 ( ptr-declarator )
12 parameters-and-qualifiers:
13 ( parameter-declaration-clause ) cv-qualifier-seqAfter
Roughly speaking, after
1->2, 2=4, 4->6, 4->6
you should have
ptr-operator ptr-operator ptr-operator
Then, use 4->5, 5=7, 7->8 for the first declarator; use 4->5, 5=7, 7->9 for the second and third declarators.

From [dcl.fct], pretty explicitly:
Functions shall not have a return type of type array or function, although they may have a return type of
type pointer or reference to such things. There shall be no arrays of functions, although there can be arrays
of pointers to functions.
With C++11, you probably just want:
std::function<int()> f();
std::function<int(double)> f(char);
There is some confusion regarding the C++ grammar. The statement int f(char)(double); can be parsed according to the grammar. Here is a parse tree:
Furthermore such a parse is even meaningful based on [dcl.fct]/1:
In a declaration T D where D has the form
D1 ( parameter-declaration-clause ) cv-qualifier-seqopt
ref-qualifieropt exception-specificationopt attribute-specifier-seqopt
and the type of the contained declarator-id in the declaration T D1 is “derived-declarator-type-list T”, the
type of the declarator-id in D is “derived-declarator-type-list function of (parameter-declaration-clause ) cv-qualifier-seqopt
ref-qualifieropt returning T”.
In this example T == int, D == f(char)(double), D1 == f(char). The type of the declarator-id in T D1 (int f(char)) is "function of (char) returning int". So derived-declarator-type-list is "function of (char) returning". Thus, the type of f would be read as "function of (char) returning function of (double) returning int."
It's ultimately much ado about nothing, as this is an explicitly disallowed declarator form. But not by the grammar.

With C++11 (but not previous versions of C++) you can not only return C-like function pointers, but also C++ closures, notably with anonymous functions. See also std::function
The standard disallows (semantically, not syntactically - so it is not a question of grammar ; see Barry's answer for the citation) returning functions (and also disallow sizeof on functions!) but permits to return function pointers.
BTW, I don't think that you could return entire functions. What would that mean? How would you implement that? Practically speaking, a function is some code block, and its name is (like for arrays) a pointer to the start of the function's machine code.
A nice trick might be to build (using mechanisms outside of the C++ standard) a function at runtime (and then handling its function pointer). Some external libraries might permit that: you could use a JIT library (e.g. asmjit, gccjit, LLVM ...) or simply generate C++ code, then compile and dlopen & dlsym it on POSIX systems, etc.
PS. You are probably right in understanding that the C++11 grammar (the EBNF rules in the standard) does not disallow returning functions. It is a semantic rule stated in plain English which disallows that (it is not any grammar rule). I mean that the EBNF alone would allow:
// semantically wrong... but perhaps not syntactically
typedef int sigfun_T(std::string);
sigfun_T foobar(int);
and it is for semantics reasons (not because of EBNF rules) that a compiler is rightly rejecting the above code. Practically speaking, the symbol table matters a lot to the C++ compiler (and it is not syntax or context-free grammar).
The sad fact about C++ is that (for legacy reasons) its grammar (alone) is very ambiguous. Hence C++11 is difficult to read (for humans), difficult to write (for developers), difficult to parse (for compilers), ....

using namespace std;
auto hello()
{
char name[] = "asdf";
return [&]() -> void
{
printf("%s\n", name);
};
}
int main()
{
hello()();
auto r = hello();
r();
return 0;
}

Actually in C one cannot pass or return function. Only a pointer/address of the function can be passed/returned, which conceptually is pretty close. To be honest, thanks to possiblity of ommiting & and * with function pointers one shouldn't really care if function or pointer is passed (unless it contains any static data). Here is simple function pointer declaration:
void (*foo)();
foo is pointer to function returning void and taking no arguments.
In C++ it is not that much different. One can still use C-style function pointers or new useful std::function object for all callable creations. C++ also adds lambda expressions which are inline functions which work somehow similar to closures in functional languages. It allows you not to make all passed functions global.
And finally - returning function pointer or std::function may seem ridiculous, but it really is not. For example, the state machine pattern is (in most cases) based on returning pointer to function handling next state or callable next state object.

The formal grammar of C actually disallows returning functions but, one can always return a function pointer which, for all intents and purposes, seems like what you want to do:
int (*getFunc())(int, int) { … }
I am fixated in saying, the grammar, is a more fundamental explanation for the lack of support of such feature. The standard's rules are a latter concern.
If the grammar does not provide for a way to accomplish something, I don't think it matters what the semantics or the standard says for any given context-free language.

Unless you are returning a pointer or reference to a function which is ok, the alternative is returning a copy of a function. Now, think about what a copy of a function looks like, acts like, behaves like. That, first of all would be an array of bytes, which isn't allowed either, and second of all those bytes would be the equivalence of a piece of code literally returning a piece of code....nearly all heuristic virus scanners would consider that a virus because there would also be no way to verify the viability of the code being returned by the runtime system or even at compile time. Even if you could return an array, how would you return a function? The primary issue with returning an array (which would be a copy on the stack) is that the size is not known and so there's no way to remove it from the stack, and the same dilemma exists for functions (where the array would be the machine language binary code). Also, if you did return a function in that fashion, how would you turn around and call that function?
To sum up, the notion of returning a function rather than a point to a function fails because the notion of that is a unknown size array of machine code placed (copied) onto the stack. It's not something the C or C++ was designed to allow, and with code there is now way to turn around and call that function, especially i you wanted to pass arguments.
I hope this makes sense

in kinda sense,function_pointer is funciton its self,
and "trailing return type" is a good and hiding thing in c++11,i will rather write like this:
#include<iostream>
auto return0(void)->int
{return 0;}
auto returnf(void)->decltype(&return0)
{return &return0;}
/*Edit: or...
auto returnf(void)->auto (*)(void)->int
{return &return0;}
//that star means function pointer
*/
auto main(void)->int
{
std::cout<<returnf()();
return 0;
}
(if type unmatch,try apply address operator "&")
look,so much of sense.
but down side of this is that "auto" thing in head of function,
that`s non-removable and no type could match it (even lambda could match template type std::function<>)
but if you wish,macro could do magic(sometime curse) to you
#define func auto

Related

what is the meaning of (*(int (*)())a)()?

I am the beginner of learning C++. Today, I saw a pointer function like that
(*(int (*)())a)()
I was very confused with what the meaning of this and how I can understand it easily.
Let's add a typedef, to help make heads or tails out of it:
typedef int (*int_func_ptr)();
(*(int_func_ptr)a)();
So a is being cast to a function pointer of a particular prototype, dereferenced (which is redundant), and then called.
int (*)() is a function pointer type that returns int and accepts no parameters.
I presume that a is a function pointer whose type "erases" the actual type (perhaps so that one can store a bunch of different function pointers in a vector) that we need to cast to this pointer type, so (int(*)())a) will perform that casting.
Afterwards we want to call the function, so the provided code dereferences the pointer * and then calls it with parenthesis ()
Example
I have a function foo that looks like this:
int foo()
{
std::cout << "foo\n";
return 1;
}
And then via reinterpret_cast I get a pointer to a function that instead returns void (for type-erasure reasons):
void(*fptr)() = reinterpret_cast<void(*)()>(&::foo); //§5.2.10/6
Later, I want to call that function, so I need to re-cast it back to its original type, and then call it:
(*(int (*)())fptr)(); // prints `foo`
Demo
De-referencing it is actually unecessary and the following is equivalent:
((int (*)())fptr)();
The explanation for why they're equivalent boils down to "The standard says that both a function type and a function pointer type can be callable"
If you're standard savvy, you can check out §5.2.2[expr.call] that states
A function call is a postfix expression followed by parentheses containing a possibly empty, comma-separated list of initializer-clauses which constitute the arguments to the function. The postfix expression shall have function type or pointer to function type
StoryTeller and Andy have given correct answers. I'll give general rules, additionally.
StoryTeller makes a correct and useful typedef with typedef int (*int_func_ptr)();, which defines a function pointer type. Two things are to remember here.
The general language design for typedefs: it exactly mimics a declaration of an object of the given type! Simply prefixing a declaration with typedef makes the declared identifier a type alias instead of a variable. That is, if int i; declares an integer variable, typedefint i; declares i as a synonym for the type int. So declaring a function pointer variable would simply read int (*int_func_ptr)();. Prefixing this with the typedef, as StoryTeller did, makes it a type alias instead.
Casts of function pointers are notoriously confusing. One reason are the necessary parentheses:
Parentheses serve several unrelated purposes:
They group expressions to indicate subexpression precedence, as in (a+b) * c.
They delimit function arguments, both in declarations and in calls.
They delimit type names used in casts.
We have parentheses for all three purposes here!
The operator precedence is "unnatural" for function pointer declarations. This is so, of course, because they are natural for the much more frequent uses: Without parentheses, the declaration would be the more familiar looking int *int_func();, which declares a function proper which returns an int pointer. The reason is that the argument parentheses have higher priority than the dereferencing asterisk, so that in order to infer the type we mentally execute the call first, and not the dereferencing. And something that can be called is a function.1 The result of the call can be dereferenced, and that result is an int.
Compare that to the original int (*int_func_ptr)();: The additional parentheses force us to dereference first, so the identifier must be a pointer of some kind. The result of the dereferencing can be called, so it must be a function; the result of the call is an int.
Another reason why function pointer declarations or typedefs look unnatural is that the declared identifier tends to be at the center of the expression. The reason is that operators to the left and to the right of the identifier are applied (the dereferencing, the function call, and then there is finally the result type declaration all the way to the left).
The next rule is about constructing casts. The type names used in casts are constructed from corresponding variable declarations simply by omitting the variable name! This is obvious in the simple cases: since int i declares an int variable, (int) without the i is the corresponding cast.
If we apply that to the function pointer type, int (*int_func_ptr)() is transformed to the weird-looking (int (*)()) by omitting the variable name and putting the type name in parentheses as required for a cast. Note that the parentheses which force precedence of the asterisk are still there, even though there is nothing to dereference! Without them, (int *()) would be derived from int *int_func() and therefore denote a function which returns a pointer.2
It is perhaps surprising that there is exactly one place in a declaration where a variable name can syntactically be, so that even very complicated type expressions in casts are well-defined: It is this one place where a variable name fits which defines the cast type.
With these rules, let's re-examine the original expression:
(*(int (*)())a)()
On the outermost level we have two pairs of parentheses. The second pair is empty and thus must be a function call operator. That implies that the operand to the left has function type:
*(int (*)())a
The operand is an expression in parentheses for precedence. It has three parts: The asterisk, an expression in parentheses, and an identifier a. Since there is no operator between the parenthesized expression and the variable a, it must be a cast, actually the one we scrutinized above. * and the type cast have the same precedence and are evaluated right-to-left: first a is cast to a function pointer, and then * dereferences that in order to obtain the "function proper" (which are no real objects in C++). This fits because the function call operator from above will be applied to this result.
1 That C permits calling function pointers directly as well, without dereferencing first, is syntactic sugar and not considered in declarations.
2 While the expression is syntactically valid, such a cast to function is not allowed in C or C++.

Difference between f and &f [duplicate]

It's interesting that using the function name as a function pointer is equivalent to applying the address-of operator to the function name!
Here's the example.
typedef bool (*FunType)(int);
bool f(int);
int main() {
FunType a = f;
FunType b = &a; // Sure, here's an error.
FunType c = &f; // This is not an error, though.
// It's equivalent to the statement without "&".
// So we have c equals a.
return 0;
}
Using the name is something we already know in array. But you can't write something like
int a[2];
int * b = &a; // Error!
It seems not consistent with other parts of the language. What's the rationale of this design?
This question explains the semantics of such behavior and why it works. But I'm interested in why the language was designed this way.
What's more interesting is the function type can be implicitly converted to pointer to itself when using as a parameter, but will not be converted to a pointer to itself when using as a return type!
Example:
typedef bool FunctionType(int);
void g(FunctionType); // Implicitly converted to void g(FunctionType *).
FunctionType h(); // Error!
FunctionType * j(); // Return a function pointer to a function
// that has the type of bool(int).
Since you specifically ask for the rationale of this behavior, here's the closest thing I can find (from the ANSI C90 Rationale document - http://www.lysator.liu.se/c/rat/c3.html#3-3-2-2):
3.3.2.2 Function calls
Pointers to functions may be used either as (*pf)() or as pf().
The latter construct, not sanctioned in the Base Document, appears in
some present versions of C, is unambiguous, invalidates no old code,
and can be an important shorthand. The shorthand is useful for
packages that present only one external name, which designates a
structure full of pointers to object s and functions : member
functions can be called as graphics.open(file) instead of
(*graphics.open)(file). The treatment of function designators can
lead to some curious , but valid , syntactic forms . Given the
declarations :
int f ( ) , ( *pf ) ( ) ;
then all of the following expressions are valid function calls :
( &f)(); f(); (*f)(); (**f)(); (***f)();
pf(); (*pf)(); (**pf)(); (***pf)();
The first expression on each line was discussed in the previous
paragraph . The second is conventional usage . All subsequent
expressions take advantage of the implicit conversion of a function
designator to a pointer value , in nearly all expression contexts .
The Committee saw no real harm in allowing these forms ; outlawing
forms like (*f)(), while still permitting *a (for int a[]),
simply seemed more trouble than it was worth .
Basically, the equivalence between function designators and function pointers was added to make using function pointers a little more convenient.
It's a feature inherited from C.
In C, it's allowed primarily because there's not much else the name of a function, by itself, could mean. All you can do with an actual function is call it. If you're not calling it, the only thing you can do is take the address. Since there's no ambiguity, any time a function name isn't followed by a ( to signify a call to the function, the name evaluates to the address of the function.
That actually is somewhat similar to one other part of the language -- the name of an array evaluates to the address of the first element of the array except in some fairly limited circumstances (being used as the operand of & or sizeof).
Since C allowed it, C++ does as well, mostly because the same remains true: the only things you can do with a function are call it or take its address, so if the name isn't followed by a ( to signify a function call, then the name evaluates to the address with no ambiguity.
For arrays, there is no pointer decay when the address-of operator is used:
int a[2];
int * p1 = a; // No address-of operator, so type is int*
int (*p2)[2] = &a; // Address-of operator used, so type is int (*)[2]
This makes sense because arrays and pointers are different types, and it is possible for example to return references to arrays or pass references to arrays in functions.
However, with functions, what other type could be possible?
void foo(){}
&foo; // #1
foo; // #2
Let's imagine that only #2 gives the type void(*)(), what would the type of &foo be? There is no other possibility.

Why is using the function name as a function pointer equivalent to applying the address-of operator to the function name?

It's interesting that using the function name as a function pointer is equivalent to applying the address-of operator to the function name!
Here's the example.
typedef bool (*FunType)(int);
bool f(int);
int main() {
FunType a = f;
FunType b = &a; // Sure, here's an error.
FunType c = &f; // This is not an error, though.
// It's equivalent to the statement without "&".
// So we have c equals a.
return 0;
}
Using the name is something we already know in array. But you can't write something like
int a[2];
int * b = &a; // Error!
It seems not consistent with other parts of the language. What's the rationale of this design?
This question explains the semantics of such behavior and why it works. But I'm interested in why the language was designed this way.
What's more interesting is the function type can be implicitly converted to pointer to itself when using as a parameter, but will not be converted to a pointer to itself when using as a return type!
Example:
typedef bool FunctionType(int);
void g(FunctionType); // Implicitly converted to void g(FunctionType *).
FunctionType h(); // Error!
FunctionType * j(); // Return a function pointer to a function
// that has the type of bool(int).
Since you specifically ask for the rationale of this behavior, here's the closest thing I can find (from the ANSI C90 Rationale document - http://www.lysator.liu.se/c/rat/c3.html#3-3-2-2):
3.3.2.2 Function calls
Pointers to functions may be used either as (*pf)() or as pf().
The latter construct, not sanctioned in the Base Document, appears in
some present versions of C, is unambiguous, invalidates no old code,
and can be an important shorthand. The shorthand is useful for
packages that present only one external name, which designates a
structure full of pointers to object s and functions : member
functions can be called as graphics.open(file) instead of
(*graphics.open)(file). The treatment of function designators can
lead to some curious , but valid , syntactic forms . Given the
declarations :
int f ( ) , ( *pf ) ( ) ;
then all of the following expressions are valid function calls :
( &f)(); f(); (*f)(); (**f)(); (***f)();
pf(); (*pf)(); (**pf)(); (***pf)();
The first expression on each line was discussed in the previous
paragraph . The second is conventional usage . All subsequent
expressions take advantage of the implicit conversion of a function
designator to a pointer value , in nearly all expression contexts .
The Committee saw no real harm in allowing these forms ; outlawing
forms like (*f)(), while still permitting *a (for int a[]),
simply seemed more trouble than it was worth .
Basically, the equivalence between function designators and function pointers was added to make using function pointers a little more convenient.
It's a feature inherited from C.
In C, it's allowed primarily because there's not much else the name of a function, by itself, could mean. All you can do with an actual function is call it. If you're not calling it, the only thing you can do is take the address. Since there's no ambiguity, any time a function name isn't followed by a ( to signify a call to the function, the name evaluates to the address of the function.
That actually is somewhat similar to one other part of the language -- the name of an array evaluates to the address of the first element of the array except in some fairly limited circumstances (being used as the operand of & or sizeof).
Since C allowed it, C++ does as well, mostly because the same remains true: the only things you can do with a function are call it or take its address, so if the name isn't followed by a ( to signify a function call, then the name evaluates to the address with no ambiguity.
For arrays, there is no pointer decay when the address-of operator is used:
int a[2];
int * p1 = a; // No address-of operator, so type is int*
int (*p2)[2] = &a; // Address-of operator used, so type is int (*)[2]
This makes sense because arrays and pointers are different types, and it is possible for example to return references to arrays or pass references to arrays in functions.
However, with functions, what other type could be possible?
void foo(){}
&foo; // #1
foo; // #2
Let's imagine that only #2 gives the type void(*)(), what would the type of &foo be? There is no other possibility.

Why is the dereference operator (*) also used to declare a pointer?

I'm not sure if this is a proper programming question, but it's something that has always bothered me, and I wonder if I'm the only one.
When initially learning C++, I understood the concept of references, but pointers had me confused. Why, you ask? Because of how you declare a pointer.
Consider the following:
void foo(int* bar)
{
}
int main()
{
int x = 5;
int* y = NULL;
y = &x;
*y = 15;
foo(y);
}
The function foo(int*) takes an int pointer as parameter. Since I've declared y as int pointer, I can pass y to foo, but when first learning C++ I associated the * symbol with dereferencing, as such I figured a dereferenced int needed to be passed. I would try to pass *y into foo, which obviously doesn't work.
Wouldn't it have been easier to have a separate operator for declaring a pointer? (or for dereferencing). For example:
void test(int# x)
{
}
In The Development of the C Language, Dennis Ritchie explains his reasoning thusly:
The second innovation that most clearly distinguishes C from its
predecessors is this fuller type structure and especially its
expression in the syntax of declarations... given an object of any
type, it should be possible to describe a new object that gathers
several into an array, yields it from a function, or is a pointer to
it.... [This] led to a
declaration syntax for names mirroring that of the expression syntax
in which the names typically appear. Thus,
int i, *pi, **ppi; declare an integer, a pointer to an integer, a
pointer to a pointer to an integer. The syntax of these declarations
reflects the observation that i, *pi, and **ppi all yield an int type
when used in an expression.
Similarly, int f(), *f(), (*f)(); declare
a function returning an integer, a function returning a pointer to an
integer, a pointer to a function returning an integer. int *api[10],
(*pai)[10]; declare an array of pointers to integers, and a pointer to
an array of integers.
In all these cases the declaration of a
variable resembles its usage in an expression whose type is the one
named at the head of the declaration.
An accident of syntax contributed to the perceived complexity of the
language. The indirection operator, spelled * in C, is syntactically a
unary prefix operator, just as in BCPL and B. This works well in
simple expressions, but in more complex cases, parentheses are
required to direct the parsing. For example, to distinguish
indirection through the value returned by a function from calling a
function designated by a pointer, one writes *fp() and (*pf)()
respectively. The style used in expressions carries through to
declarations, so the names might be declared
int *fp(); int (*pf)();
In more ornate but still realistic cases,
things become worse: int *(*pfp)(); is a pointer to a function
returning a pointer to an integer.
There are two effects occurring.
Most important, C has a relatively rich set of ways of describing
types (compared, say, with Pascal). Declarations in languages as
expressive as C—Algol 68, for example—describe objects equally hard to
understand, simply because the objects themselves are complex. A
second effect owes to details of the syntax. Declarations in C must be
read in an `inside-out' style that many find difficult to grasp.
Sethi [Sethi 81] observed that many of the nested
declarations and expressions would become simpler if the indirection
operator had been taken as a postfix operator instead of prefix, but
by then it was too late to change.
The reason is clearer if you write it like this:
int x, *y;
That is, both x and *y are ints. Thus y is an int *.
That is a language decision that predates C++, as C++ inherited it from C. I once heard that the motivation was that the declaration and the use would be equivalent, that is, given a declaration int *p; the expression *p is of type int in the same way that with int i; the expression i is of type int.
Because the committee, and those that developed C++ in the decades before its standardisation, decided that * should retain its original three meanings:
A pointer type
The dereference operator
Multiplication
You're right to suggest that the multiple meanings of * (and, similarly, &) are confusing. I've been of the opinion for some years that it they are a significant barrier to understanding for language newcomers.
Why not choose another symbol for C++?
Backwards-compatibility is the root cause... best to re-use existing symbols in a new context than to break C programs by translating previously-not-operators into new meanings.
Why not choose another symbol for C?
It's impossible to know for sure, but there are several arguments that can be — and have been — made. Foremost is the idea that:
when [an] identifier appears in an expression of the same form as the declarator, it yields an object of the specified type. {K&R, p216}
This is also why C programmers tend to[citation needed] prefer aligning their asterisks to the right rather than to the left, i.e.:
int *ptr1; // roughly C-style
int* ptr2; // roughly C++-style
though both varieties are found in programs of both languages, varyingly.
Page 65 of Expert C Programming: Deep C Secrets includes the following: And then, there is the C philosophy that the declaration of an object should look like its use.
Page 216 of The C Programming Language, 2nd edition (aka K&R) includes: A declarator is read as an assertion that when its identifier appears in an expression of the same form as the declarator, it yields an object of the specified type.
I prefer the way van der Linden puts it.
Haha, I feel your pain, I had the exact same problem.
I thought a pointer should be declared as &int because it makes sense that a pointer is an address of something.
After a while I thought for myself, every type in C has to be read backwards, like
int * const a
is for me
a constant something, when dereferenced equals an int.
Something that has to be dereferenced, has to be a pointer.

Is it legal to cast function pointers? [duplicate]

Let's say I have a function that accepts a void (*)(void*) function pointer for use as a callback:
void do_stuff(void (*callback_fp)(void*), void* callback_arg);
Now, if I have a function like this:
void my_callback_function(struct my_struct* arg);
Can I do this safely?
do_stuff((void (*)(void*)) &my_callback_function, NULL);
I've looked at this question and I've looked at some C standards which say you can cast to 'compatible function pointers', but I cannot find a definition of what 'compatible function pointer' means.
As far as the C standard is concerned, if you cast a function pointer to a function pointer of a different type and then call that, it is undefined behavior. See Annex J.2 (informative):
The behavior is undefined in the following circumstances:
A pointer is used to call a function whose type is not compatible with the pointed-to
type (6.3.2.3).
Section 6.3.2.3, paragraph 8 reads:
A pointer to a function of one type may be converted to a pointer to a function of another
type and back again; the result shall compare equal to the original pointer. If a converted
pointer is used to call a function whose type is not compatible with the pointed-to type,
the behavior is undefined.
So in other words, you can cast a function pointer to a different function pointer type, cast it back again, and call it, and things will work.
The definition of compatible is somewhat complicated. It can be found in section 6.7.5.3, paragraph 15:
For two function types to be compatible, both shall specify compatible return types127.
Moreover, the parameter type lists, if both are present, shall agree in the number of
parameters and in use of the ellipsis terminator; corresponding parameters shall have
compatible types. If one type has a parameter type list and the other type is specified by a
function declarator that is not part of a function definition and that contains an empty
identifier list, the parameter list shall not have an ellipsis terminator and the type of each
parameter shall be compatible with the type that results from the application of the
default argument promotions. If one type has a parameter type list and the other type is
specified by a function definition that contains a (possibly empty) identifier list, both shall
agree in the number of parameters, and the type of each prototype parameter shall be
compatible with the type that results from the application of the default argument
promotions to the type of the corresponding identifier. (In the determination of type
compatibility and of a composite type, each parameter declared with function or array
type is taken as having the adjusted type and each parameter declared with qualified type
is taken as having the unqualified version of its declared type.)
127) If both function types are ‘‘old style’’, parameter types are not compared.
The rules for determining whether two types are compatible are described in section 6.2.7, and I won't quote them here since they're rather lengthy, but you can read them on the draft of the C99 standard (PDF).
The relevant rule here is in section 6.7.5.1, paragraph 2:
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Hence, since a void* is not compatible with a struct my_struct*, a function pointer of type void (*)(void*) is not compatible with a function pointer of type void (*)(struct my_struct*), so this casting of function pointers is technically undefined behavior.
In practice, though, you can safely get away with casting function pointers in some cases. In the x86 calling convention, arguments are pushed on the stack, and all pointers are the same size (4 bytes in x86 or 8 bytes in x86_64). Calling a function pointer boils down to pushing the arguments on the stack and doing an indirect jump to the function pointer target, and there's obviously no notion of types at the machine code level.
Things you definitely can't do:
Cast between function pointers of different calling conventions. You will mess up the stack and at best, crash, at worst, succeed silently with a huge gaping security hole. In Windows programming, you often pass function pointers around. Win32 expects all callback functions to use the stdcall calling convention (which the macros CALLBACK, PASCAL, and WINAPI all expand to). If you pass a function pointer that uses the standard C calling convention (cdecl), badness will result.
In C++, cast between class member function pointers and regular function pointers. This often trips up C++ newbies. Class member functions have a hidden this parameter, and if you cast a member function to a regular function, there's no this object to use, and again, much badness will result.
Another bad idea that might sometimes work but is also undefined behavior:
Casting between function pointers and regular pointers (e.g. casting a void (*)(void) to a void*). Function pointers aren't necessarily the same size as regular pointers, since on some architectures they might contain extra contextual information. This will probably work ok on x86, but remember that it's undefined behavior.
I asked about this exact same issue regarding some code in GLib recently. (GLib is a core library for the GNOME project and written in C.) I was told the entire slots'n'signals framework depends upon it.
Throughout the code, there are numerous instances of casting from type (1) to (2):
typedef int (*CompareFunc) (const void *a,
const void *b)
typedef int (*CompareDataFunc) (const void *b,
const void *b,
void *user_data)
It is common to chain-thru with calls like this:
int stuff_equal (GStuff *a,
GStuff *b,
CompareFunc compare_func)
{
return stuff_equal_with_data(a, b, (CompareDataFunc) compare_func, NULL);
}
int stuff_equal_with_data (GStuff *a,
GStuff *b,
CompareDataFunc compare_func,
void *user_data)
{
int result;
/* do some work here */
result = compare_func (data1, data2, user_data);
return result;
}
See for yourself here in g_array_sort(): http://git.gnome.org/browse/glib/tree/glib/garray.c
The answers above are detailed and likely correct -- if you sit on the standards committee. Adam and Johannes deserve credit for their well-researched responses. However, out in the wild, you will find this code works just fine. Controversial? Yes. Consider this: GLib compiles/works/tests on a large number of platforms (Linux/Solaris/Windows/OS X) with a wide variety of compilers/linkers/kernel loaders (GCC/CLang/MSVC). Standards be damned, I guess.
I spent some time thinking about these answers. Here is my conclusion:
If you are writing a callback library, this might be OK. Caveat emptor -- use at your own risk.
Else, don't do it.
Thinking deeper after writing this response, I would not be surprised if the code for C compilers uses this same trick. And since (most/all?) modern C compilers are bootstrapped, this would imply the trick is safe.
A more important question to research: Can someone find a platform/compiler/linker/loader where this trick does not work? Major brownie points for that one. I bet there are some embedded processors/systems that don't like it. However, for desktop computing (and probably mobile/tablet), this trick probably still works.
The point really isn't whether you can. The trivial solution is
void my_callback_function(struct my_struct* arg);
void my_callback_helper(void* pv)
{
my_callback_function((struct my_struct*)pv);
}
do_stuff(&my_callback_helper);
A good compiler will only generate code for my_callback_helper if it's really needed, in which case you'd be glad it did.
You have a compatible function type if the return type and parameter types are compatible - basically (it's more complicated in reality :)). Compatibility is the same as "same type" just more lax to allow to have different types but still have some form of saying "these types are almost the same". In C89, for example, two structs were compatible if they were otherwise identical but just their name was different. C99 seem to have changed that. Quoting from the c rationale document (highly recommended reading, btw!):
Structure, union, or enumeration type declarations in two different translation units do not formally declare the same type, even if the text of these declarations come from the same include file, since the translation units are themselves disjoint. The Standard thus specifies additional compatibility rules for such types, so that if two such declarations are sufficiently similar they are compatible.
That said - yeah strictly this is undefined behavior, because your do_stuff function or someone else will call your function with a function pointer having void* as parameter, but your function has an incompatible parameter. But nevertheless, i expect all compilers to compile and run it without moaning. But you can do cleaner by having another function taking a void* (and registering that as callback function) which will just call your actual function then.
As C code compiles to instruction which do not care at all about pointer types, it's quite fine to use the code you mention. You'd run into problems when you'd run do_stuff with your callback function and pointer to something else then my_struct structure as argument.
I hope I can make it clearer by showing what would not work:
int my_number = 14;
do_stuff((void (*)(void*)) &my_callback_function, &my_number);
// my_callback_function will try to access int as struct my_struct
// and go nuts
or...
void another_callback_function(struct my_struct* arg, int arg2) { something }
do_stuff((void (*)(void*)) &another_callback_function, NULL);
// another_callback_function will look for non-existing second argument
// on the stack and go nuts
Basically, you can cast pointers to whatever you like, as long as the data continue to make sense at run-time.
Well, unless I understood the question wrong, you can just cast a function pointer this way.
void print_data(void *data)
{
// ...
}
((void (*)(char *)) &print_data)("hello");
A cleaner way would be to create a function typedef.
typedef void(*t_print_str)(char *);
((t_print_str) &print_data)("hello");
If you think about the way function calls work in C/C++, they push certain items on the stack, jump to the new code location, execute, then pop the stack on return. If your function pointers describe functions with the same return type and the same number/size of arguments, you should be okay.
Thus, I think you should be able to do so safely.
Void pointers are compatible with other types of pointer. It's the backbone of how malloc and the mem functions (memcpy, memcmp) work. Typically, in C (Rather than C++) NULL is a macro defined as ((void *)0).
Look at 6.3.2.3 (Item 1) in C99:
A pointer to void may be converted to or from a pointer to any incomplete or object type