Is it safe to "play" with parameter constness in extern "C" declarations?

Is it safe to "play" with parameter constness in extern "C" declarations? - c++

Suppose I'm using some C library which has a function:
int foo(char* str);
and I know for a fact that foo() does not modify the memory pointed to by str. It's just poorly written and doesn't bother to declare str being constant.
Now, in my C++ code, I currently have:
extern "C" int foo(char* str);
and I use it like so:
foo(const_cast<char*>("Hello world"));
My question: Is it safe - in principle, from a language-lawyering perspective, and in practice - for me to write:
extern "C" int foo(const char* str);
and skip the const_cast'ing?
If it is not safe, please explain why.
Note: I am specifically interested in the case of C++98 code (yes, woe is me), so if you're assuming a later version of the language standard, please say so.

Is it safe for me to write: and skip the const_cast'ing?
No.
If it is not safe, please explain why.
-- From language side:
After reading the dcl.link I think exactly how the interoperability works between C and C++ is not exactly specified, with many "no diagnostic required" cases. The most important part is:
Two declarations for a function with C language linkage with the same function name (ignoring the namespace names that qualify it) that appear in different namespace scopes refer to the same function.
Because they refer to the same function, I believe a sane assumption would be that the declaration of a identifier with C language linkage on C++ side has to be compatible with the declaration of that symbol on C side. In C++ there is no concept of "compatible types", in C++ two declarations have to be identical (after transformations), making the restriction actually more strict.
From C++ side, we read c++draft basic#link-11:
After all adjustments of types (during which typedefs are replaced by their definitions), the types specified by all declarations referring to a given variable or function shall be identical, [...]
Because the declaration int foo(const char *str) with C language linkage in a C++ translation unit is not identical to the declaration int foo(char *str) declared in C translation unit (thus it has C language linkage), the behavior is undefined (with famous "no diagnostic required").
From C side (I think this is not even needed - the C++ side is enough to make the program have undefined behavior. anyway), the most important part would be C99 6.7.5.3p15:
For two function types to be compatible, both shall specify compatible return types. Moreover, the parameter type lists, if both are present, shall agree in the number of parameters and in use of the ellipsis terminator; corresponding parameters shall have compatible types [...]
Because from C99 6.7.5.1p2:
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
and C99 6.7.3p9:
For two qualified types to be compatible, both shall have the identically qualified version of a compatible type [...]
So because char is not compatible with const char, thus const char * is not compatible with char *, thus int foo(const char *) is not compatible with int foo(char*). Calling such a function (C99 6.5.2.2p9) would be undefined behavior (you may see also C99 J.2)
-- From practical side:
I do not believe will be able to find a compiler+architecture combination where one translation unit sees int foo(const char *) and the other translation unit defines a function int foo(char *) { /* some stuff */ } and it would "not work".
Theoretically, an insane implementation may use a different register to pass a const char* argument and a different one to pass a char* argument, which I hope would be well documented in that insane architecture ABI and compiler. If that's so, wrong registers will be used for parameters, it will "not work".
Still, using a simple wrapper costs nothing:
static inline int foo2(const char *var) {
return foo(static_cast<char*>(var));
}

I think the base answer is:
Yes, you can cast off const even if the referenced object is itself const such as a string literal in the example.
Undefined behaviour is only specified to arise in the event of an attempt to modify the const object not as a result of the cast.
Those rules and their reason to exist is 'old'. I'm sure they predate C++98.
Contrast it with volatile where any attempt to access a volatile object through a non-volatile reference is undefined behaviour. I can only read 'access' as read and/or write here.
I won't repeat the other suggestions but here is the most paranoid solution.
It's paranoid not because the C++ semantics aren't clear. They are clear. At least if you accept something being undefined behaviour is clear!
But you've described it as 'poorly written' and you want to put some sandbags round it!
The paranoid solution relies on the fact that if you are passing a constant object it will be constant for the whole execution (if the program doesn't risk UB).
So make a single copy of "hello world" lower in the call-stack or even initialised as a file scope object. You can declare it static in a function and it will (with minimal overhead) only be constructed once.
This recovers almost all of the benefits of string literal. The lower down the call stack including file-scope (global you put it the better.
I don't know how long the lifetime of the pointed-to object passed to foo() needs to be.
So it needs to be at least low enough in the chain to satisfy that condition.
NB: C++98 has std::string but it won't quite do here because you're still forbidden for modifying the result of c_str().
Here the semantics are defined.
#include <cstring>
#include <iostream>
class pseudo_const{
public:
pseudo_const(const char*const cstr): str(NULL){
const size_t sz=strlen(cstr)+1;
str=new char[sz];
memcpy(str,cstr,sz);
}
//Returns a pointer to a life-time permanent copy of
//the string passed to the constructor.
//Modifying the string through this value will be reflected in all
// subsequent calls.
char* get_constlike() const {
return str;
}
~pseudo_const(){
delete [] str;
}
private:
char* str;
};
const pseudo_const str("hello world");
int main() {
std::cout << str.get_constlike() << std::endl;
return 0;
}

Related

Why member function parameter const mismatch allowed? [duplicate]

From the C++ Primer 5th Edition, it says:
int f(int){ /* can write to parameter */}
int f(const int){ /* cannot write to parameter */}
The two functions are indistinguishable. But as you know, the two functions really differ in how they can update their parameters.
Can someone explains to me?
EDIT
I think I didn't interpret my question well. What I really care is why C++ doesn't allow these two functions simultaneously as different function since they are really different as to "whether parameter can be written or not". Intuitively, it should be!
EDIT
The nature of pass by value is actually pass by copying argument values to parameter values. Even for references and pointers where thee copied values are addresses. From the caller's viewpoint, whether const or non-const is passed to the function does not influence values (and of course types of) copied to parameters.
The distinction between top-level const and low-level const matters when copying objects. More specifically, top-level const(not the case of low-level const) is ignored when copying objects since copying won't influence the copied object. It is immaterial whether the object copied to or copied from is const or not.
So for the caller, differentiating them is not necessary. Likely, from the function viewpoint, the top-level const parameters doesn't influence the interface and/or the functionality of function. The two function actually accomplish the same thing. Why bother implementing two copies?

allow these two functions simultaneously as different function since they are really different as to "whether parameter can be written or not". Intuitively, it should be!
Overloading of functions is based on the parameters the caller provides. Here, it's true that the caller may provide a const or non-const value but logically it should make no difference to the functionality that the called function provides. Consider:
f(3);
int x = 1 + 2;
f(x);
If f() does different thing in each of these situations, it would be very confusing! The programmer of this code calling f() can have a reasonable expectation of identical behaviour, freely adding or removing variables that pass parameters without it invalidating the program. This safe, sane behaviour is the point of departure that you'd want to justify exceptions to, and indeed there is one - behaviours can be varied when the function's overloaded ala:
void f(const int&) { ... }
void f(int&) { ... }
So, I guess this is what you find non-intuitive: that C++ provides more "safety" (enforced consistent behaviour through supporting only a single implementation) for non-references than references.
The reasons I can think of are:
So when a programmer knows a non-const& parameter will have a longer lifetime, they can select an optimal implementation. For example, in the code below it may be faster to return a reference to a T member within F, but if F is a temporary (which it might be if the compiler matches const F&) then a by-value return is needed. This is still pretty dangerous as the caller has to be aware that the returned reference is only valid as long as the parameter's around.
T f(const F&);
T& f(F&); // return type could be by const& if more appropriate
propagation of qualifiers like const-ness through function calls as in:
const T& f(const F&);
T& f(F&);
Here, some (presumably F member-) variable of type T is being exposed as const or non-const based on the const-ness of the parameter when f() is called. This type of interface might be chosen when wishing to extend a class with non-member functions (to keep the class minimalist, or when writing templates/algos usable on many classes), but the idea is similar to const member functions like vector::operator[](), where you want v[0] = 3 allowed on a non-const vector but not a const one.
When values are accepted by value they go out of scope as the function returns, so there's no valid scenario involving returning a reference to part of the parameter and wanting to propagate its qualifiers.
Hacking the behaviour you want
Given the rules for references, you can use them to get the kind of behaviour you want - you just need to be careful not to modify the by-non-const-reference parameter accidentally, so might want to adopt a practice like the following for the non-const parameters:
T f(F& x_ref)
{
F x = x_ref; // or const F is you won't modify it
...use x for safety...
}
Recompilation implications
Quite apart from the question of why the language forbids overloading based on the const-ness of a by-value parameter, there's the question of why it doesn't insist on consistency of const-ness in the declaration and definition.
For f(const int) / f(int)... if you are declaring a function in a header file, then it's best NOT to include the const qualifier even if the later definition in an implementation file will have it. This is because during maintenance the programmer may wish to remove the qualifier... removing it from the header may trigger a pointless recompilation of client code, so it's better not to insist they be kept in sync - and indeed that's why the compiler doesn't produce an error if they differ. If you just add or remove const in the function definition, then it's close to the implementation where the reader of the code might care about the constness when analysing the function behaviour. If you have it const in both header and implementation file, then the programmer wishes to make it non-const and forgets or decides not to update the header in order to avoid client recompilation, then it's more dangerous than the other way around as it's possible the programmer will have the const version from the header in mind when trying to analyse the current implementation code leading to wrong reasoning about the function behaviour. This is all a very subtle maintainence issue - only really relevant to commercial programming - but that's the basis of the guideline not to use const in the interface. Further, it's more concise to omit it from the interface, which is nicer for client programmers reading over your API.

Since there is no difference to the caller, and no clear way to distinguish between a call to a function with a top level const parameter and one without, the language rules ignore top level consts. This means that these two
void foo(const int);
void foo(int);
are treated as the same declaration. If you were to provide two implementations, you would get a multiple definition error.
There is a difference in a function definition with top level const. In one, you can modify your copy of the parameter. In the other, you can't. You can see it as an implementation detail. To the caller, there is no difference.
// declarations
void foo(int);
void bar(int);
// definitions
void foo(int n)
{
n++;
std::cout << n << std::endl;
}
void bar(const int n)
{
n++; // ERROR!
std::cout << n << std::endl;
}
This is analogous to the following:
void foo()
{
int = 42;
n++;
std::cout << n << std::endl;
}
void bar()
{
const int n = 42;
n++; // ERROR!
std::cout << n << std::endl;
}

In "The C++ Programming Language", fourth edition, Bjarne Stroustrup writes (§12.1.3):
Unfortunately, to preserve C compatibility, a const is ignored at the highest level of an argument type. For example, this is two declarations of the same function:
void f(int);
void f(const int);
So, it seems that, contrarily to some of the other answers, this rule of C++ was not chosen because of the indistinguishability of the two functions, or other similar rationales, but instead as a less-than-optimal solution, for the sake of compatibility.
Indeed, in the D programming language, it is possible to have those two overloads. Yet, contrarily to what other answers to this question might suggest, the non-const overload is preferred if the function is called with a literal:
void f(int);
void f(const int);
f(42); // calls void f(int);
Of course, you should provide equivalent semantics for your overloads, but that is not specific to this overloading scenario, with nearly indistinguishable overloading functions.

As the comments say, inside the first function the parameter could be changed, if it had been named. It is a copy of the callee's int. Inside the second function, any changes to the parameter, which is still a copy of the callee's int, will result in a compile error. Const is a promise you won't change the variable.

A function is useful only from the caller's perspective.
Since there is no difference to the caller, there is no difference, for these two functions.

I think the indistinguishable is used in the terms of overloading and compiler, not in terms if they can be distinguished by caller.
Compiler does not distinguish between those two functions, their names are mangled in the same way. That leads to situation when compiler treats those two declarations as redefinition.

Answering this part of your question:
What I really care is why C++ doesn't allow these two functions simultaneously as different function since they are really different as to "whether parameter can be written or not". Intuitively, it should be!
If you think about it a little more, it isn't at all intinuitive - in fact, it doesn't make much sense. As everybody else has said, a caller is in no way influenced when a functon takes it's parameter by value and it doesn't care, either.
Now, let's suppose for a moment that overload resolution worked on top level const, too. Two declarations like this
int foo(const int);
int foo(int);
would declare two different functions. One of the problems would be which functions would this expression call: foo(42). The language rules could say that literals are const and that the const "overload" would be called in this case. But that's the least of a problem.
A programmer feeling sufficiently evil could write this:
int foo(const int i) { return i*i; }
int foo(int i) { return i*2; }
Now you'd have two overloads that are appear semanticaly equivalent to the caller but do completely different things. Now that would be bad. We'd be able to write interfaces that limit the user by the way they do things, not by what they offer.

Top-level const doesn't influence a function signature

From the C++ Primer 5th Edition, it says:
int f(int){ /* can write to parameter */}
int f(const int){ /* cannot write to parameter */}
The two functions are indistinguishable. But as you know, the two functions really differ in how they can update their parameters.
Can someone explains to me?
EDIT
I think I didn't interpret my question well. What I really care is why C++ doesn't allow these two functions simultaneously as different function since they are really different as to "whether parameter can be written or not". Intuitively, it should be!
EDIT
The nature of pass by value is actually pass by copying argument values to parameter values. Even for references and pointers where thee copied values are addresses. From the caller's viewpoint, whether const or non-const is passed to the function does not influence values (and of course types of) copied to parameters.
The distinction between top-level const and low-level const matters when copying objects. More specifically, top-level const(not the case of low-level const) is ignored when copying objects since copying won't influence the copied object. It is immaterial whether the object copied to or copied from is const or not.
So for the caller, differentiating them is not necessary. Likely, from the function viewpoint, the top-level const parameters doesn't influence the interface and/or the functionality of function. The two function actually accomplish the same thing. Why bother implementing two copies?

allow these two functions simultaneously as different function since they are really different as to "whether parameter can be written or not". Intuitively, it should be!
Overloading of functions is based on the parameters the caller provides. Here, it's true that the caller may provide a const or non-const value but logically it should make no difference to the functionality that the called function provides. Consider:
f(3);
int x = 1 + 2;
f(x);
If f() does different thing in each of these situations, it would be very confusing! The programmer of this code calling f() can have a reasonable expectation of identical behaviour, freely adding or removing variables that pass parameters without it invalidating the program. This safe, sane behaviour is the point of departure that you'd want to justify exceptions to, and indeed there is one - behaviours can be varied when the function's overloaded ala:
void f(const int&) { ... }
void f(int&) { ... }
So, I guess this is what you find non-intuitive: that C++ provides more "safety" (enforced consistent behaviour through supporting only a single implementation) for non-references than references.
The reasons I can think of are:
So when a programmer knows a non-const& parameter will have a longer lifetime, they can select an optimal implementation. For example, in the code below it may be faster to return a reference to a T member within F, but if F is a temporary (which it might be if the compiler matches const F&) then a by-value return is needed. This is still pretty dangerous as the caller has to be aware that the returned reference is only valid as long as the parameter's around.
T f(const F&);
T& f(F&); // return type could be by const& if more appropriate
propagation of qualifiers like const-ness through function calls as in:
const T& f(const F&);
T& f(F&);
Here, some (presumably F member-) variable of type T is being exposed as const or non-const based on the const-ness of the parameter when f() is called. This type of interface might be chosen when wishing to extend a class with non-member functions (to keep the class minimalist, or when writing templates/algos usable on many classes), but the idea is similar to const member functions like vector::operator[](), where you want v[0] = 3 allowed on a non-const vector but not a const one.
When values are accepted by value they go out of scope as the function returns, so there's no valid scenario involving returning a reference to part of the parameter and wanting to propagate its qualifiers.
Hacking the behaviour you want
Given the rules for references, you can use them to get the kind of behaviour you want - you just need to be careful not to modify the by-non-const-reference parameter accidentally, so might want to adopt a practice like the following for the non-const parameters:
T f(F& x_ref)
{
F x = x_ref; // or const F is you won't modify it
...use x for safety...
}
Recompilation implications
Quite apart from the question of why the language forbids overloading based on the const-ness of a by-value parameter, there's the question of why it doesn't insist on consistency of const-ness in the declaration and definition.
For f(const int) / f(int)... if you are declaring a function in a header file, then it's best NOT to include the const qualifier even if the later definition in an implementation file will have it. This is because during maintenance the programmer may wish to remove the qualifier... removing it from the header may trigger a pointless recompilation of client code, so it's better not to insist they be kept in sync - and indeed that's why the compiler doesn't produce an error if they differ. If you just add or remove const in the function definition, then it's close to the implementation where the reader of the code might care about the constness when analysing the function behaviour. If you have it const in both header and implementation file, then the programmer wishes to make it non-const and forgets or decides not to update the header in order to avoid client recompilation, then it's more dangerous than the other way around as it's possible the programmer will have the const version from the header in mind when trying to analyse the current implementation code leading to wrong reasoning about the function behaviour. This is all a very subtle maintainence issue - only really relevant to commercial programming - but that's the basis of the guideline not to use const in the interface. Further, it's more concise to omit it from the interface, which is nicer for client programmers reading over your API.

Since there is no difference to the caller, and no clear way to distinguish between a call to a function with a top level const parameter and one without, the language rules ignore top level consts. This means that these two
void foo(const int);
void foo(int);
are treated as the same declaration. If you were to provide two implementations, you would get a multiple definition error.
There is a difference in a function definition with top level const. In one, you can modify your copy of the parameter. In the other, you can't. You can see it as an implementation detail. To the caller, there is no difference.
// declarations
void foo(int);
void bar(int);
// definitions
void foo(int n)
{
n++;
std::cout << n << std::endl;
}
void bar(const int n)
{
n++; // ERROR!
std::cout << n << std::endl;
}
This is analogous to the following:
void foo()
{
int = 42;
n++;
std::cout << n << std::endl;
}
void bar()
{
const int n = 42;
n++; // ERROR!
std::cout << n << std::endl;
}

In "The C++ Programming Language", fourth edition, Bjarne Stroustrup writes (§12.1.3):
Unfortunately, to preserve C compatibility, a const is ignored at the highest level of an argument type. For example, this is two declarations of the same function:
void f(int);
void f(const int);
So, it seems that, contrarily to some of the other answers, this rule of C++ was not chosen because of the indistinguishability of the two functions, or other similar rationales, but instead as a less-than-optimal solution, for the sake of compatibility.
Indeed, in the D programming language, it is possible to have those two overloads. Yet, contrarily to what other answers to this question might suggest, the non-const overload is preferred if the function is called with a literal:
void f(int);
void f(const int);
f(42); // calls void f(int);
Of course, you should provide equivalent semantics for your overloads, but that is not specific to this overloading scenario, with nearly indistinguishable overloading functions.

As the comments say, inside the first function the parameter could be changed, if it had been named. It is a copy of the callee's int. Inside the second function, any changes to the parameter, which is still a copy of the callee's int, will result in a compile error. Const is a promise you won't change the variable.

A function is useful only from the caller's perspective.
Since there is no difference to the caller, there is no difference, for these two functions.

I think the indistinguishable is used in the terms of overloading and compiler, not in terms if they can be distinguished by caller.
Compiler does not distinguish between those two functions, their names are mangled in the same way. That leads to situation when compiler treats those two declarations as redefinition.

Answering this part of your question:
What I really care is why C++ doesn't allow these two functions simultaneously as different function since they are really different as to "whether parameter can be written or not". Intuitively, it should be!
If you think about it a little more, it isn't at all intinuitive - in fact, it doesn't make much sense. As everybody else has said, a caller is in no way influenced when a functon takes it's parameter by value and it doesn't care, either.
Now, let's suppose for a moment that overload resolution worked on top level const, too. Two declarations like this
int foo(const int);
int foo(int);
would declare two different functions. One of the problems would be which functions would this expression call: foo(42). The language rules could say that literals are const and that the const "overload" would be called in this case. But that's the least of a problem.
A programmer feeling sufficiently evil could write this:
int foo(const int i) { return i*i; }
int foo(int i) { return i*2; }
Now you'd have two overloads that are appear semanticaly equivalent to the caller but do completely different things. Now that would be bad. We'd be able to write interfaces that limit the user by the way they do things, not by what they offer.

Different declarations of the same function/global variable in two files

I have 2 questions regarding different declarations of the same function and global variable in two files in case of C and C++ as well.
Different function declarations
Consider the following code fragments:
file_1.c
void foo(int a);
int main(void)
{
foo('A');
}
file_2.c
#include <stdio.h>
void foo(char a)
{
printf("%c", a); //prints 'A' (gcc)
}
As we can see, the prototype differs from the definition located in
file_2.c, however, the function prints expected value.
If it comes to C++, the above program is invalid due to undefined
reference to foo(int) at link time. It's probably caused by
presence of other function signatures - in comparison with C, where
a function name doesn't contain any extra characters indicating the
type of function arguments.
But when it comes to C then what? Since the prototypes with the same
name have the same signature regardless of the number of arguments
and its types, linker won't issue an error. But which type
conversions are performed in here? Does it look like this: 'A' ->
int -> back to char? Or maybe this behavior is
undefined/implementation-defined ?
Different declarations of a global variable
We've got two files and two different declarations of the same
global variable:
file_1.c
#include <stdio.h>
extern int a;
int main(void)
{
printf("%d", a); //prints 65 (g++ and gcc)
}
file_2.c
char a = 'A';
Both in C and C++ the output is 65.
Though I'd like to know what both standards say about that kind of
situation.
In the C11 standard I've found the following fragment:
J.5.11 Multiple external definitions (Annex J.5 Common extensions)
There may be more than one external definition for the identifier of
an object, with or without the explicit use of the keyword extern; if
the definitions disagree, or more than one is initialized, the
behavior is undefined (6.9.2).
Notice that it refers to presence of two and more definitions, in
my code there is only one, so I'm not sure whether this article is a good point of reference in
this case...

Q1. According to C99 specification, section 6.5.2.2.9, it is an undefined behavior in C:
If the function is defined with a type that is not compatible with the type (of the expression) pointed to by the expression that denotes the called function, the behavior is undefined.
The expression "points to" a function taking an int, while the function is defined as taking a char.
Q2. The case with variables is also undefined behavior, because you are reading or assigning an int to/from char. Assuming 4-byte integers, this will access three bytes past the memory location where it is valid. You can test this by declaring more variables, like this:
char a = 'A';
char b = 'B';
char c = 'C';
char d = 'D';

That's why you put declarations into headers, so even a C compiler can catch the problem.
1)
The results of this is pretty much random; in your case, the "char" parameter might be passed as an int (like in a register, or even on the stack to keep alignment, or whatever). Or you got lucky due to endianess, which keeps the lowest order byte first.
2)
Likely to be a lucky outcome due to endianess and some added '0' bytes to fill up the segment. Again, don't rely on it.

Overloaded functions in C++ work because the compiler encodes each unique method and parameter list combination into a unique name for the linker. This encoding process is called mangling,
and the inverse process demangling.
But there is no such thing in C. When the compiler encounters a symbol (either a variable or function name) that is not defined in the current module, it assumes that it is defined in some other module, generates a linker symbol table entry, and leaves it for the linker to handle. In here we have no parameter checking.
And also if there is no type conversion in here. In main, you send a value to foo. Here it's assembly code :
movl $65, (%esp)
call foo
And foo reads it by taking it away from stack. Since it's input value defined as char, It store the input value in al register ( one byte ):
movb %al, -4(%ebp)
So for given inputs greater than 256, you will see variable a in foo, circulates over 256.
About your second question, In C symbols for initialized variables and functions are defined as strong and multiple strong symbbols are not allowed, but I not sure whether is it the case with C++ or not.

Just so you know, I've accidentally found the paragraph in C11 standard that covers both issues - it's 6.2.7.2:
All declarations that refer to the same object or function shall have
compatible type; otherwise, the behavior is undefined.

Is it legal to cast function pointers? [duplicate]

Let's say I have a function that accepts a void (*)(void*) function pointer for use as a callback:
void do_stuff(void (*callback_fp)(void*), void* callback_arg);
Now, if I have a function like this:
void my_callback_function(struct my_struct* arg);
Can I do this safely?
do_stuff((void (*)(void*)) &my_callback_function, NULL);
I've looked at this question and I've looked at some C standards which say you can cast to 'compatible function pointers', but I cannot find a definition of what 'compatible function pointer' means.

As far as the C standard is concerned, if you cast a function pointer to a function pointer of a different type and then call that, it is undefined behavior. See Annex J.2 (informative):
The behavior is undefined in the following circumstances:
A pointer is used to call a function whose type is not compatible with the pointed-to
type (6.3.2.3).
Section 6.3.2.3, paragraph 8 reads:
A pointer to a function of one type may be converted to a pointer to a function of another
type and back again; the result shall compare equal to the original pointer. If a converted
pointer is used to call a function whose type is not compatible with the pointed-to type,
the behavior is undefined.
So in other words, you can cast a function pointer to a different function pointer type, cast it back again, and call it, and things will work.
The definition of compatible is somewhat complicated. It can be found in section 6.7.5.3, paragraph 15:
For two function types to be compatible, both shall specify compatible return types127.
Moreover, the parameter type lists, if both are present, shall agree in the number of
parameters and in use of the ellipsis terminator; corresponding parameters shall have
compatible types. If one type has a parameter type list and the other type is specified by a
function declarator that is not part of a function definition and that contains an empty
identifier list, the parameter list shall not have an ellipsis terminator and the type of each
parameter shall be compatible with the type that results from the application of the
default argument promotions. If one type has a parameter type list and the other type is
specified by a function definition that contains a (possibly empty) identifier list, both shall
agree in the number of parameters, and the type of each prototype parameter shall be
compatible with the type that results from the application of the default argument
promotions to the type of the corresponding identifier. (In the determination of type
compatibility and of a composite type, each parameter declared with function or array
type is taken as having the adjusted type and each parameter declared with qualified type
is taken as having the unqualified version of its declared type.)
127) If both function types are ‘‘old style’’, parameter types are not compared.
The rules for determining whether two types are compatible are described in section 6.2.7, and I won't quote them here since they're rather lengthy, but you can read them on the draft of the C99 standard (PDF).
The relevant rule here is in section 6.7.5.1, paragraph 2:
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Hence, since a void* is not compatible with a struct my_struct*, a function pointer of type void (*)(void*) is not compatible with a function pointer of type void (*)(struct my_struct*), so this casting of function pointers is technically undefined behavior.
In practice, though, you can safely get away with casting function pointers in some cases. In the x86 calling convention, arguments are pushed on the stack, and all pointers are the same size (4 bytes in x86 or 8 bytes in x86_64). Calling a function pointer boils down to pushing the arguments on the stack and doing an indirect jump to the function pointer target, and there's obviously no notion of types at the machine code level.
Things you definitely can't do:
Cast between function pointers of different calling conventions. You will mess up the stack and at best, crash, at worst, succeed silently with a huge gaping security hole. In Windows programming, you often pass function pointers around. Win32 expects all callback functions to use the stdcall calling convention (which the macros CALLBACK, PASCAL, and WINAPI all expand to). If you pass a function pointer that uses the standard C calling convention (cdecl), badness will result.
In C++, cast between class member function pointers and regular function pointers. This often trips up C++ newbies. Class member functions have a hidden this parameter, and if you cast a member function to a regular function, there's no this object to use, and again, much badness will result.
Another bad idea that might sometimes work but is also undefined behavior:
Casting between function pointers and regular pointers (e.g. casting a void (*)(void) to a void*). Function pointers aren't necessarily the same size as regular pointers, since on some architectures they might contain extra contextual information. This will probably work ok on x86, but remember that it's undefined behavior.

I asked about this exact same issue regarding some code in GLib recently. (GLib is a core library for the GNOME project and written in C.) I was told the entire slots'n'signals framework depends upon it.
Throughout the code, there are numerous instances of casting from type (1) to (2):
typedef int (*CompareFunc) (const void *a,
const void *b)
typedef int (*CompareDataFunc) (const void *b,
const void *b,
void *user_data)
It is common to chain-thru with calls like this:
int stuff_equal (GStuff *a,
GStuff *b,
CompareFunc compare_func)
{
return stuff_equal_with_data(a, b, (CompareDataFunc) compare_func, NULL);
}
int stuff_equal_with_data (GStuff *a,
GStuff *b,
CompareDataFunc compare_func,
void *user_data)
{
int result;
/* do some work here */
result = compare_func (data1, data2, user_data);
return result;
}
See for yourself here in g_array_sort(): http://git.gnome.org/browse/glib/tree/glib/garray.c
The answers above are detailed and likely correct -- if you sit on the standards committee. Adam and Johannes deserve credit for their well-researched responses. However, out in the wild, you will find this code works just fine. Controversial? Yes. Consider this: GLib compiles/works/tests on a large number of platforms (Linux/Solaris/Windows/OS X) with a wide variety of compilers/linkers/kernel loaders (GCC/CLang/MSVC). Standards be damned, I guess.
I spent some time thinking about these answers. Here is my conclusion:
If you are writing a callback library, this might be OK. Caveat emptor -- use at your own risk.
Else, don't do it.
Thinking deeper after writing this response, I would not be surprised if the code for C compilers uses this same trick. And since (most/all?) modern C compilers are bootstrapped, this would imply the trick is safe.
A more important question to research: Can someone find a platform/compiler/linker/loader where this trick does not work? Major brownie points for that one. I bet there are some embedded processors/systems that don't like it. However, for desktop computing (and probably mobile/tablet), this trick probably still works.

The point really isn't whether you can. The trivial solution is
void my_callback_function(struct my_struct* arg);
void my_callback_helper(void* pv)
{
my_callback_function((struct my_struct*)pv);
}
do_stuff(&my_callback_helper);
A good compiler will only generate code for my_callback_helper if it's really needed, in which case you'd be glad it did.

You have a compatible function type if the return type and parameter types are compatible - basically (it's more complicated in reality :)). Compatibility is the same as "same type" just more lax to allow to have different types but still have some form of saying "these types are almost the same". In C89, for example, two structs were compatible if they were otherwise identical but just their name was different. C99 seem to have changed that. Quoting from the c rationale document (highly recommended reading, btw!):
Structure, union, or enumeration type declarations in two different translation units do not formally declare the same type, even if the text of these declarations come from the same include file, since the translation units are themselves disjoint. The Standard thus specifies additional compatibility rules for such types, so that if two such declarations are sufficiently similar they are compatible.
That said - yeah strictly this is undefined behavior, because your do_stuff function or someone else will call your function with a function pointer having void* as parameter, but your function has an incompatible parameter. But nevertheless, i expect all compilers to compile and run it without moaning. But you can do cleaner by having another function taking a void* (and registering that as callback function) which will just call your actual function then.

As C code compiles to instruction which do not care at all about pointer types, it's quite fine to use the code you mention. You'd run into problems when you'd run do_stuff with your callback function and pointer to something else then my_struct structure as argument.
I hope I can make it clearer by showing what would not work:
int my_number = 14;
do_stuff((void (*)(void*)) &my_callback_function, &my_number);
// my_callback_function will try to access int as struct my_struct
// and go nuts
or...
void another_callback_function(struct my_struct* arg, int arg2) { something }
do_stuff((void (*)(void*)) &another_callback_function, NULL);
// another_callback_function will look for non-existing second argument
// on the stack and go nuts
Basically, you can cast pointers to whatever you like, as long as the data continue to make sense at run-time.

Well, unless I understood the question wrong, you can just cast a function pointer this way.
void print_data(void *data)
{
// ...
}
((void (*)(char *)) &print_data)("hello");
A cleaner way would be to create a function typedef.
typedef void(*t_print_str)(char *);
((t_print_str) &print_data)("hello");

If you think about the way function calls work in C/C++, they push certain items on the stack, jump to the new code location, execute, then pop the stack on return. If your function pointers describe functions with the same return type and the same number/size of arguments, you should be okay.
Thus, I think you should be able to do so safely.

Void pointers are compatible with other types of pointer. It's the backbone of how malloc and the mem functions (memcpy, memcmp) work. Typically, in C (Rather than C++) NULL is a macro defined as ((void *)0).
Look at 6.3.2.3 (Item 1) in C99:
A pointer to void may be converted to or from a pointer to any incomplete or object type

using a static const int in a struct/class

struct A {
static const int a = 5;
struct B {
static const int b = a;
};
};
int main() {
return A::B::b;
}
The above code compiles. However if you go by Effective C++ book by Scott Myers(pg 14);
We need a definition for a in addition to the declaration.
Can anyone explain why this is an exception?

C++ compilers allow static const integers (and integers only) to have their value specified at the location they are declared. This is because the variable is essentially not needed, and lives only in the code (it is typically compiled out).
Other variable types (such as static const char*) cannot typically be defined where they are declared, and require a separate definition.
For a tiny bit more explanation, realize that accessing a global variable typically requires making an address reference in the lower-level code. But your global variable is an integer whose size is this typically around the size of an address, and the compiler realizes it will never change, so why bother adding the pointer abstraction?

By really pedantic rules, yes, your code needs a definition for that static integer.
But by practical rules, and what all compilers implement because that's how the rules of C++03 are intended - no, you don't need a definition.
The rules for such static constant integers are intended to allow you to omit the definition if the integer is used only in such situations where a value is immediately read, and if the static member can be used in constant expressions.
In your return statement, the value of the member is immediately read, so you can omit the definition of the static constant integer member if that's the only use of it. The following situation needs a definition, however:
struct A {
static const int a = 5;
struct B {
static const int b = a;
};
};
int main() {
int *p = &A::B::b;
}
No value is read here - but instead the address of it is taken. Therefore, the intent of the C++03 Standard is that you have to provide a definition for the member like the following in some implementation file.
const int A::B::b;
Note that the actual rules appearing in the C++03 Standard says that a definition is not required only where the variable is used where a constant expression is required. That rule, however, if strictly applied, is too strict. It would only allow you to omit a definition for situation like array-dimensions - but would require a definition in cases like a return statement. The corresponding defect report is here.
The wording of C++0x has been updated to include that defect report resolution, and to allow your code as written.

However, if you try the ternary operand without "defining" static consts, you get a linker error in GCC 4x:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13795
So, although constructs like int k = A::CONSTVAL; are illegal in the current standard, they are supported. But the ternary operand is not. Some operators are more equal than others, if you get my drift :)
So much for "lax" rules. I suggest you write code conforming to the standard if you do not want surprises.

In general, most (and recent) C++ compilers allow static const ints
You just lucky, perhaps not. Try older compiler, such as gcc 2.0 and it will vehemently punish you with-less-than-pretty error message.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Is it safe to "play" with parameter constness in extern "C" declarations? - c++

Related

Why member function parameter const mismatch allowed? [duplicate]

Top-level const doesn't influence a function signature

Different declarations of the same function/global variable in two files

Is it legal to cast function pointers? [duplicate]

using a static const int in a struct/class

Categories

Resources