forward declaration using extern (in context of C/C++) - c++

I was confused whether to use extern for forward declarations of functions in C. The scenario is each function is in separate .c/.cpp file. I understood from going through this question - External linkage of const in C , that if all are .c files I should not use extern for forward declaration irrespective of whether the function is defined in same file or not.
But I would like to know more about this when to explicitly use extern for forward declarations (I think when the function being forward declared is defined in a different language than calling one, extern would be required), and any caveats to be aware of.

I think when the function being forward declared is defined in a different language than calling one, extern would be required.
I should warn you that your question is poorly phrased. I'm pretty sure you're interested in when to use
extern "C" int foo(int);
in C++, but I'm not sure, and I have to hope I'm not wasting my time.
Let's distinguish between compiler and linker. I'm leaving out some details, but nothing that affects the answer to your question.
A forward declaration is used by the compiler. In declaring the function, you give the compiler information about how it is used. You declare function F, and when the compiler runs across F being used, it knows what to do. (In the K&R days, in the absence of a declaration the compiler used defaults, sometimes leading to hilarious results. So now they're mandatory in both C and C++.)
Usually a forward declaration of a function is a functional prototype: it provides the parameter types. In C, that's not strictly necessary. You may write
int foo();
which tells the compiler foo is a function returning int, but not what parameters it takes. It then becomes your responsibility to get those right, because the compiler cannot check.
An extern declaration -- whether of a function or a variable -- notifies the compiler of a symbol and its type, and says the definition will be provided by another module. The compiler leaves a placeholder for the linker to fill later.
If you look at the getopt(3) man page, for example, you'll see optarg is declared extern; it's defined in the C runtime library, but you can use it because it's been declared.
Now we come to the linker, and the extern "C" of C++.
To the linker, a module exposes symbols. Global variables and functions not declared static have external linkage, meaning they can be used by other modules.
In C, there's a 1:1 correspondence between function names and external symbols. The name is the symbol. For example, getopt(3) has a symbol of the same name in the C standard library, libc:
$ nm -g $(find /usr/lib/ -name libc.a) 2>/dev/null | grep 'T getopt'
0000000000001620 T getopt
0000000000000000 T getopt_long
0000000000000040 T getopt_long_only
In C++, the name is not the symbol. A C++ function can be overloaded; the same name can represent different functions taking different parameters. The compiler constructs a symbol that encodes the parameter types. Compare:
$ echo 'int foo(int foo) { return foo * foo; }' > a.c && cc -c a.c && nm a.o
0000000000000000 T foo
$ echo 'int foo(int foo) { return foo * foo; }' > a.C && c++ -c a.C && nm a.o
0000000000000000 T _Z3fooi
The encoding of the symbol name is often called name mangling. nm(1) has a demangling feature:
$ echo 'int foo(int foo) { return foo * foo; }' > a.C && c++ -c a.C && nm -C a.o
0000000000000000 T foo(int)
Note that the parameter type appears next to the name.
In C++, extern "C" declares or defines the function as a C function. Here's a definition:
$ echo 'extern "C" int foo(int foo) { return foo * foo; }' > a.C && c++ -c a.C && nm a.o
0000000000000000 T foo
In a declaration, it tells the compiler that the function is defined elsewhere, and to use a C symbol.
In a definition, it tells the compiler to emit a C symbol, i.e. not mangle the name, so that the function can be used by other C modules.
I hope that answers your question.

In C language there's such thing as extern inline which has a special meaning and in which an explicit extern makes a difference.
Other than that, there's never any point in specifying an explicit extern in functon declarations, since functions in C and C++ always have external linkage by defualt.
The matter of "different language" is probably about such things as extern "C", but that's a completely different contruct from plain extern.

Functions have external linkage by default, which means that
extern int foo(void);
and
int foo(void);
have exactly the same meaning. This is true for all function declarations, except those involving static in some way.
In particular, the presence or absence of extern has no effect on type checking, and whether or not the function is defined in a different language is irrelevant.

Related

how does overloading with C linkage work in C++?

Consider this example:
int foo(void);
extern "C" int foo(void);
int main()
{
return foo();
}
It errors out with:
$ g++ -c main.cpp
main.cpp:2:16: error: conflicting declaration of ‘int foo()’ with ‘C’ linkage
2 | extern "C" int foo(void);
| ^~~
main.cpp:1:5: note: previous declaration with ‘C++’ linkage
1 | int foo(void);
| ^~~
Which is perfectly fine.
But lets swap first 2 lines:
extern "C" int foo(void);
int foo(void);
int main()
{
return foo();
}
Now it compiles without any error,
but the C linkage is chosen, even
though C++ linkage is found the last.
Questions:
Why case 2 compiles and case 1 fails? I would expect them to behave the same way.
For the case 2 which compiles, why C linkage is chosen and is it possible to change that?
Background:
I have a C++ program where the function name occasionally
clashed with the one in a C standard library. I would
expect an error in such case, but it compiled fine and
the wrong linkage was chosen. I need to either find a
way of making it to fail consistently (to fix all such
cases), or to force the selection of a C++ linkage for
all conflicting functions.
extern "C" int foo(void);
int foo(void);
Here foo is declared with extern "C". Re-declaration does not specify a different calling convention, so no problem.
But,
int foo(void);
extern "C" int foo(void);
Here first line does not explicitly specify calling convention, so default C++ is implicitly chosen. Then second line does explicitly specify a different calling convention, creating a conflict.
So question 2... It is not possible to change calling convention after it is set by first declaration.
How to solve the problem... Use a different name for your C++ function. Or put it in a namespace. Or maybe isolate conflicting C++ name in a separate .cpp file, and export a different function (or just function pointer) for calling it elsewhere (see last paragraph below).
Another solution is to be careful about #include order, so the C++ function is always declared first. There is no "nice" solution to this, other than being pedantic about include order.
The only proper, valid solution is not have any symbols, which conflict with anything in C++ standard library. That is not allowed by C++ standard. Rename your own functions, or put them in your own namespace.
An important detail is, that the linker symbol for these functions is different. C++ does symbol name mangling to enable namespaces, methods and overloading. So it is possible for both functions to exist and be callable in a linked/running program. Problem is only at source code level.

g++ signature/symbol : no difference between static and non-static member function?

Two libs that defines the same class A, in different ways (this is legacy-crap-code)
Prototypes for A:
in lib A:
#include <string>
struct A
{
static void func( const std::string& value);
};
in lib B:
#include <string>
struct A
{
void func( const std::string& value);
};
main.cpp uses A:s header from lib A (component A)
#include "liba.h"
int main()
{
A::func( "some stuff");
return 0;
}
main is linked with both lib A and lib B.
If lib B is "linked before" lib A (in the link-directive) we get a core, hence, lib B:s definition is picket.
This is not the behavior I expected. I thought that there would be some difference between the symbols, so the loader/runtime linker could pick the right symbol. That is, the hidden this-pointer for non-static member functions is somehow included in the symbol.
Is this really conformant behavior?
Same behavior on both:
g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
RHEL devtool with g++ 4.8.1
It is not possible to overload a non-static member function with a static one or viceversa. From the standard:
ISO 14882:2003 C++ Standard 13.1/2 – Overloadable declarations
Certain function declarations cannot
be overloaded:
Function declarations that differ only in the return type cannot be overloaded.
Member function declarations with the same name and the same parameter types cannot be overloaded if any of them is a static member function declaration (9.4).
More details and references might be found in question 5365714.
So you have two definitions of the same class A in the same program, which should be identical, and they are not.
To signal an error when there are inconsistent definitions in separate translation units is not mandatory for the linker. The result is implementation defined The program is ill-formed (updated as per #jonathan's comment). In an illustrating example from Stroustrup in the C++ faq it is described as undefined behavior.
In the case of GCC, as you said, the definition used depends on the order of the libraries in the link command (assuming lib A and lib B are compiled on itself, and then linked with the main program). The linker uses the first definition found in the libraries passed from left to right.
A discussion on the link order options for GCC is in 409470.
You cannot overload functions in C++ based on the return type, so I would guess that you cannot do it on basis of static/v-non-static member functions.
You will need to fix one of the header files -- preferably by not declaring the same type twice.
To illustrate look at this;
struct A {
int X(int b);
};
int A::X(int b)
{
return b+8;
}
$ g++ x.cc -c
$ nm x.o
0000000000000000 T _ZN1A1XEi
and compare it to this....
struct A {
static int X(int b);
};
int A::X(int b)
{
return b+8;
}
$ g++ x.cc -c
$ nm x.o
0000000000000000 T _ZN1A1XEi
And observe two things;
Nowhere when I declared the actual implementation of A::X did I specify that it was a static member function -- the compiler didn't care, but took what ever information from the definition of struct.
The name mangling of the symbol, whether static or not is the same _ZN1A1XEi which encodes the name of the class the name of the method and the type of the arguments.
So in conclusion, using incorrect headers against compiled code would lead to undefined behavior....
Since a class cannot have both a static member function and non-static member function with the same name, there's no need to include that information in the mangled name.
You will need to solve this problem by including namespaces for your classes, renaming them, or being careful not to use the libraries together.

Typedef for a pointer to a C function inside C++

I'm trying to declare a function pointer type (not a variable) that would specify C calling convention. Both
extern "C" typedef void (*PFunc)();
and
typedef extern "C" void (*PFunc)();
produce a syntax error, when used on function level.
extern "C" { typedef void (*PFunc)(); }
extern "C" typedef void (*PFunc)();
both work when used on the global scope; I'd rather keep it local.
What's the proper way, please? I don't want to use compiler specific extentions.
According to this, matching calling conventions between the pointer and the target is the safe thing to do when calling inderectly functions that are declared as extern "C", because C and C++ calling conventions might mismatch. In real life they mostly don't, but still, correctness.
Contrary to what some commenters have said here, calling convention/linkage is part of a function type. It has to be, otherwise this information would not be known when calling the function through the function pointer:
[C++11: 7.5/1]: All function types, function names with external linkage, and variable names with external linkage have a language linkage. [..]
The correct declaration is:
extern "C" typedef void (*PFunc)();
However, at block-scope, you cannot declare a function to have any linkage:
[C++11: 7.5/4]: Linkage specifications nest. When linkage specifications nest, the innermost one determines the language linkage. A linkage specification does not establish a scope. A linkage-specification shall occur only in namespace scope (3.3). In a linkage-specification, the specified language linkage applies to the function types of all function declarators, function names with external linkage, and variable names with external linkage declared within the linkage-specification. [..]
So, you will have to stick with a namespace-scope declaration. If you still want to restrict the visibility of the declaration, you could "shelter" it from other code in the same translation unit, using an unnamed namespace.
Yes, linkage specification (extern "C") should be in namespace scope (or global) by Standard.
extern "C" typedef void (*pf)();
is right. If your compiler doesn't accept it, it is a bug of the compiler.
Guessing, but it might be because you are declaring the function as (), in C that means it can take both an INT and VOID... Try explicitly saying (void)

Possible ambiguity with extern "C", overloading, and function pointers

With normal functions, one can write
extern "C" int Frotz(int); // in a header
int Frotz(int x) { return x; }
With function pointers, however, this appears to have been implemented inconsistently between compilers.
extern "C" int Klutz(int (*)(int), int);
int Klutz(int (*fptr)(int), int x) { return (*fptr)(x); }
In the declaration, the argument is also extern "C". In the definition, most compilers appear to match these functions and make Klutz an extern "C" function. The Sun and Cray compilers, however, interpret these functions as being different, producing an overloaded int Klutz(int (*fptr)(int), int x), which later generates a link-time error.
Although Section 7.5.5 of C++98 and C++11 guarantees the interpretation of Frotz, I cannot tell if the standard is ambiguous about whether extern "C" matching should occur before or after checking for overloading.
Should Klutz above generate a mangled (C++) symbol or an extern "C" symbol?
EDIT 1
I could use a typedef to disambiguate the function pointer to have C or C++ ABI, but I'm interested in whether the code here (a) defines Klutz to have C++ linkage, (b) defines it to have C linkage, or (c) is ambiguous according to the standard, so that compilers are free to choose how to interpret it.
EDIT 2
This appears to be a known issue, at least by those compilers with searchable bug trackers. In my tests, GCC, Clang, Intel, MSVC, IBM XL, PathScale, PGI, and Open64 all fail to distinguish function types that are identical except for language linkage, as explicitly required by the standard (see section 7.5.1, quoted in the accepted answer). Fixing this would break a lot of existing code and require an ABI change. I'm not aware of any compiler that actually uses a different calling convention for C versus C++ language linkage.
GCC bug: "Finding reasons to ask for the removal of this feature from the next standard is kind of relevant ;-)" ... "And we may even decide on an official WONTFIX."
Clang bug: "I'm terrified of actually enforcing this rule, because doing it properly means making language linkage part of the canonical type, which is going to break a ton of code."
The C ABI and the C++ ABI are not guaranteed to be the same. So, an extern "C" function pointer is a different type from a C++ function pointer. You need something like this:
extern "C" {
typedef int (*KlutzFuncType)(int);
int Klutz (KlutzFuncType, int);
}
int Klutz (KlutzFuncType fptr, int x) { return (*fptr)(x); }
There is some discussion of this issue here.
I only have a copy of the draft. From 7.5p1:
Two function types with different language linkages are distinct types even if they are otherwise identical.
My reading of this is that the first parameter of your first Klutz has a different type than the first parameter of your second Klutz, and so your second Klutz should have C++ linkage.
There are C++ implementations that do not take language linkage into account for function types, despite what the standard says. In the following code snippet, KlutzCxxFuncType refers to a function with C++ linkage, while KlutzCFuncType refers to a function with C linkage.
typedef int (*KlutzCxxFuncType)(int);
extern "C" {
typedef int (*KlutzCFuncType)(int);
int Klutz (KlutzCFuncType, int);
}
int Klutz (KlutzCxxFuncType fptr, int x) { return (*fptr)(x); }
int Klutz (KlutzCFuncType fptr, int x) { return (*fptr)(x); }
A compiler that does not distinguish function types based on language linkage will generate a redefinition error on this code. For example, g++ 4.7.2 will emit:
prog.cpp: In function ‘int Klutz(KlutzCFuncType, int)’:
prog.cpp:9:5: error: redefinition of ‘int Klutz(KlutzCFuncType, int)’
prog.cpp:8:5: error: ‘int Klutz(KlutzCxxFuncType, int)’ previously defined here

Overload resolution with extern "C" linkage

In a mixed C/C++ project, we need to make a call from C to a C++ function. The function to be called is overloaded as three separate functions, but we can ignore that from the C-side, we just pick the one most suitable and stick to that one.
There's two ways to do this: (1) write a small C++ wrapper with a extern "C" function that forwards the call to the chosen overloaded function, or (2) the hackish way to just declare the one function we want to call from C as extern "C".
The question is, is there any disadvantages (apart from nightmares and bad karma) to go for the second variant? In other words, given three overloaded function, where one is declared as exern "C", should we expect trouble with the C++ side, or is this well defined according to the standard?
I believe the language in the standard is specifically written to allow exactly one function with "C" linkage, and an arbitrary number of other functions with "C++" linkage that overload the same name (§[dcl.link]/6):
At most one function with a particular name can have C language linkage. Two declarations for a function with C language linkage with the same function name (ignoring the namespace names that qualify it) that appear in different namespace scopes refer to the same function. Two declarations for an object with C language linkage with the same name (ignoring the namespace names that qualify it) that appear in different namespace scopes refer to the same object.
The standard shows the following example:
complex sqrt(complex); // C + + linkage by default
extern "C" {
double sqrt(double); // C linkage
}
Even if it was allowed by the standard, future maintainers of the code will probably be extremely confused and might even remove the extern "C", breaking the C code (possibly far enough later that the events aren't linkable).
Just write the wrapper.
EDIT:
From C++03 7.5/5:
If two declarations of the same
function or object specify different
linkage specifications (that is, the
linkage specifications of these
declarations specify different
string literals), the program is
ill-formed if the declarations appear
in the same translation unit, and the
one definition rule (3.2) applies if
the declarations appear in different
translation units...
I interpret this to not apply since C and C++ functions with the same name aren't actually the same function but this interpretation may be wrong.
Then from C++03 7.5/6:
At most one function with a particular
name can have C language linkage...
This then implies that you could have other, non-C-linkage, functions with the same name. In this case, C++ overloads.
As long as you follow the other rules for extern-C functions (such as their special name requirements), specifying one of the overloads as extern-C is fine according to the standard. If you happen to use function pointers to these functions, be aware that language linkage is part of the function type, and needing a function pointer to this function may decide the issue for you.
Otherwise, I don't see any significant disadvantages. Even the potential disadvantage of copying parameters and return value can be mitigated by compiler- and implementation-specifics that allow you to inline the function – if that is determined to be a problem.
namespace your_project { // You do use one, right? :)
void f(int x);
void f(char x);
void f(other_overloads x);
}
extern "C"
void f(int x) {
your_project::f(x);
}
(This answer applies to C++14; other answers so far are C++03).
It is permitted to use overloading. If there is an extern "C" function definition of some particular name then the following conditions apply (references to C++14 in brackets):
The declaration of the extern "C" function must be visible at the point of any declaration or definition of overloads of that function name (7.5/5)
There must be no other extern "C" definition of a function or variable with the same name, anywhere. (7.5/6)
An overloaded function with the same name must not be declared at global scope. (7.5/6)
Within the same namespace as the extern "C" function, there must not be another function declaration with the same name and parameter list. (7.5/5)
If any violation of the above rules occurs in the same translation unit the compiler must diagnose it; otherwise it is undefined behaviour with no diagnostic required.
So your header file might look something like:
namespace foo
{
extern "C" void bar();
void bar(int);
void bar(std::string);
}
The last bullet point says that you cannot overload solely on linkage; this is ill-formed:
namespace foo
{
extern "C" void bar();
void bar(); // error
}
However you can do this at different namespaces:
extern "C" void bar();
namespace foo
{
void bar();
}
in which case , normal rules of unqualified lookup determine whether a call bar() in some code finds ::bar, foo::bar, or ambiguous.