Why does the C++ compiler allow extern keyword combined with definition? - c++

I accidentally made an error using the extern keyword and then discovered that the compiler allowed my line of code. Why is the following program allowed? Does the compiler strip off the extern keyword? It does not even give a warning.
#include <iostream>
extern void test() { std::cout << "Hello world!" << std::endl; };
int main()
{
test();
}

This is not unusual, in fact, C++ internally does that exact same thing, note what 1.4/6 says:
The templates, classes, functions, and objects in the library have external linkage
The templates in the library very obviously have definitions, too. And why wouldn't they!
See what 3.1/2 and 3.4/2 (emphasis added) have to say:
A declaration is a definition unless it declares a function without specifying the function’s body (8.4), [or] it contains the extern specifier (7.1.1) or a linkage-specification25 (7.5) and neither an initializer nor a functionbody
[...]
When a name has external linkage , the entity it denotes can be referred to by names from scopes of other translation units or from other scopes of the same translation unit.
Your declaration has a function body, so it is a definition, and that's explicitly, perfectly allowable. The function has external linkage, which means you could refer to it by a name from a scope in another translation unit, but you're not required to do that.
You're still perfectly allowed to call it by its name in the current translation unit, and that is what you are doing.
Note that there's a clause about names at namespace scope, so your usage of the extern keyword is somewhat redundant anyway.

Related

How to use P1787 to interpret why the static local variable in inline function refers to the same object

P1787 has an excellent description for what are the same entity.
Two declarations of entities declare the same entity if, considering declarations of unnamed types to introduce their names for linkage purposes, if any ([dcl.typedef], [dcl.enum]), they correspond ([basic.scope.scope]), have the same target scope that is not a function or template parameter scope, and either
they appear in the same translation unit, or
they both declare names with module linkage and are attached to the same module, or
they both declare names with external linkage.
So, consider this example:
// a.hpp
inline int& function(){
static int value = 0; // #1
return value;
}
----------------------
//b.cpp
#include "a.hpp"
void g(){
auto&& rf = function();
}
----------------------
//c.cpp
#include "a.hpp"
int main(){
auto&& rf0 = function();
}
Except for the note says that:
[ Note: An inline function or variable with external or module linkage can be defined in multiple translation units([basic.def.odr]), but is one entity with one address. A type or variable defined in the body of such a function is therefore a single entity.--end note]
However, let's consider the value declared at #1. In b's TU and c's TU, these two declarations for value are corresponding, and they have the same target scope which is introduced by the compound-statement of function. However, a local variable does not have any linkage, so neither bullet in that list will be satisfied. So, why two declarations for value(in the body of the function) in different two translate units declared the same entity? How to interpret that through the rule in P1787?
The behavior of the program (assuming that the usual ODR constraints are satisfied) is as if there were one definition of function. Whichever definition that is contains the only (operative) declaration of value, which of course declares only one entity.
Note that this singularity of definition is so strong that it is able to make “two different” lambda expressions produce the same closure type without any notion of linkage; it is certainly capable of suppressing a declaration for the purposes of object identity without the assistance of [basic.link].

Why do I get warnings both that a function is used but not defined and defined but not used?

I encountered an unusual pair of compiler warnings that seem mutually contradictory. Here's the code I've written:
#include <functional>
#include <iostream>
namespace {
std::function<void()> callback = [] {
void theRealCallback(); // Forward-declare theRealCallback
theRealCallback(); // Invoke theRealCallback
};
void theRealCallback() {
std::cout << "theRealCallback was called." << std::endl;
}
}
int main() {
callback();
}
In other words, I define a lambda function in an unnamed namespace that forward-declares a function defined later in that unnamed namespace.
When I compile this code with g++ 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1), I get these two warnings:
CompilerWarningsNamespace.cpp:6:10: warning: ‘void {anonymous}::theRealCallback()’ used but never defined
void theRealCallback();
^~~~~~~~~~~~~~~
CompilerWarningsNamespace.cpp:10:8: warning: ‘void {anonymous}::theRealCallback()’ defined but not used [-Wunused-function]
void theRealCallback() {
^~~~~~~~~~~~~~~
This is strange, because the first warning says that I'm using a function without defining it, and the second warning says that I'm defining a function without using it.
Running this program indeed produces the output
theRealCallback was called.
and the program terminates normally.
Interestingly, these warnings go away if instead of using an unnamed namespace, I give the namespace a name, as shown here:
#include <functional>
#include <iostream>
namespace NamedNamespace {
std::function<void()> callback = [] {
void theRealCallback();
theRealCallback();
};
void theRealCallback() {
std::cout << "theRealCallback was called." << std::endl;
}
}
int main() {
NamedNamespace::callback();
}
The modified version of the code, like the original code, prints out a message indicating that theRealCallback was invoked.
Can someone explain why I'm getting these warnings? My guess is that this has something to do with the forward-declaration of the function in the lambda function as being interpreted as something other than the later function that appears in the unnamed namespace, but if that's the case I'm not sure I see why this links in the end and why I'm getting those warnings.
I think the open CWG issue 2058 is relevant here to decide whether your program is well-formed.
By the current wording of the standard, I think your program is ill-formed.
Based on the C++17 standard (final draft):
According to [basic.link]/6 the declaration in your lambda will declare theRealCallback with external linkage, because it is a function declaration at block scope that doesn't match to some other, already declared, entity.
At the same time according to [basic.link]/4.2 the second declaration of theRealCallback has internal linkage because it is a namespace scope declaration of a function in an unnamed namespace.
According to [basic.link]/6 a program is ill-formed if it declares an entity with both internal and external linkage in the same translation unit. This ill-formedness was added only recently though, as resolution of CWG issue 426.
As mentioned in the notes of CWG issue 426, according to [basic.link]/9, the declarations only refer to the same entity though if they have the same linkage, meaning the ill-formedness condition in its resolution doesn't apply.
So if we interpret this strictly, then the program really has two independent functions void theRealCallback(), one with external and one with internal linkage. The one with internal linkage has a definition, but the one with external linkage does not. If that is the case the program violates the one-definition rule, because the call theRealCallback(); in the lambda ODR-uses the function with external linkage which doesn't have a definition. This would make the program ill-formed, no diagnostic required.
While this is probably the correct interpretation by strict reading of the standard, I think that the resolution of CWG issue 426 was meant to apply here, because with the interpretation above it wouldn't ever apply. I don't know why the mentioned issue wasn't fixed in the resolution.
Should CWG issue 2058 be resolved to say that the declaration at block scope will match the linkage of the enclosing namespace, then the program would be well-formed and theRealCallback would have internal linkage.
You can see that GCC considers the program to be ill-formed if the standard is interpreted strictly by adding the -pedantic-errors flag, which will cause it to emit an error rather than a warning.
Of course the simplest fix to work around this issue is to forward declare void theRealCallback(); outside the lambda at namespace scope, in which case linkage is definitely internal and any block scope declaration would refer to that declaration, taking on its linkage.
This is not an issue if the namespace is named, because then the namespace scope declaration will have external linkage as well.

Why `static` functions in different TUs do not break the ODR?

The ODR allows us to define several times the same inline function (with some restrictions).
However, what about the simpler case of static functions?
// First TU
static int foo() { return 0; }
int bar1() { return foo(); }
// Second TU
static int foo() { return 1; }
int bar2() { return foo(); }
If we do a quick read of [basic.def.odr]p4, we could naively conclude that this would be UB:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program outside of a discarded statement (9.4.1); no diagnostic required.
Where in the C++ standard is specified that each foo is a different function and therefore not breaking the ODR, even if they have the same name?
Is it simply a matter of reading [basic.link]p2.2 (i.e. due to internal linkage the names do not refer to the same entity and therefore [basic.def.odr]p4 does not apply here)? Or are there more nuances/rules involved to make this determination (like something in [basic.scope])?
Note that, with unnamed namespaces, the outcome is clear, because the name would be already different/unique.
Correct — even though they have the same name locally, those are two different functions/entities, so there's no violation.
[basic.link]/4.3: When a name has internal linkage, the entity it denotes can be referred to by names from other scopes in the same translation unit.
[basic.link]/5: A name having namespace scope has internal linkage if it is the name of a variable, variable template, function, or function template that is explicitly declared static; or [..]
I can't immediately find any further wording (normative or otherwise) that applies, but I don't think we need any.

C++ function declarations

I'm a newbie to C++. I don't understand why it is okay (i.e. why the compiler allows it) for 1 function to be declared twice. For example, the following code is legal:
#include <iostream>
#include <string>
int hello();
int hello();
int main(){
cout << "hello, world" << endl;
}
int hello(){
return 1;
}
Why does the compiler not complain?
In C and C++ forward declarations are very weak. They provide a formal "promise" to the compiler that if a function with a specified signature appears at all, it would have the signature that you specify. The function is not even guaranteed to appear: unless you call or otherwise reference the declared function, the compiler is not going to complain that there is a declaration with no definition. The standard requires compilers to treat identical forward declarations as a single declaration.
Unlike definitions which must be unique according to the single definition rule
3.2 No translation unit shall contain more than one definition of any variable, function, class type, enumeration type, or template
declarations are merely required to refer to the same definition, i.e. be equivalent to each other:
3.3.4 Given a set of declarations in the same declarative region, each of which specifies the same unqualified name, they shall all refer to the same entity, or all refer to functions or function templates, [...]
Your doubt will be cleared by "One Definition Rule". It is defined in the ISO C++ Standard (ISO/IEC 14882) 2003, at section 3.2.
It states that:
In any translation unit, a template, type, function, or object can
have no more than one definition. Some of these can have any number of
declarations.
Read more about it on Wikipedia (http://en.wikipedia.org/wiki/One_Definition_Rule)

Overload resolution with extern "C" linkage

In a mixed C/C++ project, we need to make a call from C to a C++ function. The function to be called is overloaded as three separate functions, but we can ignore that from the C-side, we just pick the one most suitable and stick to that one.
There's two ways to do this: (1) write a small C++ wrapper with a extern "C" function that forwards the call to the chosen overloaded function, or (2) the hackish way to just declare the one function we want to call from C as extern "C".
The question is, is there any disadvantages (apart from nightmares and bad karma) to go for the second variant? In other words, given three overloaded function, where one is declared as exern "C", should we expect trouble with the C++ side, or is this well defined according to the standard?
I believe the language in the standard is specifically written to allow exactly one function with "C" linkage, and an arbitrary number of other functions with "C++" linkage that overload the same name (§[dcl.link]/6):
At most one function with a particular name can have C language linkage. Two declarations for a function with C language linkage with the same function name (ignoring the namespace names that qualify it) that appear in different namespace scopes refer to the same function. Two declarations for an object with C language linkage with the same name (ignoring the namespace names that qualify it) that appear in different namespace scopes refer to the same object.
The standard shows the following example:
complex sqrt(complex); // C + + linkage by default
extern "C" {
double sqrt(double); // C linkage
}
Even if it was allowed by the standard, future maintainers of the code will probably be extremely confused and might even remove the extern "C", breaking the C code (possibly far enough later that the events aren't linkable).
Just write the wrapper.
EDIT:
From C++03 7.5/5:
If two declarations of the same
function or object specify different
linkage specifications (that is, the
linkage specifications of these
declarations specify different
string literals), the program is
ill-formed if the declarations appear
in the same translation unit, and the
one definition rule (3.2) applies if
the declarations appear in different
translation units...
I interpret this to not apply since C and C++ functions with the same name aren't actually the same function but this interpretation may be wrong.
Then from C++03 7.5/6:
At most one function with a particular
name can have C language linkage...
This then implies that you could have other, non-C-linkage, functions with the same name. In this case, C++ overloads.
As long as you follow the other rules for extern-C functions (such as their special name requirements), specifying one of the overloads as extern-C is fine according to the standard. If you happen to use function pointers to these functions, be aware that language linkage is part of the function type, and needing a function pointer to this function may decide the issue for you.
Otherwise, I don't see any significant disadvantages. Even the potential disadvantage of copying parameters and return value can be mitigated by compiler- and implementation-specifics that allow you to inline the function – if that is determined to be a problem.
namespace your_project { // You do use one, right? :)
void f(int x);
void f(char x);
void f(other_overloads x);
}
extern "C"
void f(int x) {
your_project::f(x);
}
(This answer applies to C++14; other answers so far are C++03).
It is permitted to use overloading. If there is an extern "C" function definition of some particular name then the following conditions apply (references to C++14 in brackets):
The declaration of the extern "C" function must be visible at the point of any declaration or definition of overloads of that function name (7.5/5)
There must be no other extern "C" definition of a function or variable with the same name, anywhere. (7.5/6)
An overloaded function with the same name must not be declared at global scope. (7.5/6)
Within the same namespace as the extern "C" function, there must not be another function declaration with the same name and parameter list. (7.5/5)
If any violation of the above rules occurs in the same translation unit the compiler must diagnose it; otherwise it is undefined behaviour with no diagnostic required.
So your header file might look something like:
namespace foo
{
extern "C" void bar();
void bar(int);
void bar(std::string);
}
The last bullet point says that you cannot overload solely on linkage; this is ill-formed:
namespace foo
{
extern "C" void bar();
void bar(); // error
}
However you can do this at different namespaces:
extern "C" void bar();
namespace foo
{
void bar();
}
in which case , normal rules of unqualified lookup determine whether a call bar() in some code finds ::bar, foo::bar, or ambiguous.