I'm using clang-3.6 and compiling a sizeable project. After a massive re-factoring, a small number of seemingly random methods in a few classes cause warnings such as this:
warning: function 'namespace::X::do_something' has internal linkage but is not defined [-Wundefined-internal]
The same functions also show up as missing in the linker stage.
Here is the anonymized header for one such function definition in X.hpp:
class X {
// ...
void do_something(
foo::Foo& foo,
double a
double b,
double c,
uint8_t d,
const bar::Bar& bar,
int dw, int dh);
// ...
}
The function is implemented in X.cpp as normal. When Y.cpp includes X.hpp and does x->do_something, the warning about internal linkage appears.
How can a method defined as above have internal linkage? What are all the circumstances under which a method gets internal linkage?
And seeing as this function, and others, used to compile just fine and have not even been touched during the refactoring, what kind of side effects (include order, type Foo and Bar) can cause a method to switch to internal linkage?
EDIT:
When I do a full source grep of do_something, it gives three results:
X.hpp: void do_something(
X.cpp: void X::do_something(
Y.cpp: x->do_something(
I also tried to change the name of do_something into a long guaranteed unique name to rule out any possibility of name conflicts.
As far as I can tell, the posted method definition is unquestionably the one being flagged as having internal linkage.
It was due to one of the types (i.e. foo:Foo) being a template type where one of the template parameters had erroneously gotten internal linkage.
template <typename char const* Name> struct Foo{...};
const char constexpr FooBarName[] = "Bar";
using FooBar = Foo<FooBarName>;
The presence of FooBar in any argument list or as a return type would give that method internal linkage. Not even clang-3.8 with -Weverything complained about having a template parameter with internal linkage.
The obvious fix was to properly use extern const char for the template string parameter.
Related
I am trying to understand why I get a warning -Wsubobject-linkage when trying to compile this code:
base.hh
#pragma once
#include <iostream>
template<char const *s>
class Base
{
public:
void print()
{
std::cout << s << std::endl;
}
};
child.hh
#pragma once
#include "base.hh"
constexpr char const hello[] = "Hello world!";
class Child : public Base<hello>
{
};
main.cc
#include "child.hh"
int main(void)
{
Child c;
c.print();
return 0;
}
When running g++ main.cc I get this warning:
In file included from main.cc:1:
child.hh:7:7: warning: ‘Child’ has a base ‘Base<(& hello)>’ whose type uses the anonymous namespace [-Wsubobject-linkage]
7 | class Child : public Base<hello>
| ^~~~~
This warning does not occur if the templated value is an int or if child.hh is copied in main.cc
I do not understand what justifies this warning here, and what I'm supposed to understand from it.
The warning is appropriate, just worded a bit unclear.
hello is declared constexpr, as well as redundantly const. A const variable generally has internal linkage (with some exceptions like variable templates, inline variables, etc.).
Therefore hello in the template argument to Base<hello> is a pointer to an internal linkage object. In each translation unit it will refer to a different hello local to that translation unit.
Consequently Base<hello> will be different types in different translation units and so if you include child.hh in more than one translation unit you will violate ODR on Child's definition and therefore your program will have undefined behavior.
The warning is telling you that. The use of "anonymous namespace" is not good. It should say something like "whose type refers to an entity with internal linkage". But otherwise it is exactly stating the problem.
This happens only if you use a pointer or reference to the object as template argument. If you take just the value of a variable, then it doesn't matter whether that variable has internal or external linkage since the type Base<...> will be determined by the value of the variable, not the identity of the variable.
I don't know why you need the template argument to be a pointer here, but if you really need it and you actually want to include the .hh file in multiple .cc files (directly or indirectly), then you need to give hello external linkage. Then you will have the problem that an external linkage variable may only be defined in one translation unit, which you can circumvent by making the variable inline:
inline constexpr char hello[] = "Hello world!";
inline implies external linkage (even if the variable is const) and constexpr implies const.
If you only need the value of the string, not the identity of the variable, then since C++20 you can pass a std::array<char, N> as template argument:
#include<array>
#include<concepts>
/*...*/
template<std::array s>
requires std::same_as<typename decltype(s)::value_type, char>
class Base
{
public:
void print()
{
std::cout << s.data() << std::endl;
}
};
/*...*/
constexpr auto hello = std::to_array("Hello world!");
With this linkage becomes irrelevant. Writing Base<std::to_array("Hello world!")> is also possible.
Before C++20 there isn't really any nice way to pass a string by-value as template argument.
Even in C++20 you might want to consider writing your own std::array-like class specifically for holding strings. Such a class needs to satisfy the requirements for a structural type. (std::string does not satisfy them.) On the other hand you might want to widen the allowed template parameter types by using the std::range concept and std::range_iter_t instead of std::array to allow any kind of range that could represent a string. If you don't want any restrictions just auto also works.
If child.hh is included in multiple TUs, it would violate ODR and lead to undefined behavior NDR as explained here.
A new bug for better wording of the warning has been submitted here:
Bug 106141 - Better wording for warning: ‘Child’ has a base ‘Base<(& hello)>’ whose type uses the anonymous namespace [-Wsubobject-linkage]
There is also an old bug submitted as:
Bug 86491 - bogus and unsuppressible warning: 'YYY' has a base 'ZZZ' whose type uses the anonymous namespace.
The following code (taken from Wikipedia) defines the variable template pi<>:
template<typename T=double>
constexpr T pi = T(3.14159265358979323846264338328);
template<>
constexpr const char* pi<const char*> = "π";
With the clang compiler (Apple clang version 12.0.0) (with C++14), this triggers a warning (with -Weverything):
no previous extern declaration for non-static variable 'pi<const char *>'
declare 'static' if the variable is not intended to be used outside of this translation unit
Moreover, since this was defined in a header, multiple instances of 'myNameSpace::pi<char const*>' were created, causing linker errors.
So, as suggested, I added the static keyword, which silenced the warning:
template<>
static constexpr const char* pi<const char*> = "π";
But now gcc (9.3.0) is unhappy, giving an error pointing at the static keyword:
error: explicit template specialization cannot have a storage class
What is the correct way to avoid either warning and error?
The warning from (this old version of) Clang is partly misleading, but does indicate the real problem that you eventually encountered with the linker. The warning describes the good rule of thumb that a global variable ought to
appear with extern in a header and then without in a source file, or
appear with static in a source file (avoiding collisions with any other symbol).
The latter choice doesn't apply to explicit specializations: since linkage applies to templates as a whole (the standard says that it pertains to the name of the template, which is evocative even if it doesn't work well for overloaded functions), you can't make just one specialization static and Clang is incorrect to accept it. (MSVC also incorrectly accepts this.) The only way to make a "file-local specialization" is to use a template argument that is a local type, template, or object. You can of course make the whole variable template have internal linkage with static or an unnamed namespace.
However, the former choice does apply: an explicit specialization is not a template, so it must be defined exactly once (in a source file). Like any other global variable, you use extern to reduce the definition to a declaration:
// pi.hh (excerpt)
template<typename T=double>
constexpr T pi = T(3.14159265358979323846264338328);
template<>
extern constexpr const char* pi<const char*>;
// pi.cc
#include"pi.hh"
template<>
constexpr const char* pi<const char*> = "π";
(Since the primary template is, well, a template, it is defined in the header file.)
As mentioned in the comments, C++17 allows inline variables; your explicit specialization again behaves like an ordinary global variable and can be defined with inline in a header if desired.
I have a header file where string are defined as static global.
namespace space {
#define NAME(P) static std::string const s_##P = #P
NAME(foo); NAME(bar); //... other values
#undef NAME
}
In another header, an enum is defined and a template specialization provides the mapping between the enum and a string in space.
enum class letter { alpha, beta };
template<letter> std::string const & mapping();
#define MAPPING(P1,P2) template<> std::string const & mapping<letter::P1>() { return space::s_##P2; }
MAPPING(alpha,foo)
MAPPING(beta,bar)
#undef MAPPING
The above code doesn't link when the header is included in more than one translation unit because the specializations definitions do not match - due to global redefinition per translation unit (I guess).
Wrapping the mapping functions in anonymous namespace or adding static keyword solves the linking issue but then the compiler complains that the functions are defined but not used [-Wunused-function].
template<letter> static std::string const & mapping();
But, defining the specializations as constexpr, there is no longer any link or warning issue.
template<letter> std::string const & mapping();
#define MAPPING(P1,P2) template<> constexpr std::string const & mapping<letter::P1>() { return space::s_##P2; }
I understand why the non-static version fails at link time and why the static version works and triggers warnings. But I don't understand why the constexpr specifier solves both issues.
Can you please give an explanation and even better, a rational in the standard ?
Function template specializations are functions, and are therefore subject to the one-definition rule in the same manner as functions that are not template specializations.
The linker errors you saw when the functions were declared neither static nor constexpr were due to multiple definitions of the same function template specializations which each had external linkage.
When you added static, you made the linkage internal. This made it safe for each translation unit to contain its own copy of the definitions. However, in any TU in which those functions were not called, the compiler knew that (due to internal linkage) they could not be called from any other TU either, making them unused.
With constexpr, the functions become inline implicitly according to the standard, but their linkage is not affected. Since they are inline, you can have multiple definitions, but since they have external linkage, the compiler does not complain when one TU does not use them.
functions declared with the constexpr specifier are inline functions.
From the C++ 20 Standard (9.2.5 The constexpr and consteval specifiers)
1 The constexpr specifier shall be applied only to the definition of a
variable or variable template or the declaration of a function or
function template. The consteval specifier shall be applied only to
the declaration of a function or function template. A function or
static data member declared with the constexpr or consteval specifier
is implicitly an inline function or variable (
So, as far as I know, this is legal in C:
foo.c
struct foo {
int a;
};
bar.c
struct foo {
char a;
};
But the same thing with functions is illegal:
foo.c
int foo() {
return 1;
}
bar.c
int foo() {
return 0;
}
and will result in linking error (multiple definition of function foo).
Why is that? What's the difference between struct names and function names that makes C unable to handle one but not the other?
Also does this behavior extend to C++?
Why is that?
struct foo {
int a;
};
defines a template for creating objects. It does not create any objects or functions. Unless struct foo is used somewhere in your code, as far as the compiler/linker is concerned, those lines of code may as well not exist.
Please note that there is a difference in how C and C++ deal with incompatible struct definitions.
The differing definitions of struct foo in your posted code, is ok in a C program as long as you don't mix their usage.
However, it is not legal in C++. In C++, they have external linkage and must be defined identically. See 3.2 One definition rule/5 for further details.
The distinguishing concept in this case is called linkage.
In C struct, union or enum tags have no linkage. They are effectively local to their scope.
6.2.2 Linkages of identifiers
6 The following identifiers have no linkage: an identifier declared to be anything other than
an object or a function; an identifier declared to be a function parameter; a block scope
identifier for an object declared without the storage-class specifier extern.
They cannot be re-declared in the same scope (except for so called forward declarations). But they can be freely re-declared in different scopes, including different translation units. In different scopes they may declare completely independent types. This is what you have in your example: in two different translation units (i.e. in two different file scopes) you declared two different and unrelated struct foo types. This is perfectly legal.
Meanwhile, functions have linkage in C. In your example these two definitions define the same function foo with external linkage. And you are not allowed to provide more than one definition of any external linkage function in your entire program
6.9 External definitions
5 [...] If an identifier declared with external
linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire
program there shall be exactly one external definition for the identifier; otherwise, there
shall be no more than one.
In C++ the concept of linkage is extended: it assigns specific linkage to a much wider variety of entities, including types. In C++ class types have linkage. Classes declared in namespace scope have external linkage. And One Definition Rule of C++ explicitly states that if a class with external linkage has several definitions (across different translation units) it shall be defined equivalently in all of these translation units (http://eel.is/c++draft/basic.def.odr#12). So, in C++ your struct definitions would be illegal.
Your function definitions remain illegal in C++ as well because of C++ ODR rule (but essentially for the same reasons as in C).
Your function definitions both declare an entity called foo with external linkage, and the C standard says there must not be more than one definition of an entity with external linkage. The struct types you defined are not entities with external linkage, so you can have more than one definition of struct foo.
If you declared objects with external linkage using the same name then that would be an error:
foo.c
struct foo {
int a;
};
struct foo obj;
bar.c
struct foo {
char a;
};
struct foo obj;
Now you have two objects called obj that both have external linkage, which is not allowed.
It would still be wrong even if one of the objects is only declared, not defined:
foo.c
struct foo {
int a;
};
struct foo obj;
bar.c
struct foo {
char a;
};
extern struct foo obj;
This is undefined, because the two declarations of obj refer to the same object, but they don't have compatible types (because struct foo is defined differently in each file).
C++ has similar, but more complex rules, to account for inline functions and inline variables, templates, and other C++ features. In C++ the relevant requirements are known as the One-Definition Rule (or ODR). One notable difference is that C++ doesn't even allow the two different struct definitions, even if they are never used to declare objects with external linkage or otherwise "shared" between translation units.
The two declarations for struct foo are incompatible with each other because the types of the members are not the same. Using them both within each translation unit is fine as long as you don't do anything to confuse the two.
If for example you did this:
foo.c:
struct foo {
char a;
};
void bar_func(struct foo *f);
void foo_func()
{
struct foo f;
bar_func(&f);
}
bar.c:
struct foo {
int a;
};
void bar_func(struct foo *f)
{
f.a = 1000;
}
You would be invoking undefined behavior because the struct foo that bar_func expects is not compatible with the struct foo that foo_func is supplying.
The compatibility of structs is detailed in section 6.2.7 of the C standard:
1 Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are
described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers,
and in 6.7.6 for declarators. Moreover, two structure, union, or
enumerated types declared in separate translation units are compatible
if their tags and members satisfy the following requirements: If one
is declared with a tag, the other shall be declared with the same tag.
If both are completed anywhere within their respective translation
units, then the following additional requirements apply: there shall
be a one-to-one correspondence between their members such that each
pair of corresponding members are declared with compatible types; if
one member of the pair is declared with an alignment specifier, the
other is declared with an equivalent alignment specifier; and if one
member of the pair is declared with a name, the other is declared with
the same name. For two structures, corresponding members shall be
declared in the same order. For two structures or unions,
corresponding bit-fields shall have the same widths. For two
enumerations, corresponding members shall have the same values.
2 All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
To summarize, the two instances of struct foo must have members with the same name and type and in the same order to be compatible.
Such rules are needed so that a struct can be defined once in a header file and that header is subsequently included in multiple source files. This results in the struct being defined in multiple source files, but with each instance being compatible.
The difference isn't so much in the names as in existence; a struct definition isn't stored anywhere and its name only exists during compilation.
(It is the programmer's responsibility to ensure that there is no conflict in the uses of identically named structs. Otherwise, our dear old friend Undefined Behaviour comes calling.)
On the other hand, a function needs to be stored somewhere, and if it has external linkage, the linker needs its name.
If you make your functions static, so they're "invisible" outside their respective compilation unit, the linking error will disappear.
To hide the function definition from the linker use the keyword static.
foo.c
static int foo() {
return 1;
}
bar.c
static int foo() {
return 0;
}
I noticed a very curious behavior that, if standard, I would be very happy to exploit (what I'd like to do with it is fairly complex to explain and irrelevant to the question).
The behavior is:
static void name();
void name() {
/* This function is now static, even if in the declaration
* there is no static keyword. Tested on GCC and VS. */
}
What's curious is that the inverse produces a compile time error:
void name();
static void name() {
/* Illegal */
}
So, is this standard and can I expect other compilers to behave the same way? Thanks!
C++ standard:
7.1.1/6: "A name declared in a namespace scope without a
storage-class-specifier has external
linkage unless it has internal linkage
because of a previous declaration" [or unless it's const].
In your first case, name is declared in a namespace scope (specifically, the global namespace). The first declaration therefore alters the linkage of the second declaration.
The inverse is banned because:
7.1.1/7: "The linkages implied by successive declarations for a given
entity shall agree".
So, in your second example, the first declaration has external linkage (by 7.1.1/6), and the second has internal linkage (explicitly), and these do not agree.
You also ask about C, and I imagine it's the same sort of thing. But I have the C++ book right here, whereas you're as capable of looking in a draft C standard online as I am ;-)
Qualifiers that you put on the function prototype (or that are implied) are automatically used when the function is declared.
So in your second case the lack of static on the prototype meant that the function was defined as NOT static, and then when it was later declared as static, that was an error.
If you were to leave off the return type in the prototype, then the default would be int and then you would get an error again with the void return type. The same thing happens with __crtapi and __stdcall and __declspec() (in the Microsoft C compiler).