I'm working on a static analyzer for C++11. There is an interaction between static const members of a class and linkage for which I am not sure whether it is defined. My static analyzer should warn for it only if this construct is not defined.
The example is this one:
in file f1.cpp:
struct Foo {
static const int x = 2;
};
int main(void) {
return *&Foo::x;
}
and in file f2.cpp:
struct Foo {
static int x;
};
int Foo::x;
The two files compiled and linked with clang++ -std=c++11 -Weverything f1.cpp f2.cpp cause no warning and produce a binary that returns 0. The same files when compiled with g++ -std=c++11 -Wall -Wextra -pedantic f1.cpp f2.cpp cause no warning and return 2.
My intuition is that this program is ill-defined but no warning is required, as:
both names Foo::x have external linkage following N3376[basic.link]p5:
In addition, a member function, static data member,[...]
has the typedef name for linkage purposes (7.1.3), has external linkage if the name of the class has external
linkage.
but they break the N3376[basic.link]p10 requirement:
After all adjustments of types (during which typedefs (7.1.3) are replaced by their definitions), the types
specified by all declarations referring to a given variable or function shall be identical [...] A violation of this rule on type identity does not require a diagnostic.
To be 100% sure about this, a definition for these "all adjustments of types" is needed, but seems nowhere to be found in the C++11 standard. Is there any, and is the reasoning above correct?
It's an ODR violation. The Foo type has different declarations in each file.
One definition says x is declared with external linkage (can be anything, determined when linking) and the other that it's a compile-time constant with value 2.
Related
Consider a program with the following two translation units:
// TU 1
#include <typeinfo>
struct S {
enum { } x;
};
const std::type_info& ti1 = typeid(decltype(S::x));
// TU 2
#include <iostream>
#include <typeinfo>
struct S {
enum { } x;
};
extern std::type_info& ti1;
const std::type_info& ti2 = typeid(decltype(S::x));
int main() {
std::cout << (ti1 == ti2) << '\n';
}
I compiled it with GCC and Clang and in both cases the result was 1, and I'm not sure why. (GCC also warns that "ISO C++ forbids empty unnamed enum", which I don't think is true.)
[dcl.enum]/11 states that if an unnamed enumeration does not have a typedef name for linkage purposes but has at least one enumerator, then it has its first enumerator as its name for linkage purposes. These enums have no enumerators, so they have no name for linkage purposes. The same paragraph also has the following note which seems to be a natural consequence of not giving the enums names for linkage purposes:
[Note 3: Each unnamed enumeration with no enumerators is a distinct type. — end note]
Perhaps both compilers have a bug. Or, more likely, I just misunderstood the note. The note is non-normative anyway, so let's look at some normative wording.
[basic.link]/8
Two declarations of entities declare the same entity if, considering declarations of unnamed types to introduce their names for linkage purposes, if any ([dcl.typedef], [dcl.enum]), they correspond ([basic.scope.scope]), have the same target scope that is not a function or template parameter scope, and
[irrelevant]
[irrelevant]
they both declare names with external linkage.
[basic.scope.scope]/4
Two declarations correspond if they (re)introduce the same name, both declare constructors, or both declare destructors, unless [irrelevant]
It seems that, when an unnamed enum is not given a typedef name for linkage purposes, and has no enumerators, it can't be the same type as itself in a different translation unit.
So is it really just a compiler bug? One last thing I was thinking is that if the two enum types really are distinct, then the multiple definitions of S violate the one-definition rule and make the program ill-formed NDR. But I couldn't find anything in the ODR that actually says that.
This program is well-formed and prints 1, as seen. Because S is defined identically in both translation units with external linkage, it is as if there is one definition of S ([basic.def.odr]/14) and thus only one enumeration type is defined. (In practice it is mangled based on the name S or S::x.)
This is just the same phenomenon as static local variables and lambdas being shared among the definitions of an inline function:
// foo.hh
inline int* f() {static int x; return &x;}
inline auto g(int *p) {return [p] {return p;};}
inline std::vector<decltype(g(nullptr))> v;
// bar.cc
#include"foo.hh"
void init() {v.push_back(g(f()));}
// main.cc
#include"foo.hh"
void init();
int main() {
init();
return v.front()()!=f(); // 0
}
So, as far as I know, this is legal in C:
foo.c
struct foo {
int a;
};
bar.c
struct foo {
char a;
};
But the same thing with functions is illegal:
foo.c
int foo() {
return 1;
}
bar.c
int foo() {
return 0;
}
and will result in linking error (multiple definition of function foo).
Why is that? What's the difference between struct names and function names that makes C unable to handle one but not the other?
Also does this behavior extend to C++?
Why is that?
struct foo {
int a;
};
defines a template for creating objects. It does not create any objects or functions. Unless struct foo is used somewhere in your code, as far as the compiler/linker is concerned, those lines of code may as well not exist.
Please note that there is a difference in how C and C++ deal with incompatible struct definitions.
The differing definitions of struct foo in your posted code, is ok in a C program as long as you don't mix their usage.
However, it is not legal in C++. In C++, they have external linkage and must be defined identically. See 3.2 One definition rule/5 for further details.
The distinguishing concept in this case is called linkage.
In C struct, union or enum tags have no linkage. They are effectively local to their scope.
6.2.2 Linkages of identifiers
6 The following identifiers have no linkage: an identifier declared to be anything other than
an object or a function; an identifier declared to be a function parameter; a block scope
identifier for an object declared without the storage-class specifier extern.
They cannot be re-declared in the same scope (except for so called forward declarations). But they can be freely re-declared in different scopes, including different translation units. In different scopes they may declare completely independent types. This is what you have in your example: in two different translation units (i.e. in two different file scopes) you declared two different and unrelated struct foo types. This is perfectly legal.
Meanwhile, functions have linkage in C. In your example these two definitions define the same function foo with external linkage. And you are not allowed to provide more than one definition of any external linkage function in your entire program
6.9 External definitions
5 [...] If an identifier declared with external
linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire
program there shall be exactly one external definition for the identifier; otherwise, there
shall be no more than one.
In C++ the concept of linkage is extended: it assigns specific linkage to a much wider variety of entities, including types. In C++ class types have linkage. Classes declared in namespace scope have external linkage. And One Definition Rule of C++ explicitly states that if a class with external linkage has several definitions (across different translation units) it shall be defined equivalently in all of these translation units (http://eel.is/c++draft/basic.def.odr#12). So, in C++ your struct definitions would be illegal.
Your function definitions remain illegal in C++ as well because of C++ ODR rule (but essentially for the same reasons as in C).
Your function definitions both declare an entity called foo with external linkage, and the C standard says there must not be more than one definition of an entity with external linkage. The struct types you defined are not entities with external linkage, so you can have more than one definition of struct foo.
If you declared objects with external linkage using the same name then that would be an error:
foo.c
struct foo {
int a;
};
struct foo obj;
bar.c
struct foo {
char a;
};
struct foo obj;
Now you have two objects called obj that both have external linkage, which is not allowed.
It would still be wrong even if one of the objects is only declared, not defined:
foo.c
struct foo {
int a;
};
struct foo obj;
bar.c
struct foo {
char a;
};
extern struct foo obj;
This is undefined, because the two declarations of obj refer to the same object, but they don't have compatible types (because struct foo is defined differently in each file).
C++ has similar, but more complex rules, to account for inline functions and inline variables, templates, and other C++ features. In C++ the relevant requirements are known as the One-Definition Rule (or ODR). One notable difference is that C++ doesn't even allow the two different struct definitions, even if they are never used to declare objects with external linkage or otherwise "shared" between translation units.
The two declarations for struct foo are incompatible with each other because the types of the members are not the same. Using them both within each translation unit is fine as long as you don't do anything to confuse the two.
If for example you did this:
foo.c:
struct foo {
char a;
};
void bar_func(struct foo *f);
void foo_func()
{
struct foo f;
bar_func(&f);
}
bar.c:
struct foo {
int a;
};
void bar_func(struct foo *f)
{
f.a = 1000;
}
You would be invoking undefined behavior because the struct foo that bar_func expects is not compatible with the struct foo that foo_func is supplying.
The compatibility of structs is detailed in section 6.2.7 of the C standard:
1 Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are
described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers,
and in 6.7.6 for declarators. Moreover, two structure, union, or
enumerated types declared in separate translation units are compatible
if their tags and members satisfy the following requirements: If one
is declared with a tag, the other shall be declared with the same tag.
If both are completed anywhere within their respective translation
units, then the following additional requirements apply: there shall
be a one-to-one correspondence between their members such that each
pair of corresponding members are declared with compatible types; if
one member of the pair is declared with an alignment specifier, the
other is declared with an equivalent alignment specifier; and if one
member of the pair is declared with a name, the other is declared with
the same name. For two structures, corresponding members shall be
declared in the same order. For two structures or unions,
corresponding bit-fields shall have the same widths. For two
enumerations, corresponding members shall have the same values.
2 All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
To summarize, the two instances of struct foo must have members with the same name and type and in the same order to be compatible.
Such rules are needed so that a struct can be defined once in a header file and that header is subsequently included in multiple source files. This results in the struct being defined in multiple source files, but with each instance being compatible.
The difference isn't so much in the names as in existence; a struct definition isn't stored anywhere and its name only exists during compilation.
(It is the programmer's responsibility to ensure that there is no conflict in the uses of identically named structs. Otherwise, our dear old friend Undefined Behaviour comes calling.)
On the other hand, a function needs to be stored somewhere, and if it has external linkage, the linker needs its name.
If you make your functions static, so they're "invisible" outside their respective compilation unit, the linking error will disappear.
To hide the function definition from the linker use the keyword static.
foo.c
static int foo() {
return 1;
}
bar.c
static int foo() {
return 0;
}
I'm using clang-3.6 and compiling a sizeable project. After a massive re-factoring, a small number of seemingly random methods in a few classes cause warnings such as this:
warning: function 'namespace::X::do_something' has internal linkage but is not defined [-Wundefined-internal]
The same functions also show up as missing in the linker stage.
Here is the anonymized header for one such function definition in X.hpp:
class X {
// ...
void do_something(
foo::Foo& foo,
double a
double b,
double c,
uint8_t d,
const bar::Bar& bar,
int dw, int dh);
// ...
}
The function is implemented in X.cpp as normal. When Y.cpp includes X.hpp and does x->do_something, the warning about internal linkage appears.
How can a method defined as above have internal linkage? What are all the circumstances under which a method gets internal linkage?
And seeing as this function, and others, used to compile just fine and have not even been touched during the refactoring, what kind of side effects (include order, type Foo and Bar) can cause a method to switch to internal linkage?
EDIT:
When I do a full source grep of do_something, it gives three results:
X.hpp: void do_something(
X.cpp: void X::do_something(
Y.cpp: x->do_something(
I also tried to change the name of do_something into a long guaranteed unique name to rule out any possibility of name conflicts.
As far as I can tell, the posted method definition is unquestionably the one being flagged as having internal linkage.
It was due to one of the types (i.e. foo:Foo) being a template type where one of the template parameters had erroneously gotten internal linkage.
template <typename char const* Name> struct Foo{...};
const char constexpr FooBarName[] = "Bar";
using FooBar = Foo<FooBarName>;
The presence of FooBar in any argument list or as a return type would give that method internal linkage. Not even clang-3.8 with -Weverything complained about having a template parameter with internal linkage.
The obvious fix was to properly use extern const char for the template string parameter.
Two libs that defines the same class A, in different ways (this is legacy-crap-code)
Prototypes for A:
in lib A:
#include <string>
struct A
{
static void func( const std::string& value);
};
in lib B:
#include <string>
struct A
{
void func( const std::string& value);
};
main.cpp uses A:s header from lib A (component A)
#include "liba.h"
int main()
{
A::func( "some stuff");
return 0;
}
main is linked with both lib A and lib B.
If lib B is "linked before" lib A (in the link-directive) we get a core, hence, lib B:s definition is picket.
This is not the behavior I expected. I thought that there would be some difference between the symbols, so the loader/runtime linker could pick the right symbol. That is, the hidden this-pointer for non-static member functions is somehow included in the symbol.
Is this really conformant behavior?
Same behavior on both:
g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
RHEL devtool with g++ 4.8.1
It is not possible to overload a non-static member function with a static one or viceversa. From the standard:
ISO 14882:2003 C++ Standard 13.1/2 – Overloadable declarations
Certain function declarations cannot
be overloaded:
Function declarations that differ only in the return type cannot be overloaded.
Member function declarations with the same name and the same parameter types cannot be overloaded if any of them is a static member function declaration (9.4).
More details and references might be found in question 5365714.
So you have two definitions of the same class A in the same program, which should be identical, and they are not.
To signal an error when there are inconsistent definitions in separate translation units is not mandatory for the linker. The result is implementation defined The program is ill-formed (updated as per #jonathan's comment). In an illustrating example from Stroustrup in the C++ faq it is described as undefined behavior.
In the case of GCC, as you said, the definition used depends on the order of the libraries in the link command (assuming lib A and lib B are compiled on itself, and then linked with the main program). The linker uses the first definition found in the libraries passed from left to right.
A discussion on the link order options for GCC is in 409470.
You cannot overload functions in C++ based on the return type, so I would guess that you cannot do it on basis of static/v-non-static member functions.
You will need to fix one of the header files -- preferably by not declaring the same type twice.
To illustrate look at this;
struct A {
int X(int b);
};
int A::X(int b)
{
return b+8;
}
$ g++ x.cc -c
$ nm x.o
0000000000000000 T _ZN1A1XEi
and compare it to this....
struct A {
static int X(int b);
};
int A::X(int b)
{
return b+8;
}
$ g++ x.cc -c
$ nm x.o
0000000000000000 T _ZN1A1XEi
And observe two things;
Nowhere when I declared the actual implementation of A::X did I specify that it was a static member function -- the compiler didn't care, but took what ever information from the definition of struct.
The name mangling of the symbol, whether static or not is the same _ZN1A1XEi which encodes the name of the class the name of the method and the type of the arguments.
So in conclusion, using incorrect headers against compiled code would lead to undefined behavior....
Since a class cannot have both a static member function and non-static member function with the same name, there's no need to include that information in the mangled name.
You will need to solve this problem by including namespaces for your classes, renaming them, or being careful not to use the libraries together.
I have some very simple (C++11) code which the latest clang (version 3.4 trunk 187493) fails to compile, but GCC compiles fine.
The code (below) instantiates the function-template foo with the function-local type Bar and then tries to use its address as a non-type template parameter for the class-template Func:
template<void(*FUNC_PTR)(void)>
struct Func {};
template<typename T> extern inline
void foo() {
using Foo = Func<foo<T>>;
}
int main() {
struct Bar {}; // function-local type
foo<Bar>();
return 0;
}
clang emits the following error:
error: non-type template argument refers to function 'foo' that
does not have linkage
However, if I move type Bar to global scope (by taking it out of the function), then clang compiles it fine, proving the issue is with the type being function-local.
So is clang correct to emit this error, or does the standard not support this (in which case GCC is being too lenient by allowing it)?
EDIT #1 : To be clear, this is not a duplicate of this question since the 'cannot use local types as template parameters' restriction was removed in C++11. However, it's still unclear if there are linkage implications involved with using a local type, and whether clang is correct or not in emitting this error.
EDIT #2 : It has been determined that clang was correct to emit the error for the above code (see answer from #jxh), but that it incorrectly also emits an error for the following code (with using declaration moved from foo<Bar>() scope to main() scope):
template<void(*FUNC_PTR)(void)>
struct Func {};
template<typename T> extern inline
void foo() {}
int main() {
struct Bar {};
using F = Func<foo<Bar>>;
return 0;
}
By definition of no linkage in C++.11 §3.5 Program and linkage ¶2, I originally believed foo<Bar> has no linkage since it cannot be referred to by name by any other scope except that which defined the type Bar (ie, main()). However, this is not correct. This is because the definition of a name with external linkage is described as:
When a name has external linkage, the entity it denotes can be referred to by names from scopes of other translation units or from other scopes of the same translation unit.
And for a template function, this will always be the case. This is because there is one other scope from which the name can be referred. Namely, the template function can refer to itself. Therefore, foo<Bar> has external linkage. zneak's answer, EDIT 2, has an e-mail thread with the clang developers confirming that foo<Bar> should have external linkage.
Thus, from C++.11 §14.3.2 Template non-type arguments ¶1:
A template-argument for a non-type, non-template template-parameter shall be one of: ...
a constant expression (5.19) that designates the address of an object with static storage duration and external or internal linkage or a function with external or internal linkage, including function templates and function template-ids but excluding non-static class members, expressed (ignoring parentheses) as &id-expression, except that the & may be omitted if the name refers to a function or array and shall be omitted if the corresponding template-parameter is a reference; ...
The most relevant bullet is the third bullet. Since foo<bar> has external linkage, passing it as a non-type template-parameter should be fine.
I'm late to the party, but standard says that type-local functions don't have linkage (§3.5:8):
Names not covered by these rules have no linkage. Moreover, except as noted, a name declared at block scope (3.3.3) has no linkage.
Same section goes on to say:
A type without linkage shall not be used as the type of a variable or function with external linkage unless
the entity has C language linkage (7.5), or
the entity is declared within an unnamed namespace (7.3.1), or
the entity is not odr-used (3.2) or is defined in the same translation unit.
And, as a matter of fact, Clang will allow this:
namespace
{
template<void (*FUNC_PTR)(void)>
struct Func {};
template<typename T>
void foo() {}
}
int main() {
struct Bar {}; // function-local type
Func<foo<Bar>> x;
}
And will reject it without the anonymous namespace.
EDIT 1: As jxh notes, the names are also defined in the same translation unit, so I'm not sure what to think of this one.
EDIT 2: The guys at clang confirm it's a bug.