g++ signature/symbol : no difference between static and non-static member function? - c++

Two libs that defines the same class A, in different ways (this is legacy-crap-code)
Prototypes for A:
in lib A:
#include <string>
struct A
{
static void func( const std::string& value);
};
in lib B:
#include <string>
struct A
{
void func( const std::string& value);
};
main.cpp uses A:s header from lib A (component A)
#include "liba.h"
int main()
{
A::func( "some stuff");
return 0;
}
main is linked with both lib A and lib B.
If lib B is "linked before" lib A (in the link-directive) we get a core, hence, lib B:s definition is picket.
This is not the behavior I expected. I thought that there would be some difference between the symbols, so the loader/runtime linker could pick the right symbol. That is, the hidden this-pointer for non-static member functions is somehow included in the symbol.
Is this really conformant behavior?
Same behavior on both:
g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
RHEL devtool with g++ 4.8.1

It is not possible to overload a non-static member function with a static one or viceversa. From the standard:
ISO 14882:2003 C++ Standard 13.1/2 – Overloadable declarations
Certain function declarations cannot
be overloaded:
Function declarations that differ only in the return type cannot be overloaded.
Member function declarations with the same name and the same parameter types cannot be overloaded if any of them is a static member function declaration (9.4).
More details and references might be found in question 5365714.
So you have two definitions of the same class A in the same program, which should be identical, and they are not.
To signal an error when there are inconsistent definitions in separate translation units is not mandatory for the linker. The result is implementation defined The program is ill-formed (updated as per #jonathan's comment). In an illustrating example from Stroustrup in the C++ faq it is described as undefined behavior.
In the case of GCC, as you said, the definition used depends on the order of the libraries in the link command (assuming lib A and lib B are compiled on itself, and then linked with the main program). The linker uses the first definition found in the libraries passed from left to right.
A discussion on the link order options for GCC is in 409470.

You cannot overload functions in C++ based on the return type, so I would guess that you cannot do it on basis of static/v-non-static member functions.
You will need to fix one of the header files -- preferably by not declaring the same type twice.
To illustrate look at this;
struct A {
int X(int b);
};
int A::X(int b)
{
return b+8;
}
$ g++ x.cc -c
$ nm x.o
0000000000000000 T _ZN1A1XEi
and compare it to this....
struct A {
static int X(int b);
};
int A::X(int b)
{
return b+8;
}
$ g++ x.cc -c
$ nm x.o
0000000000000000 T _ZN1A1XEi
And observe two things;
Nowhere when I declared the actual implementation of A::X did I specify that it was a static member function -- the compiler didn't care, but took what ever information from the definition of struct.
The name mangling of the symbol, whether static or not is the same _ZN1A1XEi which encodes the name of the class the name of the method and the type of the arguments.
So in conclusion, using incorrect headers against compiled code would lead to undefined behavior....

Since a class cannot have both a static member function and non-static member function with the same name, there's no need to include that information in the mangled name.
You will need to solve this problem by including namespaces for your classes, renaming them, or being careful not to use the libraries together.

Related

Template static member definition depends on order passed to linker

The following code, which has 2 definitions for a template static field member, each definition defines template1<int>::x with a different value.
One would expect the linker to reject such redefinitions as they have different values.
But compilation & linkage passes for both g++ and MSVC, and which definition is used is dependent on the order in which the sources are passed to the linker.
Is this behavior compliant to the C++ standard, undefined behavior, or a linker bug?
my_template.h
template <class T>
class template1
{
public:
static int x;
};
Src2.cpp
#include <stdio.h>
#include "my_template.h"
template <class T>
int template1<T>::x = 2;
void my_func() // definition
{
printf("my_func: template1<int>::x = %d\n", template1<int>::x); // definition of X to 2.
printf("my_func: template1<char>::x = %d\n", template1<char>::x); // definition of X to 2.
}
Main.cpp
#include <cstdio>
#include "my_template.h"
template <class T>
int template1<T>::x = 1;
void my_func();
int main()
{
printf("main: template1<int>::x = %d\n", template1<int>::x); // definition of X to 1.
my_func();
return 0;
}
Compile with g++ (MinGW.org GCC Build-20200227-1) 9.2.0+
Compile1
g++ -o prog Src2.cpp Main.cpp
Output1
main: template1<int>::x = 2
my_func: template1<int>::x = 2
my_func: template1<char>::x = 2
Compile2
g++ -o prog Main.cpp Src2.cpp
Ouput2
main: template1<int>::x = 1
my_func: template1<int>::x = 1
my_func: template1<char>::x = 2
Observed also with
Microsoft (R) C/C++ Optimizing Compiler Version 19.25.28612 for x86
When I disassembled the code with -S flag, each compilation unit defined the same symbol name.
Co-work with Nightra.
This violates ODR (which requires that an entity must have exactly one definition, if it's used). So the program has UB.
The compiler couldn't diagnose this, because each translation unit is fine. In theory, the linker could diagnose this, but in practice it won't do that.
Is this behavior compliant to the C++ standard, undefined behavior, or a linker bug?
This is undefined behaviour (UB).
From [basic.def.odr]/4 of N4659 [emphasis mine]:
Every program shall contain exactly one definition of every
non-inline function or variable that is odr-used in that program
outside of a discarded statement; no diagnostic required. The
definition can appear explicitly in the program, it can be found in
the standard or a user-defined library, or (when appropriate) it is
implicitly defined (see [class.ctor], [class.dtor] and [class.copy]).
An inline function or variable shall be defined in every translation
unit in which it is odr-used outside of a discarded statement.
Non-constexpr static member variables of templates are not implicitly inline, and thus this is UB, no diagnostic required.
We may also turn to [basic.def.odr]/6 for an even stronger statement (not even requiring ODR-use) [quoting a selected extract, emphasis mine]:
There can be more than one definition of a [...] static data member
of a class template [...] in a program provided that each definition
appears in a different translation unit, and provided the
definitions satisfy the following requirements. Given such an entity
named D defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
[...]
If the definitions of D satisfy all these requirements, then the
behavior is as if there were a single definition of D. If the
definitions of D do not satisfy these requirements, then the
behavior is undefined.
With two different definitions of D (in your case, template1<int>::x) "each definition of D shall consist of the same sequence of tokens" is not fulfilled, and it follows that we naturally cannot possibly fulfill "[...] as if there were a single definition of D"; thus UB.

Static const data member defined in another file

I'm working on a static analyzer for C++11. There is an interaction between static const members of a class and linkage for which I am not sure whether it is defined. My static analyzer should warn for it only if this construct is not defined.
The example is this one:
in file f1.cpp:
struct Foo {
static const int x = 2;
};
int main(void) {
return *&Foo::x;
}
and in file f2.cpp:
struct Foo {
static int x;
};
int Foo::x;
The two files compiled and linked with clang++ -std=c++11 -Weverything f1.cpp f2.cpp cause no warning and produce a binary that returns 0. The same files when compiled with g++ -std=c++11 -Wall -Wextra -pedantic f1.cpp f2.cpp cause no warning and return 2.
My intuition is that this program is ill-defined but no warning is required, as:
both names Foo::x have external linkage following N3376[basic.link]p5:
In addition, a member function, static data member,[...]
has the typedef name for linkage purposes (7.1.3), has external linkage if the name of the class has external
linkage.
but they break the N3376[basic.link]p10 requirement:
After all adjustments of types (during which typedefs (7.1.3) are replaced by their definitions), the types
specified by all declarations referring to a given variable or function shall be identical [...] A violation of this rule on type identity does not require a diagnostic.
To be 100% sure about this, a definition for these "all adjustments of types" is needed, but seems nowhere to be found in the C++11 standard. Is there any, and is the reasoning above correct?
It's an ODR violation. The Foo type has different declarations in each file.
One definition says x is declared with external linkage (can be anything, determined when linking) and the other that it's a compile-time constant with value 2.

My template specialization differs debug version from release version, is this gcc bug?

First of all, I've got a header file for a class, an specialization declaration without definition(code samples from internet)
$ cat foo.h
template<typename T>
class foo{
public:
static void init(){
return;
}
};
template<> void foo<int>::init();
Then there're 2 implementation files for template specialization
$ cat foo_int.cpp
#include "foo.h"
#include<stdio.h>
template<>
void foo<int>::init(){
printf("init int foo\n");
}
$ cat foo_float.cpp
#include "foo.h"
#include<stdio.h>
template<>
void foo<float>::init(){
printf("init float foo\n");
}
Finally I got a main file
$ cat main.cpp
#include "foo.h"
int main(){
foo<int>::init();
foo<float>::init();
}
If I compile it without optimization and run it, it gives:
g++ foo_int.cpp foo_float.cpp main.cpp && a.out
init int foo
init float foo
If I add optimization, then the result is different:
$ g++ foo_int.cpp foo_float.cpp main.cpp -O2 && a.out
init int foo
The result is different. Some explanation from internet said this is due to some internal mechanism of "weak symbol" in gcc implementation, but my question:
Is "weak symbol"/"strong symbol" a concept of gcc/g++, or it's part of the c/c++ language specification.
If debug and release results are different, should I say this is a bug/issue of gcc/g++, in regard with "weak symbol" mechanism? As a developer, I wouldn't expect my debug version to behave differently from release version.
I tried clang, unfortunately same error. Is this an "acceptable" case for C/C++ that debug/release "should" behave so differently?
The language definition requires that you declare an explicit specialization before it is used:
If a template, a member template or a member of a class template is
explicitly specialized then that specialization shall be declared
before the first use of that specialization that would cause an
implicit instantiation to take place, in every translation unit in
which such a use occurs; no diagnostic is required. [temp.expl.spec]/6.
There is no declaration of the explicit specialization of foo<float>::init() at the point where it is called from main, but there is an explicit specialization in foo_float.cpp, so the behavior of the program is undefined.
You've violated the one definition rule — your program contains two definitions of foo<float>::init.
One definition occurs in the compilation unit foo_float.cpp, and the other appears in the compilation unit main.cpp.
Violating the one definition rule means undefined behavior — in this case, what likely happens is:
With optimizations off, the program generates an actual function call, and the linker happened to put foo_float.cpp's version of the function in the executable.
With optimizations on, when compiling main.cpp the compiler inlined the function — naturally, it would inline main.cpp's version of the function.

How do inline variables work?

At the 2016 Oulu ISO C++ Standards meeting, a proposal called Inline Variables was voted into C++17 by the standards committee.
In layman's terms, what are inline variables, how do they work and what are they useful for? How should inline variables be declared, defined and used?
The first sentence of the proposal:
” The ​inline specifier can be applied to variables as well as to functions.
The ¹guaranteed effect of inline as applied to a function, is to allow the function to be defined identically, with external linkage, in multiple translation units. In practice that means defining the function in a header, that can be included in multiple translation units. The proposal extends this possibility to variables.
So, in practical terms the (now accepted) proposal allows you to use the inline keyword to define an external linkage const namespace scope variable, or any static class data member, in a header file, so that the multiple definitions that result when that header is included in multiple translation units are OK with the linker – it just chooses one of them.
Up until and including C++14 the internal machinery for this has been there, in order to support static variables in class templates, but there was no convenient way to use that machinery. One had to resort to tricks like
template< class Dummy >
struct Kath_
{
static std::string const hi;
};
template< class Dummy >
std::string const Kath_<Dummy>::hi = "Zzzzz...";
using Kath = Kath_<void>; // Allows you to write `Kath::hi`.
From C++17 and onwards I believe one can write just
struct Kath
{
static std::string const hi;
};
inline std::string const Kath::hi = "Zzzzz..."; // Simpler!
… in a header file.
The proposal includes the wording
” ​An inline static data member can be defined in the class definition and may s‌​pecify a ​brace­-or­-equal­-initializer. If the member is declared with the constexpr specifier, it may be redeclared in namespace scope with no initializer (this usage is deprecated; see‌​ D.X). Declarations of other static data members shall not specify a ​brace­-or­-equal­-in‌​itializer
… which allows the above to be further simplified to just
struct Kath
{
static inline std::string const hi = "Zzzzz..."; // Simplest!
};
… as noted by T.C in a comment to this answer.
Also, the ​constexpr​ specifier implies  inline for static data members as well as functions.
Notes:
¹ For a function `inline` also has a hinting effect about optimization, that the compiler should prefer to replace calls of this function with direct substitution of the function's machine code. This hinting can be ignored.
Inline variables are very similar to inline functions. It signals the linker that only one instance of the variable should exist, even if the variable is seen in multiple compilation units. The linker needs to ensure that no more copies are created.
Inline variables can be used to define globals in header only libraries. Before C++17, they had to use workarounds (inline functions or template hacks).
For instance, one workaround is to use the Meyers’ singleton with an inline function:
inline T& instance()
{
static T global;
return global;
}
There are some drawbacks with this approach, mostly in terms of performance. This overhead could be avoided by template solutions, but it is easy to get them wrong.
With inline variables, you can directly declare it (without getting a multiple definition linker error):
inline T global;
Apart from header only libraries, there other cases where inline variables can help. Nir Friedman covers this topic in his talk at CppCon: What C++ developers should know about globals (and the linker). The part about inline variables and the workarounds starts at 18m9s.
Long story short, if you need to declare global variables that are shared between compilation units, declaring them as inline variables in the header file is straightforward and avoids the problems with pre-C++17 workarounds.
(There are still use cases for the Meyers’ singleton, for instance, if you explicitly want to have lazy initialization.)
Minimal runnable example
This awesome C++17 feature allow us to:
conveniently use just a single memory address for each constant
store it as a constexpr: How to declare constexpr extern?
do it in a single line from one header
main.cpp
#include <cassert>
#include "notmain.hpp"
int main() {
// Both files see the same memory address.
assert(&notmain_i == notmain_func());
assert(notmain_i == 42);
}
notmain.hpp
#ifndef NOTMAIN_HPP
#define NOTMAIN_HPP
inline constexpr int notmain_i = 42;
const int* notmain_func();
#endif
notmain.cpp
#include "notmain.hpp"
const int* notmain_func() {
return &notmain_i;
}
Compile and run:
g++ -c -o notmain.o -std=c++17 -Wall -Wextra -pedantic notmain.cpp
g++ -c -o main.o -std=c++17 -Wall -Wextra -pedantic main.cpp
g++ -o main -std=c++17 -Wall -Wextra -pedantic main.o notmain.o
./main
GitHub upstream.
See also: How do inline variables work?
C++ standard on inline variables
The C++ standard guarantees that the addresses will be the same. C++17 N4659 standard draft
10.1.6 "The inline specifier":
6 An inline function or variable with external linkage shall have the same address in all translation units.
cppreference https://en.cppreference.com/w/cpp/language/inline explains that if static is not given, then it has external linkage.
GCC inline variable implementation
We can observe how it is implemented with:
nm main.o notmain.o
which contains:
main.o:
U _GLOBAL_OFFSET_TABLE_
U _Z12notmain_funcv
0000000000000028 r _ZZ4mainE19__PRETTY_FUNCTION__
U __assert_fail
0000000000000000 T main
0000000000000000 u notmain_i
notmain.o:
0000000000000000 T _Z12notmain_funcv
0000000000000000 u notmain_i
and man nm says about u:
"u" The symbol is a unique global symbol. This is a GNU extension to the standard set of ELF symbol bindings. For such a symbol the dynamic linker will make sure that in the entire process
there is just one symbol with this name and type in use.
so we see that there is a dedicated ELF extension for this.
Pre-C++ 17: extern const
Before C++ 17, and in C, we can achieve a very similar effect with an extern const, which will lead to a single memory location being used.
The downsides over inline are:
it is not possible to make the variable constexpr with this technique, only inline allows that: How to declare constexpr extern?
it is less elegant as you have to declare and define the variable separately in the header and cpp file
main.cpp
#include <cassert>
#include "notmain.hpp"
int main() {
// Both files see the same memory address.
assert(&notmain_i == notmain_func());
assert(notmain_i == 42);
}
notmain.cpp
#include "notmain.hpp"
const int notmain_i = 42;
const int* notmain_func() {
return &notmain_i;
}
notmain.hpp
#ifndef NOTMAIN_HPP
#define NOTMAIN_HPP
extern const int notmain_i;
const int* notmain_func();
#endif
GitHub upstream.
Pre-C++17 header only alternatives
These are not as good as the extern solution, but they work and only take up a single memory location:
A constexpr function, because constexpr implies inline and inline allows (forces) the definition to appear on every translation unit:
constexpr int shared_inline_constexpr() { return 42; }
and I bet that any decent compiler will inline the call.
You can also use a const or constexpr static integer variable as in:
#include <iostream>
struct MyClass {
static constexpr int i = 42;
};
int main() {
std::cout << MyClass::i << std::endl;
// undefined reference to `MyClass::i'
//std::cout << &MyClass::i << std::endl;
}
but you can't do things like taking its address, or else it becomes odr-used, see also: https://en.cppreference.com/w/cpp/language/static "Constant static members" and Defining constexpr static data members
C
In C the situation is the same as C++ pre C++ 17, I've uploaded an example at: What does "static" mean in C?
The only difference is that in C++, const implies static for globals, but it does not in C: C++ semantics of `static const` vs `const`
Any way to fully inline it?
TODO: is there any way to fully inline the variable, without using any memory at all?
Much like what the preprocessor does.
This would require somehow:
forbidding or detecting if the address of the variable is taken
add that information to the ELF object files, and let LTO optimize it up
Related:
C++11 enum with class members and constexpr link-time optimization
Tested in Ubuntu 18.10, GCC 8.2.0.

Static global variable used by inline member function

When you have a static global variable in a C++ header file, each translation unit that includes the header file ends up with its own copy of the variable.
However, if I declare a class in that same header file, and create a member function of that class, implemented inline within the class declaration, that uses the static global variable, for example:
#include <iostream>
static int n = 10;
class Foo {
public:
void print() { std::cout << n << std::endl; }
};
then I see slightly odd behavior under gcc 4.4:
If I compile without optimization, all uses of the member function use the copy of the variable from one of the translation units (the first one mentioned on the g++ command line).
If I compile with -O2, each use of the member function uses the copy of the variable from the translation unit in which the case is made.
Obviously this is really bad design, so this question is just out of curiosity. But my question, nonetheless, is what does the C++ standard say about this case? Is g++ behaving correctly by giving different behavior with and without optimization enabled?
The standard says (3.2/5):
There can be more than one definition
of a class type (clause 9),
... provided the definitions satisfy
the following requirements ... in each
definition of D, corresponding names,
looked up according to 3.4, shall
refer to an entity defined within the
definition of D, or shall refer to the
same entity
This is where your code loses. The uses of n in the different definitions of Foo do not refer to the same object. Game over, undefined behavior, so yes gcc is entitled to do different things at different optimization levels.
3.2/5 continues:
except that a name can refer to a
const object with internal or no
linkage if the object has the same
integral or enumeration type in all
definitions of D, and the object is
initialized with a constant expression
(5.19), and the value (but not the
address) of the object is used, and
the object has the same value in all
definitions of D
So in your example code you could make n into a static const int and all would be lovely. It's not a coincidence that this clause describes conditions under which it makes no difference whether the different TUs "refer to" the same object or different objects - all they use is a compile-time constant value, and they all use the same one.