struct method linking

struct method linking - c++

I'm updating some old code that has several POD structs that were getting zero'd out by memset (don't blame me...I didn't write this part). Part of the update changed some of them to classes that use private internal pointers that are now getting wiped out by the memset.
So I added a [non-virtual] reset() method to the various structs and refactored the code to call that instead.
One particular struct developed an "undefined reference to `blah::reset()'" error.
Changing it from a struct to a class fixed the error.
Calling nm on the .o file h, the mangled function names for that method look the same (whether it's a class or a struct).
I'm using g++ 4.4.1, on Ubuntu.
I hate the thought that this might be a compiler/linker bug, but I'm not sure what else it could be. Am I missing some fundamental difference between structs and classes? I thought the only meaningful ones were the public/private defaults and the way everyone thinks about them.
Update:
It actually depends on the way the it's declared:
typedef struct
{
...
void reset();
} foo;
won't link.
struct foo
{
...
void reset();
};
links fine.
So, maybe just a lack of understanding on my part about the way typedefs work in this context?

I think that your problem (and I don't have a standards quote to back this up) is that because your struct doesn't have a name, your member function also does not have a globally identifiable name.
Although you're allowed to use a typedef name to introduce a member function definition, that member function must be part of a named type if you are going to be able to link it to a definition in a different TU.
typedef struct S_ { void reset(); } S;
void S::reset() // OK, but the function actually has id: S_::reset()
{
// ...
}
typedef struct { void reset(); } T;
void T::reset() // OK, defintion of anonymous struct's reset(),
// but this isn't an id that can cross TUs.
{
// ...
}
Edit: This could be a gcc bug, though.
7.1.3 [dcl.typedef] If the typedef declaration defines and unnamed class (...), the first typedef-name declared by the declaration to be that class type (...) is used to denote the class type (...) for linkage purposes only (3.5).
Edit:
Or gcc might be right. While the class has external linkage via its typedef name (3.5/4), a member function has external linkage only if the name of the class has external linkage. Although the class has external linkage and it has a name for linkage purposes only it is still an unnamed class, so it's member functions have no linkage.

Of course, the second declaration is the "proper" C++ way of doing things. Still, this links for me with g++ 4.4.1:
typedef struct {
void f() {}
} S;
int main() {
S s;
s.f();
}
I must confess that the whole struct namespace thing has always been one of the darker corners of C for me, but I'm sure someone else will come up with an explanation.
Edit: OK, minimalist code that reproduces the problem:
// str.h
typedef struct {
void f();
} S;
// str.cpp
#include "str.h"
void S :: f() {
}
// sm.cpp
#include "str.h"
int main() {
S s;
s.f();
}
Compiled with:
g++ sm.cpp str.cpp
Error:
undefined reference to S::f()
Now if someone can give us chapter and verse from the standard on why this doesn't work - obviously a struct namespace issue, I would have thought.

Related

Warning about shadowed type definitions

Situation as follows:
in a codebase a header which defines some types, mostly convinient shortcuts like u64 for uint64_t exists. These definitions are global, as general type definitions do not really make sense when encapsulated in a namespace. This header is used in almost any other file to unify the codebase.
Also a public availible header only library is used, which happens to define the same name. In fact the other definition is not a type but a parameter name like:
void function(uint64_t u64);
Now the compiler throws warnings about that the parameter name shadows the global type definition. Which is correct in some respect.
But as the common global type-management header is needed and the include for the library is also needed, my question is how to resolve that problem in a sane way.
Also, why a type name can be shadowed by a function parameter/argument name, as a variable name and a typename are two very different things. That sounds more like a bug in the compiler to me. Without having sound knowlege about the standard in that topic, I would assume
using mytype = unsigned int;
void myfunc(mytype mytype);
to be valid, as I just have defined a variable to have the same name as a type, but the compiler can always distinguish both, because of the syntax.
Additional Question:
why the include order of both headers seems to have no effect.
I tried to solve the problem by including the external library header first (so the global type shortcut should not exist in that module) and included the common global header which defines the types later. But the Warnings persist.
Edited to preceise the problem (2 times)

But as the common global type-management header is needed and the include for the library is also needed, my question is how to resolve that problem in a sane way.
One thing you could do is to isolate the code that utilized the headers that produce that error and move it into its own compilation unit for which you silent that shadowing error message.
Also, why a type name can be shadowed by a function parameter/argument name, as a variable name and a typename are two very different things.
They are two different things, but there are cases where the syntax for both would be the same, so the compiler has to decide whether to use the parameter or the type.
Here one example, with static_assert, might be odd, but maybe there are other situations where this can be actually hard to debug:
#include <utility>
struct Type {
constexpr bool operator == (int i) {
return i==102;
}
};
template <typename X>
constexpr void f1(X Type) {
// does not fail because lambda is called which returns 101
static_assert(Type() == 101, "");
}
void f2() {
// fails because the operator == checks against 102
static_assert(Type() == 101, "");
}
int main() {
f1([] { return 101; });
f2();
return 0;
}
Here is probably a better example, maybe still an odd one but I could imagine that this one is more likely to occur:
bar(TypeA()); would not construct TypeA directly but TypeA operator () is called on the object of type TypeB:
#include <utility>
struct TypeA {
void operator () () {
}
};
struct TypeB {
TypeA operator () () {
return TypeA();
}
};
void bar(TypeA a) {
}
void foo(TypeB TypeA ) {
bar(TypeA());
}
int main() {
foo(TypeB());
}

When is definition of class' static data member (un/-)necesary

I have a big project and work on refactoring it. Major task is rewrite of logger. New logger is (as far as I can tell) API-compatible with old one, so I believed that after changing header include directory, recompile and relink everything should work. But no. I get multiple errors of the kind undefined reference to <static_data_member>. I can't paste actual code, but it looks for example like this:
// Foo.h
class Foo {
static const int bar = 0;
int baz; // assigned in c-tor
void updateBaz() { baz = bar; }
// ....
}
static const int bar is NOT defined in Foo.cpp. It is sometimes printed by log macros. And it used to work (with old logger), now I have to define it. What change could have caused it?
Another example that that occurs with variables declared by boost:
(...)/blog_adaptor.h:50: error: undefined reference to bbost::serialization::version<CA::CReyzinSignature>::value'
So: when are definitions to static members required and when can they be omitted?

Unless the variables are declared inline (a C++17 feature), definitions of static member variables are not optional, as far as the C++ standard is concerned. Failure to provide a definition is undefined behavior.
Compilers and linkers may vary on exactly what will make them check to see if definitions exist, but that is the nature of undefined behavior.

As Nicol Bolas answered, the code in my project had undefined behavior because static data members were initialized but nowhere defined. To summarize and extend:
Static data member doesn't need to be defined when:
It is not used or is used only in discarded branches (non-instantiated templates and discarded branches of constexpr-if)
in C++17 if member is inline
also clang-tidy says that "out-of-line definition of constexpr static data member is redundant in C++17 and is deprecated", so probably static constexpr also doesn't need it
Further, following code shows why my bad project was not triggering linker error before. I don't know whether it is "not odr-use" or "Undefined Behavior that doesn't hurt you yet":
#include <boost/serialization/version.hpp>
class Klass {};
//BOOST_CLASS_VERSION(Klass, 3);
// would be expanded to:
namespace boost { namespace serialization {
template<>
struct version<Klass> {
static const int value = 3; // not defined anywhere
};
} }
int foo (int val) { // was used by old logger
return val;
}
int bar (const int &val) { // is used by new logger
return val;
}
int main () {
// return bar(boost::serialization::version<Klass>::value); // link error
return foo(boost::serialization::version<Klass>::value); // works fine
}
So there is no link error if member is used but not it's address is not queried. Passing value by reference qualifies as querying the address.

Static data member in an unnamed class C++

I read in the below link that unnamed(anonymous) class should not have static data memebers in it. Could anyone please let me know the reason for it?
https://www-01.ibm.com/support/knowledgecenter/SSLTBW_2.1.0/com.ibm.zos.v2r1.cbclx01/cplr038.htm
says the below..
You can only have one definition of a static member in a program.
Unnamed classes, classes contained within unnamed classes, and local
classes cannot have static data members.

All static member data, if they are ODR-used, must be defined outside the class/struct.
struct Foo
{
static int d;
};
int Foo::d = 0;
If the class/struct is unnamed, there is no way to define the member outside the class.
int ::d = 0;
cannot be used to define the static member of an unnamed class.
Update for C++17
If you are able to use C++17 or newer, you may use
static inline int d = 10;
That will allow a static member variable to be defined in an anonymous class/struct.
Sample code to demonstrate that a static member variable need not be defined outside the class definition:
#include <iostream>
struct foo
{
static inline int d = 10;
};
int main()
{
auto ptr = &foo::d;
std::cout << *ptr << std::endl;
}
Command to build:
g++ -std=c++17 -Wall socc.cc -o socc
Output of running the program:
10
Thanks are due to #Jean-MichaëlCelerier for the suggestion for the update.

Are you sure that the standard actually forbids this?
As mentioned the problem arises as you need to have an actual definition of the static member. The language provides for no method to define it. There is no other problems in referring to it as we can do it from within the struct or via an instance of it.
However GCC for example will accept the following:
static struct {
static int j;
} a;
int main() {
return a.j; // Here we actually refers to the static variable
}
but it can't be linked as a.j refers to an undefined symbol (._0::j), but there's a way to get around this. By defining it in assembler or by using compiler extensions you could. For example adding the line
int j asm("_ZN3._01jE") = 42;
Will make it work. _ZN3._01jE is the real mangled name of the static variable in this case, neither the mangled or unmangled name can be used directly as a identifier in standard C++ (but it can via GCC extension or assembler).
As you must probably realize this would only work with specific compilers. Other compilers would mangle the name in other ways (or even do other things that may make the trick not work at all).
You should really question why you would like to use this trick. If you can do the work using standard methods you should most probably chose that. For example you could reduce the visibility by using anonymous namespace instead for example:
namespace {
static struct Fubar {
static int j;
} a;
Fubar::a = 0;
}
Now Fubar is not really anonymous, but it will at least be confined to the translation unit.

When C++ was standardized, unnamed classes could not have static data members, as there was no way to define/instantiate them. However, this problem has been solved with C++11, as it added the decltype operator:
struct {
static int j;
} a;
// Declare the static member of the unnamed struct:
int decltype(a)::j = 42;
int main() {
return a.j == 42 ? 0 : 1; // Use the static member
}
So, in principle, there could be unnamed classes or structs with static data members. But the C++ standard makers deliberately did not allow this syntax, as the compiler doesn't know, which name it should give to that decltype(a)::j thing for linking. So most (all?) compilers - including current versions of GCC in normal mode - refuse to compile this.
In -fpermissive mode, GCC-9 und GCC-10 accept this code and compile it fine. However, if the declaration of a is moved to a header file, that is included from different source files, they still fail at the linking stage.
So unnamed classes can only be used inside a single translation unit. To avoid polluting the global namespace, just put anything, that needs to stay local, inside an anonymous namespace. So:
namespace {
struct whatever {
static int j;
} a;
int whatever::j = 42;
}
int main() {
return a.j == 42 ? 0 : 1;
}
compiles fine, doesn't pollute global namespace, and even doesn't lead to problems, if the name whatever clashes with a name from another header file.

Why should types be put in unnamed namespaces?

I understand the use of unnamed namespaces to make functions and variables have internal linkage. Unnamed namespaces are not used in header files; only source files. Types declared in a source file cannot be used outside. So what's the use of putting types in unnamed namespaces?
See these links where it's mentioned that types can be put in unnamed namespaces:
Superiority of unnamed namespace over static?
Unnamed/anonymous namespaces vs. static functions
Why an unnamed namespace is a "superior" alternative to static?

Where do you want to put local types other than the unnamed namespace? Types can't have a linkage specifier like static. If they are not publicly known, e.g., because they are declared in a header, there is a fair chance that names of local types conflict, e.g., when two translation units define types with the same name. In that case you'd end up with an ODR violation. Defining the types inside an unnamed namespace eliminates this possibility.
To be a bit more concrete. Consider you have
// file demo.h
int foo();
double bar();
// file foo.cpp
struct helper { int i; };
int foo() { helper h{}; return h.i; }
// file bar.cpp
struct helper { double d; }
double bar() { helper h{}; return h.d; }
// file main.cpp
#include "demo.h"
int main() {
return foo() + bar();
}
If you link these three translation units, you have mismatching definitions of helper from foo.cpp and bar.cpp. The compiler/linker is not required to detect these but each type which is used in the program needs to have a consistent definition. Violating this constraints is known as violation of the "one definition rule" (ODR). Any violation of the ODR rule results in undefined behavior.
Given the comment it seems a bit more convincing is needed. The relevant section of the standard is 3.2 [basic.def.odr] paragraph 6:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member
of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then each definition of D shall consist of the same sequence of tokens; and
[...]
There are plenty of further constraints but "shall consist of the same sequence of tokens" is clearly sufficient to rule out e.g. the definitions in the demo above from being legal.

So what's the use of putting types in unnamed namespaces?
You can create short, meaningful classes with names that maybe used in more than one file without the problem of name conflicts.
For example, I use two classes often in unnamed namespaces - Initializer and Helper.
namespace
{
struct Initializer
{
Initializer()
{
// Take care of things that need to be initialized at static
// initialization time.
}
};
struct Helper
{
// Provide functions that are useful for the implementation
// but not exposed to the users of the main interface.
};
// Take care of things that need to be initialized at static
// initialization time.
Initializer initializer;
}
I can repeat this pattern of code in as many files as I want without the names Initializer and Helper getting in the way.
Update, in response to comment by OP
file-1.cpp:
struct Initializer
{
Initializer();
};
Initializer::Initializer()
{
}
int main()
{
Initializer init;
}
file-2.cpp:
struct Initializer
{
Initializer();
};
Initializer::Initializer()
{
}
Command to build:
g++ file-1.cpp file-2.cpp
I get linker error message about multiple definitions of Initializer::Initializer(). Please note that the standard does not require the linker to produce this error. From section 3.2/4:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required.
The linker does not produce an error if the functions are defined inline:
struct Initializer
{
Initializer() {}
};
That's OK for a simple case like this since the implementations are identical. If the inline implementations are different, the program is subject to undefined behavior.

I might be a bit late for answering the question the OP made but since I think the answer is not fully clear, I would like to help future readers.
Lets try a test... compile the following files:
//main.cpp
#include <iostream>
#include "test.hpp"
class Test {
public:
void talk() {
std::cout<<"I'm test MAIN\n";
}
};
int main()
{
Test t;
t.talk();
testfunc();
}
//test.hpp
void testfunc();
//test.cpp
#include <iostream>
class Test {
public:
void talk()
{
std::cout<<"I'm test 2\n";
}
};
void testfunc() {
Test t;
t.talk();
}
Now run the executable.
You would expect to see:
I'm test MAIN
I'm test 2
What you should see thought is:
I'm test MAIN
I'm test MAIN
What happened?!?!!
Now try putting an unnamed namespace around the "Test" class in "test.cpp" like so:
#include <iostream>
#include "test.hpp"
namespace{
class Test {
public:
void talk()
{
std::cout<<"I'm test 2\n";
}
};
}
void testfunc() {
Test t;
t.talk();
}
Compile it again and run.
The output should be:
I'm test MAIN
I'm test 2
Wow! It works!
As it turns out, it is important to define classes inside unnamed namespaces so that you get the proper functionality out of them when two class names in different translation units are identical.
Now as to why that is the case, I haven't done any research on it (maybe someone could help here?) and so I can't really tell you for sure. I'm answering purely from a practical standpoint.
What I would suspect though is that, while it is true that C structs are indeed local to a translation unit, they are a bit different from classes since classes in c++ usually have behavior assigned to them. Behavior means functions and as we know, functions are not local to the translation unit.
This is just my assumption.

Why does typedef struct produce a link failure

So I have a piece of code that looks like this.
typedef struct {
int foo;
int bar;
void foobar(int, char*);
} mystruct;
and
void mystruct::foobar(int x, char* y) { return; }
and
mystruct obj;
obj.foobar(17, "X");
This all compiles, links and runs perfectly. Except when it doesn't. On one compiler it works, and on another compiler (Android GCC) it fails with a link error: unsatisfied reference.
If I change it like this, it compiles and links.
struct mystruct {
int foo;
int bar;
void foobar(int, char*);
};
I think I kind of know why, but I can't explain it properly and I can't find it in the standard. Can anyone explain it to me and find the proper reference?
Edit: I thought it was pretty obvious to all concerned that this is C++ code. I tagged it; function in a struct is not valid C; but just to be clear the file has a CPP extension and the compiler treats it as C++.
Edit: an answerer noticed that the call is a literal and therefore const, but the arg is non-const. This is not a critical factor because (a) the compiler passed it (b) the linked failed regardless of argument types.
Edit: My theory was that this is related to anonymous struct types passed to the linker so that declaration and call compiled separately did not match. It seems this may not be correct, in which case it may just be a subtle compiler bug.
Edit: out of curiosity, can anyone reproduce this behaviour? The actual compiler is Android NDK, recent download, and whatever version of GCC comes with that. If other compilers do/do not have this problem, that could be the answer.

I'm thinking it's a compiler bug. The following more extreme example is still OK:
typedef struct { void f(); } f;
void f::f() { }
From GCC buglist. I.e. you're allowed to use a typedef name to define a member function, even when that member function has the same name as the typedef.

C++11, 9.3/5
If the definition of a member function is lexically outside its class definition, the member function name shall be qualified by its class name using the :: operator.
In the first example, mystruct isn't the class name.

Basically the own name of the struct is on the first line before the { bracket. The last word is an other name to which is renamed by typedef.
Also it needs to use so:
typedef struct mystruct{int a;int b;void foobar(int,char*)} othername;
void mystruct::foobar(int,char*){}

The idiom of an anonymous struct in a typedef is one I used to use frequently when I first learnt C. For example the following lines are a valid description of a type in C
typedef struct
{
char * characters;
int length;
} String;
If gcc chokes on this then gcc is wrong.

Seems you are mixing C and C++. First, if you want the code to be C++, then you definitely must have:
void foobar(int, const char*);
because "X" is of type (const char *), not (char *).
Second, If you want your code to be C, then you can't use a function in the structure. Only pointers to function are allowed in C. So,
typedef struct {
void (*foo)(void);
} X;
Your first code is valid in C++ code, but it is probably something different then you really want.
typedef struct { ... } A; in C++ is a no-name structure with a typedef. It's something like:
struct A {
void foo();
}
typedef A B;
void B::foo() { }
I don't think this would be a valid C++ construction. Removing "A" you got your code exactly:
typedef struct {
void foo();
} B;
void B::foo() { }

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

struct method linking - c++

Related

Warning about shadowed type definitions

When is definition of class' static data member (un/-)necesary

Static data member in an unnamed class C++

Why should types be put in unnamed namespaces?

Why does typedef struct produce a link failure

Categories

Resources