Lazy symbol binding failed: symbol not found - c++

I have three header files in my project which describe objects Rational, Complex, and RubyObject. The first two are templates. All can be interconverted using copy constructors, which are defined in the header files — except for those that construct Rational and Complex from const RubyObject&s, which are defined in a source file.
Note: Those definitions are there by necessity. If they all go in the headers, you get circular dependency.
A while back, I ran into some unresolved symbol errors with the two copy constructors defined in the source file. I was able to include in the source file the following function
void nm_init_data() {
nm::RubyObject obj(INT2FIX(1));
nm::Rational32 x(obj);
nm::Rational64 y(obj);
nm::Rational128 z(obj);
volatile nm::Complex64 a(obj);
volatile nm::Complex128 b(obj);
}
and then call nm_init_data() from the library entry point in the main source file. Doing so forced these symbols to be linked properly.
Unfortunately, I recently upgraded GCC and the errors are back. In fact, it seems to happen in a slightly different place with GCC 4.6 (e.g., on Travis-CI).
But it's not a version-specific issue (as I had thought before). We see it on Travis CI's Ubuntu-based system, which runs GCC 4.6. But we don't see it on an Ubuntu machine with either GCC 4.8.1 or 4.8.2. But we do see it on a Mac OS X machine with 4.8.2 — and not the same machine with 4.7.2. Turning off optimization doesn't seem to help either.
If I run nm on my library, the symbol is definitely undefined:
$ nm tmp/x86_64-darwin13.0.0/nmatrix/2.0.0/nmatrix.bundle |grep RationalIsEC1ERKNS
U __ZN2nm8RationalIsEC1ERKNS_10RubyObjectE
00000000004ca460 D __ZZN2nm8RationalIsEC1ERKNS_10RubyObjectEE18rb_intern_id_cache
00000000004ca458 D __ZZN2nm8RationalIsEC1ERKNS_10RubyObjectEE18rb_intern_id_cache_0
I'm not sure why there are two defined entries which are subordinate to the undefined symbol, but I also don't know as much as I'd like about compilers.
It also looks like the copy constructor is an undefined symbol for each version of the Rational template:
__ZN2nm8RationalIiEC1ERKNS_10RubyObjectE
__ZN2nm8RationalIsEC1ERKNS_10RubyObjectE
__ZN2nm8RationalIxEC1ERKNS_10RubyObjectE
"Well, that's strange," I thought. "Complex64 and Complex128 are also called in that nm_init_data function, but they both resolve properly — and aren't listed in the nm -u output." So I tried adding volatile before the Rational copy construction as well, thinking that maybe the compiler was optimizing out something we don't want optimized out. But that didn't fix it either, sadly. This did, with a caveat:
void nm_init_data() {
volatile VALUE t = INT2FIX(1);
volatile nm::RubyObject obj(t);
volatile nm::Rational32 x(const_cast<nm::RubyObject&>(obj));
volatile nm::Rational64 y(const_cast<nm::RubyObject&>(obj));
volatile nm::Rational128 z(const_cast<nm::RubyObject&>(obj));
volatile nm::Complex64 a(const_cast<nm::RubyObject&>(obj));
volatile nm::Complex128 b(const_cast<nm::RubyObject&>(obj));
}
The caveat is that now I get the exact same error, but for the Complex objects instead. Argh!
dyld: lazy symbol binding failed: Symbol not found: __ZN2nm7ComplexIdEC1ERKNS_10RubyObjectE
Referenced from: /Users/jwoods/Projects/nmatrix/lib/nmatrix.bundle
Expected in: flat namespace
dyld: Symbol not found: __ZN2nm7ComplexIdEC1ERKNS_10RubyObjectE
Referenced from: /Users/jwoods/Projects/nmatrix/lib/nmatrix.bundle
Expected in: flat namespace
This is completely absurd. Here are the definitions for both of these functions, in the same source file as the nm_init_data() function:
namespace nm {
template <typename Type>
Complex<Type>::Complex(const RubyObject& other) {
// do some things
}
template <typename Type>
Rational<Type>::Rational(const RubyObject& other) {
// do some other things
}
} // end of namespace nm
Hint: One thing that is worth mentioning is that the error doesn't occur when nm_init_data() gets called (i.e., when the library is loaded). It happens much later, during another call to these troublesome functions.
How do I fix this problem once and for all, and others like it?

You claim the following, which I doubt.
Those definitions are there by necessity. If they all go in the headers, you get circular dependency.
In most cases you can solve such a circular entanglement by separating your code into an additional .hpp file, which is included together with the class definition that contains the template definitions anywhere needed.
If your code has a real circular dependency, it could not compile. Usually, if your dependencies seem to be circular, you have to look closer and go down to method level and check which of them would require both types to compile.
So it could be that your types use each other, then compile all in one .cpp file (e.g. via three .hpp includes).
Or there are only pointer to another type, then use forward declarations to ensure, that all templates are resolved.
Or third, you have some method that depend forward and some that depend backward, then put the one kind in one file, the others kind in another, and you are fine again.
Additionally, it seems that you should use a forward declaration for your missing items. I would expect something like the following after the definition of the function. E.g.:
template nm::Complex<nm::RubyObject>::Complex(const nm::RubyObject& other);

Rational, Complex... are templates
copy constructors... are defined in the header files — except for those that construct Rational and Complex from const RubyObject&s, which are defined in a source file.
And therein lies your problem. Since Rational and Complex are templates, all their methods need to be available in your header file.
If they're not, then you might sometimes be able to get away with it depending on the order in which things are called and the order in which things are linked -- but more often you'll get strange errors about undefined symbols, which is exactly what is happening here.
Simply move the definitions of Rational(const RubyObject&) and Complex(const RubyObject&) into the respective headers and everything should just work.

Related

Extern variable only in header unexpectedly working, why?

I'm currently updating a C++ library for Arduino (Specifically 8-bit AVR processors compiled using avr-gcc).
Typically the authors of the default Arduino libraries like to include an extern variable for the class inside the header, which is defined in the class .cpp file also. This I assume is basically to have everything provided ready to go for newbies as built-in objects.
The scenario I have is: The library I have updated no longer requires the .cpp file and I have removed it from the library. It wasn't until I went on a final pass checking for bugs that I realized, no linker error was produced despite the fact a definition wasn't provided for the extern variable in a .cpp file.
This is as simple as I can get it (header file):
struct Foo{
void method() {}
};
extern Foo foo;
Including this code and using it in one or many source files does not cause any linker error. I have tried it in both versions of GCC which Arduino uses (4.3.7, 4.8.1) and with C++11 enabled/disabled.
In my attempt to cause an error, I found it was only possible when doing something like taking the address of the object or modifying the contents of a dummy variable I added.
After discovering this I find its important to note:
The class functions only return other objects, as in, nothing like operators returning references to itself, or even a copy.
It only modifies external objects (registers which are effectively volatile uint8_t references in code), and returns temporaries of other classes.
All of the class functions in this header are so basic that they cost less than or equal to the cost of a function call, therefore they are (in my tests) completely in-lined into the caller. A typical statement may create many temporary objects in the call chain, however the compiler sees through these and outputs efficient code modifying registers directly, rather than a set of nested function calls.
I also recall reading in n3797 7.1.1 - 8 that extern can be used on incomplete types, however the class is fully defined whereas the declaration is not (this is probably irrelevant).
I'm led to believe that this may be a result of optimizations at play. I have seen the effect that taking the address has on objects which would otherwise be considered constant and compiled without RAM usage. By adding any layer of indirection to an object in which the compiler cannot guarantee state will cause this RAM consuming behavior.
So, maybe I've answered my question by simply asking it, however I'm still making assumptions and it bothers me. After quite some time hobby-coding C++, literally the only thing on my list of do-not's is making assumptions.
Really, what I want to know is:
With respect to the working solution I have, is it a simple case of documenting the inability to take the address (cause indirection) of the class?
Is it just an edge case behavior caused by optimizations eliminating the need for something to be linked?
Or is plain and simple undefined behavior. As in GCC may have a bug and is permitting code that might fail if optimizations were lowered or disabled?
Or one of you may be lucky enough to be in possession of a decoder ring that can find a suitable paragraph in the standard outlining the specifics.
This is my first question here, so let me know if you would like to know certain details, I can also provide GitHub links to the code if needed.
Edit: As the library needs to be compatible with existing code I need to maintain the ability to use the dot syntax, otherwise I'd simply have a class of static functions.
To remove assumptions for now, I see two options:
Add a .cpp just for the variable declaration.
Use a define in the header like #define foo (Foo()) allowing dot syntax via a temporary.
I prefer the method using a define, what does the community think?
Cheers.
Declaring something extern just informs the assembler and the linker that whenever you use that label/symbol, it should refer to entry in the symbol table, instead of a locally allocated symbol.
The role of the linker is to replace symbol table entries with an actual reference to the address space whenever possible.
If you don't use the symbol at all in your C file, it will not show up in the assembly code, and thus will not cause any linker error when your module is linked with others, since there is no undefined reference.
It is either an edge case behaviour caused by optimization, or you never use the foo variable in your code. I'm not 100% sure it is formally not an undefined behavior, but i'm quite sure it isn't undefined from practical point of view.
extern variables are implemented in such way, that code compiled with them produces so-called relocations - empty places where addres of variable should be placed - which are then filled by linker. Apparently foo is never used in your code in such a way that would need getting it's address and therefore linker doesn't even try to find that symbol. If you turn optimization off (-O0) you will probably get linker error.
Update: If you want to keep "dot notation" but remove the problem with undefined extern, you may replace extern with static (in header file), creating separate "instance" of variable for each TU. As this variable is going to be optimized out anyway, this will not change the real code at all, but will also work for unoptimized build.

Different C++ Class Declarations

I'm trying to use a third party C++ library that isn't using namespaces and is causing symbol conflicts. The conflicting symbols are for classes my code isn't utilizing, so I was considering creating custom header files for the third party library where the class declarations only include the public members my code is using, leaving out any members that use the conflicting classes. Basically creating an interface.
I have three questions:
If the compilation to .obj files works, will this technique still cause symbol conflicts when I get to linking?
If that isn't a problem, will the varying class declarations cause problems when linking? For example, does the linker verify that the declaration of a class used by each .obj file has the same number of members?
If neither of those are a problem and I'm able to link the .obj files, will it cause problems when invoking methods? I don't know exactly how C++ works under the hood, but if it uses indexes to point to class methods, and those indexes were different from one .obj file to another, I'm guessing this approach would blow up at runtime.
In theory, you need identical declarations for this to work.
In practice, you will definitely need to make sure your declarations contain:
All the methods you use
All the virtual methods, used or not.
All the data members
You need all these in the right order of declaration too.
You might get away with faking the data members, but would need to make sure you put in stubs that had the same size.
If you do not do all this, you will not get the same object layout and even if a link works it will fail badly and quickly at run-time.
If you do this, it still seems risky to me and as a worst case may appear to work but have odd run time failures.
"if it uses indexes ": To some extent exactly how virtual functions work is implementation defined, but typically it does use an index into a virtual function table.
What you might be able to do is to:
Take the original headers
Keep the full declarations for the classes you use
Stub out the classes and declarations you do not use but are referenced by the ones you do.
Remove all the types not referenced at all.
For explanatory purposes a simplified explaination follows.
c++ allows you to use functions you declare. what you do is putting multiple definitions to a single declaration across multiple translation units. if you expose the class declaration in a header file your compiler sees this in each translation unit, that includes the header file.
Therefore your own class functions have to be defined exactly as they have been declared (same function names same arguments).
if the function is not called you are allowed not to define it, because the compiler doesn't know whether it might be defined in another translation unit.
Compilation causes label creation for each defined function(symbol) in the object code. On the other hand a unresolved label is created for each symbol that is referenced to (a call site, a variable use).
So if you follow this rules you should get to the point where your code compiles but fails to link. The linker is the tool that maps defined symbols from each translation-unit to symbol references.
If the object files that are linked together have multiple definitions to the same functions the linker is unable to create an exact match and therefore fails to link.
In practice you most likely want to provide a library and enjoy using your own classes without bothering what your user might define. In spite of the programmer taking extra care to put things into a namespace two users might still choose the same name for a namespace. This will lead to link failures, because the compiler exposed the symbols and is supposed to link them.
gcc has added an attribute to explicitly mark symbols, that should not be exposed to the linker. (called attribute hidden (see this SO question))
This makes it possible to have multiple definitions of a class with the same name.
In order for this to work across compilation units, you have to make sure class declarations are not exposed in an interface header as it could cause multiple unmatching declarations.
I recommend using a wrapper to encapsulate the third party library.
Wrapper.h
#ifndef WRAPPER_H_
#define WRAPPER_H_
#include <memory>
class third_party;
class Wrapper
{
public:
void wrappedFunction();
Wrapper();
private:
// A better choice would be a unique_ptr but g++ and clang++ failed to
// compile due to "incomplete type" which is the whole point
std::shared_ptr<third_party> wrapped;
};
#endif
Wrapper.cpp
#include "Wrapper.h"
#include <third_party.h>
void Wrapper::wrappedFunction()
{
wrapped->command();
}
Wrapper::Wrapper():wrapped{std::make_shared<third_party>()}
{
}
The reason why a unique_ptr doesn't work is explained here: std::unique_ptr with an incomplete type won't compile
You can move the entire library into a namespace by using a clever trick to do with imports. All the import directive does is copy the relevant code into the current "translation unit" (a fancy name for the current code). You can take advantage of this as so
I've borrowed heavily from another answer by user JohnB which was later deleted by him.
// my_thirdparty.h
namespace ThirdParty {
#include "thirdparty.h"
//... Include all the headers here that you need to use for thirdparty.
}
// my_thirdparty.cpp / .cc
namespace ThirdParty {
#include "thirdparty.cpp"
//... Put all .cpp files in here that are currently in your project
}
Finally, remove all the .cpp files in the third party library from your project. Only compile my_thirdparty.cpp.
Warning: If you include many library files from the single my_thirdparty.cpp this might introduce compiler issues due to interaction between the individual .cpp files. Things such as include namespace or bad define / include directives can cause this. Either resolve or create multiple my_thirdparty.cpp files, splitting the library between them.

C++: Compiler and Linker functionality

I want to understand exactly which part of a program compiler looks at and which the linker looks at. So I wrote the following code:
#include <iostream>
using namespace std;
#include <string>
class Test {
private:
int i;
public:
Test(int val) {i=val ;}
void DefinedCorrectFunction(int val);
void DefinedIncorrectFunction(int val);
void NonDefinedFunction(int val);
template <class paramType>
void FunctionTemplate (paramType val) { i = val }
};
void Test::DefinedCorrectFunction(int val)
{
i = val;
}
void Test::DefinedIncorrectFunction(int val)
{
i = val
}
void main()
{
Test testObject(1);
//testObject.NonDefinedFunction(2);
//testObject.FunctionTemplate<int>(2);
}
I have three functions:
DefinedCorrectFunction - This is a normal function declared and defined correctly.
DefinedIncorrectFunction - This function is declared correctly but the implementation is wrong (missing ;)
NonDefinedFunction - Only declaration. No definition.
FunctionTemplate - A function template.
Now if I compile this code I get a compiler error for the missing ';'in DefinedIncorrectFunction.
Suppose I fix this and then comment out testObject.NonDefinedFunction(2). Now I get a linker error.
Now comment out testObject.FunctionTemplate(2). Now I get a compiler error for the missing ';'.
For function templates I understand that they are not touched by the compiler unless they are invoked in the code. So the missing ';' is not complained by the compiler until I called testObject.FunctionTemplate(2).
For the testObject.NonDefinedFunction(2), the compiler did not complain but the linker did. For my understanding, all compiler cared was to know that is a NonDefinedFunction function declared. It didn't care for the implementation. Then linker complained because it could not find the implementation. So far so good.
Where I get confused is when compiler complained about DefinedIncorrectFunction. It didn't look for implementation of NonDefinedFunction but it went through the DefinedIncorrectFunction.
So I'm little unclear as to what the compiler does exactly and what the linker does. My understanding is linker links components with their calls. So for when NonDefinedFunction is called it looked for the compiled implementation of NonDefinedFunction and complained. But compiler didn't care about the implementation of NonDefinedFunction but it did for DefinedIncorrectFunction.
I'd really appreciate if someone can explain this or provide some reference.
Thank you.
The function of the compiler is to compile the code that you have written and convert it into object files. So if you have missed a ; or used an undefined variable, the compiler will complain because these are syntax errors.
If the compilation proceeds without any hitch, the object files are produced. The object files have a complex structure but basically contain five things
Headers - The information about the file
Object Code - Code in machine language (This code cannot run by itself in most cases)
Relocation Information - What portions of code will need to have addresses changed when the actual execution occurs
Symbol Table - Symbols referenced by the code. They may be defined in this code, imported from other modules or defined by linker
Debugging Info - Used by debuggers
The compiler compiles the code and fills the symbol table with every symbol it encounters. Symbols refers to both variables and functions. The answer to This question explains the symbol table.
This contains a collection of executable code and data that the linker can process into a working application or shared library. The object file has a data structure called a symbol table in it that maps the different items in the object file to names that the linker can understand.
The point to note
If you call a function from your code, the compiler doesn't put the
final address of the routine in the object file. Instead, it puts a
placeholder value into the code and adds a note that tells the linker
to look up the reference in the various symbol tables from all the
object files it's processing and stick the final location there.
The generated object files are processed by the linker that will fill out the blanks in symbol tables, link one module to the other and finally give the executable code which can be loaded by the loader.
So in your specific case -
DefinedIncorrectFunction() - The compiler gets the definition of the function and begins compiling it to make the object code and insert appropriate reference into Symbol Table. Compilation fails due to syntax error, so Compiler aborts with an error.
NonDefinedFunction() - The compiler gets the declaration but no definition so it adds an entry to symbol table and flags the linker to add appropriate values (Since linker will process a bunch of object files, it is possible this definitionis present in some other object file). In your case you do not specify any other file, so the linker aborts with an undefined reference to NonDefinedFunction error because it can't find the reference to the concerned symbol table entry.
To understand it further lets say your code is structured as following
File- try.h
#include<string>
#include<iostream>
class Test {
private:
int i;
public:
Test(int val) {i=val ;}
void DefinedCorrectFunction(int val);
void DefinedIncorrectFunction(int val);
void NonDefinedFunction(int val);
template <class paramType>
void FunctionTemplate (paramType val) { i = val; }
};
File try.cpp
#include "try.h"
void Test::DefinedCorrectFunction(int val)
{
i = val;
}
void Test::DefinedIncorrectFunction(int val)
{
i = val;
}
int main()
{
Test testObject(1);
testObject.NonDefinedFunction(2);
//testObject.FunctionTemplate<int>(2);
return 0;
}
Let us first only copile and assemble the code but not link it
$g++ -c try.cpp -o try.o
$
This step proceeds without any problem. So you have the object code in try.o. Let's try and link it up
$g++ try.o
try.o: In function `main':
try.cpp:(.text+0x52): undefined reference to `Test::NonDefinedFunction(int)'
collect2: ld returned 1 exit status
You forgot to define Test::NonDefinedFunction. Let's define it in a separate file.
File- try1.cpp
#include "try.h"
void Test::NonDefinedFunction(int val)
{
i = val;
}
Let us compile it into object code
$ g++ -c try1.cpp -o try1.o
$
Again it is successful. Let us try to link only this file
$ g++ try1.o
/usr/lib/gcc/x86_64-redhat-linux/4.4.5/../../../../lib64/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: ld returned 1 exit status
No main so won';t link!!
Now you have two separate object codes that have all the components you need. Just pass BOTH of them to linker and let it do the rest
$ g++ try.o try1.o
$
No error!! This is because the linker finds definitions of all the functions (even though it is scattered in different object files) and fills the blanks in object codes with appropriate values
I believe this is your question:
Where I get confused is when compiler complained about DefinedIncorrectFunction. It didn't look for implementation of NonDefinedFunction but it went through the DefinedIncorrectFunction.
The compiler tried to parse DefinedIncorrectFunction (because you provided a definition in this source file) and there was a syntax error (missing semicolon). On the other hand, the compiler never saw a definition for NonDefinedFunction because there simply was no code in this module. You might have provided a definition of NonDefinedFunction in another source file, but the compiler doesn't know that. The compiler only looks at one source file (and its included header files) at a time.
Say you want to eat some soup, so you go to a restaurant.
You search the menu for soup. If you don't find it in the menu, you leave the restaurant. (kind of like a compiler complaining it couldn't find the function) If you find it, what do you do?
You call the waiter to go get you some soup. However, just because it's in the menu, doesn't mean that they also have it in the kitchen. Could be an outdated menu, it could be that someone forgot to tell the chef that he's supposed to make soup. So again, you leave. (like an error from the linker that it couldn't find the symbol)
Compiler checks that the source code is language conformant and adheres to the semantics of the language. The output from compiler is object code.
Linker links the different object modules together to form a exe. The definitions of functions are located in this phase and the appropriate code to call them is added in this phase.
The compiler compiles code in the form of translation units. It will compile all the code that is included in a source .cppfile,
DefinedIncorrectFunction() is defined in your source file, So compiler checks it for language validity.
NonDefinedFunction() does have any definition in the source file so the compiler does not need to compile it, if the definition is present in some other source file, the function will be compiled as a part of that translation unit and further the linker will link to it, if at linking stage the definition is not found by the linker then it will raise a linking error.
What the compiler does, and what the linker does, depends on the
implementation: a legal implementation could just store the tokenized
source in the “compiler”, and do everything in the linker.
Modern implementations do put off more and more to the linker, for
better optimization. And many early implementations of templates didn't
even look the template code until link time, other than matching braces
enough to know where the template ended. From a user point of view,
you're more interested in whether the error “requires a
diagnostic” (which can be emitted by the compiler or the linker)
or is undefined behavior.
In the case of DefinedIncorrectFunction, you have provides source text
which the implementation is required to parse. That text contains a
error for which a diagnostic is required. In the case of
NonDefinedFunction: if the function is used, failure to provide a
definition (or providing more than one definition) in the complete
program is a violation of the one definition rule, which is undefined
behavior. No diagnostic is required (but I can't imagine an
implementation that didn't provide one for a missing definition of a
function that was used).
In practice, errors which can be easily detected simply by examining the
text input of a single translation unit are defined by the standard to
“require a diagnostic”, and will be detected by the
compiler. Errors which cannot be detected by the examination of a
single translation unit (e.g. a missing definition, which might be
present in a different translation unit) are formally undefined
behavior—in many cases, the errors can be detected by the linker,
and in such cases, implementations will in fact emit an error.
This is somewhat modified in cases like inline functions, where you're
allowed to repeat the definition in each translation unit, and extremely
modified by templates, since many errors cannot be detected until
instantiation. In the case of templates, the standard leaves
implementations a great deal of freedom: at the least, the compiler must
parse the template enough to determine where the template ends. The
standard added things like typename, however, to allow much more
parsing before instantiation. In dependent contexts, however, some
errors cannot possibly be detected before instantiation, which may take
place at compilation time or at link time—early implementations
favored link time instantiation; compile time instantiation dominates
today, and is used by VC++ and g++.
The missing semi-colon is a syntax error and therefore the code should not compile. This might happen even in a template implementation. Essentially, there is a parsing stage and whilst it is obvious to a human how to "fix and recover" a compiler doesn't have to do that. It can't just "imagine the semi-colon is there because that's what you meant" and continue.
A linker looks for function definitions to call where they are required. It isn't required here so there is no complaint. There is no error in this file as such, as even if it were required, it might not be implemented in this particular compilation unit. The linker is responsible for collecting together different compilation units, i.e. "linking" them.
Ah, but you could have NonDefinedFunction(int) in another compilation unit.
The compiler produces some output for the linker that basically says the following (among other things):
Which symbols (functions/variables/etc) are defined.
Which symbols are referenced but undefined. In this case the linker needs to resolve the references by searching through the other modules being linked. If it can't, you get a linker error.
The linker is there to link in code defined (possibly) in external modules - libraries or object files you will use together with this particular source file to generate the complete executable. So, if you have a declaration but no definition, your code will compile because the compiler knows the linker might find the missing code somewhere else and make it work. Therefore, in this case you will get an error from the linker, not the compiler.
If, on the other hand, there's a syntax error in your code, the compiler can't even compile and you will get an error at this stage. Macros and templates may behave a bit differently yet, not causing errors if they are not used (templates are about as much as macros with a somewhat nicer interface), but it also depends on the error's gravity. If you mess up so much that the compiler can't figure it out where the templated/macro code ends and regular code starts, it won't be able to compile.
With regular code, the compiler must compile even dead code (code not referenced in your source file) because someone might want to use that code from another source file, by linking your .o file to his code. Therefore non-templated/macro code must be syntactically correct even if it is not directly used in the same source file.

Templated C++ Object Files

Lets say I have two .cpp files, file1.cpp and file2.cpp, which use std::vector<int>. Suppose that file1.cpp has a int main(void). If I compiled both into file1.o and file2.o, and linked the two object files into an elf binary which I can execute. I am compiling on a 32-bit Ubuntu Linux machine.
My question regards how the compiler and linker put together the symbols for the std::vector:
When the linker makes my final binary, is there code duplication? Does the linker have one set of "templated" code for the code in f1.o that uses std::vector and another set of std::vector code for the code that comprises f2.o?
I tried this for myself (I used g++ -g) and I looked at my final executable disassembly, and I found the labels generated for the vector constructor and other methods were apparently random, although the code from f1.o appeared to have called the same constructor as the code from f2.o. I could not be sure, however.
If the linker does prevent the code duplication, how does it do it? Must it "know" what templates are? Does it always prevent code duplication regarding multiple uses of the same templated code across multiple object files?
It knows what the templates are through name mangling. The type of the object is encoded by the compiler in its name, and that allows the linker to filter out the duplicate implementations of the same template.
This is done during linking, and not compilation, because each .o file can be linked with anything thus cannot be stripped of something that may later be needed. Only the linker can decide which code is unused, which template is duplicate, etc. This is done by using "Weak Symbols" in the object's symbol list: Symbols that the linker can remove if they appear multiple times (as opposed to other symbols, like user-defined functions, that cannot be removed if duplicate and cause a linking error).
Your question is stated verbatim in the opening section of this documentation:
http://gcc.gnu.org/onlinedocs/gcc/Template-Instantiation.html
Technically due to the "one definition rule" there is only one std::vector<int> and therefore the code should be linked together. What may happen is that some code is inlined which would speed up execution time but could produce more code.
If you had one file using std::vector<int> and another using std::vector<unsigned int> then you would have 2 classes and potentially lots of duplicate code.
Of course the writers of vector might use some common code for certain situations eg POD types that removes the duplication.

Throw-catch cause linkage errors

I'm getting linkage errors of the following type:
Festival.obj : error LNK2019:
unresolved external symbol "public:
void __thiscall Tree::add(class Price &)"
(?add#?$Tree#VPrice####QAEXAAVPrice###Z)
referenced in function
__catch$?AddBand#Festival##QAE?AW4StatusType##HHH#Z$0
I used to think it has to do with try-catch mechanism, but since been told otherwise. This is an updated version of the question.
I'm using Visual Studio 2008, but I have similar problems in g++.
The relevant code:
In Festival.cpp
#include "Tree.h"
#include <exception>
using namespace std;
class Band{
public:
Band(int bandID, int price, int votes=0): bandID(bandID), price(price), votes(votes){};
...
private:
...
};
class Festival{
public:
Festival(int budget): budget(budget), minPrice(0), maxNeededBudget(0), priceOffset(0), bandCounter(0){};
~Festival();
StatusType AddBand(int bandID, int price, int votes=0);
...
private:
Tree<Band> bandTree;
...
};
StatusType Festival::AddBand(int bandID, int price, int votes){
if ((price<0)||(bandID<0)){
return INVALID_INPUT;
}
Band* newBand=NULL;
try{
newBand=new Band(bandID,price-priceOffset,votes);
}
catch(bad_alloc&){return ALLOCATION_ERROR;}
if (bandTree.find(*newBand)!=NULL){
delete newBand;
return FAILURE;
}
bandTree.add(*newBand);
....
}
In Tree.h:
template<class T>
class Tree{
public:
Tree(T* initialData=NULL, Tree<T>* initialFather=NULL);
void add(T& newData);
....
private:
....
};
Interestingly enough I do not have linkage errors when I try to use Tree functions when type T is a primitive type like an int.
Is there Tree.cpp? If there is, maybe you forgot to link it? Where is the implementation of Tree::add?
In addition I don't see where you call Tree::add. I guess it should be inside the try statement, right after the new?
Just a reminder:
For most compilers (i.e. those that practice separate compilation) the implementation of the member functions of a template class has to be visible during the compilation of the source file that uses the template class. Usually people follow this rule by putting the implementation of the member functions inside the header file.
Maybe Tree::add isn't inside the header? Then a possible solution in the discussed case will be to put Tree::add implementation inside the header file.
The difference between regular classes and template classes exists because template classes are not "real" classes - it is, well, a template. If you had defined your Tree class as a regular class, the compiler could have used your code right away. In case of a template the compiler first "writes" for you the real class, substituting the template parameters with the types you supplied. Now, compiler compiles cpp files one by one. He is not aware of other cpp files and can use nothing from other cpp files. Let's say your implementation of Tree:add looks like this:
void Tree::add(T& newData)
{
newData.destroyEverything();
}
It is totally legitimate as long as your T has method destroyEverything. When the compiler compiles Class.cpp it wants to be sure that you don't do with T anything it doesn't know. For example Tree<int> won't work because int doesn't have destroyEverything. The compiler will try to write your code with int instead of T and find out that the code doesn't compile. But since the compiler "sees" only the current cpp and everything it includes, it won't be able to validate add function, since it is in a separate cpp.
There won't be any problem with
void Tree::add(int& newData)
{
newData.destroyEverything();
}
implemented in a separate cpp because the compiler knows that int is the only acceptable type and can "count on himself" that when he gets to compile Tree.cpp he will find the error.
Are you sure the try/catch has anything to do with it? What happens if you simply comment out the try and catch lines, leave the rest of the code as it is, and build that?
It might just be that you're missing the library that defines Tree::add(class Price &) from your link line.
Update: using Tree functions with a primitive type doesn't result in a linking error.
I updated my question in light of some of the things that were said.
As others have stated you need to show the implementation of Treee::add() and tell us how you are linking it.
On an unrelated point, if you are using constructs like:
Band* newBand=NULL;
try{
newBand=new Band(bandID,price-priceOffset,votes);
}
catch(bad_alloc&){return ALLOCATION_ERROR;}
throughout your code, you are frankly wasting your time. The chances of you getting to a point of memory exhaustion in a modern OS are remote and the chances of you doing anything useful after it has happened are roughly zero. You will be much better off simply saying:
Band * newBand = new Band ( bandID, price - priceOffset, votes );
ot possibly:
Band newBand( bandID, price - priceOffset, votes );
and forgetting the exception handling in this case.
You wrote in a comment:
I considered this but the function is part of Tree.h, and I do include it. The function defined is: template void Tree::add(T& newData); We call it the following way: priceTree.add(*newPriceNode); whereas priceTree is Tree, both of which are defined in the cpp file in question.
instead of:
priceTree.add(*newPriceNode);
try:
priceTree.add(newPriceNode); //no "*" before "newPriceNode"
add() takes a reference to a node, not a pointer to a node (according to your definition of Tree).
You're getting linkage errors, not compiler errors. This tells us that the compiler knew what sort of function Tree::add() is, but didn't have a definition. In Tree.h, I see a declaration of the add() function, but not a definition. It looks odd to me; does anybody know where Tree.h came from?
Usually a template class comes with member function definitions in the include file, since the functions have to be instantiated somewhere, and the simplest thing is for the compiler to instantiate when used and let the linker sort it out. If the definitions are in Tree.h, I'd expect everything to work as planned.
So, I'm going to go out on a limb and suggest that the definitions are in a separate file, not linked in, and that there are provisions elsewhere for instantiating for basic types like Tree<int>. This is presumably to streamline compilation, as normally these things are compiled in multiple places, and that takes time.
What you need to do in that case is to find where Tree<int> is instantiated, and add an instantiation for your class.
I could be way off base here, but my explanation does fit the facts you've given.
Edit after first comments:
Templates are somewhat trickier than ordinary functions, which usually isn't a real problem. If the definitions for all the calls were in Tree.h, then Festival.cpp would be able to instantiate Tree<Band> and everything would be cool. That's the usual technique, and you're running into this problem because you're not using it.
When you write a function, it gets compiled, and the linker will find it. Any routine calling that function needs to know the function prototype, so it will know how to call it. When you write a template, you're not writing anything that will go directly into the program, but any use of the template counts as writing all the functions.
Therefore, there has to be some use of Tree<Band> somewhere in your program, for there to be a Tree<Band>::add() function compiled. The definition of Tree<T>::add has to be available to the compiler when Tree<Band> is instantiated, because otherwise the compiler has no idea what to compile. In this case, it's generating the function call, confident that you'll make sure the function is compiled elsewhere.
Therefore, you have to instantiate Tree<Band> inside a file that has access to both the definitions for Tree<T> and Band. This probably means a file that is, or includes, Tree.cpp and includes Festival.h.
The linker is already using Tree.cpp, but Tree.cpp doesn't have Tree<Band> defined in it, so it's meaningless to the linker. Templates are only useful for the compiler, and the linker only operates on what the compiler generated from templates.
The quick way to solve this is to take the definitions from Tree.cpp and put them in Tree.h. That will be likely to increase compilation and link times, unfortunately. The other technique is to instantiate all template uses in Tree.cpp, so that they'll be compiled there.