How to get rid of multiple definitions of fully specialized function templates? - c++

I'm having a problem with linking the objects of one of my C++ applications. The source files are all compiled into object files, but many of them rely on the same library, which has a fully specialised function template. The linker complains when trying to link them all together. I understand why this is, but I don't understand how to fix it.
I found this, which explains the problem exactly, but the forum thread never got to a point where the OP asked for a solution. Womp, womp.
How do I compile all the source files, but only get the specialised functions from the library once?

The problem there is that the specialization also has the implementation in the header, which is wrong.
You should either move the implementation to a source file, or mark the method as inline.

Related

Elegant solution to implementing c++ templates

Inspired by this 2009 question
Background: I'm currently working on a small c++ project and decided to try my hand at creating my own templated classes. I immediately ran into a dozen linker errors.
As it stands, my understanding is that template specializations aren't generated until they absolutely need to be, and this implies that the implementation of a templated class must be either inlined, or accompanied by an explicit instantiation at the bottom. (Why one implies the other I'm not so sure)
Question: Why is that so? Is there something special about the order of compilation that makes it impossible for a compiler to instantiate the template on-demand if it is implemented in a separate .cpp file? In my mind the header and the implementation were simply appended together.
Additionally, the question I linked above was initially posted more than ten years ago, and some comments note that the c++-faq quote mentioned is out of date, so I was wondering if newer standards support solutions that enable both separate header/implementation files and implicit instantiation.
Why is it so?
As templates compiles through two phases in first phase compiler checks mostly for syntactical errors. If there is no error found in your template is legal to be used, but at this stage compiler do not generate any code for it. And in the second phase compiler will generate the code for all the class members function, of templated functions you used.
Because templates are evaluated at compile time. So what happens when compiler compiles it? For example if you defined a template in templated.hpp file and its implementation in implementation.cpp file. Compiler compiles each file separately into an object and then linker link them together. As templates are evaluated at compile time so compiler need its implementation at compile time, which is not available if you are having it in different implementation file. So linkers complains to you that I could not find implementation for type T for your this template. This all happens at compile time.
So far until C++20 or even C++23 templates are still needed to be evaluated at compile time albeit C++ has added new concept modules, I am not sure it can be used this way, but you can read about it here.

Is really required to separate c++ constructions in a .h and a .cpp files?

Well, I'm getting in C++ universe and this doubt came over.
It's too boring to have 2 files for each meaning-unit I choose to include in my program.
I know I can have multiple classes in the same (pair of) archive(s), however I would like to clarify if really there's no way to write just one file, instead of a .h and a .cpp ones.
I found some other answers (like this, that and that other) there are actually pretty explicative, but a quite older too. So hopping the language have got some improvement I came to ask:
Is there some compilation option, any another alternative extension, or whatever, that allows me to write just one file?
I appreciate it!
Okay, you need to understand what is going on. This isn't Java.
The .h file is the definition of your class. It's just an include file that can be used other places so those other places understand your class. Now, you CAN actually do the constructor inline, like this:
public:
Foo() { ... your code here ... }
This is perfectly legal. But the problem with this is simple. Everywhere you hit this constructor, the compiler has to insert that code inline. This leads to lots of the same code everywhere you create a new Foo.
If you put the code in your .cpp file, and then compile it, you get a .o file. That .o file includes a single copy of your constructor, and that's what gets called everywhere you create a Foo.
So separating the definition from the code results in smaller programs. That's less important nowadays than it used to be.
This is the nature of C++, and you should accept it.
The .h is an include file, used in other places. The .cpp is the implementation. It's not really onerous, once you grow accustomed.
You have to understand that C++ is a compiled language. When you compile a library, for example, the library contains machine-specific code. If you want to write a program that uses that library, your program has to be able to see function and class definitions to properly link that library. On the other hand, it is absolutely possible to write your entire program in header files -- indeed, the term header-only library exists to describe libraries that have no pre-compiled machine code. That means the responsibility of compiling it falls to you. You'll likely have longer compile times, and because of this, very large libraries are almost exclusively pre-compiled into binaries that are platform-specific (in the absence of a set of binaries for you machine, you must compile from source and link against the result). In theory, one could rewrite the C++ spec in such a way that only one file was necessary, but then those files would need to be present within any project that incorporates that library. For very large libraries, this can be a pain -- why include the full source of some engine when you could include just the definitions necessary to link into the binaries? This provides the added advantage of obfuscating the algorithms and implementation details from the client program. C++ is not an interpreted programming language -- it's important to think about it from the compiler's perspective.

Why only declare functions in headers

i know what is a header file,but,i still don't understand why a lot of programmers make a header file,and a source file with the same name,and only prototype functions in the header file,while tell what the function does in the source file.
I never make functions and their prototypes in separate files,just stuff it all into the header file.
The question is,why make a source file for headers?Does it give any advantages?Is it just to make the code look cleaner?I don't understand.
If you implement a function in a header, and then include the header into two or more different source files, you'll have multiple definitions of the same function. This violates the one definition rule.
It's possible to get around that by declaring the functions inline, but (unless the linker knows how to merge the multiple definitions) that can lead to code bloat.
You usually define (not only declare) inlined functions in headers.
And you declare non-inlined functions (e.g. functions whose body is large enough), and define them (i.e. implement them) in one particular compilation unit.
Then the linker resolves the appropriate function names (usually mangled names) at link-time. And you don't want to have multiply defined functions.
Having a function definition provided only by one compilation unit makes the total build time a bit faster.
With link-time optimizations (e.g. the -flto option to g++ both during compilation and during linking) things become more complicated.
Notice that huge software (some executables are nearly one gigabyte of binary, and simply linking them takes several minutes) bring constraints that a lone programmer don't even imagine. Just try to compile a large free software (Libreoffice, Firefox, Qt5, ...) from its source code to guess the issues.
BTW, you could in principle put all the code of some program in a single source file, but for valid and obvious reasons people don't do that.
Putting function definitions into the header causes lengthy compile times when using any non-trivial system. It may work for small projects to make everything inline but it certainly does not work for bigger systems (not to mention large systems with a couple of hundred million lines of code).
The function declarations inside a header file provide symbol references that can be used to link code together. When you compile individual source files, each one gets generated into object code.
If you want to use one source's functions in another source file, you need some way to know where that code is and how to call it. The compiler is able to use the function declarations for just this reason. It knows how to call it and what it returns, but it doesn't yet know where the source for the function is.
Once all the sources are compiled into object code, the linker then assembles all the object files into an executable (or library), and these symbol references are resolved.
If you have all the code in a single, non-shared file it really doesn't matter where it is - in header or in source file.
There are 2 main reasons for splitting the code to headers and source files. They can be summarized like this:
Technical. If you have several source files interacting with each other, you need to include the headers. If you define everything in the header, you'll have multiple definitions of you code included - and that's a compilation error.
Design. Header files define the interface of your software which can be distributed to client software without exposure of the internal implementation.

compile C++ header-only templates to a shared library

I'm working on a code base of template classes. It's header-only (no .cpp files). I would like to hide implementation and provide a shared library plus a couple of headers containing only declaration. Unfortunately sounds like doesn't make a sense. Since there is no compiled code, so what will be in such a shared lib? Trying to remove definitions from headers after compiling, causes undefined references. Is there a way to force compiler to ship objects in dll or shared library without having to explicitly instantiate template classes?
No, there isn't and wont be a way to do that for the foreseeable future. The only way to provide template C++ code is as header files only. Modules might change that, but that is unlikely to happen before your library is finished.
Something you can try is to split into implementation and explicitly instantiate all possible use cases. Then the library you ship wont work with any other types then the instantiated ones and would significantly reduce the benefit templates bring.
Template implementations need to be known at compile time. That's why you can't separate implementation from declaration. So if you want to have the advantages of templates, there is no way around passing your header(s).

How to get your head around C++ linking/dependencies?

I'm a Java developer and I never have to worry about including files or messing with BUILD files.
Whenever I need to write C++ code, things get more complicated. I can think of creating *.h files as interfaces in Java, but figuring out how to write the build file and what order classes should be included in gives me a headache.
Is there a simple way to think of this? How do you know when to include something or how to separate things out properly. For example, what is generally a good way to deal with a project with dozens of sources files that are interdependent on each other.
Is there some framework to make creating BUILD files or managing all this boilerplate compilation stuff more bearable?
CMake is the best build system I've been able to find so far. You give it a list of your source files, and it will automatically scan dependencies and recompile only changed files. Although its syntax is a bit funny, and documentation is not very accessible, CMake beats GNU autotools in usability and simplicity, and it works on all major platforms.
As to your "mental model" of what's going on, here are some points to keep in mind.
A .cpp file is compiled completely independently of other .cpp files.
The .cpp file is read by the compiler from top to bottom, only once. Hence, things need to be in the proper order.
A #include directive is the same as copy/pasting the header into the .cpp file.
At the point where a function is used, a declaration of that function is needed, but not necessarily a definition.
At the point where a class member is accessed, a definition of the class is needed. Deriving from a class also requires its definition. Taking pointers or references does not require a definition, but does require a declaration. Use this to your advantage in headers: instead of including Foo.hpp, see if you can get away with just a declaration of class Foo;.
When compiling a .cpp file, a .o file is generated that contains the implementation of exactly those functions defined in the .cpp. References to functions not defined therein are left for the linker to resolve.
The linker puts all these definitions together into an executable, but each function definition has to be present exactly once. (Templates and inline functions are an exception.)
I am a big fan of stackoverflow podcast and I have decided that when I would use a build system I should use FinalBuilder.
Jeff Atwood and Joel Spolsky had a conversation about that and it is mentioned that it is used in Fog Creek.
The podcast is here
FinalBuilder is here
Feature Tour
I hope it is suitable for the purpose.
It works on the Windows platform.