Elegant solution to implementing c++ templates

Elegant solution to implementing c++ templates - c++

Inspired by this 2009 question
Background: I'm currently working on a small c++ project and decided to try my hand at creating my own templated classes. I immediately ran into a dozen linker errors.
As it stands, my understanding is that template specializations aren't generated until they absolutely need to be, and this implies that the implementation of a templated class must be either inlined, or accompanied by an explicit instantiation at the bottom. (Why one implies the other I'm not so sure)
Question: Why is that so? Is there something special about the order of compilation that makes it impossible for a compiler to instantiate the template on-demand if it is implemented in a separate .cpp file? In my mind the header and the implementation were simply appended together.
Additionally, the question I linked above was initially posted more than ten years ago, and some comments note that the c++-faq quote mentioned is out of date, so I was wondering if newer standards support solutions that enable both separate header/implementation files and implicit instantiation.

Why is it so?
As templates compiles through two phases in first phase compiler checks mostly for syntactical errors. If there is no error found in your template is legal to be used, but at this stage compiler do not generate any code for it. And in the second phase compiler will generate the code for all the class members function, of templated functions you used.
Because templates are evaluated at compile time. So what happens when compiler compiles it? For example if you defined a template in templated.hpp file and its implementation in implementation.cpp file. Compiler compiles each file separately into an object and then linker link them together. As templates are evaluated at compile time so compiler need its implementation at compile time, which is not available if you are having it in different implementation file. So linkers complains to you that I could not find implementation for type T for your this template. This all happens at compile time.
So far until C++20 or even C++23 templates are still needed to be evaluated at compile time albeit C++ has added new concept modules, I am not sure it can be used this way, but you can read about it here.

Related

Why do Inline Member Functions become accessible from a different cpp file? [duplicate]

I am aware that the keyword inline has useful properties e.g. for keeping template specializations inside a header file.
On the other hand I have often read that inline is almost useless as hint for the compiler to actually inline functions.
Further the keyword cannot be used inside a cpp file since the compiler wants to inspect functions marked with the inline keyword whenever they are called.
Hence I am a little confused about the "automatic" inlining capabilities of modern compilers (namely gcc 4.43). When I define a function inside a cpp, can the compiler inline it anyway if it deems that inlining makes sense for the function or do I rob him of some optimization capabilities ? (Which would be fine for the majority of functions, but important to know for small ones called very often)

Within the compilation unit the compiler will have no problem inline functions (even if they are not marked as inline). Across compilation units it is harder but modern compilers can do it.
Use of the inline tag has little affect on 'modern' compilers and whether it actually inlines functions (it has better heuristics than the human mind) (unless you specify flags to force it one way or the other (which is usually a bad idea as humans are bad at making this decision)).

Microsoft Visual C++ was able to do so at least since Visual Studio 2005. They call it "Whole Program Optimization" or "Link-Time Code Generation". In this implementation, the compiler will not actually produce machine code, but write the preprocessed C++ code into the object files. The linker will then merge all of the code into one huge code unit and perform the actual compilation..
GCC is able to do this since at least version 4.5, with major improvements coming in GCC 4.7. To my knowledge the feature is still considered somewhat experimental (at least in so far as many Linux distributions not using it). GCC's implementation works very similarly by first writing the preprocessed source (in its GIMPLE intermediate language) into the object files, then compiling all of the object files into a single object file which is then passed to the linker (this allows GCC to continue to work with existing linkers).
Many big C++ projects also do what is now being called "unity builds". Instead of passing hundreds of individual C++ source files into the compiler, one source file is created that includes all the other source files in the project. The original intent behind this is to decrease compilation times (since headers etc. do not have to be parsed over and over), but as a side-effect, it will have the same outcome as the LTO/LTCG techniques mentioned above: giving the compiler perfect visibility into all functions in all compilation units.
I jump between being impressed by my C++ compiler's (MSVC 2010) ingenuity and its stupidity. Some code that did pixel format conversion via templates, which would have resolved into 5-10 assembly instructions when properly inlined, got bloated into kilobytes(!) of nested function calls. At other times, it inlines so aggressively that whole classes disappear even though they contained non-trivial functionality.

This depends on your compilation flags. With -combine and -fwhole-program, gcc will do function inlining across cpp boundaries. I'm not sure how much the linker will do if you compile into multiple object files.

The standard dictates nothing about how a function can be inlined. And compilers can inline functions if they have access to their implementation. If you only have a header with binaries, it would be impossible. If it's in the same module, the compiler can inline the function even if it is in the cpp file.

How to get rid of multiple definitions of fully specialized function templates?

I'm having a problem with linking the objects of one of my C++ applications. The source files are all compiled into object files, but many of them rely on the same library, which has a fully specialised function template. The linker complains when trying to link them all together. I understand why this is, but I don't understand how to fix it.
I found this, which explains the problem exactly, but the forum thread never got to a point where the OP asked for a solution. Womp, womp.
How do I compile all the source files, but only get the specialised functions from the library once?

The problem there is that the specialization also has the implementation in the header, which is wrong.
You should either move the implementation to a source file, or mark the method as inline.

What are "separately compiled C++ templates"?

Once I saw a statement that "separately compiled C++ templates" is a standard feature that none of available C++ compilers support.
What are those "separately compiled templates" and why are they ignored?

C++98 introduced the export keyword which allowed you to have the definition of a function template in another translation unit, with only its declaration needed to compile code that uses it. (See here if you are hazy on what's a definition vs. a declaration. Basically, you could have the function templates implementation in another translation unit.) That's just as it is with other functions.
However, only compilers using EDG's compiler front end ever supported it, and not all of them even did officially. In fact, the only compiler I know that officially supported it was Comeau C++. That's why the keyword, unfortunately, got removed from C++11.
I think it's safe to say that it is expected that a proper module system would cure C++ from many of its shortcomings that surround the whole compilation model, but, again unfortunately, a module system was not considered something that could be tackled in a reasonable amount of time for C++11. We will have to hope for the next version of the standard.

Separately compiled templates is where you can bring in template definitions from another translation unit instead of having to define them in every TU (normally in the header).
Basically, they're ignored because they're virtually impossible to implement in terms of complexity and bring a number of unfortunate side effects.

How to reconcile the C++ idiom of separating header/source with templates?

I'm wondering a bit about this templating business.
In C and C++ it is very common to put declarations in header files and definitions in source files, and keep the two completely separate. However, this doesn't even seem to be possible (in any great way) when it comes to templates, and as we all know, templates are a great tool.
Also, Boost is mostly headers, so this is a real issue. Is separating headers and source still a good idea in C++, or should I just not rely heavily on templates?

Instantiating a template is costly at compile time but virtualy free at runtime. Basically, everytime you use a new template type, the compiler has to generate the code for that new type, that's why the code is in a header, so that the compiler have access to the code later.
Putting all your code in a .cpp lets the compiler compile that code only once which greatly speeds up compilation. You could in theory write all your code in headers, it will work fine, but it will take forever to compile very large projects. Also, as soon as you will change one line anywhere, you will have to rebuild everything.
Now you might ask, how comes the STL and BOOST are not so slow? That's where precompiled headers come to the rescue. PCHs let the compiler do the most costly work only once. This works well with code that won't change often like libraries, but its effect is totally nullified for code that changes a lot as you will have to recompile the whole set of precompiled headers everytime. The compiler also use a couple of tricks to avoid recompiling all template code in every compilation unit.
Also note that C++0x will introduce explicit mechanisms to better control template instantiation. You will be able to explicitly instantiate templates and, most importantly, prevent instanciation in some compilation units. However, most of that work is already being done by most compilers without our knowledge.
So, the rule of thumb is, put as much code (and include directives) as possible in your .cpp. If you can't, well, you can't.
My advice would be: don't template just for the heck of it. If you have to template, be careful and be aware that you are in fact choosing between speed of compilation and usability the template will bring.

My personal favourite is this structure:
Header file:
#ifndef HEADER_FILE
#define HEADER_FILE
template < typename T >
class X { void method(); };
#include "header_impl.h"
#endif
Implementation file:
#ifndef HEADER_IMPL_FILE
#define HEADER_IMPL_FILE
#include "header.h"
template < typename T >
void X<T>::method() { }
#endif

I think the really important thing to understand about templates is that, paraphrasing Bjarne Stroustrop, C++ is really like multiple languages rolled into one. The conventions and idioms of templating are different than those of writing "regular" C++, almost like another language.
It is absolutely a good idea to separate header and implementation files in "regular" C++, because the header files tell the compiler what you will supply as an implementation some time later (at link time). It is important because this separation is a very real thing in most common operating systems: link time can happen when the user runs the program. You can compile the implementation into binaries (so's, dll's) and ship the unaltered headers for developers to know how to use your now-opaque implementation.
Now, for templates, you can't do that because the compiler has to fully resolve everything at compile time. That is, when you compile the headers, they have to have the implementation of the template in order for the compiler to be able to make sense of your headers. The compiler essentially "un-templates" your templates when you compile, so there is no option for separating interface and implementation of template C++.
Separation of interface and implementation is good practice, generally, but when you wander into template-land, you are explicitly going to a place where that's just not possible - because that's what a template means: resolve and build the implementation of this interface at compile time, not at runtime.

Here are a couple of techniques I use when writing templated code that move implementations into cpp files:
Move parts of methods that do not depend on the template parameters into separate, non-template helper functions or base classes. The bodies can then go into their own cpp files.
Write declarations of common specializations. The definitions can live in their own cpp files.

Separating header and source is not a C++ idiom, it is more of a C idiom, precisely because C++ favours use of templates and inline functions where possible to reduce the runtime of the program.

Templates spread across multiple files

C++ seems to be rather grouchy when declaring templates across multiple files. More specifically, when working with templated classes, the linker expect all method definitions for the class in a single compiler object file. When you take into account headers, other declarations, inheritance, etc., things get really messy.
Are there any general advice or workarounds for organizing or redistributing templated member definitions across multiple files?

Are there any general advice or workarounds for organizing or redistributing templated member definitions across multiple files?
Yes; don't.
The C++ spec permits a compiler to be able to "see" the entire template (declaration and definition) at the point of instantiation, and (due to the complexities of any implementation) most compilers retain this requirement. The upshot is that #inclusion of any template header must also #include any and all source required to instantiate the template.
The easiest way to deal with this is to dump everything into the header, inline where posible, out-of-line where necessary.
If you really regard this as an unacceptable affront, a common option is to split the template into the usual header/implementation pair, and then #include the implementation file at the end of the header.
C++'s "export" feature may or may not provide another workaround. The feature is poorly supported and poorly defined; although it in principle should permit some kind of separate compilation of templates, it doesn't necessarily obviate the demand that the compiler be able to see the entire template body.

Across how many files? If you just want to separate class definitions from implementation then try this article in the C++ faqs. That's about the only way I know of that works at the moment, but some IDEs (Eclipse CDT for example) won't link this method properly and you may get a lot of errors. However, writing your own makefiles or using Visual C++ has always worked for me :-)

When/if your compiler supports C++0x, the extern keyword can be used to separate template declarations from definitions.
See here for a brief explanation.
Also, section 6.3, "The Separation Model," of C++ Templates: The Complete Guide by David Vandevoorde and Nicolai M. Josuttis describes other options.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js