Why do we not compile template implementation in a ".cpp.h" file? - c++

While making a template linked list, rather than having the files list.h and list.cpp, we create list.h and list.cpp.h. The .cpp.h file contains the implementation for the linked list class. At the end of list.h, before the #ENDIF, we #include "list.cpp.h".
I understand that by making the .cpp file a header file we are able to avoid compiling it, but how does this work? We have needed to compile implementations that rely on using "template < class T >" in the past, so why do we not have to compile this implementation file?
Edit: My question was marked as a duplicate with a link to Why can templates only be implemented in the header file? , which I had already read. That question does not answer why the template implementation does not need to be compiled like a normal .cpp implementation file does. If we can avoid compiling implementations by making them header files, why do we not do that for every implementation? What are the downsides?
Does instantiating a template compile the code for each instance of that template? If my .cpp ever requires the use of a template in one of its functions, then why wouldn't I change it to a header file if it avoids this initial compilation?
I hope my question makes more sense.

Remember, a #include directive just pulls the contents of the header into the file that's currently being compiled. So the template does get compiled; it gets compiled in every translation unit that #includes the file where the template is defined. And when the template is used (later in that file) it gets instantiated.

In what you are saying, you are including the templates as header file, since your .h file includes the .ccp.h file in the end.
So this is a convention on your project, to seperate the normal .h and the template code in two files.
Hence the template code, in your case, is compiled into the regular .cpp code.
The reason why you would want to avoid that is "bloat" -- every .cpp file will create it's own copy of the template code used in that cpp-unit, so if if A.cpp and B.cpp both includes T.cpp.h, they may both create the same code if they use the same templates, and it takes a smart optimizer to remove all the duplication.

Related

Why use a "tpp" file when implementing templated functions and classes defined in a header?

Please refer to the first answer in this question about implementing templates.
Specifically, take note of this quote
A common solution to this is to write the template declaration in a header file, then implement the class in an implementation file (for example .tpp), and include this implementation file at the end of the header.
I bolded the part that I'm most interested in.
What's the significance of a .tpp file? I tried doing exactly what was suggested in that page and it worked. But then, I changed the file extension to any random gibberish (like .zz or .ypp) and it still worked! Is it supposed to work? Does it matter if it's .tpp or any other extension? And why not use a .cpp?
Here's another thing I'm confused about.
If my implementations were written in a .cpp and the header defined non-templated functions, then I'd only need to compile the .cpp file once, right? at least until I changed something in the .cpp file.
But if I have a header defining templated functions and my implementations are in a file with a random funky extension, how is it compiled? And are the implementations compiled every time I compile whatever source code #includes said header?
Does it matter if it's .tpp or any other extension? And why not use a .cpp?
It does not matter what the extension is, but don't use .cpp because it goes against conventions (it will still work, but don't do it; .cpp files are generally source files). Other than that it's a matter of what your codebase uses. For example I (and the Boost codebase) use .ipp for this purpose.
What's the significance of a .tpp file?
It's used when you don't want the file that contains the interface of a module to contain all the gory implementation details. But you cannot write the implementation in a .cpp file because it's a template. So you do the best you can (not considering explicit instantiations and the like). For example
Something.hpp
#pragma once
namespace space {
template <typename Type>
class Something {
public:
void some_interface();
};
} // namespace space
#include "Something.ipp"
Something.ipp
#pragma once
namespace space {
template <typename Type>
void Something<Type>::some_interface() {
// the implementation
}
} // namespace space
I thought the whole point of writing definitions in headers and the implementations in a separate file is to save compilation time, so that you compile the implementations only once until you make some changes
You can't split up general template code into an implementation file. You need the full code visible in order to use the template, that's why you need to put everything in the header file. For more see Why can templates only be implemented in the header file?
But if the implementation file has some funky looking file extension, how does that work in terms of compiling? Is it as efficient as if the implementations were in a cpp?
You don't compile the .tpp, .ipp, -inl.h, etc files. They are just like header files, except that they are only included by other header files. You only compile source (.cpp, .cc) files.
Files extensions are meaningless to the preprocessor; there's nothing sacred about .h either. It's just convention, so other programmers know and understand what the file contains.
The preprocessor will allow you to include any file into any translation unit (it's a very blunt tool). Extensions like that just help clarify what should be included where.
Does it matter if it's .tpp or any other extension? And why not use a .cpp?
It doesn't matter much which extension is actually used, as long it is different from any of the standard extensions used for C++ translation units.
The reasoning is to have a different file extension as they are usually detected by any C++ build systems for translation units (.cpp, .cc, ...). Because translating these as a source file would fail. They have to be #included by the corresponding header file containing the template declarations.
But if the implementation file has some funky looking file extension, how does that work in terms of compiling?
It needs to be #included to be compiled as mentioned.
Is it as efficient as if the implementations were in a cpp?
Well, not a 100% as efficient regarding compile time like a pure object file generated from a translation unit. It will be compiled again, as soon the header containing the #include statement changes.
And are the implementations compiled every time I compile whatever source code #includes said header?
Yes, they are.
The file extensions of header files do not matter in C++, even though standard source-file extensions such as .cpp should be avoided.
However, there are established conventions. These help human programmers navigate the code. Calling template implementation files .tpp is one of such conventions.
Something nobody has mentioned yet is that some external tools might rely on such conventions.
For instance, I routinely employ a popular grep substitute which allows searching only in files of a given type. This program will recognize .tpp files as C++, but not, say, .zz files

C++ Error message redefinition of functions

I am using two stacks to implement a queue class. My header file looks like:
#ifndef _MyQueue_h
#define _MyQueue_h
using namespace std;
template <typename T>
class MyQueue {
public:
MyQueue();
~MyQueue();
void enqueue(T element);
T peek();
void dequeue();
int size();
bool empty();
private:
int count;
stack<T> stk1;
stack<T> stk2;
};
# include "MyQueue.cpp"
# endif
And my cpp (implementation) file looks like:
#include <stack>
#include "MyQueue.h"
using namespace std;
template <typename T>
MyQueue<T>::MyQueue()
{
count = 0;
}
template <typename T>
MyQueue<T>::~ MyQueue()
{
}
template <typename T>
void MyQueue<T>::enqueue(T element)
{
stk1.push(element);
count ++;
}
(other functions omitted).
However, using Xcode 4.5, it keeps saying that my functions (MyQueue, ~MyQueue, enqueue, peek, etc.) are redefined. Can anyone help me to clarify where have I redefined them?
Thank you
You're trying something which I really don't like. It's a pretence.
Remove #include "MyQueue.cpp", replace it with the content of MyQueue.cpp, delete the file MyQueue.cpp. Now everything will work.
You are trying to pretend the template code can be split into header file and implementation file. But because it can't you have to cheat by including the implementation file in the header file. It's less confusing if you don't cheat or pretend and just have one file, the header file, with everything in it.
The precise reason that you get a redefinition is that you are compiling your cpp file, which includes your header file, which includes your cpp file again. So the content of the cpp file gets compiled twice.
In C and C++, #include behaves like a copy and paste.
Everytime you see
#include "file"
it should be treated as if you literally retyped the whole file in that one spot.
So if you compile MyQueue.cpp, the preprocessor will prepend the contents of MyQueue.h,
which itself tacks on a duplicate of MyQueue.cpp evidenced by
#include "MyQueue.cpp"
and then follows the native content of MyQueue.cpp.
So the result of
#include "MyQueue.cpp"
inside MyQueue.h, is the same as if you had written one big file with the contents
of MyQueue.h, MyQueue.cpp and MyQueue.cpp again. (with the include of stack in there as well of course)
That is why the compiler complained about functions getting redefined.
The Duplicate inserted from the
#include "MyQueue.cpp"
might also contain the line
#include "MyQueue.h"
but I think the include guards (ifndef,endif) protected against a recursive expansion since that did
not seem to be an issue.
I would like to point out that putting all the implementation code and declaration code in the same file for templates is not the only solution, as others suggest.
You just have to remember that templates are generated at compile time and include them wherever they are needed. Like Aaron has pointed out, you can even force generate a template for a specific type or function so it's accessible to all units.
In this way, the definition of a function can be embedded in an arbitrary module and the rest of the modules won't complain that a function isn't defined.
I like to declare small templates and template interfaces in header files
and put large implementations in special files that are just glorified headers. You could put some special extension like .tpp .cppt or whatever to remind yourself that it is code you have to include somewhere (which is what I do).
It is a suitable alternative to storing large implementations in header files that must be pasted around just to refer to the type (or function signature). And it works absolutely fine, for years now.
So for example, when I am ready to compile my big program, I might have a file called structures.cpp that I designate to implement lots of small structures I use, as well as instantiate all the templates for my project.
all the other .cpp files in the project need to include "mylib/template_structs.h" in order to create instances of templates and call functions with them. whereas structures.cpp only needs to include "mylib/template_structs.cppt" which in turn may include template_structs.h
or else structures.cpp would have to include that as well first.
If structures.cpp calls all the functions that any other .cpp files would call for that template then we are done, if not, then you'd need the extra step of something like
template class mynamespace::queue<int> ;
to generate all the other definitions the rest of the project's modules would need.
The problem is that, when compiling the cpp file, the cpp file includes the .h file and then the .h file includes the .cpp file. Then you have two copies of the cpp code in the same 'translation unit' at the same time.
But there are a few different solutions to this, it depends what your ultimate goal is.
The simplest, and most flexible solution is simply to remove all the template stuff from the .cpp file and put it into the .h file instead. You might think this is bad design, you've probably been taught to keep declarations and definitions in separate files, but this is how templates are usually implemented. (Welcome to the weird and wonderful world of C++ templates!)
But, perhaps these are to be 'private' templates, only to be used from one .cpp file. In that case, the best thing to do is simply to move everything from the .h file into the .cpp file.
There is a third approach, which doesn't get enough attention in my opinion. First, remove the #include "MyQueue.cpp" from your .h file, and recompile. It's quite possible that will just work for you. However, if your project has multiple .cpp files, you might get linker errors about undefined reference to MyQueue<string> :: MyQueue(). (where string is replaced with whatever you are putting in your queue. These linker errors can be fixed by placing template MyQueue<string>; at the end of the file that has the definitions of the templates (your MyQueue.cpp). This means you have to do this once for each type that you plan to store in your queue, but you might see this as an advantage as it helps you remember which types are supported by your queue.
when you include something it replaces the included file with the code within so when you call
#include "MyQueue.cpp"
it replaces that with the cpp file, then your cpp file redefines it.
Getting rid of the line will fix it.

Is #include'ing .cpp template implementation files a bad practice?

In a project I'm working on, I have a fairly large templated class that I've implemented like this:
I have my header file
// MyBigClass.h
#ifndef MYBIGCLASS_H
#define MYBIGCLASS_H
template <typename T>
class MyBigClass {
/* -- snip -- */
};
#include "MyBigClass.cpp"
#include "MyBigClass_iterator.cpp"
#include "MyBigClass_complicatedFunctionality_1.cpp"
#include "MyBigClass_complicatedFunctionality_2.cpp"
#endif
And then all of my implementation files look basically like this:
// MyBigClass_foobar.cpp
template <typename T>
void MyBigClass<T>::member_1(){
/* -- snip -- */
}
template <typename T>
int MyBigClass<T>::member_2(int foo, T & bar){
/* -- snip -- */
}
// etc, etc
In main.cpp, I just include MyBigClass.h, and everything works and compiles fine. The reason I've split the implementation into many files is because I prefer working on three or four 200-400 line files, versus one 1200 line file. The files themselves are fairly logically organized, containing for example only the implementation of a nested class, or a group of interrelated member functions.
My question is, is this something that is done? I got a strange reaction when I showed this to someone the other day, so I wanted to know if this is a bad practice, or if there is a better, more usual way to accomplish something like this.
It's convention to generally not include cpp files (there are limited and exotic cases when this is done), that's probably the reason you got the weird looks.
Usually this separation is done by moving the implementation to an .impl or even a .h file instead of a cpp file.
And no, there's nothing wrong with separating the implementation of templates and including the file in the header.
So... include .cpp is a bad practice, but what prevent you from using header files instead of .cpp? boost use .ipp files for example...
Having worked on C/C++ projects with thousand or more files, this is the practice I've generally observed:
Keep the class definitions in the header file. Do not add any implementation in the header file. The exceptions being inline functions and templates. I'll come to that in the end.
Keep all of your function and method implementations in .c or .cpp files.
There are fundamental reasons of doing so.
.h files act as a reference point for anyone to use the classes/functions/ and other data structures and APIs you've implemented in your code. Anyone unaware of the class structure will refer the .h first.
You can distribute a library revealing only the .h file without revealing your actual implementation. Usually they come up with external APIs (again in .h files) which are the only entry points to any code within the library and can be used by 3rd parties if they wish to share the code.
If you include a .c or .cpp file in multiple places - you'll not see any errors during the compilation - but the linker will bail out complaining about duplicate symbols. Since it has more than one copy of all those functions/methods part of the .c/.cpp file you included.
Exceptions
Inline function implementations need to be present in .h files to be effective. There are cases when the compiler might automatically be able to decide that a piece of code can be inlined - but in some cases it requires the presence of the inline keyword. NOTE: I'm only talking about the latter here. The compiler will inline any function only if it sees the inline keyword. In other words if a .cpp has the following piece of code:
class A;
A.my_inline_method();
If my_inline_method() is not visible to the compiler as an inline function when compiling this cpp file, it will not inline this function.
Templates are similar to inline methods when it comes to compilation - when the code for a template needs to be generated by the compiler in the .cpp where it is used - it needs to know the entire implementation of that template. Since template code generation is compile-time and NOT runtime.
I've mentioned the more common philosophies behind this. Feel free to edit this answer to add more if I've missed.
More info on template stuff here: Why can templates only be implemented in the header file?
EDIT: Made changes based on #Forever's comment to avoid ambiguity.
I'm answering my own question to add that a somewhat standard extension for template implementation seems to be .tcc. It's recognized by github's syntax highlighter, and is also mentioned in the gcc man page:
C++ source files conventionally use one of the suffixes .C, .cc, .cpp, .CPP, .c++, .cp, or .cxx; C++ header files often use .hh, .hpp, .H, or (for shared template code) .tcc;
If I'm misunderstanding the intended use of the .tcc extension, please let me know and I will delete this answer!

Inclusion of header files in case of templates.

When we make a class we declare its functions in a header files and define them in a source file... then the header file can be included in the main file to use the class...
But if we declare a template class in header files and define it in the .cpp file and then if we include the header file in the main(containing int main) file then why does a linker error crop up... and the error does not crop up if we included the .cpp file(containing the header file ) in the main file... any answers plz?
Templates don't actually produce any object code at the point where the compiler reads their source code; they're (typically) only "instantiated" when something actually uses the template. So if you define a template function in one source file, and call it from another, the code for the template function doesn't get compiled at all: it's not in the first object file since nothing there needed it, and it's not in the second object file since the compiler didn't have access to the function's definition.
You define template functions in header files so that in each translation unit where something calls the template function, the compiler has access to its code and can compile a copy of it specialized with the appropriate template arguments.
Alternatively, you can use explicit instantiation: you define the template function in a .cpp file, and also tell the compiler exactly which types that it should compile the function for. This is harder to maintain, because you have to keep track of which instantiations are needed by the rest of the program. If something calls foo<float>(), but you've only explicitly instantiated foo<int>() and foo<char>(), you get a missing-symbol error.
You shouldn't #include a .cpp file from another .cpp file. Just put the template function definitions in the header together with their declarations.
A template is neither a class nor a function. Its a pattern that compiler uses to generate classes or functions.
Very nicely explained in HERE

where should "include" be put in C++

I'm reading some c++ code and Notice that there are "#include" both in the header files and .cpp files . I guess if I move all the "#include" in the file, let's say foo.cpp, to its' header file foo.hh and let foo.cpp only include foo.hh the code should work anyway taking no account of issues like drawbacks , efficiency and etc .
I know my "all of sudden" idea must be in some way a bad idea, but what is the exact drawbacks of it? I'm new to c++ so I don't want to read lots of C++ book before I can answer this question by myself. so just drop the question here for your help . thanks in advance.
As a rule, put your includes in the .cpp files when you can, and only in the .h files when that is not possible.
You can use forward declarations to remove the need to include headers from other headers in many cases: this can help reduce compilation time which can become a big issue as your project grows. This is a good habit to get into early on because trying to sort it out at a later date (when its already a problem) can be a complete nightmare.
The exception to this rule is templated classes (or functions): in order to use them you need to see the full definition, which usually means putting them in a header file.
The include files in a header should only be those necessary to support that header. For example, if your header declares a vector, you should include vector, but there's no reason to include string. You should be able to have an empty program that only includes that single header file and will compile.
Within the source code, you need includes for everything you call, of course. If none of your headers required iostream but you needed it for the actual source, it should be included separately.
Include file pollution is, in my opinion, one of the worst forms of code rot.
edit: Heh. Looks like the parser eats the > and < symbols.
You would make all other files including your header file transitively include all the #includes in your header too.
In C++ (as in C) #include is handled by the preprocessor by simply inserting all the text in the #included file in place of the #include statement. So with lots of #includes you can literally boast the size of your compilable file to hundreds of kilobytes - and the compiler needs to parse all this for every single file. Note that the same file included in different places must be reparsed again in every single place where it is #included! This can slow down the compilation to a crawl.
If you need to declare (but not define) things in your header, use forward declaration instead of #includes.
While a header file should include only what it needs, "what it needs" is more fluid than you might think, and is dependent on the purpose to which you put the header. What I mean by this is that some headers are actually interface documents for libraries or other code. In those cases, the headers must include (and probably #include) everything another developer will need in order to correctly use your library.
Including header files from within header files is fine, so is including in c++ files, however, to minimize build times it is generally preferable to avoid including a header file from within another header unless absolutely necessary especially if many c++ files include the same header.
.hh (or .h) files are supposed to be for declarations.
.cpp (or .cc) files are supposed to be for definitions and implementations.
Realize first that an #include statement is literal. #include "foo.h" literally copies the contents of foo.h and pastes it where the include directive is in the other file.
The idea is that some other files bar.cpp and baz.cpp might want to make use of some code that exists in foo.cc. The way to do that, normally, would be for bar.cpp and baz.cpp to #include "foo.h" to get the declarations of the functions or classes that they wanted to use, and then at link time, the linker would hook up these uses in bar.cpp and baz.cpp to the implementations in foo.cpp (that's the whole point of the linker).
If you put everything in foo.h and tried to do this, you would have a problem. Say that foo.h declares a function called doFoo(). If the definition (code for) this function is in foo.cc, that's fine. But if the code for doFoo() is moved into foo.h, and then you include foo.h inside foo.cpp, bar.cpp and baz.cpp, there are now three definitions for a function named doFoo(), and your linker will complain because you are not allowed to have more than one thing with the same name in the same scope.
If you #include the .cpp files, you will probably end up with loads of "multiple definition" errors from the linker. You can in theory #include everything into a single translation unit, but that also means that everything must be re-built every time you make a change to a single file. For real-world projects, that is unacceptable, which is why we have linkers and tools like make.
There's nothing wrong with using #include in a header file. It is a very common practice, you don't want to burden a user a library with also remembering what other obscure headers are needed.
A standard example is #include <vector>. Gets you the vector class. And a raft of internal CRT header files that are needed to compile the vector class properly, stuff you really don't need nor want to know about.
You can avoid multiple definition errors if you use "include guards".
(begin myheader.h)
#ifndef _myheader_h_
#define _myheader_h_
struct blah {};
extern int whatsit;
#endif //_myheader_h_
Now if you #include "myheader.h" in other header files, it'll only get included once (due to _myheader_h_ being defined). I believe MSVC has a "#pragma once" with the equivalent functionality.