I'm reading some c++ code and Notice that there are "#include" both in the header files and .cpp files . I guess if I move all the "#include" in the file, let's say foo.cpp, to its' header file foo.hh and let foo.cpp only include foo.hh the code should work anyway taking no account of issues like drawbacks , efficiency and etc .
I know my "all of sudden" idea must be in some way a bad idea, but what is the exact drawbacks of it? I'm new to c++ so I don't want to read lots of C++ book before I can answer this question by myself. so just drop the question here for your help . thanks in advance.
As a rule, put your includes in the .cpp files when you can, and only in the .h files when that is not possible.
You can use forward declarations to remove the need to include headers from other headers in many cases: this can help reduce compilation time which can become a big issue as your project grows. This is a good habit to get into early on because trying to sort it out at a later date (when its already a problem) can be a complete nightmare.
The exception to this rule is templated classes (or functions): in order to use them you need to see the full definition, which usually means putting them in a header file.
The include files in a header should only be those necessary to support that header. For example, if your header declares a vector, you should include vector, but there's no reason to include string. You should be able to have an empty program that only includes that single header file and will compile.
Within the source code, you need includes for everything you call, of course. If none of your headers required iostream but you needed it for the actual source, it should be included separately.
Include file pollution is, in my opinion, one of the worst forms of code rot.
edit: Heh. Looks like the parser eats the > and < symbols.
You would make all other files including your header file transitively include all the #includes in your header too.
In C++ (as in C) #include is handled by the preprocessor by simply inserting all the text in the #included file in place of the #include statement. So with lots of #includes you can literally boast the size of your compilable file to hundreds of kilobytes - and the compiler needs to parse all this for every single file. Note that the same file included in different places must be reparsed again in every single place where it is #included! This can slow down the compilation to a crawl.
If you need to declare (but not define) things in your header, use forward declaration instead of #includes.
While a header file should include only what it needs, "what it needs" is more fluid than you might think, and is dependent on the purpose to which you put the header. What I mean by this is that some headers are actually interface documents for libraries or other code. In those cases, the headers must include (and probably #include) everything another developer will need in order to correctly use your library.
Including header files from within header files is fine, so is including in c++ files, however, to minimize build times it is generally preferable to avoid including a header file from within another header unless absolutely necessary especially if many c++ files include the same header.
.hh (or .h) files are supposed to be for declarations.
.cpp (or .cc) files are supposed to be for definitions and implementations.
Realize first that an #include statement is literal. #include "foo.h" literally copies the contents of foo.h and pastes it where the include directive is in the other file.
The idea is that some other files bar.cpp and baz.cpp might want to make use of some code that exists in foo.cc. The way to do that, normally, would be for bar.cpp and baz.cpp to #include "foo.h" to get the declarations of the functions or classes that they wanted to use, and then at link time, the linker would hook up these uses in bar.cpp and baz.cpp to the implementations in foo.cpp (that's the whole point of the linker).
If you put everything in foo.h and tried to do this, you would have a problem. Say that foo.h declares a function called doFoo(). If the definition (code for) this function is in foo.cc, that's fine. But if the code for doFoo() is moved into foo.h, and then you include foo.h inside foo.cpp, bar.cpp and baz.cpp, there are now three definitions for a function named doFoo(), and your linker will complain because you are not allowed to have more than one thing with the same name in the same scope.
If you #include the .cpp files, you will probably end up with loads of "multiple definition" errors from the linker. You can in theory #include everything into a single translation unit, but that also means that everything must be re-built every time you make a change to a single file. For real-world projects, that is unacceptable, which is why we have linkers and tools like make.
There's nothing wrong with using #include in a header file. It is a very common practice, you don't want to burden a user a library with also remembering what other obscure headers are needed.
A standard example is #include <vector>. Gets you the vector class. And a raft of internal CRT header files that are needed to compile the vector class properly, stuff you really don't need nor want to know about.
You can avoid multiple definition errors if you use "include guards".
(begin myheader.h)
#ifndef _myheader_h_
#define _myheader_h_
struct blah {};
extern int whatsit;
#endif //_myheader_h_
Now if you #include "myheader.h" in other header files, it'll only get included once (due to _myheader_h_ being defined). I believe MSVC has a "#pragma once" with the equivalent functionality.
Related
Please refer to the first answer in this question about implementing templates.
Specifically, take note of this quote
A common solution to this is to write the template declaration in a header file, then implement the class in an implementation file (for example .tpp), and include this implementation file at the end of the header.
I bolded the part that I'm most interested in.
What's the significance of a .tpp file? I tried doing exactly what was suggested in that page and it worked. But then, I changed the file extension to any random gibberish (like .zz or .ypp) and it still worked! Is it supposed to work? Does it matter if it's .tpp or any other extension? And why not use a .cpp?
Here's another thing I'm confused about.
If my implementations were written in a .cpp and the header defined non-templated functions, then I'd only need to compile the .cpp file once, right? at least until I changed something in the .cpp file.
But if I have a header defining templated functions and my implementations are in a file with a random funky extension, how is it compiled? And are the implementations compiled every time I compile whatever source code #includes said header?
Does it matter if it's .tpp or any other extension? And why not use a .cpp?
It does not matter what the extension is, but don't use .cpp because it goes against conventions (it will still work, but don't do it; .cpp files are generally source files). Other than that it's a matter of what your codebase uses. For example I (and the Boost codebase) use .ipp for this purpose.
What's the significance of a .tpp file?
It's used when you don't want the file that contains the interface of a module to contain all the gory implementation details. But you cannot write the implementation in a .cpp file because it's a template. So you do the best you can (not considering explicit instantiations and the like). For example
Something.hpp
#pragma once
namespace space {
template <typename Type>
class Something {
public:
void some_interface();
};
} // namespace space
#include "Something.ipp"
Something.ipp
#pragma once
namespace space {
template <typename Type>
void Something<Type>::some_interface() {
// the implementation
}
} // namespace space
I thought the whole point of writing definitions in headers and the implementations in a separate file is to save compilation time, so that you compile the implementations only once until you make some changes
You can't split up general template code into an implementation file. You need the full code visible in order to use the template, that's why you need to put everything in the header file. For more see Why can templates only be implemented in the header file?
But if the implementation file has some funky looking file extension, how does that work in terms of compiling? Is it as efficient as if the implementations were in a cpp?
You don't compile the .tpp, .ipp, -inl.h, etc files. They are just like header files, except that they are only included by other header files. You only compile source (.cpp, .cc) files.
Files extensions are meaningless to the preprocessor; there's nothing sacred about .h either. It's just convention, so other programmers know and understand what the file contains.
The preprocessor will allow you to include any file into any translation unit (it's a very blunt tool). Extensions like that just help clarify what should be included where.
Does it matter if it's .tpp or any other extension? And why not use a .cpp?
It doesn't matter much which extension is actually used, as long it is different from any of the standard extensions used for C++ translation units.
The reasoning is to have a different file extension as they are usually detected by any C++ build systems for translation units (.cpp, .cc, ...). Because translating these as a source file would fail. They have to be #included by the corresponding header file containing the template declarations.
But if the implementation file has some funky looking file extension, how does that work in terms of compiling?
It needs to be #included to be compiled as mentioned.
Is it as efficient as if the implementations were in a cpp?
Well, not a 100% as efficient regarding compile time like a pure object file generated from a translation unit. It will be compiled again, as soon the header containing the #include statement changes.
And are the implementations compiled every time I compile whatever source code #includes said header?
Yes, they are.
The file extensions of header files do not matter in C++, even though standard source-file extensions such as .cpp should be avoided.
However, there are established conventions. These help human programmers navigate the code. Calling template implementation files .tpp is one of such conventions.
Something nobody has mentioned yet is that some external tools might rely on such conventions.
For instance, I routinely employ a popular grep substitute which allows searching only in files of a given type. This program will recognize .tpp files as C++, but not, say, .zz files
I was wondering if there is some pro and contra having include statements directly in the include files as opposed to have them in the source file.
Personally I like to have my includes "clean" so, when I include them in some c/cpp file I don't have to hunt down every possible header required because the include file doesn't take care of it itself. On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards, the files have to be parsed first. Is this just a matter of taste, or are there any pros/cons over the other?
What I mean is:
sample.h
#ifdef ...
#include "my_needed_file.h"
#include ...
class myclass
{
}
#endif
sample.c
#include "sample.h"
my code goes here
Versus:
sample.h
#ifdef ...
class myclass
{
}
#endif
sample.c
#include "my_needed_file.h"
#include ...
#include "sample.h"
my code goes here
There's not really any standard best-practice, but for most accounts, you should include what you really need in the header, and forward-declare what you can.
If an implementation file needs something not required by the header explicitly, then that implementation file should include it itself.
The language makes no requirements, but the almost universally
accepted coding rule is that all headers must be self
sufficient; a source file which consists of a single statement
including the include should compile without errors. The usual
way of verifying this is for the implementation file to include
its header before anything else.
And the compiler only has to read each include once. If it
can determine with certainty that it has already read the file,
and on reading it, it detects the include guard pattern, it has
no need to reread the file; it just checks if the controling
preprocessor token is (still) defined. (There are
configurations where it is impossible for the compiler to detect
whether the included file is the same as an earlier included
file. In which case, it does have to read the file again, and
reparse it. Such cases are fairly rare, however.)
A header file is supposed to be treated like an API. Let us say you are writing a library for a client, you will provide them a header file for including in their code, and a compiled binary library for linking.
In such scenario, adding a '#include' directive in your header file will create a lot of problems for your client as well as you, because now you will have to provide unnecessary header files just to get stuff compiling. Forward declaring as much as possible enables cleaner API. It also enables your client to implement their own functions over your header if they want.
If you are sure that your header is never going to be used outside your current project, then either way is not a problem. Compilation time is also not a problem if you are using include guards, which you should have been using anyway.
Having more (unwanted) includes in headers means having more number of (unwanted) symbols visible at the interface level. This may create a hell lot of havocs, might lead to symbol collisions and bloated interface
On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards
If your compiler doesn't remember which files have include guards and avoid re-opening and re-tokenising the file then get a better compiler. Most modern compilers have been doing this for many years, so there's no cost to including the same file multiple times (as long as it has include guards). See e.g. http://gcc.gnu.org/onlinedocs/cpp/Once_002dOnly-Headers.html
Headers should be self-sufficient and include/declare what they need. Expecting users of your header to include its dependencies is bad practice and a great way to make users hate you.
If my_needed_file.h is needed before sample.h (because sample.h requires declarations/definitions from it) then it should be included in sample.h, no question. If it's not needed in sample.h and only needed in sample.c then only include it there, and my preference is to include it after sample.h, that way if sample.h is missing any headers it needs then you'll know about it sooner:
// sample.c
#include "sample.h"
#include "my_needed_file.h"
#include ...
#include <std_header>
// ...
If you use this #include order then it forces you to make sample.h self-sufficient, which ensures you don't cause problems and annoyances for other users of the header.
I think second approach is a better one just because of following reason.
when you have a function template in your header file.
class myclass
{
template<class T>
void method(T& a)
{
...
}
}
And you don't want to use it in the source file for myclass.cxx. But you want to use it in xyz.cxx, if you go with your first approach then you will end up in including all files that are required for myclass.cxx, which is of no use for xyz.cxx.
That is all what I think of now the difference. So I would say one should go with second approach as it makes your code each to maintain in future.
I read that it is really bad to include things in header files (for compilation speed). I often place the definitions of my functions templates in a MyClass.hxx file that I #include "MyClass.hxx" from "MyClass.h". Since MyClass.hxx needs all of my includes, and they are by definition included directly in the .h - that seems very bad. Is there any way to avoid this?
I asked a question relevant to this one a while back: pimpl for a templated class
Your other option is precompiled headers.
Unless you are running into unacceptable compile times I would just use good general practices to reduce compile times like forward declaring in the headers and #including in the cpps (when possible) and only #including what it necessary. The end result doesn't change even if you #include everything under the sun in every file, the unused stuff will get stripped.
I am working on porting some source code to a linux system, and as expected, some stuff is broken. One thing that is throwing an error for me right now, is that someone has a .h and a .cpp file that both use fclose()
The compiler is complaining about fclose() being undeclared in the header file.
here was the function declaration in the header file:
void closeFile() { if (fp) fclose(fp); }
Now, I think this is bad style, but also - how did they get this working before? Did their version of the compiler allow this kind of behavior?
Should I fix this by including stdio in the header, or move the whole thing to the cpp?
It's not bad style, you can put source code in header files, and some times you're forced to, in particular:
When defining a template class/function.
When defining an inline function.
Anyway you shouldn't put a free (defined outside a class scope) non-inline function in a header file, since that would be compiled any time a source file include such header (and this will give you a linking error).
If you're getting an error that states fclose hasn't been declared, it's probably because cstdio (or stdio.h) wasn't declared before that piece of code. Put an #include <cstdio> at the beginning of the header file.
Just to add 2 things about the other answers, remember inline is not an order, is more a guess of what the compiler should do, unless you force inline. Most compilers can decide when to inline a function, even if you did not declare it inline. It's not a must, it is a should.
Sometimes you really need to include such function/structs/classes definitions in the header, and sometimes they are not trivial or that simple. But those definitions are usually some auxiliar functions you need and you just don't want to include them on the main file, just to organize things better.
The best examples I ever seen (ok, and I didn't see those many) about this practice is game source code (that's my main interest :) ).
By the way, I usually do not put any declaration on header files (except structs and enums), I put them on separated files, unless if:
There are only a few declarations/aux functions to be made; or
These declarations are functions that I will use in another contex (of the same project, in the case) and I just think it's better to treat them as a separated 'auxiliar library' (within the same context).
Hope it helps somewhat.
Keep in mind that header files aren't compiled by themselves (usually). They are #included into source files and the definitions available when the header file is parsed are whatever was included in the source file above the header file in question.
Regardless of whether the header file is an appropriate place for the implementation of closeFile(), your header file should #include everything it needs at the top of itself. So, add #include <stdio.h> at the top of the header file.
(Note that if this is code intended to be compiled into the Linux kernel itself, you may need a different header than stdio.h. Often application-level headers aren't suitable to be used in the kernel sources.)
fclose() will be undefined if you haven't included stdio.h either in this header or before it everywhere this header is used. That is why the error occurs.
I personally see nothing wrong with the code being allowed to be in a header file, however, if it is, and you're using C99, you should declare it
inline void closeFile() { if (fp) fclose(fp); }
This means multiple compiled objects won't have closeFile() symbols (because of inline) and the inline hints to the compiler that this shouldn't be left as a function call but substituted inline which is probably want you want for speed.
The headers are included by textual substitutions (that is the whole content of the header is substituted to the #include declaration). So if there is only one .cpp file that include this particular header, this is equivalent with having the function defined in the .cpp file. I think this is the reason why it was working (at link time).
The C standard only define the header that must be included to have a function available but does not forbid the system header to include one another. So it is possible that on some system, the stdio.h header was implicitly included by another header (and thus no error reported by the compiler).
Personally, I would move such code to the .cpp file, as it will be less brittle (the header can be included by multiple .cpp files, the header will not require a previous inclusion of the stdio.h header) and will allow for quicker recompilation if the implementation must be changed (to add logging or proper error handling, as closing a file can fail).
In eclipse, whenever I create a new C++ class, or C header file, I get the following type of structure. Say I create header file example.h, I get this:
/*Comments*/
#ifndef EXAMPLE_H_
#define EXAMPLE_H_
/* Place to put all of my definitions etc. */
#endif
I think ifndef is saying that if EXAMPLE_H_ isn't defined, define it, which may be useful depending on what tool you are using to compile and link your project. However, I have two questions:
Is this fairly common? I don't see it too often. And is it a good idea to use that rubric, or should you just jump right into defining your code.
What is EXAMPLE_H_ exactly? Why not example.h, or just example? Is there anything special about that, or could is just be an artifact of how eclipse prefers to auto-build projects?
This is a common construct. The intent is to include the contents of the header file in the translation unit only once, even if the physical header file is included more than once. This can happen, for example, if you include the header directly in your source file, and it's also indirectly included via another header.
Putting the #ifndef wrapper around the contents means the compiler only parses the header's contents once, and avoids redefinition errors.
Some compilers allow "#pragma once" to do the same thing, but the #ifndef construct works everywhere.
This is just a common way to protect your includes - in this way it prevents the code from being included twice. And the identifier used could be anything, it's just convention to do it the way described.
Is it common? Yes - all C and C++ header files should be structured like this. EXAMPLE_H is a header guard, it prevents the code in the header being included more than once in the same translation unit, which would result in multiple definition errors. The name EXAPMLE_H is chosen to match the name of the header file it is guarding - it needs to be unique in your project and maybe globally as well. To try to ensure this, it's normal to prefix or suffix it with your project name:
#define MYPROJ_EXAMPLE_H
for example, if your project is called "myproj". Don't be tempted into thinking that prefixing with underscores will magically make it unique, by the way - names like _EXAMPLE_H_ and __EXAMPLE_H__ are illegal as they are reserved for the language implementation.
Always do this at the top of a header file. It's typically called a header guard or an include guard.
What it does is make it so that if a header file would be included multiple times, it will only be included once. If you don't do it, then you'll end up with errors about things being defined multiple times and things like that.
The exact define doesn't matter that much, though typically it's some variation on the file name. Basically, you're checking whether the given macro has been defined. If it hasn't, then define it, and continue with including the file. If it has, then you must have included the file previously, and the rest of the file is ignored.
This is an include guard. It guarantees that a header is included no more than once.
For example, if you were to:
#include "example.h"
#include "example.h"
The first time the header is included, EXAMPLE_H_ would not be defined and the if-block would be entered. EXAMPLE_H_ is then defined by the #define directive, and the contents of the header are evaluated.
The second time the header is included, EXAMPLE_H_ is already defined, so the if-block is not re-entered.
This is essential to help ensure that you do not violate the one definition rule. If you define a class in a header that didn't have include guards and included that header twice, you would get compilation errors due to violating the one definition rule (the class would be defined twice).
While the example above is trivial and you can easily see that you include example.h twice, frequently headers include other headers and it's not so obvious.
Consider this
File foo.c:
#include foo.h
#include bar.h
File bar.h
#include <iostream>
#include foo.h
Now, when we compile foo.c, we have foo.h in there twice! We definitely don't want this, because all the functions will throw compile errors the second time around.
To prevent this, we put the INCLUDE GUARD at the top. That way, if it's already been included, we define a preprocessor variable to tell us not to include it again.
It's very common (often mandated), and very frustrating if someone doesn't put one in there. You should be able to simply expect that each .h file has a header guard when you included. Of course, you know what they say when you assume things ("makes an ass of u and me") but that should be something you're expecting to see.
This is called an include guard and is indeed a common idiom for C/C++ header files. This allows the header file to be included multiple times without multiply including its contents.
The name EXAMPLE_H_ is an arbitrary convention but has to obey naming rules for C preprocessor macros, which excludes names like example.h. Since C macros are all defined in a single global namespace, it is important that you do not have different header files that use the same name for their include guard. Therefore, it is usually a good idea to include the name of your project or library in the include guard name:
#ifndef __MYPROJECT_EXAMPLE_H__
...