Related
I started writing a simple interpreter in C++ with a class structure that I will describe below, but I quit and rewrote the thing in Java because headers were giving me a hard time. Here's the basic structure that is apparently not allowed in C++:
main.cpp contains the main function and includes a header for a class we can call printer.h (whose single void method is implemented in printer.cpp). Now imagine two other classes which are identical. Both want to call Printer::write_something();, so I included printer.h in each. So here's my first question: Why can I #include <iostream> a million times, even one after the other, but I can only include my header once? (Well, I think I could probably do the same thing with mine, as long as it's in the same file. But I may be wrong.) I understand the difference between a declaration and an implementation/definition, but that code gives me a class redefinition error. I don't see why. And here's the thing that blows my mind (and probably shows you why I don't understand any of this): I can't just include printer.h at the top of main.cpp and use the class from my other two classes. I know I can include printer.h in one of the two classes (headers) with no trouble, but I don't see why this is any different than just including it before I include the class in main.cpp (as doing so gives me a class not found error).
When I got fed up, I thought about moving to C since the OOP I was using was quite forced anyway, but I would run into the same problem unless I wrote everything in one file. It's frustrating to know C++ but be unable to use it correctly because of compilation issues.
I would really appreciate it if you could clear this up for me. Thanks!
Why can I #include a million times, even one after the other, but I can only include my header once?
It is probably because your header doesn't have an include guard.
// printer.h file
#ifndef PRINTER_H_
#define PRINTER_H_
// printer.h code goes here
#endif
Note that it is best practice to chose longer names for the include guard defines, to minimise the chance that two different headers might have the same one.
Most header files should be wrapped in an include guard:
#ifndef MY_UNIQUE_INCLUDE_NAME_H
#define MY_UNIQUE_INCLUDE_NAME_H
// All content here.
#endif
This way, the compiler will only see the header's contents once per translation unit.
C/C++ compilation is divided in compilation/translation units to generate object files. (.o, .obj)
see here the definition of translation unit
An #include directive in a C/C++ file results in the direct equivalent of a simple recursive copy-paste in the same file. You can try it as an experiment.
So if the same translation unit includes the same header twice the compiler sees that some entities are being defined multiple times, as it would happen if you write them in the same file. The error output would be exactly the same.
There is no built-in protection in the language that prevents you to do multiple inclusions, so you have to resort to write the include guard or specific #pragma boilerplate for each C/C++ header.
Just out of curiosity I wanted to know if is there a way to achieve this.
In C++ we learn that we should avoid using macros. But when we use include guards, we do use at least one macro. So I was wondering if there is a way to write a macro-free program.
It's definitely possible, though it's unimaginably bad practice not to have include guards. It's important to understand what the #include statement actually does: the contents of another file are pasted directly into your source file before it's compiled. An include guard prevents the same code from being pasted again.
Including a file only causes an error if it would be incorrect to type the contents of that file at the position you included it. As an example, you can declare (note: declare, not define) the same function (or class) multiple times in a single compilation unit. If your header file consists only of declarations, you don't need to specify an include guard.
IncludedFile.h
class SomeClassSomewhere;
void SomeExternalFunction(int x, char y);
Main.cpp
#include "IncludedFile.h"
#include "IncludedFile.h"
#include "IncludedFile.h"
int main(int argc, char **argv)
{
return 0;
}
While declaring a function (or class) multiple times is fine, it isn't okay to define the same function (or class) more than once. If there are two or more definitions for a function, the linker doesn't know which one to choose and gives up with a "multiply defined symbols" error.
In C++, it's very common for header files to include class definitions. An include guard prevents the #included file from being pasted into your source file a second time, which means your definitions will only appear once in the compiled code, and the linker won't be confused.
Rather than trying to figure out when you need to use them and when you don't, just always use include guards. Avoiding macros most of the time is a good idea; this is one situation where they aren't evil, and using them here isn't dangerous.
It is definitely doable and I have used some early C++ libraries which followed an already misguided approach from C which essentially required the user of a header to include certain other headers before this. This is based on thoroughly understanding what creates a dependency on what else and to use declarations rather than definitions wherever possible:
Declarations can be repeated multiple times although they are obviously required to be consistent and some entities can't be declared (e.g. enum can only be defined; in C++ 2011 it is possible to also declare enums).
Definitions can't be repeated but are only needed when the definition if really used. For example, using a pointer or a reference to a class doesn't need its definition but only its declaration.
The approach to writing headers would, thus, essentially consist of trying to avoid definitions as much as possible and only use declaration as far as possible: these can be repeated in a header file or corresponding headers can even be included multiple times. The primary need for definitions comes in when you need to derive from a base class: this can't be avoided and essentially means that the user would have to include the header for the base class before using any of the derived classes. The same is true for members defined directly in the class but using the pimpl-idiom the need for member definitions can be pushed to the implementation file.
Although there are a few advantages to this approach it also has a few severe drawbacks. The primary advantage is that it kind of enforces a very thorough separation and dependency management. On the other hand, overly aggressive separation e.g. using the pimpl-idiom for everything also has a negative performance impact. The biggest drawback is that a lot the implementation details are implicitly visible to the user of a header because the respective headers this one depends on need to be included first explicitly. At least, the compiler enforces that you get the order of include files right.
From a usability and dependency point of view I think there is a general consensus that headers are best self-contained and that the use of include guards is the lesser evil.
It is possible to do so if you ensure the same header file is not being included in the same translation unit multiple times.
Also, you could use:
#pragma once
if portability is not your concern.
However, you should avoid using #pragma once over Include Guards because:
It is not standard & hence non portable.
It is less intuitive and not all users might know of it.
It provides no big advantage over the classic and very well known Include Guards.
In short, yes, even without pragmas. Only if you can guarantee that every header file is included only once. However, given how code tends to grow, it becomes increasingly difficult to honour that guarantee as the number of header files increase. This is why not using header guards is considered bad practice.
Pre-processor macros are frowned upon, yes. However, header include guards are a necessary evil because the alternative is so much worse (#pragma once will only work if your compiler supports it, so you lose portability)
With regard to pre-processor macros, use this rule:
If you can come up with an elegant solution that does not involve a macro, then avoid them.
Does the non-portable, non-standard
#pragma once
work sufficiently well for you? Personally, I'd rather use macros for preventing reinclusion, but that's your decision.
Is it reasonable to put custom headers higher in include section than standard headers?
For example include section in someclass.hpp:
#include "someclass.h"
#include "global.h"
#include <iostream>
#include <string>
Is it best practice? What is the profit if it is?
The reason is that if you forget to include a dependent header in someclass.h, then whatever implementation file includes it as the first header, will get a warning/error of undefined or undeclared type, and whatnot. If you include other headers first, then you could be masking that fact - supposing the included headers define the required types, functions, etc. Example:
my_type.h:
// Supressed include guards, etc
typedef float my_type;
someclass.h:
// Supressed include guards, etc
class SomeClass {
public:
my_type value;
};
someclass.cpp:
#include "my_type.h" // Contains definition for my_type.
#include "someclass.h" // Will compile because my_type is defined.
...
This will compile fine. But imagine you want to use use SomeClass in your program. If you don't include my_type.h before including someclass.h, you'll get a compiler error saying my_type is undefined. Example:
#include "someclass.h"
int main() {
SomeClass obj;
obj.value = 1.0;
}
It is fairly common practice to #include "widget.h" as the first thing in widget.cpp. What this does is ensure that widget.h is self-contained, i.e. does not inadvertently depend on other header files.
Beyond that, I think it's essentially a matter of personal preference.
There are two important observations to be made before delving in the specifics:
When you develop a new header/source pair, it is important to check that the header is self-contained. To do so, the easiest way is to include first in a file.
It is best not to include extraneous things before including a header you do not own, as this could create strange issues in case of conflict of macros or overload of functions.
Therefore, the answer depend if you have unit test or not.
A general rule of thumb is to include headers starting with the Standard Library, then 3rd party headers (including Open Source projects), then your own middleware, utilities, etc... and finally the headers local to this library. It more or less follows the order of dependencies to comply with observation 2.
The only exception I have seen was the one header corresponding to the current source file, which would be included first to make sure it is self-contained (observation 1)... but this only holds if you don't have unit tests, for if you do then the unit test source file is a very good place to check this.
While it is just personal choice, I would prefer to include standard headers first. Few reasons:
Any set of #ifdef..#define would be correctly mapped, rather than standard headers misinterpreting them. This goes for conditional compilation as well as values of some macros, while standard headers are being compiled.
Any change/new function in standard header may conflict with your function, and compiler would emit error in header file, which would be be complicated to solve.
All required standard headers should be placed in one header (preferbly some pre-compiled-header), include that header, and then include your custom header. This would reduce compilation time.
Start with the system headers.
If there are no dependencies between the headers both ways work, but since programming is essentially communication, not with the computer but with other humans, it is important to make it logical and easy to understand. And my opinion is that it is better to start with the system headers.
I base this one of my very first programming courses (in 1984, I think), where we programmed in Lisp and were taught to think like this: you start with the normal Lisp language, and then you create a new language that is more useful for your application by adding some functions and data types. If you for example add dates and the ability to manipulate dates, this new language could be called Lisp-with-dates. Then you could use Lisp-with-dates to create a new language with calendar functionality, which could be called Lisp-with-calendars. Like layers in an onion.
Similarly, you can view C as having a "core" language, without any headers, and then you can for example expand this language into a new, bigger language with I/O functionality by #including stdio.h. You add more and more stuff to the core language by #including more headers. (I am aware that the term "C language" in other contexts refers to the entire standard, with all the standard headers, but bear with me here.) Each new #included header creates a new, bigger language, and an additional layer of the onion.
Now, to me it seems that the standard headers obviously should be the inner part of this onion, and therefore before the custom headers. You can create the language C-with-monsters by adding stuff to C-with-I/O, but the people who created C-with-I/O did not start with C-with-monsters.
any place you include c++ compiler treats it as the same
I have a static library that I am building in C++. I have separated it into many header and source files. I am wondering if it's better to include all of the headers that a client of the library might need in one header file that they in turn can include in their source code or just have them include only the headers they need? Will that cause the code to be unecessary bloated? I wasn't sure if the classes or functions that don't get used will still be compiled into their products.
Thanks for any help.
Keep in mind that each source file that you compile involves an independent invocation of the compiler. With each invocation, the compiler has to read in every included header file, parse through it, and build up a symbol table.
When you use one of these "include the world" header files in lots of your source files, it can significantly impact your build time.
There are ways to mitigate this; for example, Microsoft has a precompiled header feature that essentially saves out the symbol table for subsequent compiles to use.
There is another consideration though. If I'm going to use your WhizzoString class, I shouldn't have to have headers installed for SOAP, OpenGL, and what have you. In fact, I'd rather that WhizzoString.h only include headers for the types and symbols that are part of the public interface (i.e., the stuff that I'm going to need as a user of your class).
As much as possible, you should try to shift includes from WhizzoString.h to WhizzoString.cpp:
OK:
// Only include the stuff needed for this class
#include "foo.h" // Foo class
#include "bar.h" // Bar class
public class WhizzoString
{
private Foo m_Foo;
private Bar * m_pBar;
.
.
.
}
BETTER:
// Only include the stuff needed by the users of this class
#include "foo.h" // Foo class
class Bar; // Forward declaration
public class WhizzoString
{
private Foo m_Foo;
private Bar * m_pBar;
.
.
.
}
If users of your class never have to create or use a Bar type, and the class doesn't contain any instances of Bar, then it may be sufficient to provide only a forward declaration of Bar in the header file (WhizzoString.cpp will have #include "bar.h"). This means that anyone including WhizzoString.h could avoid including Bar.h and everything that it includes.
In general, when linking the final executable, only the symbols and functions that are actually used by the program will be incorporated. You pay only for what you use. At least that's how the GCC toolchain appears to work for me. I can't speak for all toolchains.
If the client will always have to include the same set of header files, then it's okay to provide a "convience" header file that includes others. This is common practice in open-source libraries. If you decide to provide a convenience header, make it so that the client can also choose to include specifically what is needed.
To reduce compile times in large projects, it's common practice to include the least amount of headers as possible to make a unit compile.
what about giving both choices:
#include <library.hpp> // include everything
#include <library/module.hpp> // only single module
this way you do not have one huge include file, and for your separate files, they are stacked neatly in one directory
It depends on the library, and how you've structured it. Remember that header files for a library, and which pieces are in which header file, are essentially part of the API of the library. So, if you lead your clients to carefully pick and choose among your headers, then you will need to support that layout for a long time. It is fairly common for libraries to export their whole interface via one file, or just a few files, if some part of the API is truly optional and large.
A consideration should be compilation time: If the client has to include two dozen files to use your library, and those includes have internal includes, it can significantly increase compilation time in a big project, if used often. If you go this route, be sure all your includes have proper include guards around not only the file contents, but the including line as well. Though note: Modern GCC does a very good job of this particular issue and only requires the guards around the header's contents.
As to bloating the final compiled program, it depends on your tool chain, and how you compiled the library, not how the client of the library included header files. (With the caveat that if you declare static data objects in the headers, some systems will end up linking in the objects that define that data, even if the client doesn't use it.)
In summary, unless it is a very big library, or a very old and cranky tool chain, I'd tend to go with the single include. To me, freezing your current implementation's division into headers into the library's API is bigger worry than the others.
The problem with single file headers is explained in detail by Dr. Dobbs, an expert compiler writer. NEVER USE A SINGLE FILE HEADER!!! Each time a header is included in a .cc/.cpp file it has to be recompiled because you can feed the file macros to alter the compiled header. For this reason, a single header file will dramatically increase compile time without providing any benifit. With C++ you should optimize for human time first, and compile time is human time. You should never, because it dramatically increases compile time, include more than you need to compile in any header, each translation unit(TU) should have it's own implementation (.cc/.cpp) file, and each TU named with unique filenames;.
In my decade of C++ SDK development experience, I religiously ALWAYS have three files in EVERY module. I have a config.h that gets included into almost every header file that contains prereqs for the entire module such as platform-config and stdint.h stuff. I also have a global.h file that includes all of the header files in the module; this one is mostly for debugging (hint enumerate your seams in the global.h file for better tested and easier to debug code). The key missing piece here is that ou should really have a public.h file that includes ONLY your public API.
In libraries that are poorly programmed, such as boost and their hideous lower_snake_case class names, they use this half-baked worst practice of using a detail (sometimes named 'impl') folder design pattern to "conceal" their private interface. There is a long background behind why this is a worst practice, but the short story is that it creates an INSANE amount of redundant typing that turns one-liners into multi-liners, and it's not UML compliant and it messes up the UML dependency diagram resulting in overly complicated code and inconsistent design patterns such as children actually being parents and vice versa. You don't want or need a detail folder, you need to use a public.h header with a bunch of sibling modules WITHOUT ADDITIONAL NAMESPACES where your detail is a sibling and not a child that is in reatliy a parent. Namespaces are supposed to be for one thing and one thing only: to interface your code with other people's code, but if it's your code you control it and you should use unique class and funciton names because it's bad practice to use a namesapce when you don't need to because it may cause hash table collision that slow downt he compilation process. UML is the best pratice, so if you can organize your headers so they are UML compliant then your code is by definition more robust and portable. A public.h file is all you need to expose only the public API; thanks.
Should every .C or .cpp file should have a header (.h) file for it?
Suppose there are following C files :
Main.C
Func1.C
Func2.C
Func3.C
where main() is in Main.C file. Should there be four header files
Main.h
Func1.h
Func2.h
Func3.h
Or there should be only one header file for all .C files?
What is a better approach?
For a start, it would be unusual to have a main.h since there's usually nothing that needs to be exposed to the other compilation units at compile time. The main() function itself needs to be exposed for the linker or start-up code but they don't use header files.
You can have either one header file per C file or, more likely in my opinion, a header file for a related group of C files.
One example of that is if you have a BTree implementation and you've put add, delete, search and so on in their own C files to minimise recompilation when the code changes.
It doesn't really make sense in that case to have separate header files for each C file, as the header is the API. In other words, it's the view of the library as seen by the user. People who use your code generally care very little about how you've structured your source code, they just want to be able to write as little code as possible to use it.
Forcing them to include multiple distinct header files just so they can create, insert into, delete from, and search, a tree, is likely to have them questioning your sanity :-)
You would be better off with one btree.h file and a single btree.lib file containing all of the BTree object files that were built from the individual C files.
Another example can be found in the standard C headers.
We don't know for certain whether there are multiple C files for all the stdio.h functions (that's how I'd do it but it's not the only way) but, even if there were, they're treated as a unit in terms of the API.
You don't have to include stdio_printf.h, stdio_fgets.h and so on - there's a single stdio.h for the standard I/O part of the C runtime library.
Header files are not mandatory.
#include simply copy/paste whatever file included (including .c source files)
Commonly used in real life projects are global header files like config.h and constants.h that contains commonly used information such as compile-time flags and project wide constants.
A good design of a library API would be to expose an official interface with one set of header files and use an internal set of header files for implementation with all the details. This adds a nice extra layer of abstraction to a C library without adding unnecessary bloat.
Use common sense. C/C++ is not really for the ones without it.
I used to follow the "it depends" trend until I realized that consistency, uniformity and simplicity are more important than saving the effort to create a file, and that "standards are good even when they are bad".
What I mean is the following: a .cpp/.h pair of files is pretty much what all "modules" end up anyway. Making the existing of both a requirement saves a lot of confusion and bad engineering.
For instance, when I see some interface of something in a header file, I know exactly where to search for / place its implementation. Conversely, if I need to expose the interface of something that was previously hidden in .cpp file (e.g. static function becoming global), I know exactly where to put it.
I've seen too many bad consequences of not following this simple rule. Unnecessary inline functions, breaking any kind of rules about encapsulation, (non)separation of the interface and implementation, misplaced code, to name a few -- all due to the fact that the appropriate sibling header or cpp file was never added.
So: always define both .h and .c files. Make it a standard, follow it, and safely rely on it. Life is much simpler this way, and simplicity is the most important thing in software.
Generally it's best to have a header file for each .c file, containing the declarations for functions etc in the .c file that you want to expose. That way, another .c file can include the .h file for the functions it needs, and won't need to be recompiled if a header file it didn't include got changed.
Generally there will be one .h file for each .c/.cpp file.
Bjarne Stroustrup Explains it beautifully in his book "The C++ Programming Language"....
The single header style of physical partitioning is most useful when the program is small and its parts are not intended for separate use. When namespaces are used, the logical structure of the program can still be explained in a single header file.
For larger Programs, the single header file approach is unworkable in a conventional file-based development environment. A change to the common header forces recompilation of the whole program, and updates of that single header by several programmers are error prone. Unless strong emphasis is placed on programming styles relying heavily on namespaces and classes, the logical structure deteriorates as program grows.
An alternative physical organization lets each logical module have its own header defining the facilities it provides. Each .c file then has a corresponding h. file specifying what it provides(its interface). Each .c module includes its own .h file and usually also other .h files that specifies what it needs from other modules in order to implement the services advertised in its interface. This physical organization corresponds to the logical organization of a module. The multiple header approach makes it easy to determine the dependencies. The single header approach forces us to look at every declarations used by any module and decide if its relevant. The simple fact is that maintenance of a code is invariably done with incomplete information and from a local perspective.
The better localization leads to less information to compile a module and thus faster compilation..
It depends. Usually your reason for having separate .c files will dictate whether you need separate .h files.
Generally cpp/c files are for implementation and h/hpp (hpp are not used often) files are for header files (prototypes and declarations only). Cpp files don't always have to have a header file associated with it but it usually does as the header file acts like a bridge between cpp files so each cpp file can use code from another cpp file.
One thing that should be strongly enforced is the no use of code within a header file! There's been too many times where header files break compiles in any size project because of redefinitions. And that's simply when you include the header file in 2 different cpp files. Header files should always be designed to be included multiple times as well. Cpp files should never be included.
It's all about what code needs to be aware of what other code. You want to reduce the amount other files are aware of to the bare minimum for them to do their jobs.
They need to know that a function exists, what types they need to pass into it, and what types it will return, but not what it's doing internally. Note that it's also important from the programmers point of view to know what those types actually mean. (e.g which int is the row, and which is the column) but the code itself doesn't care. This is why naming the function and parameters sensibly is worthwhile.
As others have said, if there's nothing in a cpp file worth exposing to other parts of the code, as is normally the case with main.c, then there's no need for a header file.
It's occasionally worth putting everything you want to expose in a single header file (e.g, Func1and2and3.h), so that anything that knows about Func1 knows about Func2 as well, but I'm personally not keen on this, as it means that you tend to load a hell of a lot of junk along with the stuff you actually want.
Summary:
Imagine that you trust that someone can write code and that their algorithms, design, etc. are all good. You want to use code they've written. All you need to know is what to give them to get something to happen, what you should give it to, and what you'll get back. That's what needs to go in the header files.
I like putting interfaces into header files and implementation in cpp files. I don't like writing C++ where I need to add member variables and prototypes to the header and then the method again in the C++. I prefer something like:
module.h
struct IModuleInterface : public IUnknown
{
virtual void SomeMethod () = 0;
}
module.cpp
class ModuleImpl : public IModuleInterface,
public CObject // a common object to do the reference
// counting stuff for IUnknown (so we
// can stick this object in a smart
// pointer).
{
ModuleImpl () : m_MemberVariable (0)
{
}
int m_MemberVariable;
void SomeInternalMethod ()
{
// some internal code that doesn't need to be in the interface
}
void SomeMethod ()
{
// implementation for the method in the interface
}
// whatever else we need
};
I find this is a really clean way of separating implementation and interface.
There is no better approach, only common and less common cases.
The more common case is when you have a class/function interface to declare/define. It's better to have only one .cpp/.c with the definitions, and one header for the declarations.
Giving them the same name makes easy to understand that they are directly related.
But that's not a "rule", that's the common way and the most efficient in almost all cases.
Now in some cases( like template classes or some tiny struct definition ) you'll not need any .c/.cpp file, just the header. We often have some virtual class interface definition in only a header file for example, with only virtual pure functions or trivial functions.
And in other rare cases (like an hypothetical main.c/.cpp file) if wouldn't be always required to allow code from external compilation unit to call the function of a given compilation unit. The main function is an example (no header/declaration needed), but there are others, mostly when it's code that "connect all the other parts together" and is not called by other parts of the application. That's very rare but in this case a header make no sense.
If your file exposes an interface - that is, if it has functions which will be called from other files - then it should have a header file. Otherwise, it shouldn't.
As already noted, generally, there will be one header (.h) file for each source (.c or .cpp) file.
However, you should look at the cohesiveness of the files. If the various source files provide separate, individually reusable sets of functions - an ideal organization - then you should certainly have one header per file. If, however, the three source files provide a composite set of functions (that is too big to fit into one file), then you would use a more complex organization. There would be one header for the external services used by the main program - and that would be used by other programs needing the same services. There would also be a second header used by the cooperating source files that provides 'internal' definitions shared by those files.
(Also noted by Pax): The main program does not normally need its own header - no other source code should be using the services it provides; it uses the services provided by other files.
If you want your compiled code to be used from another compilation unit you will need the header files. There are some situations for which you do now need/want to have a headers.
The first such case are main.c/cpp files. This class is not meant to be included and as such there is no need for a header file.
In some cases you can have a header file that defines behavior of a set of different implementations that are loaded through a dll that is loaded at runtime. There will be different set of .c/.cpp files that implement variations of the same header. This can be common in plugin systems.
In general, I don't think there is any explicit relationship between .h and .c files. In many cases (probably most), a unit of code is a library of functionality with a public interface (.h) and an opaque implementation (.c). Sometimes a number of symbols are needed, like enums or macros, and you get a .h with no corresponding .c and in a few circumstances, you will have a lump of code with no public interface and no corresponding .h
in particular, there are a number of times when, for the sake of readability, the headers or implementations (seldom both) are so big and hairy that they end up being broken into many smaller files, for the sake of the programmer's sanity.