including a header file twice in c++ - c++

What happens if I include iostream or any other header file twice in my file?
I know the compiler does not throw error.
Will the code gets added twice or what happens internally?
What actually happens when we include a header file ?

Include guard prevents the content of the file from being actually seen twice by the compiler.
Include guard is basically a set of preprocessor's conditional directives at the beginning and end of a header file:
#ifndef SOME_STRING_H
#define SOME_STRING_H
//...
#endif
Now if you include the file twice then first time round macro SOME_STRING_H is not defined and hence the contents of the file is processed and seen by the compiler. However, since the first thing after #ifdef is #define, SOME_STRING_H is defined and the next time round the header file's content is not seen by the compiler.
To avoid collisions the name of the macro used in the include guard is made dependent on the name of the header file.

Header files are simple beasts. When you #include <header> all that happens is that the contents of header basically get copy-pasted into the file. To stop headers from being included multiple times, include guards are used, which is why in most header files you'll see something akin to
#ifndef SOME_HEADER_FILE_GUARD
#define SOME_HEADER_FILE_GUARD
//Contents of Header
#endif

It simply gets skipped over, due to preprocessor code along the following lines:
#ifndef MY_HEADER_H
#define MY_HEADER_H
<actual header code here>
#endif
So if you include twice, then MY_HEADER_H is already defined and everything between the #ifndef and #endif is skipped by the preprocessor.

It depends. With the exception of <assert>, the standard requires the
second (and later) includes of a standard header to be a no-op. This is
a characteristic of the header, however; the compiler will (at least
conceptually) read and include all of the header text each time it
encounters the include.
The standard practice for avoiding multiple definitions in such cases is
to use include guards: all of the C++ code in the header will be
enclosed in something like:
#ifndef SPECIAL_NAME
#define SPECIAL_NAME
// All of the C++ code here
#endif SPECIAL_NAME
Obviously, each header needs a different name. Within an application,
you can generally establish conventions based on the filename and
location; something like subsystem_filename, with
characters not legal in a C++ symbol (if you're using them in your
filenames) mapped (and very often everything upper cased). For
libraries, the best practice would be to generate a reasonably long
random sequence of characters; far more frequent (although certainly
inferior from a quality of implementation standpoint) is to ensure that
every such symbol begin with a documented prefix.
A system library can, of course, use reserved symbols (e.g. a symbol
starting with an underscore followed by a capital letter) here, to
guarantee that there is no conflict. Or it can use some entirely
different, implementation dependent technique. Microsoft, for example,
uses a compiler extension #pragma once; g++ uses include guards which
always start with _GLIBCXX (which isn't a legal symbol in user code).
These options aren't necessarily available to you.

When you include a header file, all its contents get copied to the line where the "#include" directive was placed. This is done by the preprocessor and is a step in the compilation process.
This process is the same for standard files like iostream or user-made ".h" files stored in local directory. However, the syntax slightly differs.
We use #include <filename> for files like 'iostream' which are stored in the library. Whereas, for header files in the local directory, we use #include "filename.h".
Now, what if we include header files twice:
Ideally speaking the content should be copied twice. But...
Many header files use the modern practice of mentioning #pragma once which instructs the pre-processor to copy contents only once, no matter how many times the header file is included.
Some very old codes use a concept called 'include gaurds'. I won't explain it as the other answers do so very well.
Using pragma once is the easy and the modern approach, however, include guards were used a lot previously and is a relatively complicated fix to this issue.

Related

Need clarification on #ifndef #define

The code I am working has multiple headers and source files for different classes face.cc, face.hh, cell.cc, cell.hh edge.cc edge.hh and the headers contain includes like this,
#ifndef cellINCLUDED
#define cellINCLUDED
#ifndef faceINCLUDED
#define faceINCLUDED
I saw through http://www.cplusplus.com/forum/articles/10627/ and saw the way to write include guard is
#ifndef __MYCLASS_H_INCLUDED__
#define __MYCLASS_H_INCLUDED__
So in above code that I am working on, does compiler automatically understands it is looking for face.hh or cell.hh files?
better question : Is writing __CELL_H_INCLUDED__ same as cellINCLUDED ?
#ifndef __MYCLASS_H_INCLUDED__
#define __MYCLASS_H_INCLUDED__
So in above code that I am working on, does compiler automatically
understands it is looking for face.hh or cell.hh files?
No, the compiler doesn't automatically understand what you mean.
What really happens is that, when compiling a translation unit, the Compiler holds a list of globally defined MACROs. And so, what you are doing is defining the MACRO __MYCLASS_H_INCLUDED__ if it doesn't already exists.
If that macro is defined, that #ifndef until #endif will not be parsed by the actual compiler.
Hence you can test for the existence of that MACRO to determine if the Compiler has parsed that header file to include it once and only once in the translation unit... This is because the compiler compiles each translation unit as one flattened file (after merging all the #includes)
See https://en.wikipedia.org/wiki/Include_guard
Is writing __CELL_H_INCLUDED__ same as cellINCLUDED ?
Yes it is.... The reason some prefer using underscored prefixed and suffixed MACROs for include guards is because they have extremely low probability of ever being used as identifiers... but again, underscore could clash with the compiler...
I prefer something like this: CELL_H_INCLUDED
If you use cellINCLUDED, there are chances that someday, somebody may use it as an identifier in that translation unit
The preprocessor definitions have no special meaning. The only requirement is that they stay unique across the modules, and that's why the file name is typically a part of them.
In particular, the mechanics for preventing double inclusion aren't "baked in" the language and simply use the mechanics of the preprocessor.
That being said, every compiler worth attention nowadays supports #pragma once, and you could probably settle on that.
As the link you have referenced says, "compilers do not have brains of their own" - so to answer your question, no, the compile does not understand which particular files are involved. It would not even understand that '__cellINCLUDED' has anything conceptually to do with a specific file.
Instead, the include guard simply prevents the logic contained between its opening #ifndef and closing #endif from being included multiple times. You, as the programmer, are telling the compiler not to include that code multiple times - the compiler is not doing anything 'intelligent' on its own.
Nope, This is essentially telling the compiler/parser that if this has already been put into the program, don't puthave already been loaded.
This should be at the top (and have an #endif at the bottom) of your .h file.
Lets say you have mainProgram.cpp and Tools.cpp, with each of these files loading fileReader.h.
As the compiler compiles each cpp file it will attempt to load the fileReader.h. unless you tell it not to it will load all of the fileReader file in twice.
ifndef = if not defined
so when you use these (and the #endif AFTER all your code in the .h file)
you are saying:
if not defined: cellINCLUDED
then define: cellINCLUDED with the following code:
[code]
end of code
so this way when it goes to load the code in your .h file a second time it hits the if not defined bit and ignores the code on the second time.
This reduces compile time and also means if you are using a poor/old compiler it isn't trying to shove the code in again.

Best practice for including from include files

I was wondering if there is some pro and contra having include statements directly in the include files as opposed to have them in the source file.
Personally I like to have my includes "clean" so, when I include them in some c/cpp file I don't have to hunt down every possible header required because the include file doesn't take care of it itself. On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards, the files have to be parsed first. Is this just a matter of taste, or are there any pros/cons over the other?
What I mean is:
sample.h
#ifdef ...
#include "my_needed_file.h"
#include ...
class myclass
{
}
#endif
sample.c
#include "sample.h"
my code goes here
Versus:
sample.h
#ifdef ...
class myclass
{
}
#endif
sample.c
#include "my_needed_file.h"
#include ...
#include "sample.h"
my code goes here
There's not really any standard best-practice, but for most accounts, you should include what you really need in the header, and forward-declare what you can.
If an implementation file needs something not required by the header explicitly, then that implementation file should include it itself.
The language makes no requirements, but the almost universally
accepted coding rule is that all headers must be self
sufficient; a source file which consists of a single statement
including the include should compile without errors. The usual
way of verifying this is for the implementation file to include
its header before anything else.
And the compiler only has to read each include once. If it
can determine with certainty that it has already read the file,
and on reading it, it detects the include guard pattern, it has
no need to reread the file; it just checks if the controling
preprocessor token is (still) defined. (There are
configurations where it is impossible for the compiler to detect
whether the included file is the same as an earlier included
file. In which case, it does have to read the file again, and
reparse it. Such cases are fairly rare, however.)
A header file is supposed to be treated like an API. Let us say you are writing a library for a client, you will provide them a header file for including in their code, and a compiled binary library for linking.
In such scenario, adding a '#include' directive in your header file will create a lot of problems for your client as well as you, because now you will have to provide unnecessary header files just to get stuff compiling. Forward declaring as much as possible enables cleaner API. It also enables your client to implement their own functions over your header if they want.
If you are sure that your header is never going to be used outside your current project, then either way is not a problem. Compilation time is also not a problem if you are using include guards, which you should have been using anyway.
Having more (unwanted) includes in headers means having more number of (unwanted) symbols visible at the interface level. This may create a hell lot of havocs, might lead to symbol collisions and bloated interface
On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards
If your compiler doesn't remember which files have include guards and avoid re-opening and re-tokenising the file then get a better compiler. Most modern compilers have been doing this for many years, so there's no cost to including the same file multiple times (as long as it has include guards). See e.g. http://gcc.gnu.org/onlinedocs/cpp/Once_002dOnly-Headers.html
Headers should be self-sufficient and include/declare what they need. Expecting users of your header to include its dependencies is bad practice and a great way to make users hate you.
If my_needed_file.h is needed before sample.h (because sample.h requires declarations/definitions from it) then it should be included in sample.h, no question. If it's not needed in sample.h and only needed in sample.c then only include it there, and my preference is to include it after sample.h, that way if sample.h is missing any headers it needs then you'll know about it sooner:
// sample.c
#include "sample.h"
#include "my_needed_file.h"
#include ...
#include <std_header>
// ...
If you use this #include order then it forces you to make sample.h self-sufficient, which ensures you don't cause problems and annoyances for other users of the header.
I think second approach is a better one just because of following reason.
when you have a function template in your header file.
class myclass
{
template<class T>
void method(T& a)
{
...
}
}
And you don't want to use it in the source file for myclass.cxx. But you want to use it in xyz.cxx, if you go with your first approach then you will end up in including all files that are required for myclass.cxx, which is of no use for xyz.cxx.
That is all what I think of now the difference. So I would say one should go with second approach as it makes your code each to maintain in future.

What MSVC++ 2008 Express Editon compiler does and doesn't

I've been wondering if the msvc++ 2008 compiler takes care of multiple header includes of the same file, considering this example:
main.cpp
#include "header.h"
#include "header.h"
Will the compiler include this file multiple times or just one? (I'm aware I can use the #ifndef "trick" to prevent this from happening)
Also, if I include "header.h" which contains 10 functions, but I only call or use 2, will it still include all 10 or just the 2 I need and all of their needs?
#include is basically a synonym for "copy-and-paste". If you do identical #includes, the contents of that header file will be copy-and-pasted twice, sequentially.
As to your second question, it doesn't really make sense. #includes are executed by the preprocessor, which runs before the compiler and the linker. The preprocessor doesn't know or care what the content of the header file is, it simply copy-and-pastes it in. The linker may be able to eliminate unnecessary functions, but that's completely independent of the preprocessor.
No, the compiler (or, more accurately, the pre-processor) doesn't take care of this "automatically". Not in Visual C++ 2008, or in any other version. And you really wouldn't want it to.
There are two standard ways of going about this. You should choose one of them.
The first is known as include guards. That's the "#ifndef trick" you mentioned in your question. But it's certainly not a "trick". It's the standard idiom for handling this situation when writing C++ code, and any other programmer who looks at your source file will almost certainly expect to see include guards in there somewhere.
The other takes advantage of a VC++ feature (one that's also found its way into several other C++ toolkits) to do essentially the same thing in a way that's somewhat easier to type. By including the line #pragma once at the top of your header file, you instruct the pre-processor to only include the header file once per translation unit. This has some other advantages over include guards, but they're not particularly relevant here.
As for your second question, the linker will take care of "optimizing" out functions that you never call in your code. But this is the last phase of compilation, and has nothing to do with #include, which is handled by the pre-processor, as I mentioned above.
The MSVC 20xx preprocessor (not the compiler -- the compiler never sees preprocessor directives) does not in any sense "take care of" multiple #includes of the same file. If a file is #included twice, the preprocessor obeys the #includes and includes the file two times. (Just imagine the chaos if the preprocessor even thought about trying to correct your source file's "bad" #include behavior.)
Because the preprocessor is so meticulous and careful about following your instructions, each #included file must protect itself from being #included twice. That protection is what we see when we find lines like these at the top of a header file:
#ifndef I_WAS_ALREADY_INCLUDED // if not defined, continue with include
#define I_WAS_ALREADY_INCLUDED // but make sure I'm not included again
[ header-file real contents ]
#endif // I_WAS_ALREADY_INCLUDED
When you write a header file, you must always be sure to protect it in this way.
Why do you care? It doesn't add really much burden on the compiler because the compiler conditionally (with #ifdefs, for example) excludes code it doesn't need to compile.
Preprocessor will include 2 times these headers. Thats why guards in header files are required.
As far as I know the linker is most cases will remove code (functions) that are newer used to reduce executable file size.

Preprocessor directive #ifndef for C/C++ code

In eclipse, whenever I create a new C++ class, or C header file, I get the following type of structure. Say I create header file example.h, I get this:
/*Comments*/
#ifndef EXAMPLE_H_
#define EXAMPLE_H_
/* Place to put all of my definitions etc. */
#endif
I think ifndef is saying that if EXAMPLE_H_ isn't defined, define it, which may be useful depending on what tool you are using to compile and link your project. However, I have two questions:
Is this fairly common? I don't see it too often. And is it a good idea to use that rubric, or should you just jump right into defining your code.
What is EXAMPLE_H_ exactly? Why not example.h, or just example? Is there anything special about that, or could is just be an artifact of how eclipse prefers to auto-build projects?
This is a common construct. The intent is to include the contents of the header file in the translation unit only once, even if the physical header file is included more than once. This can happen, for example, if you include the header directly in your source file, and it's also indirectly included via another header.
Putting the #ifndef wrapper around the contents means the compiler only parses the header's contents once, and avoids redefinition errors.
Some compilers allow "#pragma once" to do the same thing, but the #ifndef construct works everywhere.
This is just a common way to protect your includes - in this way it prevents the code from being included twice. And the identifier used could be anything, it's just convention to do it the way described.
Is it common? Yes - all C and C++ header files should be structured like this. EXAMPLE_H is a header guard, it prevents the code in the header being included more than once in the same translation unit, which would result in multiple definition errors. The name EXAPMLE_H is chosen to match the name of the header file it is guarding - it needs to be unique in your project and maybe globally as well. To try to ensure this, it's normal to prefix or suffix it with your project name:
#define MYPROJ_EXAMPLE_H
for example, if your project is called "myproj". Don't be tempted into thinking that prefixing with underscores will magically make it unique, by the way - names like _EXAMPLE_H_ and __EXAMPLE_H__ are illegal as they are reserved for the language implementation.
Always do this at the top of a header file. It's typically called a header guard or an include guard.
What it does is make it so that if a header file would be included multiple times, it will only be included once. If you don't do it, then you'll end up with errors about things being defined multiple times and things like that.
The exact define doesn't matter that much, though typically it's some variation on the file name. Basically, you're checking whether the given macro has been defined. If it hasn't, then define it, and continue with including the file. If it has, then you must have included the file previously, and the rest of the file is ignored.
This is an include guard. It guarantees that a header is included no more than once.
For example, if you were to:
#include "example.h"
#include "example.h"
The first time the header is included, EXAMPLE_H_ would not be defined and the if-block would be entered. EXAMPLE_H_ is then defined by the #define directive, and the contents of the header are evaluated.
The second time the header is included, EXAMPLE_H_ is already defined, so the if-block is not re-entered.
This is essential to help ensure that you do not violate the one definition rule. If you define a class in a header that didn't have include guards and included that header twice, you would get compilation errors due to violating the one definition rule (the class would be defined twice).
While the example above is trivial and you can easily see that you include example.h twice, frequently headers include other headers and it's not so obvious.
Consider this
File foo.c:
#include foo.h
#include bar.h
File bar.h
#include <iostream>
#include foo.h
Now, when we compile foo.c, we have foo.h in there twice! We definitely don't want this, because all the functions will throw compile errors the second time around.
To prevent this, we put the INCLUDE GUARD at the top. That way, if it's already been included, we define a preprocessor variable to tell us not to include it again.
It's very common (often mandated), and very frustrating if someone doesn't put one in there. You should be able to simply expect that each .h file has a header guard when you included. Of course, you know what they say when you assume things ("makes an ass of u and me") but that should be something you're expecting to see.
This is called an include guard and is indeed a common idiom for C/C++ header files. This allows the header file to be included multiple times without multiply including its contents.
The name EXAMPLE_H_ is an arbitrary convention but has to obey naming rules for C preprocessor macros, which excludes names like example.h. Since C macros are all defined in a single global namespace, it is important that you do not have different header files that use the same name for their include guard. Therefore, it is usually a good idea to include the name of your project or library in the include guard name:
#ifndef __MYPROJECT_EXAMPLE_H__
...

In C and C++, why is each .h file usually surrounded with #ifndef #define #endif directives?

Why does each .h file starts with #ifndef #define #endif? We can certainly compile the program without those directives.
It's a so-called "include guard". The purpose is to prevent the file from having to be parsed multiple times if it is included multiple times.
It prevents multiple inclusions of a single file. The same can be done using
#pragma once
directive, but those #ifndefs are standard thus supported by every compiler.
It is called an include guard. You can write without them until you start writing large programs and find out that you need to include the same .h file more than once, directly or indirectly, from a .c file. Then without include guards you would get multiple definition errors, but with them, the header file contents are parsed only once and skipped all the subsequent times, avoiding those errors. It`s a good practice to always use them.
If I understand correctly, you want to know if, in the absence of include guards, can including a header file multiple times cause an error or dangerous behavior. This is after excluding multiple definitions, etc.
Imagine a malicious programmer, whose header file doesn't have the include guards. His header file defines one macro, SZ, which is a size that you use for your statically allocated arrays. The programmer could write his header file like this:
#ifndef SZ
#define SZ 1024
#else
#if SZ == 1024
#undef SZ
#define SZ 128
#else
#error "You can include me no more than two times!"
#endif
#endif
Now, if you include the header file once, you get SZ equal to 1024. If you include it twice, SZ becomes 128. Of course, most real-world programmers are not malicious, and no one really writes code like above.
Note that the C standard allows assert.h to be #included more than once with different behavior depending upon whether NDEBUG is defined at the time of inclusion of assert.h. So, assert.h can't have include guards. That is a feature, not a bug, though.
If a header file contains definition likeint i; than, being included several times without a guard, will produce a compilation error.ifndef checks that some preprocessor variable is not defined (and it is not, for a first time), then defines it explicitly to avoid being captured again.
In MSVC you can also use #pragma once instead of ifndef's.