What MSVC++ 2008 Express Editon compiler does and doesn't - c++

I've been wondering if the msvc++ 2008 compiler takes care of multiple header includes of the same file, considering this example:
main.cpp
#include "header.h"
#include "header.h"
Will the compiler include this file multiple times or just one? (I'm aware I can use the #ifndef "trick" to prevent this from happening)
Also, if I include "header.h" which contains 10 functions, but I only call or use 2, will it still include all 10 or just the 2 I need and all of their needs?

#include is basically a synonym for "copy-and-paste". If you do identical #includes, the contents of that header file will be copy-and-pasted twice, sequentially.
As to your second question, it doesn't really make sense. #includes are executed by the preprocessor, which runs before the compiler and the linker. The preprocessor doesn't know or care what the content of the header file is, it simply copy-and-pastes it in. The linker may be able to eliminate unnecessary functions, but that's completely independent of the preprocessor.

No, the compiler (or, more accurately, the pre-processor) doesn't take care of this "automatically". Not in Visual C++ 2008, or in any other version. And you really wouldn't want it to.
There are two standard ways of going about this. You should choose one of them.
The first is known as include guards. That's the "#ifndef trick" you mentioned in your question. But it's certainly not a "trick". It's the standard idiom for handling this situation when writing C++ code, and any other programmer who looks at your source file will almost certainly expect to see include guards in there somewhere.
The other takes advantage of a VC++ feature (one that's also found its way into several other C++ toolkits) to do essentially the same thing in a way that's somewhat easier to type. By including the line #pragma once at the top of your header file, you instruct the pre-processor to only include the header file once per translation unit. This has some other advantages over include guards, but they're not particularly relevant here.
As for your second question, the linker will take care of "optimizing" out functions that you never call in your code. But this is the last phase of compilation, and has nothing to do with #include, which is handled by the pre-processor, as I mentioned above.

The MSVC 20xx preprocessor (not the compiler -- the compiler never sees preprocessor directives) does not in any sense "take care of" multiple #includes of the same file. If a file is #included twice, the preprocessor obeys the #includes and includes the file two times. (Just imagine the chaos if the preprocessor even thought about trying to correct your source file's "bad" #include behavior.)
Because the preprocessor is so meticulous and careful about following your instructions, each #included file must protect itself from being #included twice. That protection is what we see when we find lines like these at the top of a header file:
#ifndef I_WAS_ALREADY_INCLUDED // if not defined, continue with include
#define I_WAS_ALREADY_INCLUDED // but make sure I'm not included again
[ header-file real contents ]
#endif // I_WAS_ALREADY_INCLUDED
When you write a header file, you must always be sure to protect it in this way.

Why do you care? It doesn't add really much burden on the compiler because the compiler conditionally (with #ifdefs, for example) excludes code it doesn't need to compile.

Preprocessor will include 2 times these headers. Thats why guards in header files are required.
As far as I know the linker is most cases will remove code (functions) that are newer used to reduce executable file size.

Related

Need clarification on #ifndef #define

The code I am working has multiple headers and source files for different classes face.cc, face.hh, cell.cc, cell.hh edge.cc edge.hh and the headers contain includes like this,
#ifndef cellINCLUDED
#define cellINCLUDED
#ifndef faceINCLUDED
#define faceINCLUDED
I saw through http://www.cplusplus.com/forum/articles/10627/ and saw the way to write include guard is
#ifndef __MYCLASS_H_INCLUDED__
#define __MYCLASS_H_INCLUDED__
So in above code that I am working on, does compiler automatically understands it is looking for face.hh or cell.hh files?
better question : Is writing __CELL_H_INCLUDED__ same as cellINCLUDED ?
#ifndef __MYCLASS_H_INCLUDED__
#define __MYCLASS_H_INCLUDED__
So in above code that I am working on, does compiler automatically
understands it is looking for face.hh or cell.hh files?
No, the compiler doesn't automatically understand what you mean.
What really happens is that, when compiling a translation unit, the Compiler holds a list of globally defined MACROs. And so, what you are doing is defining the MACRO __MYCLASS_H_INCLUDED__ if it doesn't already exists.
If that macro is defined, that #ifndef until #endif will not be parsed by the actual compiler.
Hence you can test for the existence of that MACRO to determine if the Compiler has parsed that header file to include it once and only once in the translation unit... This is because the compiler compiles each translation unit as one flattened file (after merging all the #includes)
See https://en.wikipedia.org/wiki/Include_guard
Is writing __CELL_H_INCLUDED__ same as cellINCLUDED ?
Yes it is.... The reason some prefer using underscored prefixed and suffixed MACROs for include guards is because they have extremely low probability of ever being used as identifiers... but again, underscore could clash with the compiler...
I prefer something like this: CELL_H_INCLUDED
If you use cellINCLUDED, there are chances that someday, somebody may use it as an identifier in that translation unit
The preprocessor definitions have no special meaning. The only requirement is that they stay unique across the modules, and that's why the file name is typically a part of them.
In particular, the mechanics for preventing double inclusion aren't "baked in" the language and simply use the mechanics of the preprocessor.
That being said, every compiler worth attention nowadays supports #pragma once, and you could probably settle on that.
As the link you have referenced says, "compilers do not have brains of their own" - so to answer your question, no, the compile does not understand which particular files are involved. It would not even understand that '__cellINCLUDED' has anything conceptually to do with a specific file.
Instead, the include guard simply prevents the logic contained between its opening #ifndef and closing #endif from being included multiple times. You, as the programmer, are telling the compiler not to include that code multiple times - the compiler is not doing anything 'intelligent' on its own.
Nope, This is essentially telling the compiler/parser that if this has already been put into the program, don't puthave already been loaded.
This should be at the top (and have an #endif at the bottom) of your .h file.
Lets say you have mainProgram.cpp and Tools.cpp, with each of these files loading fileReader.h.
As the compiler compiles each cpp file it will attempt to load the fileReader.h. unless you tell it not to it will load all of the fileReader file in twice.
ifndef = if not defined
so when you use these (and the #endif AFTER all your code in the .h file)
you are saying:
if not defined: cellINCLUDED
then define: cellINCLUDED with the following code:
[code]
end of code
so this way when it goes to load the code in your .h file a second time it hits the if not defined bit and ignores the code on the second time.
This reduces compile time and also means if you are using a poor/old compiler it isn't trying to shove the code in again.

Best practice for including from include files

I was wondering if there is some pro and contra having include statements directly in the include files as opposed to have them in the source file.
Personally I like to have my includes "clean" so, when I include them in some c/cpp file I don't have to hunt down every possible header required because the include file doesn't take care of it itself. On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards, the files have to be parsed first. Is this just a matter of taste, or are there any pros/cons over the other?
What I mean is:
sample.h
#ifdef ...
#include "my_needed_file.h"
#include ...
class myclass
{
}
#endif
sample.c
#include "sample.h"
my code goes here
Versus:
sample.h
#ifdef ...
class myclass
{
}
#endif
sample.c
#include "my_needed_file.h"
#include ...
#include "sample.h"
my code goes here
There's not really any standard best-practice, but for most accounts, you should include what you really need in the header, and forward-declare what you can.
If an implementation file needs something not required by the header explicitly, then that implementation file should include it itself.
The language makes no requirements, but the almost universally
accepted coding rule is that all headers must be self
sufficient; a source file which consists of a single statement
including the include should compile without errors. The usual
way of verifying this is for the implementation file to include
its header before anything else.
And the compiler only has to read each include once. If it
can determine with certainty that it has already read the file,
and on reading it, it detects the include guard pattern, it has
no need to reread the file; it just checks if the controling
preprocessor token is (still) defined. (There are
configurations where it is impossible for the compiler to detect
whether the included file is the same as an earlier included
file. In which case, it does have to read the file again, and
reparse it. Such cases are fairly rare, however.)
A header file is supposed to be treated like an API. Let us say you are writing a library for a client, you will provide them a header file for including in their code, and a compiled binary library for linking.
In such scenario, adding a '#include' directive in your header file will create a lot of problems for your client as well as you, because now you will have to provide unnecessary header files just to get stuff compiling. Forward declaring as much as possible enables cleaner API. It also enables your client to implement their own functions over your header if they want.
If you are sure that your header is never going to be used outside your current project, then either way is not a problem. Compilation time is also not a problem if you are using include guards, which you should have been using anyway.
Having more (unwanted) includes in headers means having more number of (unwanted) symbols visible at the interface level. This may create a hell lot of havocs, might lead to symbol collisions and bloated interface
On the other hand, if I have the includes in the include files compile time might get bigger, because even with the include guards
If your compiler doesn't remember which files have include guards and avoid re-opening and re-tokenising the file then get a better compiler. Most modern compilers have been doing this for many years, so there's no cost to including the same file multiple times (as long as it has include guards). See e.g. http://gcc.gnu.org/onlinedocs/cpp/Once_002dOnly-Headers.html
Headers should be self-sufficient and include/declare what they need. Expecting users of your header to include its dependencies is bad practice and a great way to make users hate you.
If my_needed_file.h is needed before sample.h (because sample.h requires declarations/definitions from it) then it should be included in sample.h, no question. If it's not needed in sample.h and only needed in sample.c then only include it there, and my preference is to include it after sample.h, that way if sample.h is missing any headers it needs then you'll know about it sooner:
// sample.c
#include "sample.h"
#include "my_needed_file.h"
#include ...
#include <std_header>
// ...
If you use this #include order then it forces you to make sample.h self-sufficient, which ensures you don't cause problems and annoyances for other users of the header.
I think second approach is a better one just because of following reason.
when you have a function template in your header file.
class myclass
{
template<class T>
void method(T& a)
{
...
}
}
And you don't want to use it in the source file for myclass.cxx. But you want to use it in xyz.cxx, if you go with your first approach then you will end up in including all files that are required for myclass.cxx, which is of no use for xyz.cxx.
That is all what I think of now the difference. So I would say one should go with second approach as it makes your code each to maintain in future.

including a header file twice in c++

What happens if I include iostream or any other header file twice in my file?
I know the compiler does not throw error.
Will the code gets added twice or what happens internally?
What actually happens when we include a header file ?
Include guard prevents the content of the file from being actually seen twice by the compiler.
Include guard is basically a set of preprocessor's conditional directives at the beginning and end of a header file:
#ifndef SOME_STRING_H
#define SOME_STRING_H
//...
#endif
Now if you include the file twice then first time round macro SOME_STRING_H is not defined and hence the contents of the file is processed and seen by the compiler. However, since the first thing after #ifdef is #define, SOME_STRING_H is defined and the next time round the header file's content is not seen by the compiler.
To avoid collisions the name of the macro used in the include guard is made dependent on the name of the header file.
Header files are simple beasts. When you #include <header> all that happens is that the contents of header basically get copy-pasted into the file. To stop headers from being included multiple times, include guards are used, which is why in most header files you'll see something akin to
#ifndef SOME_HEADER_FILE_GUARD
#define SOME_HEADER_FILE_GUARD
//Contents of Header
#endif
It simply gets skipped over, due to preprocessor code along the following lines:
#ifndef MY_HEADER_H
#define MY_HEADER_H
<actual header code here>
#endif
So if you include twice, then MY_HEADER_H is already defined and everything between the #ifndef and #endif is skipped by the preprocessor.
It depends. With the exception of <assert>, the standard requires the
second (and later) includes of a standard header to be a no-op. This is
a characteristic of the header, however; the compiler will (at least
conceptually) read and include all of the header text each time it
encounters the include.
The standard practice for avoiding multiple definitions in such cases is
to use include guards: all of the C++ code in the header will be
enclosed in something like:
#ifndef SPECIAL_NAME
#define SPECIAL_NAME
// All of the C++ code here
#endif SPECIAL_NAME
Obviously, each header needs a different name. Within an application,
you can generally establish conventions based on the filename and
location; something like subsystem_filename, with
characters not legal in a C++ symbol (if you're using them in your
filenames) mapped (and very often everything upper cased). For
libraries, the best practice would be to generate a reasonably long
random sequence of characters; far more frequent (although certainly
inferior from a quality of implementation standpoint) is to ensure that
every such symbol begin with a documented prefix.
A system library can, of course, use reserved symbols (e.g. a symbol
starting with an underscore followed by a capital letter) here, to
guarantee that there is no conflict. Or it can use some entirely
different, implementation dependent technique. Microsoft, for example,
uses a compiler extension #pragma once; g++ uses include guards which
always start with _GLIBCXX (which isn't a legal symbol in user code).
These options aren't necessarily available to you.
When you include a header file, all its contents get copied to the line where the "#include" directive was placed. This is done by the preprocessor and is a step in the compilation process.
This process is the same for standard files like iostream or user-made ".h" files stored in local directory. However, the syntax slightly differs.
We use #include <filename> for files like 'iostream' which are stored in the library. Whereas, for header files in the local directory, we use #include "filename.h".
Now, what if we include header files twice:
Ideally speaking the content should be copied twice. But...
Many header files use the modern practice of mentioning #pragma once which instructs the pre-processor to copy contents only once, no matter how many times the header file is included.
Some very old codes use a concept called 'include gaurds'. I won't explain it as the other answers do so very well.
Using pragma once is the easy and the modern approach, however, include guards were used a lot previously and is a relatively complicated fix to this issue.

Is there any situation where you wouldn't want include guards?

I know why include guards exist, and that #pragma once is not standard and thus not supported by all compilers etc.
My question is of a different kind:
Is there any sensible reason to ever not have them? I've yet to come across a situation where theoretically, there would be any benefit of not providing include guards in a file that is meant to be included somewhere else. Does anyone have an example where there is an actual benefit of not having them?
The reason I ask - to me they seem pretty redundant, as you always use them, and that the behaviour of #pragma once could as well just be automatically applied to literally everything.
I've seen headers that generate code depending on macros defined before their inclusion. In this case it's sometimes wanted to define those macros to one (set of) value(s), include the header, redefine the macros, and include again.
Everybody who sees such agrees that it's ugly and best avoided, but sometimes (like if the code in said headers is generated by some other means) it's the lesser evil to do that.
Other than that, I can't think of a reason.
#sbi already talked about code generation, so let me give an example.
Say that you have an enumeration of a lot of items, and that you would like to generate a bunch of functions for each of its elements...
One solution is to use this multiple inclusion trick.
// myenumeration.td
MY_ENUMERATION_META_FUNCTION(Item1)
MY_ENUMERATION_META_FUNCTION(Item2)
MY_ENUMERATION_META_FUNCTION(Item3)
MY_ENUMERATION_META_FUNCTION(Item4)
MY_ENUMERATION_META_FUNCTION(Item5)
Then people just use it like so:
#define MY_ENUMERATION_META_FUNCTION(Item_) \
case Item_: return #Item_;
char const* print(MyEnum i)
{
switch(i) {
#include "myenumeration.td"
}
__unreachable__("print");
return 0; // to shut up gcc
}
#undef MY_ENUMERATION_META_FUNCTION
Whether this is nice or hackish is up to you, but clearly it is useful not to have to crawl through all the utilities functions each time a new value is added to the enum.
<cassert>
<assert.h>
"The assert macro is redefined according to the current state of NDEBUG each time that
<assert.h> is included."
It can be a problem if you have two headers in a project which use the same include guard, e.g. if you have two third party libraries, and they both have a header which uses an include guard symbol such as __CONSTANTS_H__, then you won't be able to successfully #include both headers in a given compilation unit. A better solution is #pragma once, but some older compilers do not support this.
Suppose you have a third party library, and you can't modify its code. Now suppose including files from this library generates compiler warnings. You would normally want to compile your own code at high warning levels, but doing so would generate a large set of warnings from using the library. You could write warning disabler/enabler headers that you could then wrap around the third party library, and they should be able to be included multiple times.
Another more sophisticated kind of use is Boost's Preprocessor iteration construct:
http://www.boost.org/doc/libs/1_46_0/libs/preprocessor/doc/index.html
The problem with #pragma once, and the reason it is not part of the standard, is that it just doesn't always work everywhere. How does the compiler know if two files are the same file or not, if included from different paths?
Think about it, what happens if the compiler makes a mistake and fails to include a file that it should have included? What happens if it includes a file twice, that it shouldn't have? How would you fix that?
With include guards, the worst that can happen is that it takes a bit longer to compile.
Edit:
Check out this thread on comp.std.c++ "#pragma once in ISO standard yet?"
http://groups.google.com/group/comp.std.c++/browse_thread/thread/c527240043c8df92

using macro defined in header files

I have a macro definition in header file like this:
// header.h
ARRAY_SZ(a) = ((int) sizeof(a)/sizeof(a[0]));
This is defined in some header file, which includes some more header files.
Now, i need to use this macro in some source file that has no other reason to include header.h or any other header files included in header.h, so should i redefine the macro in my source file or simply include the header file header.h.
Will the latter approach affect the code size/compile time (I think yes), or runtime (i think no)?
Your advice on this!
Include the header file or break it out into a smaller unit and include that in the original header and in your code.
As for code size, unless your headers do something incredibly ill-advised, like declaring variables or defining functions, they should not affect the memory footprint much, if at all. They will affect your compile time to a certain degree as well as polluting your name space.
Including the header in the source file might affect compile time slightly unless you are using pre-compiled headers. It shouldn't affect the code size though. Redefining the macro shouldn't have any effect on compile time or size. It is more of a maintenance and consistency issue though.
should i redefine the macro in my source file or simply include the header file header.h.
Neither. Instead you should clean up the code and break header.h so that one can use ARRAY_SZ() without also getting unrelated stuff.
You ask:
Will the latter approach affect the
code size/compile time (I think yes)
In the case of the specific macro, the answer is "no" to the size, because the sizeof expression can be evaluated at compile time, and therefore "yes" to the time. Neither are likely to be remotely significant.
Unless you're running this on a really limited bit of hardware, or this is called billions and billions of times, you won't notice any difference between the two at either compile time or run time.
Go for whatever seems more readable / maintainable.
Personally, I'd suggest there are better ways of achieving what you're doing there without involving macros (namely inline functions and/or function templates). You have to be careful using your solution because there are a few gotchas you need to keep an eye on.
Including that header and all other headers included into it will increase the compile time. It can affect runtime if there're other definitions that will change how your code compiles - if your code compiles differently because of those defines it will of course run differently. Although the latter is not usual be careful.