Header Files connection - c++

How are header files connected to the cpp files? I have a cpp file, including header files. I understand what include does, but what about the cpp file of the header file?
Let's say:
calculate.cpp:
#include table.h
What happens with table .cpp? To fully understand calculate.cpp, is table.cpp also needed to be looked at?

You have file A.cpp which includes B.h. When you compile A.cpp the preprocessor will include everything from file B.h to the translation unit of A.cpp and the compiler create an object file from it.
The compiler doesn't care at this point about the implementation of whatever is in B.cpp. This is dealt with separately when the compiler compiles the translation unit B.cpp. The compiler trusts at this point that in the future (at link time) there will be something when calling something from B. If not, you will end up with a linker error (undefined symbols most likely).
Here you have a very good answer on what's happening: How does the compilation/linking process work?
But just to describe it in less words:
Preprocessor: reads through your .cpp and included .h files (e.g. A.cpp and B.h and creates an output which the compiler then can compile. This will independently also happen for B.cpp and its includes/defines)
Compiler: Takes the output from the preprocessor and creates object files. Object files contain mostly machine code and some linker information
Linker: Links the object files together so when you run the program they right functions are called.
So I guess the connection you are looking for happens in the Linking stage. That's where all the pieces come together.

Related

Why do includes need further dependencies?

My current understanding is like this. Please correct me if I am wrong. When I include a C++ library (e.g. open source project) to my project I have to include the .h files so that the compiler knows about the interface of the included library. The compiled code of the included library is then linked by the linker.
But now during compilation, the included header file needs another dependency. If I would include the header file of this dependency won't this turn into some recursive loop until every dependency is included? Why is it needed? Shouldn't be this the concern of the linker? The compiled library contains the dependency.
I stumbled over this project using Xcode 9.4.
A compiler translates code into machine language. The said code is then strung together with other machine code using a linker. Google more on what I wrote, if confused; it is a simplification missing finer details.
When you type #include <cstdint> for example, a preprocessor, which is another separate program, does a pattern substitution, if you will, on #include <cstdint> and replaces that line with the whole contents of the cstdint.hh file. The substitute happens before the translation process to machine code even begins.
Usually, these #include <...> files are written carefully so that you do not need to chase other #include. However, that is not a guarantee.
The risk you identify exists. It's not automatic, though. If a.h includes b.h which includes c.h, there is no problem with nested includes.
You could have a problem if a.h includes both b.h and c.h, and b.h also includes c.h indirectly. The risk here isn't so much recursion, but double-definition of the contents of c.h.
The usual solution is that every header starts with
#ifndef A_H_INCLUDED
#define A_H_INCLUDED
// actual contents of "a.h"
and ends with
#endif // A_H_INCLUDED
Now, the second inclusion of c.h is harmless. When this happens, C_H_INCLUDED will be already defined by the first inclusion, so the second inclusion is wholly skipped. Some compilers are smart enough to recognize this pattern and won't even read c.h the second time, saving a few milliseconds of disk I/O.
The linker can't solve this, because the double-definition problem happens before the linker is involved. It happens at the level of individual Translation Units. A Translation Unit is basically a single .cpp file after all its .h files have been included. Each TU is handled individually by the compiler, and it's this compiler which trips over the double definitions. The linker cares a bit less about duplications. Duplicate function definitions are a problem for the linker, class definitions are not.

C++ Multiple Include Annoyances

So I'm writing a program which has gotten large enough now that it has several separate source files, and as a result, several separate header files. I keep constantly running into multiple include issues.
The problem is that I compile all of the individual files before I link them. So, A.cpp and B.cpp both include Z.h, because both A.cpp and B.cpp use function declarations and the such which exist inside of Z.h . This is all fine during the compile stage, because everything is in order, but when I go to link A.o and B.o together, the compiler (linker) throws multiple definition errors, because it's included the function definitions from Z.h while it was compiling each of the .o files, and so they exist in both .o files. This can normally be avoided by using include guards, but in this case, they won't work, since each .cpp file is compiled separately, the compiler "forgets" the state of defined preprocessor variables.
So my question is, how is this solved in the real world? I've had a good dig around and have come up dry, but I'm certain that this must have been solved before.
Thanks!
So, A.cpp and B.cpp both include Z.h, because both A.cpp and B.cpp use
function declarations and the such which exist inside of Z.h
This cannot be technically correct, or at least it's an incomplete description. Z.h most likely does not only contain function declarations but also function definitions.
Function declaration:
void f();
Function definition:
void f() { std::cout << "doing something\n"; }
So my question is, how is this solved in the real world?
You solve this problem by keeping the declarations in Z.h and moving the definitions into yet another to-be-created Z.cpp file.
Hard to say exactly what problem you're running into without code, but you are probably defining functions or variables in your headers. That's not what headers are for unless the functions are inline or templates. Including a header is like copy/pasting all the code in it into your cpp file. If you have the same variable in every cpp file, and it's not static or in an anonymous namespace, you'll have multiple definitions when you try to link and the linker will puke.

How do you compile just a .h file in a makefile?

I have a makefile that creates object files for two classes (and main) and one of those classes is just defined in a .h file. In my makefile I have a line that says
FileName.o: FileName.h
g++ -c FileName.h
but when I try to compile it says it can't find FileName.o
Do I have to create FileName.cpp in order to get this to compile?
You are using your class from FileName.h somewhere, aren't you? So at least one of your .cpp files should contain #include "FileName.h", and .h's code will be compiled with this .cpp and you needn't compile .h's code separately.
You don't normally attempt to compile a header (.h) file by itself. Including it into an otherwise empty .cpp file will let you compile it and produce a .o file, but it probably won't do much (if any) real good unless you've put things in the header that don't really belong in a header.

C++ Header and CPP includes

quick question.
I am trying to get C++ nailed down, and today I spent hours with a double definition linker error("this has already been defined!") and I finally realised it's because I had the layout as such:
main.cpp
#include Dog.cpp
Dog.cpp
#include Dog.h
Dog.h
// (Dog class and prototype of test function)
And now that I've cleared that up by including the Dog.h instead of the Dog.cpp in the main.cpp.
By including the .h file, does the .cpp file with the identical prefix get compiled with the program?
I was astounded when the program ran with only the .h included and no references whatsoever to Dog.cpp. I spent ages Googling but no answers really helped me understand what was going on.
Edit: I forgot to add that I prototyped in the .h, and defined the function for the class in the .cpp and that's what gave me the "already defined" error.
By including the .h file, does the .cpp file with the identical prefix get compiled with the program? I was astounded when the program ran with only the .h included and no references whatsoever to Dog.cpp.
No.
Your program is built in phases.
For the compilation phase, only declarations are needed in each translation unit (roughly equivalent to a single .cpp file with #includes resolved). The reason that declarations even exist in the first place is as a kind of "promise" that the full function definition will be found later.
g++ -c Dog.cpp # produces `Dog.o`
g++ -c main.cpp # produces `main.o`
For the linking phase, symbols are resolved between translation units. You must be linking together the result of compiling Dog.cpp and of compiling main.cpp (perhaps your IDE is doing this for you?), and this link process finds all the correct function definitions between them to produce the final executable.
g++ Dog.o main.o -o program # produces executable `program`
(Either that, or you actually haven't got to the link phase yet, and merely have an object file (Dog.o); you can't execute it, partially because it doesn't have all the function definitions in.)
The two phases can be done at the same time, with the "shorthand":
g++ Dog.cpp main.cpp -o program # compiles, links and produces executable
No, the .cpp file does NOT automatically get compiled. You can either do that manually, create a makefile, or use an IDE that has both of them in the same project.
You don't specify how you are compiling it. If you are using an IDE and have a new .h and .cpp to the project automatically then it will all be compiled and linked automatically.
There are 2 stages to making an executable to run: compiling and linking. Compiling is where the code gets interpretted and translated into lower level code. Linking is where all of the functions that you used get resolved. This is where you got the duplicate function error.
Inclusion does not automatically cause compilation, no.
In fact, the actual compiler never sees the #include statement at all. It's removed by an earlier step (called the preprocessor).
I'm not sure how it could build if you never compiled the Dog.cpp file. Did you reference any objects with code defined in that file?

Why use #ifndef CLASS_H and #define CLASS_H in .h file but not in .cpp?

I have always seen people write
class.h
#ifndef CLASS_H
#define CLASS_H
//blah blah blah
#endif
The question is, why don't they also do that for the .cpp file that contain definitions for class functions?
Let's say I have main.cpp, and main.cpp includes class.h. The class.h file does not include anything, so how does main.cpp know what is in the class.cpp?
First, to address your first inquiry:
When you see this in .h file:
#ifndef FILE_H
#define FILE_H
/* ... Declarations etc here ... */
#endif
This is a preprocessor technique of preventing a header file from being included multiple times, which can be problematic for various reasons. During compilation of your project, each .cpp file (usually) is compiled. In simple terms, this means the compiler will take your .cpp file, open any files #included by it, concatenate them all into one massive text file, and then perform syntax analysis and finally it will convert it to some intermediate code, optimize/perform other tasks, and finally generate the assembly output for the target architecture. Because of this, if a file is #included multiple times under one .cpp file, the compiler will append its file contents twice, so if there are definitions within that file, you will get a compiler error telling you that you redefined a variable. When the file is processed by the preprocessor step in the compilation process, the first time its contents are reached the first two lines will check if FILE_H has been defined for the preprocessor. If not, it will define FILE_H and continue processing the code between it and the #endif directive. The next time that file's contents are seen by the preprocessor, the check against FILE_H will be false, so it will immediately scan down to the #endif and continue after it. This prevents redefinition errors.
And to address your second concern:
In C++ programming as a general practice we separate development into two file types. One is with an extension of .h and we call this a "header file." They usually provide a declaration of functions, classes, structs, global variables, typedefs, preprocessing macros and definitions, etc. Basically, they just provide you with information about your code. Then we have the .cpp extension which we call a "code file." This will provide definitions for those functions, class members, any struct members that need definitions, global variables, etc. So the .h file declares code, and the .cpp file implements that declaration. For this reason, we generally during compilation compile each .cpp file into an object and then link those objects (because you almost never see one .cpp file include another .cpp file).
How these externals are resolved is a job for the linker. When your compiler processes main.cpp, it gets declarations for the code in class.cpp by including class.h. It only needs to know what these functions or variables look like (which is what a declaration gives you). So it compiles your main.cpp file into some object file (call it main.obj). Similarly, class.cpp is compiled into a class.obj file. To produce the final executable, a linker is invoked to link those two object files together. For any unresolved external variables or functions, the compiler will place a stub where the access happens. The linker will then take this stub and look for the code or variable in another listed object file, and if it's found, it combines the code from the two object files into an output file and replaces the stub with the final location of the function or variable. This way, your code in main.cpp can call functions and use variables in class.cpp IF AND ONLY IF THEY ARE DECLARED IN class.h.
I hope this was helpful.
The CLASS_H is an include guard; it's used to avoid the same header file being included multiple times (via different routes) within the same CPP file (or, more accurately, the same translation unit), which would lead to multiple-definition errors.
Include guards aren't needed on CPP files because, by definition, the contents of the CPP file are only read once.
You seem to have interpreted the include guards as having the same function as import statements in other languages (such as Java); that's not the case, however. The #include itself is roughly equivalent to the import in other languages.
It doesn't - at least during the compilation phase.
The translation of a c++ program from source code to machine code is performed in three phases:
Preprocessing - The Preprocessor parses all source code for lines beginning with # and executes the directives. In your case, the contents of your file class.h is inserted in place of the line #include "class.h. Since you might be includein your header file in several places, the #ifndef clauses avoid duplicate declaration-errors, since the preprocessor directive is undefined only the first time the header file is included.
Compilation - The Compiler does now translate all preprocessed source code files to binary object files.
Linking - The Linker links (hence the name) together the object files. A reference to your class or one of its methods (which should be declared in class.h and defined in class.cpp) is resolved to the respective offset in one of the object files. I write 'one of your object files' since your class does not need to be defined in a file named class.cpp, it might be in a library which is linked to your project.
In summary, the declarations can be shared through a header file, while the mapping of declarations to definitions is done by the linker.
That's the distinction between declaration and definition. Header files typically include just the declaration, and the source file contains the definition.
In order to use something you only need to know it's declaration not it's definition. Only the linker needs to know the definition.
So this is why you will include a header file inside one or more source files but you won't include a source file inside another.
Also you mean #include and not import.
That's done for header files so that the contents only appear once in each preprocessed source file, even if it's included more than once (usually because it's included from other header files). The first time it's included, the symbol CLASS_H (known as an include guard) hasn't been defined yet, so all the contents of the file are included. Doing this defines the symbol, so if it's included again, the contents of the file (inside the #ifndef/#endif block) are skipped.
There's no need to do this for the source file itself since (normally) that's not included by any other files.
For your last question, class.h should contain the definition of the class, and declarations of all its members, associated functions, and whatever else, so that any file that includes it has enough information to use the class. The implementations of the functions can go in a separate source file; you only need the declarations to call them.
main.cpp doesn't have to know what is in class.cpp. It just has to know the declarations of the functions/classes that it goes to use, and these declarations are in class.h.
The linker links between the places where the functions/classes declared in class.h are used and their implementations in class.cpp
.cpp files are not included (using #include) into other files. Therefore they don't need include guarding. Main.cpp will know the names and signatures of the class that you have implemented in class.cpp only because you have specified all that in class.h - this is the purpose of a header file. (It is up to you to make sure that class.h accurately describes the code you implement in class.cpp.) The executable code in class.cpp will be made available to the executable code in main.cpp thanks to the efforts of the linker.
It is generally expected that modules of code such as .cpp files are compiled once and linked to in multiple projects, to avoid unnecessary repetitive compilation of logic. For example, g++ -o class.cpp would produce class.o which you could then link from multiple projects to using g++ main.cpp class.o.
We could use #include as our linker, as you seem to be implying, but that would just be silly when we know how to link properly using our compiler with less keystrokes and less wasteful repetition of compilation, rather than our code with more keystrokes and more wasteful repetition of compilation...
The header files are still required to be included into each of the multiple projects, however, because this provides the interface for each module. Without these headers the compiler wouldn't know about any of the symbols introduced by the .o files.
It is important to realise that the header files are what introduce the definitions of symbols for those modules; once that is realised then it makes sense that multiple inclusions could cause redefinitions of symbols (which causes errors), so we use include guards to prevent such redefinitions.
its because of Headerfiles define what the class contains (Members, data-structures) and cpp files implement it.
And of course, the main reason for this is that you could include one .h File multiple times in other .h files, but this would result in multiple definitions of a class, which is invalid.