Understanding makefiles

Understanding makefiles - c++

I was looking at this flow diagram to understand how makefiles really operate but I'm still struggling to 100% understand what's going on.
I have a main.cpp file that calls upon some function that is defined in function.h and function.cpp. Then, I'm given the makefile:
main: main.cpp function.o
g++ main.cpp function.o -o main
mainAssembly: main.cpp
g++ -S main.cpp
function.o: function.cpp
g++ -c function.cpp
clean:
rm -f *.o *.S main
linkerError: main.cpp function.o
g++ main.cpp function.o -o main
What's going on? From what I understand so far is that we are compiling function.cpp, which turns into an object file? Why is this necessary?
I don't know what the mainAssembly part is really doing. I tried reading the g++ flags but I still have trouble understand what this is. Is this just compiling main.cpp with the headers? Shouldn't we also convert main into an object file as well?
I guess main is simply linking everything together into an exe called main? And I'm completely lost on what clean and linkerError are trying to do. Can someone help me understand what is going on?

That flowchart confuses more than it explains as it seems needlessly complicated. Each step is actually quite simple in isolation, and there's no point in jamming them all into one chart.
Remember a Makefile simply establishes a dependency chain, an order of operations which it tries to follow, where the file on the left is dependent on the files on the right.
Here's your first part where function.o is the product of function.cpp:
function.o: function.cpp
g++ -c function.cpp
If function.cpp changes, then the .o file must be rebuilt. This is perhaps incomplete if function.h exists, as function.cpp might #include it, so the correct definition is probably:
function.o: function.cpp function.h
g++ -c function.cpp
Now if you're wondering why you'd build a single .cpp into a single .o file, consider programs at a much larger scale. You don't want to recompile every source file every time you change anything, you only want to compile the things that are directly impacted by your changes. Editing function.cpp should only impact function.o, and not main.o as that's unrelated. However, changing function.h might impact main.o because of a reference in main.cpp. It depends on how things are referenced with #include.
This part is a little odd:
mainAssembly: main.cpp
g++ -S main.cpp
That just dumps out the compiled assembly code for main.cpp. This is an optional step and isn't necessary for building the final executable.
This part ham-fistedly assembles the two parts:
main: main.cpp function.o
g++ main.cpp function.o -o main
I say that because normally you'd compile all .cpp files to .o and then link the .o files together with your libstdc++ library and any other shared libraries you're using with a tool like ld, the linker. The final step in any typical compilation is linking to produce a binary executable or library, though g++ will silently do this for you when directed to, like here.
I think there's much better examples to work from than what you have here. This file is just full of confusion.

Related

Proper makefile dependency list (C++)

A lot of the examples I see regarding make files are somewhat inconsistent in regards to what files are considered dependencies of main.o and I was wondering what is the safest and most efficient way of going about creating a makefile.
An example from https://www.tutorialspoint.com/makefile/makefile_quick_guide.htm:
hello: main.o factorial.o hello.o
$(CC) main.o factorial.o hello.o -o hello
main.o: main.cpp functions.h
$(CC) -c main.cpp
factorial.o: factorial.cpp functions.h
$(CC) -c factorial.cpp
hello.o: hello.cpp functions.h
$(CC) -c hello.cpp
As you can see, the header file functions.h is a dependency of main.o.
An example from my textbook:
myprog.exe : main.o threeintsfcts.o
g++ main.o threeintsfcts.o -o myprog.exe
main.o : main.cpp threeintsfcts.cpp threeintsfcts.h
g++ -Wall -c main.cpp
threeintsfcts.o : threeintsfcts.cpp threeintsfcts.h
g++ -Wall -c threeintsfcts.cpp
clean :
rm *.o myprog.exe
As you can see, the header file .h and it's .cpp are dependencies of main.o.
I've also seen another example (from https://www.youtube.com/watch?v=_r7i5X0rXJk) where the only dependency of main.o is main.cpp.
Something like:
myprog.exe : main.o threeintsfcts.o
g++ main.o threeintsfcts.o -o myprog.exe
main.o : main.cpp
g++ -Wall -c main.cpp
threeintsfcts.o : threeintsfcts.cpp threeintsfcts.h
g++ -Wall -c threeintsfcts.cpp
clean :
rm *.o myprog.exe
When a main.cpp includes a .h file, should both the .h and its respective .cpp be included as dependencies?
One of the thoughts that came into my head was this: why should any .h file be included as a dependency anyways? Wouldn't a change in any .h file register as a change in the respective .cpp file since the contents of the .h are just going to be copy and pasted into the respective .cpp file through #include?
I am also unsure of whether to have the respective .cpp as a dependency.
(ex. main.o : main.cpp threeintsfcts.cpp threeintsfcts.h).
I think doing so would go against one of the main benefits of makefiles which is the efficiency of modular compilation. (You would have to recompile the main whenever threeintsfcts.cpp changes).
However, it might make sense to do so in case threeintsfcts.cpp changes the name of one of its functions used in main and you forget to change it in main.

Each object file target needs to depend on its source file, obviously, but also all header files that it is including. The make program itself does not parse the source files, so it doesn't know what headers a source file includes. If one of the header files is missing and modified, the source file will not be automatically recompiled by make.
Because tracking the header file dependencies manually is cumbersome and error-prone, there are tools to automate it, see e.g. this question.
Other source files should however not be dependencies, because one source file should not be including another, so there cannot be any dependency that isn't resolved in the later linker step in the main executable target.
Any change in one source file that would affect a change in the compilation step of another source file would have to be through a change in the former source file's header file which is included in the later one. Therefore the header dependencies are sufficient.
Therefore I see no justification for the textbook example you posted. The first example is fine however, as long as the project size is small enough to track the dependencies manually. The third example is wrong, because it wont recompile main.cpp if the header file changes. (Assuming threeintsfcts.h is included in main.cpp, which is the only thing making sense)

The example from your textbook:
main.o : main.cpp threeintsfcts.cpp threeintsfcts.h
g++ -Wall -c main.cpp
is wrong. The purpose of separating source code into two files is so that they can be compiled independently; if one depends on the other, they have been separated incorrectly.
"...Why should any .h file be included as a dependency anyways?
Wouldn't a change in any .h file register as a change in the
respective .cpp file since the contents of the .h are just going to be
copy and pasted into the respective .cpp file through #include?"
If threeintsfcts.h is the only file that has been changed, then main.cpp has not been changed. Make is not smart enough to parse main.cpp and deduce that threeintsfcts.h ought to be a prerequisite of main.o. (There are ways to get Make to do that, but you must master the basics first.)
...In case threeintsfcts.cpp changes the name of one of its functions
used in main and you forget to change it in main.
In that case you will not be able to build the executable; Make can (and will) inform you of the problem, but not fix it, no matter how you arrange the prerequisite lists.

No you shouldn't have .cpp dependency to main, the whole point of Makefile is for separate compilation.
If you were to include .cpp as a dependency to main, then every time the implementation of that cpp changes main and the cpp would get recompiled which is not what we want. Rather, we only want the cpp file to get re-compiled and main stay the same.
I think that the example from your textbook is a mistake.
I could be wrong, Makefiles is an old friend for me, so I want to know what others have to say on the matter.
I'm answering based on what I personally do and this is what also makes sense to me.

how to run c++ file if header, class, and main are not in the same folder?

The code::block IDE generates the following files:
./main.cpp
./include/class.h
./src/class.cpp It include class.h with #include "class.h"
How can I run this set of files, with the three files in three different folders?
First, this program can be run by clicking IDE "build and run" button.
This program need to take some arguments, like ./a.out arg[1] arg[2]. So I cannot input arguments by clicking "build and run" button, and thus I have to use g++ to compile an output first.
But g++ is not smart enough as the IDE in finding the three files(I try g++ -I./include main.cpp, it seems that it has no problem with class.h file, but cannot find class.cpp file)
So how can I compile the three files in three different locations?
BTW, how could the class.h file find the class.cpp file in IDE/g++(scan all the files in the directory to see which contains the definition of the class functions?)?

It's a bad idea to #include source files. But this will do it:
g++ -I./include -Isrc main.cpp

Normally one would expect that the IDE has some function to just build the application, especially when there's a function to build-and-run. In addition there are those that have the possibility to supply command line arguments for the program so build-and-run will run with supplied arguments.
You have to supply the source files and the search path for includes, normally one would write:
g++ -o exec-file-name -I./include main.cpp src/class.cpp
but that may depend a bit on how you include the header file. Another note is that you normally don't compile the header file separately - it's included when you compile the .cpp files that includes it.
If on the other hand you actually want to do what you write (compile the .h file that includes the .cpp file - which is higly unorthodox) you would do:
g++ -c -I./src include/class.h
g++ -c main.cpp
g++ -o exec-file-name main.o class.o
where you need to replace the .o extension if your platform uses another extension. Note that in this case you should probably not include class.h from main.cpp since that could lead to duplicate symbols.

Difference between compiling with object and source files

I have a file main.cpp containing an implementation of int main() and a library foo split up between foo.h and foo.cpp.
What is the difference (if any) between
g++ main.cpp foo.cpp -o main
and
g++ -c foo.cpp -o foo.o && g++ main.cpp foo.o
?
Edit: of course there is a third version:
g++ -c foo.cpp -o foo.o && g++ -c main.cpp -o main.o && g++ main.o foo.o -o main

The total work that the compiler & linker (and other tools used by the compiler) has to do is exactly the same (give or take a few minor things like deleting the temporary object file created for foo.o and main.o that the compiler makes in the first example, which remains in the second example, and both remain in the third example).
The main difference comes when you have a larger project, and you use a Makefile to build the code. Here the advantage is that, since the Makefile only recompiles things that need to be recompiled, you don't have to wait for the compiler to compile code that don't need to recompile. Assuming of course, we choose to use the g++ -c file.cpp -o file.o variant in the makefile (which is the typical way to do it), and not the g++ file.cpp main.cpp ... -o main.
Of course, there are other possible scenarios - for example in unit testing, you may want to use the same object file to build a test around, as you were using to build the main application. Again, this makes more of a difference when the project is large and has half a dozen or more source files.
On a modern machine, compiling doesn't take that long - my compiler project (~5500 lines of C++ code) that links with LLVM takes about 10 seconds to compile the source files, and another 10 seconds to link with all the LLVM files. That's a debug version of the llvm libraries, so it produces a 120+ MB executable.
Once you get onto commercial (or corresponding open source type projects) level of software, the number of sourcefiles and other things involved in a project can be hundreds, and the number of lines of the sources can often be in the 100k-several million lines range. And now it starts to matter if you just recompile foo.cpp or "everything", because compiling everything takes an hour of CPU time. Sure, with multicore machines, it still is only a few minutes, but it's not ideal to spend minutes, when you could just spend a few seconds.

If you type something like this:
g++ -o main main.cpp foo.cpp
You are compiling and linking two cpp files at once and generating an executable file called main (you get it with -o)
If you type this:
g++ main.cpp foo.cpp
You are compiling and linking two cpp files at once, generating an executable file with the default name a.out.
Finally, if you type this:
g++ -c foo.cpp
You will generate an object file called foo.o which can later be linked with g++ -o executable_name file1.o ... fileN.o
Using options -c and -o allows you to perform separately two of the tasks performed by the g++ compiler and getting the corresponding preprocessed and object files respectively. I have found a link which may provide you helpful information about it. It talks about gcc (C compiler), but both g++ and gcc work similarly after all:
http://www3.ntu.edu.sg/home/ehchua/programming/cpp/gcc_make.html
Be careful with the syntax of the commands you are using. If you work with Linux and you have problems with commands, just open a cmd window and type "man name_of_the_command", in order to read about, syntax, options, return values and some more relevant information about commands, system calls, user library functions and many other issues.
Hope it helps!

Compile Time & Memory Usage of a large C++ Project?

Suppose one has about 50,000 different .cpp files.
Each .cpp file contains just one class that has about ~1000 lines of code in it (the code itself is not complicated -- involves in-memory operations on matrices & vectors -- i.e, no special libraries are used).
I need to build a project (in a Linux environment) that will have to import & use all of these 50,000 different .cpp files.
A couple of questions come to mind:
How long will it roughly take to compile this? What will be the approx. size of the compiled file?
What would be a better approach -- keep 50,000 different .so files (compiled extenstions) and have the main program import them one by one, or alternatively, unite these 50,000 different .cpp files into one large .cpp file, and just deal with that? Which method will be faster / more efficient?
Any insights are greatly appreicated.

There is no answer, just advice.
Right back at you: What are you really trying to do? Are you trying to make a code library from different source files? Or is that an executable? Did you actually code that many .cpp files?
50,000 source files is well... a massively sized project. Are you trying to do something common across all files (e.g. every source file represents a resource, record, image, or something unique). Or it just 50K disparate code files?
Most of your compile time will not be based on the size of each source file. It will be based on the amount of header files (and the headers they include) that will be brought in with each cpp file. Headers, while not usually containing implementations, just declarations, have to go through a compile process. And redundant headers across the code base can slow your build time down.
Large projects at that kind of scale use precompiled headers. You can include all the commonly used header files in one header file (common.h) and build common.h. Then all the other source files just include "common.h". The compiler can be configured to automatically use the compiled header file when it sees the #include "common.h" for each source.

(i) There are way too many factors involved in determining this, even an approximation is impossible. Compilation can be memory, cpu or hard drive bound. The complexity of the files matter (from your description, your complexity is low).
(ii) The typical way of doing this is to make a library and let the system figure out linking or loading. You can choose static or dynamic linking.
static linking
Assuming you are using gcc, this would look like this:
g++ -c file1.cpp -o file1.o
g++ -c file2.cpp -o file2.o
...
g++ -c filen.cpp -o filen.o
ar -rc libvector.a file1.o file2.o ... filen.o
Then, when you build your own code, your final link looks like this:
g++ myfile.cpp libvector.a -o mytask
dynamic linking
Again, assuming you are using gcc, this would look like this:
g++ -c file1.cpp -fPIC -o file1.o
g++ -c file2.cpp -fPIC -o file2.o
...
g++ -c filen.cpp -fPIC -o filen.o
ld -G file1.o file2.o ... filen.o -o libvector.so
Then, when you build your own code, your final link looks like this:
g++ myfile.cpp libvector.so -o mytask
You will need libvector.so to be in the loader's path for your executable to work.
In any case, as long as the 50,000 files don't change, you will only need to do the last command (which will be much faster).

You can build each object file from a '.cpp' with having the '.h' file having lots (and I MEAN LOTS) of forward declarations - so when you change a .h file it does not need to recompile the rest of the program. Usually a function/method needs the name of the object in its parmaters or what it is returing. If it needs other details - yes it needs to be included.
Please get a book by Scott Myers - Will help you a lot.
Oh - When trying to eat a big cake - divied it up. The slices are more manageable.

We can't really say the time it will take to compile, but what you should do is compile each .cpp/.h pair into a .o file:
$ g++ -c -o test.o test.cpp ...
Once you have all of these, you compile the main program as so:
$ g++ -c -o main.o main.cpp
$ g++ -o main main.o test.o blah.o otherThings.o foo.o bar.o baz.o etc...
Your idea of using .sos is pretty much asking "how quickly can I crash the program and possibly the OS?". Shared libraries are ment for large libraries in small numbers, not 50,000 .sos linked to a binary (especially if you load them dynamicly...that would be BAD).

Linking files in g++

Recently I have tried to compile a program in g++ (on Ubuntu). Usually i use Dev-C++ (on Windows) and it works fine there as long as I make a project and put all the necessary files in there.
The error that occurs when compiling the program is:
$filename.cpp: undefined reference to '[Class]::[Class Member Function]'
The files used are as following:
The source code (.cpp) file with the main function.
The header file with the function prototypes.
The .cpp file with the definitions for each function.
Any help will be appreciated.

You probably tried to either compile and link instead of just compiling source files or somehow forgot something.
Variation one (everything in one line; recompiles everything all the time):
g++ -o myexecutable first.cpp second.cpp third.cpp [other dependencies, e.g. -Lboost, -LGL, -LSDL, etc.]
Variation two (step by step; if no -o is provided, gcc will reuse the input file name and just change the extension when not linking; this variation is best used for makefiles; allows you to skip unchanged parts):
g++ -c first.cpp
g++ -c second.cpp
g++ -c third.cpp
g++ -o myexecutable first.o second.o third.o [other dependencies]
Variation three (some placeholders):
Won't list it but the parameters mentioned above might as well take placeholders, e.g. g++ -c *.cpp will compile all cpp files in current directory to o(bject) files of the same name.
Overall you shouldn't worry too much about it unless you really have to work without any IDE. If you're not that proficient with the command line syntax, stick to IDEs first.

The command line of gcc should look like:
g++ -o myprogram class1.cpp class2.cpp class3.cpp main.cpp
Check in which cpp file the missing class member function is defined. You may have not given it to gcc.

You can also check for correct #include tags within filename.cpp. Assume that filename.cpp uses code contained in myclass.h present in the same directory as filename.cpp. Assume that the class that g++ says is undefined is contained in myclass.h and defined in myclass.cpp. So, to correctly include myclass.h within filename.cpp, do the following:
In filename.cpp:
#include <iostream>
#include <myclass.h>
//..source code.
In the makefile:
filename.o: myclass.C myclass.h filename.cpp
g++ -I./ -c filename.cpp -o filename.o
myclass.o: myclass.C myclass.h
g++ -c myclass.C -o myclass.o
In the above, note the use of -I. option when compiling filename.cpp. The -I<directory> asks g++ to include the path following the -I part into the search path. That way myclass.h is correctly included.
In the absence of more information (the source maybe), it is difficult to say with any accuracy where the problem lies. All attempts will be but stabs in the dark.

I assume that you have declared a member function (usually in a .h or .hpp file) but have ommited the respective definition of the member function (usually in a .cpp file).
In c++, it is possible to declare a class like so:
class foo {
void x();
void y();
}
with a cpp file that goes like so
void foo::x() {
do_something()
}
Note, there is no foo::y().
This poses no problem to the compiling/linking process as long as the member function foo::y() is referenced nowhere throughout the compiled code.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js