I have about 20 .cpp files and and as many .h files. When I compile I do
g++ -std=c++11 main.cpp -o main
in main.cpp I start by some forward declarations and then I include all .h and .cpp files.
As a result, at every compilation, all the files must be recompiled which it uncomfortably slow. I think I would need to use .so and/or .dll files so that at future compilation, only the modified code need to be recompiled. I don't really know how to do that. Could you give me some advice?
You generally want each translation unit to be compiled separately and you might even compile them in parallel (e.g. with make -j, see below).
(I am guessing and hoping that you are on Linux and using GCC as g++; adapt this answer to your compiler and operating system if not.)
If you have src1.cpp src2.cpp src3.cpp (each containing appropriate #include directives, probably with a common header file) you would compile src1.cpp into an object file src1.o using GCC as:
g++ -Wall -Wextra -g src1.cpp -c -o src1.o
The -Wall -Wextra options asks for all warnings and some extra ones, and you really want them (to improve your code to get no warnings). The -g option asks for DWARF debugging information (to be able to use the gdb debugger later, and also to use valgrind). The -c option requires only the compilation step without linking. The -o src1.o explicits the output object file.
Likewise, you'll compile src2.cpp into src2.o
g++ -Wall -Wextra -g src2.cpp -c -o src2.o
and src3.cpp
g++ -Wall -Wextra -g src3.cpp -c -o src3.o
BTW I prefer the shorter .cc suffix to .cpp. For benchmarking purposes, enable optimizations in your compiler, e.g. by adding -O2 -march=native after -g.
Finally you want to link all your three objects files src1.o, src2.o and src3.o into a myprog executable:
g++ -g src1.o src2.o src3.o -o myprog
you might add additional options, e.g. to link external libraries.
Beware that C++14 (and also C++11 and even C++17) has no real modules (in contrast to Ocaml or Go). So the preprocessor is used a lot, and you practically need to include a lot of code; for example, #include <vector> is pulling more than ten thousand lines of C++ code from standard and internal header files on my Linux/Debian desktop. This explains why C++ compilers are slow, hence I recommend avoiding too small C++ files (e.g. a source file of only a hundred C++ lines including several header files, that would in practice pull dozens of thousands of C++ lines from various internal headers). My preference is to define several related functions (and perhaps classes) in each .cc (or .cpp) source file of one or a few thousands lines of C++.
(future C++ standards, perhaps C++20, might add modules to the language; but this could be postponed...)
My recommendation is to learn to use GNU make or ninja. Indeed you need a build automation tool.
You certainly should learn how to invoke your compiler on the command line. Read about invoking GCC if you use g++. Order of arguments to g++ matters a big lot.
You could use tools like cmake or meson which generates configuration files (for make or ninja). But I recommend simpler things than cmake or meson (e.g. code your Makefile by hand and just use make). Here is an example of Makefile for make. You need to understand your build process. In some cases, you might generate some (simple) C++ file(s) during your build (e.g. with GNU bison or Qt moc or your own script or program emitting some C++ file).
I think I would need to use .so
Not necessarily. .so files are shared objects, used in shared libraries. See this. You could (and probably at first, you want to) have several object files, as explained above. You might later consider making your own software libraries, but that is worthwhile only for reusable source code.
To link external libraries, you might even want to use pkg-config (for those packages knowing about it) which expands to appropriate build options for g++.
Look also into existing free software projects for inspiration, and study their source code (including their build process). You'll find many of them in Linux distributions, and on github, sourceforge and elsewhere.
... and then I include all .h and .cpp files.
.cpp files (aka translation units) aren't meant to be included. You should use a script or any other kind of build system, that compiles each of the .cpp files separately, and links all the produced .o object files together into the executable program.
This can be enabled to avoid to recompile all the .cpp files, if they aren't affected by changes in header files.
Related
I have a file main.cpp containing an implementation of int main() and a library foo split up between foo.h and foo.cpp.
What is the difference (if any) between
g++ main.cpp foo.cpp -o main
and
g++ -c foo.cpp -o foo.o && g++ main.cpp foo.o
?
Edit: of course there is a third version:
g++ -c foo.cpp -o foo.o && g++ -c main.cpp -o main.o && g++ main.o foo.o -o main
The total work that the compiler & linker (and other tools used by the compiler) has to do is exactly the same (give or take a few minor things like deleting the temporary object file created for foo.o and main.o that the compiler makes in the first example, which remains in the second example, and both remain in the third example).
The main difference comes when you have a larger project, and you use a Makefile to build the code. Here the advantage is that, since the Makefile only recompiles things that need to be recompiled, you don't have to wait for the compiler to compile code that don't need to recompile. Assuming of course, we choose to use the g++ -c file.cpp -o file.o variant in the makefile (which is the typical way to do it), and not the g++ file.cpp main.cpp ... -o main.
Of course, there are other possible scenarios - for example in unit testing, you may want to use the same object file to build a test around, as you were using to build the main application. Again, this makes more of a difference when the project is large and has half a dozen or more source files.
On a modern machine, compiling doesn't take that long - my compiler project (~5500 lines of C++ code) that links with LLVM takes about 10 seconds to compile the source files, and another 10 seconds to link with all the LLVM files. That's a debug version of the llvm libraries, so it produces a 120+ MB executable.
Once you get onto commercial (or corresponding open source type projects) level of software, the number of sourcefiles and other things involved in a project can be hundreds, and the number of lines of the sources can often be in the 100k-several million lines range. And now it starts to matter if you just recompile foo.cpp or "everything", because compiling everything takes an hour of CPU time. Sure, with multicore machines, it still is only a few minutes, but it's not ideal to spend minutes, when you could just spend a few seconds.
If you type something like this:
g++ -o main main.cpp foo.cpp
You are compiling and linking two cpp files at once and generating an executable file called main (you get it with -o)
If you type this:
g++ main.cpp foo.cpp
You are compiling and linking two cpp files at once, generating an executable file with the default name a.out.
Finally, if you type this:
g++ -c foo.cpp
You will generate an object file called foo.o which can later be linked with g++ -o executable_name file1.o ... fileN.o
Using options -c and -o allows you to perform separately two of the tasks performed by the g++ compiler and getting the corresponding preprocessed and object files respectively. I have found a link which may provide you helpful information about it. It talks about gcc (C compiler), but both g++ and gcc work similarly after all:
http://www3.ntu.edu.sg/home/ehchua/programming/cpp/gcc_make.html
Be careful with the syntax of the commands you are using. If you work with Linux and you have problems with commands, just open a cmd window and type "man name_of_the_command", in order to read about, syntax, options, return values and some more relevant information about commands, system calls, user library functions and many other issues.
Hope it helps!
I use g++ -M -MF options. But this makes output about one object file in my project. How get this output for all .o files in my project?
I don't know about QTCreater, but IDEs usually handle header dependencies internally. The GCC options you mentioned are primarily useful if you write your own Makefile, with its own automatic header dependency handling.
Recently I had to use this command in a makefile I had for an sqlite program I'm working on:
gcc -g -c sqlite3.c -o sqlite3.o
g++ -g -c main.cpp -o main.o
g++ sqlite3.o main.o -o sqliteex
I had to directly compile the sqlite3.c file into my program in order to use the sqlite3.h interface (included in the main.cpp file with #include SQL/sqlite3.h). But why did I need to use gcc to do this and create sqlite3.o, then compile both files as .o files into my executable?
Edit: My guess would be that .o files are compilable by both gcc and g++, if this is the case, is it a good practice to just always compile things as .o files?
But why did I need to use gcc to do this and create sqlite3.o, then compile both files as .o files into my executable?
You did not need to do that. The reason you did do that was to specify that sqlite.c was C code and not C++ code. You could have done this instead:
g++ main.cpp -x c sqlite3.c -o sqliteex
Additionally, it is possible (but not at all certain) that the sqlite code could have compiled as C++, like this:
g++ main.cpp sqlite3.c -o sqliteex
Quote from Wikipedia:
Single Compilation Unit is a technique of computer programming for the C/C++ languages, which reduces compilation time and aids the compiler to perform program optimization even when the compiler itself is lacking support for whole program optimization or precompiled headers.
http://en.wikipedia.org/wiki/Single_Compilation_Unit
Development is mostly edit->compile until success cycle. When you have separately compiled files you can just recompile only file which was modified, which makes rebuild much faster. Last line is not compilation but linking of compiled object files into target executable.
Also as Mysticial noted, you have mixture of C and C++
Suppose one has about 50,000 different .cpp files.
Each .cpp file contains just one class that has about ~1000 lines of code in it (the code itself is not complicated -- involves in-memory operations on matrices & vectors -- i.e, no special libraries are used).
I need to build a project (in a Linux environment) that will have to import & use all of these 50,000 different .cpp files.
A couple of questions come to mind:
How long will it roughly take to compile this? What will be the approx. size of the compiled file?
What would be a better approach -- keep 50,000 different .so files (compiled extenstions) and have the main program import them one by one, or alternatively, unite these 50,000 different .cpp files into one large .cpp file, and just deal with that? Which method will be faster / more efficient?
Any insights are greatly appreicated.
There is no answer, just advice.
Right back at you: What are you really trying to do? Are you trying to make a code library from different source files? Or is that an executable? Did you actually code that many .cpp files?
50,000 source files is well... a massively sized project. Are you trying to do something common across all files (e.g. every source file represents a resource, record, image, or something unique). Or it just 50K disparate code files?
Most of your compile time will not be based on the size of each source file. It will be based on the amount of header files (and the headers they include) that will be brought in with each cpp file. Headers, while not usually containing implementations, just declarations, have to go through a compile process. And redundant headers across the code base can slow your build time down.
Large projects at that kind of scale use precompiled headers. You can include all the commonly used header files in one header file (common.h) and build common.h. Then all the other source files just include "common.h". The compiler can be configured to automatically use the compiled header file when it sees the #include "common.h" for each source.
(i) There are way too many factors involved in determining this, even an approximation is impossible. Compilation can be memory, cpu or hard drive bound. The complexity of the files matter (from your description, your complexity is low).
(ii) The typical way of doing this is to make a library and let the system figure out linking or loading. You can choose static or dynamic linking.
static linking
Assuming you are using gcc, this would look like this:
g++ -c file1.cpp -o file1.o
g++ -c file2.cpp -o file2.o
...
g++ -c filen.cpp -o filen.o
ar -rc libvector.a file1.o file2.o ... filen.o
Then, when you build your own code, your final link looks like this:
g++ myfile.cpp libvector.a -o mytask
dynamic linking
Again, assuming you are using gcc, this would look like this:
g++ -c file1.cpp -fPIC -o file1.o
g++ -c file2.cpp -fPIC -o file2.o
...
g++ -c filen.cpp -fPIC -o filen.o
ld -G file1.o file2.o ... filen.o -o libvector.so
Then, when you build your own code, your final link looks like this:
g++ myfile.cpp libvector.so -o mytask
You will need libvector.so to be in the loader's path for your executable to work.
In any case, as long as the 50,000 files don't change, you will only need to do the last command (which will be much faster).
You can build each object file from a '.cpp' with having the '.h' file having lots (and I MEAN LOTS) of forward declarations - so when you change a .h file it does not need to recompile the rest of the program. Usually a function/method needs the name of the object in its parmaters or what it is returing. If it needs other details - yes it needs to be included.
Please get a book by Scott Myers - Will help you a lot.
Oh - When trying to eat a big cake - divied it up. The slices are more manageable.
We can't really say the time it will take to compile, but what you should do is compile each .cpp/.h pair into a .o file:
$ g++ -c -o test.o test.cpp ...
Once you have all of these, you compile the main program as so:
$ g++ -c -o main.o main.cpp
$ g++ -o main main.o test.o blah.o otherThings.o foo.o bar.o baz.o etc...
Your idea of using .sos is pretty much asking "how quickly can I crash the program and possibly the OS?". Shared libraries are ment for large libraries in small numbers, not 50,000 .sos linked to a binary (especially if you load them dynamicly...that would be BAD).
I have been learning C++ in school to create small command-line programs.
However, I have only built my projects with IDEs, including VS08 and QtCreator.
I understand the process behind building a project: compile source to object code, then link them into an executable that is platform specific (.exe, .app, etc). I also know most projects also use make to streamline the process of compiling and linking multiple source and header files.
The thing is, although IDEs do all this under the hood, making life very easy, I don't really know what is really happening, and feel that I need to get accustomed to building projects the "old fashioned way": from the command line, using the tool chain explicitly.
I know what Makefiles are, but not how to write them.
I know what gcc does, but not how to use it.
I know what the linker does, but not how to use it.
What I am looking for, is either an explanation, or link to a tutorial that explains, the workflow for a C++ project, from first writing the code up to running the produced executable.
I would really like to know the what, how, and why of building C++.
(If it makes any difference, I am running Mac OS X, with gcc 4.0.1 and make 3.81)
Thanks!
Compiling
Let's say you want to write a simple 'hello world' application. You have 3 files, hello.cpp hello-writer.cpp and hello-writer.h, the contents being
// hello-writer.h
void WriteHello(void);
// hello-writer.cpp
#include "hello-writer.h"
#include <stdio>
void WriteHello(void){
std::cout<<"Hello World"<<std::endl;
}
// hello.cpp
#include "hello-writer.h"
int main(int argc, char ** argv){
WriteHello();
}
The *.cpp files are converted to object files by g++, using the commands
g++ -c hello.cpp -o hello.o
g++ -c hello-writer.cpp -o hello-writer.o
The -c flag skips the linking for the moment. To link all the modules together requires running
g++ hello.o hello-writer.o -o hello
creating the program hello. If you need to link in any external libraries you add them to this line, eg -lm for the math library. The actual library files would look something like libm.a or libm.so, you ignore the suffix and the 'lib' part of the filename when adding the linker flag.
Makefile
To automate the build process you use a makefile, which consists of a series of rules, listing a thing to create and the files needed to create it. For instance, hello.o depends on hello.cpp and hello-writer.h, its rule is
hello.o:hello.cpp hello-writer.h
g++ -c hello.cpp -o hello.o # This line must begin with a tab.
If you want to read the make manual, it tells you how to use variables and automatic rules to simplify things. You should be able to just write
hello.o:hello.cpp hello-writer.h
and the rule will be created automagically. The full makefile for the hello example is
all:hello
hello:hello.o hello-writer.o
g++ hello.o hello-writer.o -o hello
hello.o:hello.cpp hello-writer.h
g++ -c hello.cpp -o hello.o
hello-writer.o:hello-writer.cpp hello-writer.h
g++ -c hello-writer.cpp -o hello-writer.o
Remember that indented lines must start with tabs. Not that not all rules need an actual file, the all target just says create hello. It is common for this to be the first rule in the makefile, the first being automatically created when you run make.
With all this set up you should then be able to go to a command line and run
$ make
$ ./hello
Hello World
More advanced Makefile stuff
There are also some useful variables that you can define in your makefile, which include
CXX: c++ compiler
CXXFLAGS:
Additional flags to pass to the
compiler (E.g include directories
with -I)
LDFLAGS: Additional flags to
pass to the linker
LDLIBS: Libraries
to link
CC: c compiler (also used to
link)
CPPFLAGS: preprocessor flags
Define variables using =, add to variables using +=.
The default rule to convert a .cpp file to a .o file is
$(CXX) $(CXXFLAGS) $(CPPFLAGS) -c $< -o $#
where $< is the first dependancy and $# is the output file. Variables are expanded by enclosing them in $(), this rule will be run with the pattern hello.o:hello.cpp
Similarly the default linker rule is
$(CC) $(LDFLAGS) $^ -o $# $(LDLIBS)
where $^ is all of the prerequisites. This rule will be run with the pattern hello:hello.o hello-writer.o. Note that this uses the c compiler, if you don't want to override this rule and are using c++ add the library -lstdc++ to LDLIBS with the line
LDLIBS+=-lstdc++
in the makefile.
Finally, if you don't list the dependancies of a .o file make can find them itself, so a minimal makefile might be
LDFLAGS=-lstdc++
all:hello
hello:hello.o hello-writer.o
Note that this ignores the dependancy of the two files on hello-writer.h, so if the header is modified the program won't be rebuilt. If you're interested, check the -MD flag in the gcc docs for how you can automatically generate this dependancy.
Final makefile
A reasonable final makefile would be
// Makefile
CC=gcc
CXX=g++
CXXFLAGS+=-Wall -Wextra -Werror
CXXFLAGS+=-Ipath/to/headers
LDLIBS+=-lstdc++ # You could instead use CC = $(CXX) for the same effect
# (watch out for c code though!)
all:hello # default target
hello:hello.o hello-world.o # linker
hello.o:hello.cpp hello-world.h # compile a module
hello-world.o:hello-world.cpp hello-world.h # compile another module
$(CXX) $(CXXFLAGS) -c $< -o $# # command to run (same as the default rule)
# expands to g++ -Wall ... -c hello-world.cpp -o hello-world.o
A simple example is often useful to show the basic procedure, so:
Sample gcc usage to compile C++ files:
$ g++ -c file1.cpp # compile object files
[...]
$ g++ -c file2.cpp
[...]
$ g++ -o program file1.o file2.o # link program
[...]
$ ./program # run program
To use make to do this build, the following Makefile could be used:
# main target, with dependencies, followed by build command (indented with <tab>)
program: file1.o file2.o
g++ -o program file1.o file2.o
# rules for object files, with dependencies and build commands
file1.o: file1.cpp file1.h
g++ -c file1.cpp
file2.o: file2.cpp file2.h file1.h
g++ -c file2.cpp
Sample Makefile usage:
$ make # build it
[...]
$ ./program # run it
For all the details you can look at the Gnu make manual and GCC's documentation.
I know what Makefiles are, but not how to write them.
The make syntax is horrible, but the GNU make docs aren't bad. The main syntax is:
<target> : <dependency> <dependency> <dep...>
<tab> <command>
<tab> <command>
Which defines commands to build the target from the given dependencies.
Reading docs and examples is probably how most people learn makefiles, as there are many flavors of make with their own slight differences. Download some projects (pick something known to work on your system, so you can actually try it out), look at the build system, and see how they work.
You should also try building a simple make (strip out a bunch of the harder features for your first version); I think this is one case where that will give you a much better grasp on the situation.
I know what gcc does, but not how to use it.
Again, man g++, info pages, and other documentation is useful, but the main use when you call it directly (instead of through a build system) will be:
g++ file.cpp -o name # to compile and link
g++ file.cpp other.cpp -o name # to compile multiple files and link as "name"
You can also write your own shell script (below is my ~/bin/c++ simplified) to incorporate $CXXFLAGS so you won't forget:
#!/bin/sh
g++ $CXXFLAGS "$#"
You can include any other option as well. Now you can set that environment variable ($CXXFLAGS, the standard variable for C++ flags) in your .bashrc or similar, or redefine it in a particular session, for working without a makefile (which make does do just fine, too).
Also use the -v flag to see details on what g++ does, including...
I know what the linker does, but not how to use it.
The linker is what takes the object files and links them, as I'm sure you know, but g++ -v will show you the exact command it uses. Compare gcc -v file.cpp (gcc can work with C++ files) and g++ -v file.cpp to see the difference in linker commands that often causes the first to fail, for example. Make also shows the commands as it runs them by default.
You are better off not using the linker directly, because it is much simpler to use either gcc or g++ and give them specific linker options if required.
Just to throw this out there, the complete gcc documentation can be found here: http://www.delorie.com/gnu/docs/gcc/gcc_toc.html
compiler takes a cpp and turns into an object file which contains native code and some information about that native code
a linker takes the object files and lays out an excutable using the extra information in the object file.... it finds all the references to the same things and links them up, and makes and image useful for the operating system to know how to load all the code into memory.
check out object file formats to get a better understanding of what the compiler produces
http://en.wikipedia.org/wiki/Object_file (different compilers use different formats)
also check out (for gcc)
http://pages.cs.wisc.edu/~beechung/ref/gcc-intro.html on what you type at the command line
You might also look into Autoproject, which sets up automake and autoconf files, which makes it easier for people to compile your packages on different platforms: http://packages.debian.org/unstable/devel/autoproject
I like this quirky intro to building a hello world program with gcc, Linux-based but the command-line stuff should work fine on OS/X. In particular, it walks you through making some common mistakes and seeing the error messages.
Holy Compilers, Robin, the darn thing worked!
This is what has helped me to learn the autoconf, automake, ...:
http://www.bioinf.uni-freiburg.de/~mmann/HowTo/automake.html
It is a nice tutorial progresses from a simple helloworld to more advanced structures with libraries etc.