I read that #include <file> will copy paste “file” into our source file by C++ preprocessor before being compiled.
Does this mean that the “file” (iostream) will also be compiled again and again as long as we compile source file?
Also after C++ does its job, will the size of intermediate file also have bytes of “file” + "size of source file"?
I read that #include <file> will copy
paste “file” into our source file by C++ preprocessor before being
compiled.
Yes. The data that the compiler proper sees will consist of the data in file and the data in your source file. (Actually, real compilers these days tend to merge the C preprocessor and the compiler front end and the compiler back end into a single program - but that is pretty much a detail.)
Does this mean that the “file” (iostream) will also be compiled again
and again as long as we compile source file?
Yes. Some compilers have a feature called "pre-compiled headers" which allow you to compile a bunch of headers once, and then use the output of that multiple times. How you do this varies from compiler to compiler; if you need portability, don't worry about it (it doesn't make that big a difference to compile times).
Also after C++ does its job, will the size of intermediate file also
have bytes of “file” + "size of source file"?
No. The size of the output file is only very weakly related to the size of the source file. For example #include <iostream> defines many, many inline functions. Any particular program will only use a very few of those - so they will be omitted from the compiler output.
Comments (which use space in the source file) don't appear in the output.
On the other hand, if you write a complex template, and then instantiate it for several different types, then the output will contain a different copy of the template for each type, and may be quite a bit larger than the input.
Regarding the size of intermediate file size, yes it will increase. You can check this by writing a simple hello world program and then compile it as follows (in Linux)
g++ name.cpp -o name --save-temps
This will store intermediate files, specifically:
"name.ii" (Preprocessed code after including <iostream>
"name.s" (Assembly version of the code)
"name.o" (Object code)
Check the difference in this size of file using:
ls -l name.cpp name.ii
No.libraries(headers) don't increase the size of file because compiler don't add all of the header into your code.It just add things you use in your code to your code,
No, it doesn't increase program size. The file iostream and many other header files don't have compilable statements. They contain some definitions required for you program.
If you look at a preprocessed C/C++ file, you will see thousands of definitions are added at the beginning of the file.
You can try it with cpp tool, run it with commands just like normal g++/gcc and see the outputs.
cpp -I /usr/include/ -I. file.cpp -o file.preprocessed
It contains just the headers and definitions, they don't increase you final program size.
Edit:
As martin said, they have compilable inline functions which will compile each time, but they doesn't increase your program size unless you use them.
I read that #include <file> will copy paste “file” into our source
file by C++ preprocessor before being compiled.
It's a little more subtle than that. #include "file" does what you describe. #include <file> pulls in a header which is not required to be a file. The idea is that the compiler can have its own internal version of that header, already in binary form, so it doesn't have to read and parse a file. If there is no header with that name, then it treats the include directive as if it had been #include "file" and looks for a file to read.
In practice, I don't know of a compiler that takes advantage of this leeway; all headers are, in fact, files that get compiled over and over again. But, as others have said, that in itself does not mean that the executable file becomes larger. Ideally, if you don't use it, it doesn't get into the executable, although some systems are better about this than others.
Related
I'm writing a programming language, that convert the source files to C++ and compile they.
I want to add an way to work with a large number of files, compiling they to .o files, making it possible to use makefiles. A Better explanation (thank #Beta):
You have a tool that reads a source file (foo.FN) and writes C++ source and header files (foo.cpp and foo.h). Then a compiler (gcc) reads those source and header files (foo.cpp and foo.h) and writes an object file (foo.o). And maybe there are interdependencies (bar.cpp needs foo.h).
The problem is: my interpreter delete the .cpp and .h after the GCC compile they. Because this, it can't use #include, cause when it will compile, the referenced files don't exist anymore. How I can solve this?
There are two parts to the answer.
First, don't write explicit header files. You know what they should contain, just perform the #include operation yourself.
Secondly, don't write out the .cpp file either. Use gcc -x c++ - to read the code from standard input, and have your tool emit C++ to standard out, so you can run tool foo.FN | gcc -c -o foo.o -x c++ - to produce foo.o.
I have 2 questions about c++ precompiled-headers feature.
1.What actually is happening when you make a .gch file (using GCC), what it contains ?
2.Why those files are so huge in size , but the final executable is so small.
When you precompile a header, it all begins like an usual compilation:
The preprocessor is run, including any dependent headers and performing macro substitution
The resulting source code is hander over to the compiler, which parses it and validates the syntax
Then the compiler produces the AST, the data structure which encodes the semantics of the code
Usually, this is done on .cpp files, and goes on afterwards to actually compile the AST and generate executable code. However, a header precompilation stops there, and the compiler dumps the AST inside the .gch file.
On further uses of this precompiled header, the compiler can then directly load the AST back from file and pick it up from there, skipping the costly processing listed above.
The .gch file is huge, because it contains a lot of information that was implicit in the original header. But it has no relation to the size of the final executable -- compiling with and without precompiled headers should produce the exact same result.
Would it be optimizable in a large program if a .cpp file loaded all the needed headers for the application rather than pre-loading it in the main source file?
Like instead of having
Main.cpp
#include <header.h>
#include <header1.h>
#include <header2.h>
#include <header3.h>
//main code
Can I just have a .cpp file that does this and just loads .cpp file in the main.cpp? Like this
Loader.cpp
#include <header.h>
#include <header1.h>
#include <header2.h>
#include <header3.h>
Main.cpp
#include "Loader.cpp"
//main code
Preprocessing simply generates the text that gets compiled. Your suggestion leads to the same body of source code, so it will have no effect on optimization.
Including all the headers, all the time (call it a "super-header") may lead to slow compilation.
However, precompiled headers are a popular solution to allow such super-headers to work quickly. You might check your IDE or compiler's documentation to learn its precompiled header facility.
In any case, typically the super-header is still named with .h; since it implements nothing a .cpp name would not be appropriate.
You can, but you may want to reconsider.
You may have trouble with accidentally trying to compile Loader.cpp by itself since you've named it as if it was a source file. Since you're using it as a header - and it is a concatenation of multiple headers - it would make sense to name it according to the convention and use .h file name extension.
Would it be optimizable
It would have zero effect on the compiled program, so in that sense optimization is irrelevant.
This will bring problems with compilation speed however. And those problems are not "optimizable". Copying every header into the source file - needed or not - slow the time (and the slowdown is bound to the hard drive speed) needed to compile the source file. Much bigger problem is that it prevents incremental building because a change in any header file would force the source file to be recompiled, because its content will have changed.
These problems are compounded over multiple source files assuming you intend to use this "technique" with all source files that you have. Having to recompile the entire project when any header is modified is usually non acceptable for any project that is not trivially small.
It really depends on the compiler.
Visual studio has a thing called stdafx.h.
What's the use for "stdafx.h" in Visual Studio?
I was reading on Clang and Ch (c++ interpreters), but its not clear for me, is it possible to run a newly generated .cpp file without any installations? Because i need to run the final program on any pc...
ps. if yes, does anyone have a good example, where a .cpp file is being executed within c++ code?
This is probably impossible or at least very hard. You would have to include the whole compiler (including linker, assembler, optimizer, preprocessor, ...) inside your program and that would make it extremely big.
One way of doing this is with Clang (as you already noted), there is even a demo project called "Clang interpreter" in the source: http://llvm.org/viewvc/llvm-project/cfe/trunk/examples/clang-interpreter/
However I once tried to compile this "beast" into my program and gave up halfway, because the file size of the result binary (or binaries with external libraries) gets into tens of megabytes (maybe even a hundred).
My suggestion is to either produce a different script (e.g. bash/sh script, which you could execute on any unix machine) that can be interpreted easily.
As far as I know, it is impossible, because compilation process of a CPP file is like this-
Preprocessing: the preprocessor takes a C++ source code file and deals with the #includes, #defines and other preprocessor directives. The output of this step is a "pure" C++ file without pre-processor directives.
Compilation: the compiler takes the pre-processor's output and produces an object file from it.
Linking: the linker takes the object files produced by the compiler and produces either a library or an executable file.
So, there should be intermediate files and executable files.
More can be found here-
https://stackoverflow.com/a/6264256/7725220
Kind of depends on what you mean by "installations".
Yes you can distribute your program with a full compiler, compile the source code and then execute the final result (all from the original exe).
When I include some function from a header file in a C++ program, does the entire header file code get copied to the final executable or only the machine code for the specific function is generated. For example, if I call std::sort from the <algorithm> header in C++, is the machine code generated only for the sort() function or for the entire <algorithm> header file.
I think that a similar question exists somewhere on Stack Overflow, but I have tried my best to find it (I glanced over it once, but lost the link). If you can point me to that, it would be wonderful.
You're mixing two distinct issues here:
Header files, handled by the preprocessor
Selective linking of code by the C++ linker
Header files
These are simply copied verbatim by the preprocessor into the place that includes them. All the code of algorithm is copied into the .cpp file when you #include <algorithm>.
Selective linking
Most modern linkers won't link in functions that aren't getting called in your application. I.e. write a function foo and never call it - its code won't get into the executable. So if you #include <algorithm> and only use sort here's what happens:
The preprocessor shoves the whole algorithm file into your source file
You call only sort
The linked analyzes this and only adds the source of sort (and functions it calls, if any) to the executable. The other algorithms' code isn't getting added
That said, C++ templates complicate the matter a bit further. It's a complex issue to explain here, but in a nutshell - templates get expanded by the compiler for all the types that you're actually using. So if have a vector of int and a vector of string, the compiler will generate two copies of the whole code for the vector class in your code. Since you are using it (otherwise the compiler wouldn't generate it), the linker also places it into the executable.
In fact, the entire file is copied into .cpp file, and it depends on compiler/linker, if it picks up only 'needed' functions, or all of them.
In general, simplified summary:
debug configuration means compiling in all of non-template functions,
release configuration strips all unneeded functions.
Plus it depends on attributes -> function declared for export will be never stripped.
On the other side, template function variants are 'generated' when used, so only the ones you explicitly use are compiled in.
EDIT: header file code isn't generated, but in most cases hand-written.
If you #include a header file in your source code, it acts as if the text in that header was written in place of the #include preprocessor directive.
Generally headers contain declarations, i.e. information about what's inside a library. This way the compiler allows you to call things for which the code exists outside the current compilation unit (e.g. the .cpp file you are including the header from). When the program is linked into an executable that you can run, the linker decides what to include, usually based on what your program actually uses. Libraries may also be linked dynamically, meaning that the executable file does not actually include the library code but the library is linked at runtime.
It depends on the compiler. Most compilers today do flow analysis to prune out uncalled functions. http://en.wikipedia.org/wiki/Data-flow_analysis