make SCons compile everything in one gcc line? - c++

I have a rather complex SCons script that compiles a big C++ project.
This gcc manual page says:
The compiler performs optimization based on the knowledge it has of the program. Compiling multiple files at once to a single output file mode allows the compiler to use information gained from all of the files when compiling each of them.
So it's better to give all my files to a single g++ invocation and let it drive the compilation however it pleases.
But SCons does not do this. it calls g++ separately for every single C++ file in the project and then links them using ld
Is there a way to make SCons do this?

The main reason to have a build system with the ability to express dependencies is to support some kind of conditional/incremental build. Otherwise you might as well just use a script with the one command you need.
That being said, the result of having gcc/g++ optimize as the manual describe is substantial. In particular if you have C++ templates you use often. Good for run-time performance, bad for recompile performance.
I suggest you try and make your own builder doing what you need. Here is another question with an inspirational answer: SCons custom builder - build with multiple files and output one file

Currently the answer is no.
Logic similar to this was developed for MSVC only.
You can see this in the man page (http://scons.org/doc/production/HTML/scons-man.html) as follows:
MSVC_BATCH When set to any true value, specifies that SCons should
batch compilation of object files when calling the Microsoft Visual
C/C++ compiler. All compilations of source files from the same source
directory that generate target files in a same output directory and
were configured in SCons using the same construction environment will
be built in a single call to the compiler. Only source files that have
changed since their object files were built will be passed to each
compiler invocation (via the $CHANGED_SOURCES construction variable).
Any compilations where the object (target) file base name (minus the
.obj) does not match the source file base name will be compiled
separately.
As always patches are welcome to add this in a more general fashion.

In general this should be left up to the program developer. Trying to compile all together in an amalgamation may introduce unintended behaviour to the program if it even compiles in the first place. Your best bet if you want this kind of optimisation without editing the source yourself is to use a compiler with inter-process optimisation like icc -ipo.
Example where an amalgamation of two .c files would not compile is for example if they use two identical static symbols with different functionality.

Related

Is there anything like a forwarding C++ preprocessor, that could be used by GCC?

I've been searching around for different custom pre-processor extensions and replacements, but all of them seem to come with 1 of 2 caveats:
Either 1), you generate the code as a separate build-system, them manually put the output into your real (CMake) build system, or 2) you end up losing the builtin preprocessor for GCC.
Is there really no tool that can, say, run each file it gets against some configured script, then through cpp, then pass the result to gcc?
I'd love to use something like Cog by just setting an environment variable for gcc, indicating a tool that runs Cog first and then the standard preprocessor.
Alternatively, is there a straightforward way to accomplish that in CMake, itself? I don't want to have to write a custom script for each file, especially if I have to then hard-code the compiler/preprocessor flags in each target.
edit: For clarity, I am aware of several partial/partially-applicable solutions. For example, how to tell GCC to use a different preprocessor. (Or really, to look in a different place for its own preprocessor, cc1. See: Custom gcc preprocessor) However, that leaves a lot of work to do, to modify files, and then correctly invoke the real cc1, with the correct original arguments.
Since that is effectively a constant/generic problem, I'm just surprised there is no drop in program.
Edit 2: After looking over several proposed solutions, I am not convinced there is an answer to this question. For example, if files are going to be generated by CMake, then they can't be included and browsed by the IDE - due to not yet existing.
As ridiculous as it sounds, I don't think there is any way to extend the preprocessor short of forking Gcc. Everything recommended so far, constitutes incomplete hacks.
The GCC (C++ compiler) is made for compiling C++ programs. As the C++ preprocessor is standardized within the C++ standard there is usually no need for anything like a "plugin" or "extension" there.
Don't listen to the comments, that suggest you using any exotic extension to CMake or change source code of GCC. Running source files through a different program (cog in your case) before compiling is a well known task and all major build systems support it right away.
In CMake you can use the add_custom_command function. If you need this for more than one file, you could use a CMake loop like e.g. suggested in this answer.

In C or C++, does the compiler do implicit linking?

How does some std-lib, external-libs or any other pre-compiled src code such as the well-known header file <iostream> with its corresponding object file or static/dll lib get linked into my own application automatically? Does the compiler do it implicitly/under-the-hood or something like a compiler pre-linked list operation?
If such a case exist how do we use its functionality in our accord, Is there a way to put my own obj, dll, static-lib or src file into that ideal list via writing some special syntax without changing the initial directories of each of it, neither the help of an IDE config and outside-software, the goal is to drop the linking phase explicitly at terminal, want to do this configuration inside of the src-code.
Does every std-lib had a direct/inline special src-code that doing this kind of operation? If there are, then how do we take advantage of it? Or if everything is done by a compiler/handler and if it is generic-type then you could modified it with less problem but the delema is, it is fixed with the compiler and hate to forcefully modified/forked it. If there is alreadly a way to do this without explicitly tinkering it, for such doing it only at your onw src-code/write-time as a said at the first line of this block statement: "Does every std-lib/external-lib had a direct/inline special src-code that doing this kind of operation?".
// a.cpp
#include<iostream>
// there's no linking on iostream obj, src, dll, static-lib file
// love to have this kind of special features to our own none-std-lib/etc.
>c/cpp-compiler -c a.cpp
>c/cpp-compiler -o a a.o
Note: some of my terminologies are base on my own experience so watch out and be open-minded. For as I grow in the coding-community using terminology/standard way of communicating are a mess specially exploring from low to another low and to high to another high level prog-lang.
It depends on what you call "the compiler".
Most modern toolchains - including gcc, clang, Visual C++ - are based on a "compile then link" model, with several components. One of those components is the preprocessor (which does text substitution on C or C++ source code, to produce some modified source code), a "compiler" that translates preprocessed source code into object files, utility programs that produce libraries from sets of source files, a linker that produces an executable file from a set of object files and libraries, and - last but by no means least - a driver program that coordinates execution of other components.
The specifics are different between toolchains - e.g. VC++ does things quite differently than gcc/g++ or clang. The concepts are similar.
In what follows, I'll give a very over-simplistic (imprecise, details omitted) discussion of what gcc and g++ (in the gnu compiler collection) do.
When you use gcc or g++ at the command line you're actually using a driver program, that orchestrates execution of a bunch of other programs (the preprocessor, the compiler, the linker, etc). Depending on what options you provide, the result produced differs. For example, gcc -E only completes preprocessing of source files, g++ -c means the process stops after compiling source files to produce object files. If used to produce an executable, the driver program will use the linker to (well!) link object files and libraries together to produce an executable.
So, if you think of gcc or g++ (the program you execute directly) as the compiler then you could claim the compiler does implicit linking. When being used to create an executable, both execute the linker - and provide it information needed (e.g. names of libraries). gcc automatically links in libraries needed by C programs (e.g. the C standard library) while g++ automatically links in libraries needed by C++ programs (e.g. parts of the C++ standard library as well as the C standard library).
However, if you take a narrower view of the compiler - it is the program that only translates source files into object files - then there is no implicit linking of libraries by the compiler. It is the driver program that orchestrates compiling and linking, not the compiler that orchestrates linking.
If you read documentation for your favourite toolchain, it will describe various means (extensions of source files, settings, command line options, values of environment variables, etc) to control what it does. There is typically flexibility to do preprocessing only, compilation only, output assembler, linking only, or a complete "compile multiple source files then link them together to produce an executable" process.
The linker search libraries in some oreder in which the standard libs folder is searched first.
There are somed default libraries that gets loaded by default like glibc.
this way you dont need to specify to the linker to link with standard libs.
Gcc even have flags for not linking with some standard libs
https://docs.oracle.com/cd/E19205-01/819-5262/auto29/index.html
Note that while it is not standard Microsoft's Visual C++ has a #pragma based language extension that allows specifying files to link in the source:
#pragma comment(lib, "yourfile.lib") // or yourfile.obj
The comment pragma can also be used to specify a few other linker command line options, for example:
#pragma comment(linker,"\"/manifestdependency:type='win32' name='Microsoft.Windows.Common-Controls' version='6.0.0.0' processorArchitecture='' publicKeyToken='6595b64144ccf1df' language=''"")
Note that the list of linker options that can be specified this way is fairly limited and that while there are a few other legal 'comment' types only lib and linker really have meaning.
In C or C++, does the compiler do implicit linking?
As the "compiler" (understood as the whole group of tools in the chain that generate the final executable) has whole control over creating that final executable, it does everything related to every stage of compilation, including implicit linking.
How does some std-lib, external-libs or any other pre-compiled src code such as the well-known header file with its corresponding object file or static/dll lib get linked into my own application automatically?
The same as any other library is linked - linker searches the library for symbols and uses them.
Does the compiler do it implicitly/under-the-hood or something like a compiler pre-linked list operation?
Yes (for the compilers I worked with).
But it's very specific to the compiler. From the point of C++ language, there is no requirement on compiler command line options. If the compiler -needs-this-option-to-link-with-standard-library, it's fine and specific to that compiler. It's a quality of implementation issue. Surely users would want some things to be done implicitly with sane defaults for that compiler.
how do we use its functionality in our accord, Is there a way to put my own obj/dll/static-lib/src file into that ideal list via writing some special syntax without changing the initial directories of each of i
Because the compiler does it implicitly, you have to modify the compiler. That strongly depends on the compiler, and specific system and specific compiler own very specific configuration and build settings.
For example on Linux with gcc you can use the method in Enable AddressSanitizer by default in gcc . You can also use the method in Custom gcc preprocessor but overwrite collect2 stage.

How can I configure cmake to compile a file twice with two different compilers?

I'm adding a SYCL/OpenCL kernel to a parallel C++ program which is built with cmake. Using SYCL means I need to get cmake to compile my C++ source file twice: once with the SYCL compiler, and once with the project's default compiler, which is GCC. Both compilations produce outputs which need to be included when linking.
I'm completely new to cmake. I've added the GCC compile and link steps to the project's CMakeLists.txt, but what's the best way to add the SYCL compile step? I'm currently trying the "add_custom_command" option with "PRE_BUILD", but the command which is run doesn't seem to know about the paths which are provided to the normal compile and link steps: the current working directory, include directories, source directories, etc. I'm having to specify all of these manually, and I'm having to figure some of them out first.
It feels like I'm doing this the hard way. Is there a recommended (or at least better) way to get cmake to compile a file twice with two different compilers?
Also, there used to be a SYCL tag, but it's disappeared. Can someone recreate it, please?
Be aware that PRE_BUILD only works as PRE_BUILD in Visual Studio 7, for other targets is just PRE_LINK.
If you need to use two compilers on the same source file, just add a dependency from the GCC compile and link to the custom target you are using, so the GCC is executed after the SYCL compiler.
I can think of a couple other ways to do it.
Generate two build configurations
Write a script to call both compilers
The first method is probably the easiest. You might need to maintain two seperate CMakeLists.txt files, or possibly just parameterize the compiler and options and pass them arguments to Cmake when you generate (CC=gcc, CXX=g++, CFLAGS/CXXFLAGS, etc...). You might be able to do the same with the underlying build system (e.g. make) and just run it twice.
The second method is a bit more complicated. Write a simple script that accepts both sets of compiler options and compile each file using the compilers in sequence. Then the script could be then configured as CC/CXX.
So, the command options would look something like this...
--cc1 sycl --cc2 gcc --cc1opts ... --cc2opts ...
I'm not familiar with SYCL though, so I don't know how it's normally used.

C++ Compile on different platforms

I am currently developing a C++ command line utility to be distributed as an open-source utility on Github. However, I want people who download the program to be able to easily compile and run the program on any platform (specifically Mac, Linux, and Windows) in as few steps as possible. Assuming only small changes have to be made to the code to make it compatible with the various platform-independent C++ compilers (g++ and win32), how can I do this? Are makefiles relevant?
My advice is, do not use make files, maintaining the files for big enougth projects is tedious and errors happen sometimes which you don't catch immediatly (because the *.o file is still there).
See this question here
Makefiles are indeed highly relevant. You may find that you need (at least) two different makefiles to compensate for the fact that you have different compilers.
It's hard to be specific about how you solve this, since it depends on how complex the project is. It may be easiest to write a script/batchfile, and just document "Use the command build.sh on Linux/Unix, and build.bat on Windows") - and then let the respective files deal with for example setting up the name of the compiler and flags, etc.
Or you can have an include into the makefile, which is determined by the architecture. Or different makefiles.
If the project is REALLY simple, it may be just enough to provide a basic makefile - but it's unlikely, as a compile of x.cpp on Linux/MacOS makes an object file is called x.o, on windows the object file is called x.obj. Libraries have different names, dll's have differnet names, and on Linux/MacOS, the final executable has no extension (typically) so it's called "myprog", where the executable under windows is called "myprog.exe".
These sorts of differences mean that the makefile needs to be different.

Optimization: .cpp or .obj/.o or .lib/.a

I have this chuck of code that could be placed in a separate library but I'm unsure how that will affect the compiler's ability to optimize.
Option 1: include the code directly in the projects and compile it together with everything else.
Option 2: build the .obj/.o files and simply use them when building the projects.
Option 3: create a static library (.lib or .a) and link with that when building the projects.
Now, my question is: which of these will give the best performance? If you could discuss/explain the consequences of each of the options with regard to compiler optimization that would be super awesome!
Thanks in advance :-)
There should be no difference in performance:
An .a file is simply an archive of .o files. They are treated the same by the linker (except that .a files need to be unpacked first).
Directly compiling all sources together will still result in all compilation units be compiled separately, and subsequently linked together. It’s just that the compiler hides this and calls the linker behind your back. Nevertheless, the work is the same as when first compiling the compilation units separately and then linking them together in an explicit step.
There's no difference in the optimization a compiler can do. In every case, the object can be built with as much or as less optimization you want.
The only difference you might see, is when you build a shared library. Then you have a call overhead, which you have not, when linking the objects or a static library directly into the executable.
If by Option 1 you mean #include the code via header files, then the compiler may be able to optimise slightly better than linking multiple objects together, as in Options 2 and 3. This is because the compiler can see the entire source code, rather than just the object code, and may be able to inline functions.
There is no difference between Options 2 and 3, as an archive file - *.a - is just a collection of object files - *.o.
All this being said, The Architecture of Open Source Applications: LLVM implies that you can build LLVM IR code objects, which when linked can be optimised properly, including inlining of functions. So, if you are using clang++, this may be an option.