In C or C++, does the compiler do implicit linking? - c++

How does some std-lib, external-libs or any other pre-compiled src code such as the well-known header file <iostream> with its corresponding object file or static/dll lib get linked into my own application automatically? Does the compiler do it implicitly/under-the-hood or something like a compiler pre-linked list operation?
If such a case exist how do we use its functionality in our accord, Is there a way to put my own obj, dll, static-lib or src file into that ideal list via writing some special syntax without changing the initial directories of each of it, neither the help of an IDE config and outside-software, the goal is to drop the linking phase explicitly at terminal, want to do this configuration inside of the src-code.
Does every std-lib had a direct/inline special src-code that doing this kind of operation? If there are, then how do we take advantage of it? Or if everything is done by a compiler/handler and if it is generic-type then you could modified it with less problem but the delema is, it is fixed with the compiler and hate to forcefully modified/forked it. If there is alreadly a way to do this without explicitly tinkering it, for such doing it only at your onw src-code/write-time as a said at the first line of this block statement: "Does every std-lib/external-lib had a direct/inline special src-code that doing this kind of operation?".
// a.cpp
#include<iostream>
// there's no linking on iostream obj, src, dll, static-lib file
// love to have this kind of special features to our own none-std-lib/etc.
>c/cpp-compiler -c a.cpp
>c/cpp-compiler -o a a.o
Note: some of my terminologies are base on my own experience so watch out and be open-minded. For as I grow in the coding-community using terminology/standard way of communicating are a mess specially exploring from low to another low and to high to another high level prog-lang.

It depends on what you call "the compiler".
Most modern toolchains - including gcc, clang, Visual C++ - are based on a "compile then link" model, with several components. One of those components is the preprocessor (which does text substitution on C or C++ source code, to produce some modified source code), a "compiler" that translates preprocessed source code into object files, utility programs that produce libraries from sets of source files, a linker that produces an executable file from a set of object files and libraries, and - last but by no means least - a driver program that coordinates execution of other components.
The specifics are different between toolchains - e.g. VC++ does things quite differently than gcc/g++ or clang. The concepts are similar.
In what follows, I'll give a very over-simplistic (imprecise, details omitted) discussion of what gcc and g++ (in the gnu compiler collection) do.
When you use gcc or g++ at the command line you're actually using a driver program, that orchestrates execution of a bunch of other programs (the preprocessor, the compiler, the linker, etc). Depending on what options you provide, the result produced differs. For example, gcc -E only completes preprocessing of source files, g++ -c means the process stops after compiling source files to produce object files. If used to produce an executable, the driver program will use the linker to (well!) link object files and libraries together to produce an executable.
So, if you think of gcc or g++ (the program you execute directly) as the compiler then you could claim the compiler does implicit linking. When being used to create an executable, both execute the linker - and provide it information needed (e.g. names of libraries). gcc automatically links in libraries needed by C programs (e.g. the C standard library) while g++ automatically links in libraries needed by C++ programs (e.g. parts of the C++ standard library as well as the C standard library).
However, if you take a narrower view of the compiler - it is the program that only translates source files into object files - then there is no implicit linking of libraries by the compiler. It is the driver program that orchestrates compiling and linking, not the compiler that orchestrates linking.
If you read documentation for your favourite toolchain, it will describe various means (extensions of source files, settings, command line options, values of environment variables, etc) to control what it does. There is typically flexibility to do preprocessing only, compilation only, output assembler, linking only, or a complete "compile multiple source files then link them together to produce an executable" process.

The linker search libraries in some oreder in which the standard libs folder is searched first.
There are somed default libraries that gets loaded by default like glibc.
this way you dont need to specify to the linker to link with standard libs.
Gcc even have flags for not linking with some standard libs
https://docs.oracle.com/cd/E19205-01/819-5262/auto29/index.html

Note that while it is not standard Microsoft's Visual C++ has a #pragma based language extension that allows specifying files to link in the source:
#pragma comment(lib, "yourfile.lib") // or yourfile.obj
The comment pragma can also be used to specify a few other linker command line options, for example:
#pragma comment(linker,"\"/manifestdependency:type='win32' name='Microsoft.Windows.Common-Controls' version='6.0.0.0' processorArchitecture='' publicKeyToken='6595b64144ccf1df' language=''"")
Note that the list of linker options that can be specified this way is fairly limited and that while there are a few other legal 'comment' types only lib and linker really have meaning.

In C or C++, does the compiler do implicit linking?
As the "compiler" (understood as the whole group of tools in the chain that generate the final executable) has whole control over creating that final executable, it does everything related to every stage of compilation, including implicit linking.
How does some std-lib, external-libs or any other pre-compiled src code such as the well-known header file with its corresponding object file or static/dll lib get linked into my own application automatically?
The same as any other library is linked - linker searches the library for symbols and uses them.
Does the compiler do it implicitly/under-the-hood or something like a compiler pre-linked list operation?
Yes (for the compilers I worked with).
But it's very specific to the compiler. From the point of C++ language, there is no requirement on compiler command line options. If the compiler -needs-this-option-to-link-with-standard-library, it's fine and specific to that compiler. It's a quality of implementation issue. Surely users would want some things to be done implicitly with sane defaults for that compiler.
how do we use its functionality in our accord, Is there a way to put my own obj/dll/static-lib/src file into that ideal list via writing some special syntax without changing the initial directories of each of i
Because the compiler does it implicitly, you have to modify the compiler. That strongly depends on the compiler, and specific system and specific compiler own very specific configuration and build settings.
For example on Linux with gcc you can use the method in Enable AddressSanitizer by default in gcc . You can also use the method in Custom gcc preprocessor but overwrite collect2 stage.

Related

How does gcc decide which libraries to implicitly include?

With reference to this question:
On an embedded project for a small micro I found my compiled code size was much larger than expected. It turned out it was because I had included code that used assert(). The use of assert was appropriate in the included code but caused my compiled code size to almost double.
The question is not around if/when assert should be used but how the compiler/linker decides to include all the necessary overhead for assert.
My original question from the other post:
It would be helpful if someone could explain to me how gcc decides to include library functions when assert is called? I see that assert.h declares an external function __assert_func. How does the linker know to reference it from a library rather than just say "undefined reference to __asert_func"?
When configuring a toolchain, the authors decide which libraries that should be linked to by default.
Often this is includes runtime startup/initializing code and a library named libc which includes an implementation of the C standard, and any other code the authors deem relevant (e.g. libc might also implement Posix, any custom board specific functions etc.) and for embedded targets it's not unusual to also link to a library implementing an RTOS for the target.
You can use the -nodefaultlibs flag to gcc to omit these default libraries at the linking stage.
In the case of assert(), it is a standard C macro/function , which is normally implemented in libc. assert() might print to stdout if it fails, so using assert() could pull in the entire stdio facility that implements FILE* handling/buffering, printf etc., all which is implemented in libc.
You can see the libraries that gcc links to by default if you run gcc -v for the linking stage.
The gcc (or g++) command is simply a driver. It runs other programs, including the compiler proper (cc1 for C code, cc1plus for C++ code) and the assembler and the linker.
What programs are run are determined by the spec file (and there is an implicit one, see the -dumpspecs developer option). BTW, running gcc with the -v option displays the actual programs involved.
The assert macro is defined (see file /usr/include/assert.h) to do some check in <assert.h> only if NDEBUG is not a defined preprocessor symbol. On my Linux/Glibc system it can call a __assert_failed internal function from the C standard library. Citing assert(3) documentation:
If the macro NDEBUG is defined at the moment <assert.h> was last
included, the macro assert() generates no code, and hence does
nothing at all.
Some projects are compiling with -DNDEBUG their code in production mode.
You should read the Invoking GCC chapter of the documentation.
Perhaps you want to compile with -ffreestanding to avoid any extra libraries, even the standard ones?
On your embedded system linking is static. Static linking works as follows.
A static library is an archive of object files. The linker considers each object separately.
A referenced function or variable that is found in a static library is included in the resulting executable, together with the entire object file that contains the referenced symbol. Object files that don't contain referenced symbols are not pulled in.
This isn't by any way specific to gcc. Linkers work this way since the dawn of time.

make SCons compile everything in one gcc line?

I have a rather complex SCons script that compiles a big C++ project.
This gcc manual page says:
The compiler performs optimization based on the knowledge it has of the program. Compiling multiple files at once to a single output file mode allows the compiler to use information gained from all of the files when compiling each of them.
So it's better to give all my files to a single g++ invocation and let it drive the compilation however it pleases.
But SCons does not do this. it calls g++ separately for every single C++ file in the project and then links them using ld
Is there a way to make SCons do this?
The main reason to have a build system with the ability to express dependencies is to support some kind of conditional/incremental build. Otherwise you might as well just use a script with the one command you need.
That being said, the result of having gcc/g++ optimize as the manual describe is substantial. In particular if you have C++ templates you use often. Good for run-time performance, bad for recompile performance.
I suggest you try and make your own builder doing what you need. Here is another question with an inspirational answer: SCons custom builder - build with multiple files and output one file
Currently the answer is no.
Logic similar to this was developed for MSVC only.
You can see this in the man page (http://scons.org/doc/production/HTML/scons-man.html) as follows:
MSVC_BATCH When set to any true value, specifies that SCons should
batch compilation of object files when calling the Microsoft Visual
C/C++ compiler. All compilations of source files from the same source
directory that generate target files in a same output directory and
were configured in SCons using the same construction environment will
be built in a single call to the compiler. Only source files that have
changed since their object files were built will be passed to each
compiler invocation (via the $CHANGED_SOURCES construction variable).
Any compilations where the object (target) file base name (minus the
.obj) does not match the source file base name will be compiled
separately.
As always patches are welcome to add this in a more general fashion.
In general this should be left up to the program developer. Trying to compile all together in an amalgamation may introduce unintended behaviour to the program if it even compiles in the first place. Your best bet if you want this kind of optimisation without editing the source yourself is to use a compiler with inter-process optimisation like icc -ipo.
Example where an amalgamation of two .c files would not compile is for example if they use two identical static symbols with different functionality.

Optimization: .cpp or .obj/.o or .lib/.a

I have this chuck of code that could be placed in a separate library but I'm unsure how that will affect the compiler's ability to optimize.
Option 1: include the code directly in the projects and compile it together with everything else.
Option 2: build the .obj/.o files and simply use them when building the projects.
Option 3: create a static library (.lib or .a) and link with that when building the projects.
Now, my question is: which of these will give the best performance? If you could discuss/explain the consequences of each of the options with regard to compiler optimization that would be super awesome!
Thanks in advance :-)
There should be no difference in performance:
An .a file is simply an archive of .o files. They are treated the same by the linker (except that .a files need to be unpacked first).
Directly compiling all sources together will still result in all compilation units be compiled separately, and subsequently linked together. It’s just that the compiler hides this and calls the linker behind your back. Nevertheless, the work is the same as when first compiling the compilation units separately and then linking them together in an explicit step.
There's no difference in the optimization a compiler can do. In every case, the object can be built with as much or as less optimization you want.
The only difference you might see, is when you build a shared library. Then you have a call overhead, which you have not, when linking the objects or a static library directly into the executable.
If by Option 1 you mean #include the code via header files, then the compiler may be able to optimise slightly better than linking multiple objects together, as in Options 2 and 3. This is because the compiler can see the entire source code, rather than just the object code, and may be able to inline functions.
There is no difference between Options 2 and 3, as an archive file - *.a - is just a collection of object files - *.o.
All this being said, The Architecture of Open Source Applications: LLVM implies that you can build LLVM IR code objects, which when linked can be optimised properly, including inlining of functions. So, if you are using clang++, this may be an option.

CURL is in /usr/include but still won't automatically be found by g++

Every time I want to use CURL in one of my C++ programs, I have to add the flag -lcurl as a flag onto g++. This can be especially annoying when working with Eclipse. If /usr/include/curl/curl.h exists, what do I need to do to have CURL always be within the include path for g++?
tl;dr: you have to add the flag.
The linker needs libcurl, not the compiler. The compiler needs the header; the linker needs the lib.
To simplify things quite a bit, the header file tells the compiler that the declarations will be defined later. libcurl is what actually defines them.
The linker does not guess-and-check what to link against (doing so would be a horrible idea). You must explicitly tell it what to link against (except for the default libs). In particular, the linker has to know to use libcurl to find the declarations that curl.h laid out. Without libcurl, the linker is missing functions and thus cannot produce a complete binary.
I'm not familiar with Eclipse, but I'm nearly positive that it has an option where you can specify additional libraries. Yes, you'll have to do that once per project, but that shouldn't be a major overhead.
Try adding curl path in
Properties -> C/C++ General -> Paths and Symbols:
This is just how linking works in C and C++.
When you compile the program, you include the header file /usr/include/curl/curl.h. The compiler does this part. The header file contains all of the definitions for the library interface.
When you link the program, you link in the library /usr/lib/libcurl.so, or whatever it happens to be named. The linker does this part. The library contains the implementation in either a loadable (for dynamic libraries) or linkable (for static libraries) format.
The C and C++ languages have no way of specifying which libraries should be linked in, so you have to pass -lcurl to the linker. This is just the way it is.
There are some extensions to C and C++ that allow you to encode library dependencies in your source code, e.g., #pragma comment with MSC, but they're not supported by your typical ELF toolchain, as far as I know.
Note: Actually, the -lcurl flag is not for g++, but it is for the linker, ld. When you pass -lcurl to g++, g++ passes it through to the linker.

Is g++ both a c++ compiler and a linker?

I was looking at the output from my build in Eclipse. I'm cross compiling for a ColdFire processor. The compilation line looks like this:
m68k-elf-g++ -O2 -falign-functions=4 -IC:\nburn\include -IC:\nburn\MOD52...
followed by more include file, obvious "compiler" flags and finally the one source file I changed. The next line invokes the same tool again:
m68k-elf-g++ src\main.o src\TouchPanelMediator.o src\Startup.o....
followed by more .o files some .ld files and some .a files. This appears to be linking all the various types of object files together.
In the Gnu family is g++ some uber application that can determine based on arguments whether it needs to compile or link? Does it have both capabilities built-in or is it just dispatching compiling to gcc and linking to ld and my log just doesn't show that?
g++ and gcc are drivers. Usually, they run the preprocessor (cpp), compiler proper (cc1plus for C++ and cc1 for C) and the linker (gold or GNU ld) and all other things necessary. The difference between gcc and g++ is that the latter includes one additional library to link against (libstdc++).
Depending on what type of file they are invoked on, they may omit some steps or do things differently. For .o files, it doesn't need to run the compiler proper or the preprocessor, for example.
If you pass -### to them, you can see it print the tools it invokes in each step of its execution.
Taken from this little GCC guide:
Based on the file extension that you gave your program, it selects the appropriate commands it needs to run to turn the source you gave it into the output file you specified.
With a nice little flowchart of what GCC exactly does, depending on the file extensions:
input extensions runs if output
It dispatches linking to ld.
Also see here:
How to get GCC linker command?