Why can CImg achieve this kind of effect? - c++

The compilation is done on the fly :
only CImg functionalities really used
by your program are compiled and
appear in the compiled executable
program. This leads to very compact
code, without any unused stuffs.
Any one knows the principle?

CImg is a header-only library, and they use templates liberally, which is what they're referring to.
If they used a precompiled library of some kind (.dll/.lib/.a/.so) the library file would have to contain the entire CImg library, regardless of which bits of it you actually use.
In the case of a statically linked library (.lib or .a), the linker can then strip out unused symbols, but that may depend on optimization settings.
When the entire library is included in one or two headers, it is only actually compiled when you #include it, and so it is part of the same compilation process as the rest of your program, and the compiler can easily determine which parts of the library are used, and which ones aren't.
And because the CImg API uses templates, no code is generated for functions that are never called.
They are overselling it a bit though, because as the other answers point out, unused symbols will usually be stripped out anyway.

Sounds like fairly standard behaviour to me - a C++ linker will usually throw away any unused library references rather than including uncallable code. Similarly, an optimized build will not include uncallable code.

This sounds like MSVC's Eliminate Unreferenced Data (/OPT:REF) linker command, GCC should have something similar too this too

Related

Confused about how the compiler includes standard libraries in c++ program

I'am new to c++ programming and I'm a little confused about how the compiler includes standard libraries in c++ program. Say for example I want to use the sqrt() function. I know that I have to include the math.h header file in my source code, but the math library contains many functions other than sqrt(). So my question is are all this functions source code added to the program, whitch is unnecessary, or just the function that I need?
I hope my question was clear and thanks in advance.
Functions that are NOT templates (and not so trivial that they are just one or two lines) are compiled separately, and then stored in a "libary" (which is not the header file, it just contains double sqrt(double); or some such).
The compiler will (given the right compile-time flags) link to the C library that contains those functions. The linker [called upon by the compiler] will then introduce the code that was compiled when the library was built. So, typically, the source is not compiled when you build your program - it was done some other time.
The linker understands what functions are needed by the code you are building, so will only add those functions to your program, not ALL of the functions [but it may pull in some other functions than the precise one that you asked for, for example there may be some helper functions and perhaps some generic error handling functions that are needed by sqrt].
No, linking means that the linker figures out which symbols (functions and data objects) from your library are necessary to build your program, and then only includes these that are.
In fact, with dynamic linking, it wouldn't even include the function itself, but just the reference to the function and how to load the library containing it.
Generally, libraries that are linked with your executables aren't source code, but binary objects, which already have been translated to machine language ("compiled").
You have a confusion between libraries and header files. Libraries are the implementations. Header files contain the declarations.
You use #include for a library file so that the compiler can find the syntax and semantics of the function you use.
All the declarations (unless blocked by preprocessor directives), are parsed by the compiler and stored in a dictionary. The only issue about you not using a declaration is that it takes up room in the compiler's dictionary. Usually this is not an issue (modern compilers have large capacity dictionaries).
As far as adding functions to your program, that is handled during the Linking phase (usually by a linker application). This is compiler dependent. Fundamentally, only functions that are used by your program are pulled from the library (static libraries only) and placed into your executable. Some compiler may speed up the build process and include groups of functions that are popular, but you may not use. This speeds up the build processor but makes your executables larger.
Some library functions may use other library functions. This means that a library function may add a lot more code into your executable. One example is printf. The printf function requires a lot of support, more than puts, because of all the formatting specifiers. So the printf may include other (internal) library functions.

What does it mean to link against something?

I commonly hear the term "to link against a library".
I'm new to compilers and thus linking, so I would like to understand this a bit more.
What does it mean to link against a library and when would not doing so cause a problem?
A library is an "archive" that contains already compiled code. Typically, you want to use a ready-made library to use some functionality that you don't want to implement on your own (e.g. decoding JPEGs, parsing XML, providing you GUI widgets, you name it).
Typically in C and C++ using a library goes like this: you #include some headers of the library that contain the function/class declarations - i.e. they tell the compiler that the symbols you need do exist somewhere, without actually providing their code. Whenever you use them, the compiler will place in the object file a placeholder, which says that that function call is to be resolved at link time, when the rest of the object modules will be available.
Then, at the moment of linking, you have to specify the actual library where the compiled code for the functions of the library is to be found; the linker then will link this compiled code with yours and produce the final executable (or, in the case of dynamic libraries, it will add the relevant information for the loader to perform the dynamic linking at runtime).
If you don't specify that the library is to be linked against, the linker will have unresolved references - i.e. it will see that some functions were declared, you used them in your code, but their implementation is nowhere to be found; this is the cause of the infamous "undefined reference errors".
Notice that all this process is identical to what normally happens when you compile a project that is made of multiple .cpp files: each .cpp is compiled independently (knowing of the functions defined in the others only via prototypes, typically written in .h files), and at the end everything is linked together to produce the final executable.

How to avoid including the same code when using C++ libraries?

EDIT: I know about include guards, but include files are not the issue here. I'm talking about actual compiled and already linked code that gets baked into the static library.
I'm creating a general-purpose utility library for myself in C++.
One of the functions I'm creating, printFile, requires string, cout and other such members of the standard library.
I'm worried that when the library is compiled, and then linked to another project that also uses string and cout, the code for string and cout will be duplicated: it will both be prelinked in the library binary the program is being linked with, and it will be again linked with the project that uses them itself.
The library is structured like this:
There is one libname.hpp file the programmer who uses the library is supposed to #include in his projects.
For every function fname declared in libname.hpp, there is an file fname.cpp implementing it.
All fname.cpp files also #include "libname.hpp".
The library itself compiles into libname.a which is copied to /usr/lib/.
Will this even happen?
If yes, is it a problem at all?
If yes, then how can I avoid this?
I'm worried that when the library is compiled, and then linked to another project that also uses string and cout, the code for string and cout will be duplicated
Don't worry: no modern compilation system will do that. The code for template functions is emitted into object files, but the linker discards duplicate entries.
The library definitions of the standard C++ library won't show up in your own statically library unless you explicitly include them there (i.e., you extract object files from the standard C++ library and include them into your library). Static libraries are not linked at all and will just have undefined references to other libraries. A static library is merely a collection of object files defining the symbols provided by the library. The definitions which come from the headers, e.g., inline functions and template instantiations, will be defined in such a way that multiple definitions in multiple translation units won't conflict. Where the code isn't actually inlined, it will define "weak" symbols which result in duplicates being ignored or removed at link time.
The only real concern is that the libraries linked into an executable need to use compatible library definitions. With substantial amount of code residing in header files, there are relatively frequent changes to the C++ header files, including standard C++ library headers (relative to the C library headers which contain a lot less code).
Yes, the code for standard library things will be duplicated. It can be a problem if for example you return a std::string or take one as a parameter in one of your methods. It may have a different layout in your standard library implementation than in the user's.
This is rarely a problem in practice.
For static functions and inline templated functions defined in header files, there's nothing to worry about: every compilation unit gets its own copy (e.g. within the .a library there may already be many anonymous copies). This is okay because these definitions aren't exported, so the linker doesn't need to worry about them.
For functions that are declared with non-static linkage, whether you have an issue depends on how you link the .a library.
When you build the library, you typically will not link in the standard C++ library. The created library will contain undefined references to the standard C++ library. These must be resolved before building the final executable binary. This is normally done automatically when linking that final binary in the default way (depending on the compiler).
There are times when people do link in the standard C++ library into a static library. If you're linking against multiple static libraries that each embed another library (like the standard C++ library), then expect trouble if there are any differences in those embedded libraries. Fortunately, this is a rare problem, at least with the gcc toolchain. It's a more frequent problem with Microsoft's tools.
In some cases, a workaround is to make one or more conflicting static libraries into a dynamic library. This way each of these dynamic libraries can statically link its own copy of the problematic library. As long as the dynamic library doesn't export the symbols from the problematic library and there are no memory layout incompatibilities, there generally isn't any trouble.

Unnecessary linked libraries in linker

I have a project which I can exclude some of libraries from linker and still builds ?
Is it any better to exclude them in terms of the performance and memory of final product ?
A good c++ linker would not include any calls from any libs that are not used in the code (the so-called ¨dead-code stripping¨).
So, I would say it depends what kind of C++ linker you are using to emit the final release. Maybe you should refer to your linker documentation to get information on dead-code stripping. If it is not doing that, then it would definitely help reducing the final memory footprint of the program.
Cheers, and hope that info helps !
Excluding some unused libraries from the final executable might make startup a bit faster and save a tiny amount of memory - chances are only the header and library startup code will actually end up being loaded, and these can be paged out after startup.
However, don't do it manually. If you were told to add the library there's probably a reason for it - perhaps some function call you're not using yet requires it, and later on if you use that function call you may have forgotten about it.
Most linkers have an option to exclude unused libraries automatically, so you may want to just enable that option to have it take care of things for you.
Note: In some rare cases, the library's startup code might have some important effect, in which case you should not exclude it. This is something that is best determined by checking the library's documentation; things like this should (hopefully!) be clearly documented.
It should not make any difference.
Any linker of any worth will not include anything from libraries that are not (directly or indirectly) referenced by the application, even if those libraries are specified on the command line.
The only reasons to include (a part of) a library are:
- The application uses a function or global object from the library
- A part of a library that was included to resolve some references has a reference to a function or global object of this library.
A linker does not just blindly put all the things you provide together in an application, but it makes a distinction between object files (for the application) and libraries.
The linker first collects all the object files and resolves as many references that are made between the files.
After that, the linker goes through the specified libraries and takes from each library those parts that are needed to resolve the (known) unresolved references. This may create new unresolved references due to dependencies between the libraries. Most linkers will make only a single pass over the libraries, but some may perform multiple passes to resolve all the references.
Parts of libraries that are not needed to resolve a reference are not included in the executable.
Yes, it's always better to exclude the unneccessary libraries.

Building C++ source code as a library - where to start?

Over the months I've written some nice generic enough functionality that I want to build as a library and link dynamically against rather than importing 50-odd header/source files.
The project is maintained in Xcode and Dev-C++ (I do understand that I might have to go command line to do what I want) and have to link against OpenGL and SDL (dynamically in SDL's case). Target platforms are Windows and OS X.
What am I looking at at all?
What will be the entry point of my
library if it needs one?
What do I have to change in my code?
(calling conventions?)
How do I release it? My understanding
is that headers and the compiled
library (.dll, .dylib(, .framework),
whatever it'll be) need to be
available for the project -
especially as template functionality
can not be included in the library by
nature.
What else I need to be aware of?
I'd recommend building as a statc library rather than a DLL. A lot of the issues of exporting C++ functions and classes go away if you do this, provided you only intend to link with code produced by the same compiler you built the library with.
Building a static library is very easy as it is just an collection of .o/.obj files - a bit like a ZIP file but without compression. There is no need to export anything - just include the library in the list of files that your application links with. To access specific functions or classes, just include the relevant header file. Note you can't get rid of header files - the C++ compilation model, particularly for templates, depends on them.
It can be problematic to export a C++ class library from a dynamic library, but it is possible.
You need to mark each function to be exported from the DLL (syntax depends on the compiler). I'm poking around to see if I can find how to do this from xcode. In VC it's __declspec(dllexport) and in CodeWarrior it's #pragma export on/#pragma export off.
This is perfectly reasonable if you are only using your binary in-house. However, one issue is that C++ methods are named differently by different compilers. This means that nobody who uses a different compiler will be able to use your DLL, unless you are only exporting C functions.
Also, you need to make sure the calling conventions match in the DLL and the DLL's client. This either means you should have the same default calling convention flag passed to the compiler for both the DLL or the client, or better, explicitly set the calling convention on each exported function in the DLL, so that it won't matter what the default is for the client.
This article explains the naming issue:
http://en.wikipedia.org/wiki/Name_decoration
The C++ standard doesn't define a standard ABI, and that's bad news for people trying to build C++ libraries. This means that you get different behavior from your compiled code depending on which flags were used to compile it, and that can lead to mysterious bugs in code that compiles and links just fine.
This extends beyond just different calling conventions - C++ code can be compiled to support or not support RTTI, exception handling, and with various optimizations that can affect the the memory layout of class instances, which C++ code relies on.
So, what can you do? I would build C++ libraries inside my source tree, and make sure that they're built as part of my project's build, and that all the libraries and the code that links to them use the same compiler flags.
Note that name mangling, which was supposed to at least prevent you from linking object files that were compiled with different compilers/compiler flags only mostly works, and there are certain things you can do, especially with GCC, that will result in code that links just fine and fails at runtime.
You have to be extra careful with vendor supplied dynamic C++ libraries (QT on most Linux distributions, for example.) I've seen instances of vendor supplied libraries that were compiled in ways that prevented certain things from working properly. For example, some Redhat Linux releases (maybe all of them) disabled exceptions in QT, which made it impossible to catch exceptions in main() if the exceptions were thrown in a QT callback. Fun.