write a C or C++ library with "template" - c++

(1). When using C++ template, is it correct that the compiler (e.g. g++) will not compile the template definition (which can only be in header file not source file) directly, but generate the code based on template definition for each of its instantiations and then compile the generated code for its instantiations?
(2). If I want to write a C++ library which provide template classes and template functions, is it impossible to compile the library into shared file (.so, .a) because their instantiations will not be anywhere in the code of the library but only appear in the user's program? If yes, does it mean that template libraries are just source code files not precompiled files?
How is C++ standard template library (STL) implemented? Is its source code precompiled or compiled together with user's program?
(3). In C,
how to write a library that provide functions acting like template functions in C++? Is overloading a good solution?
If I have to write a procedure into a different function for different types of arguments, is there a good way for code reusing? Is this a good way to do it http://www.vlfeat.org/api/imop_8c_source.html? Any other ways?
Thanks and regards!

When using C++ template, is it correct that the compiler (e.g. g++)
will not compile the template
definition.
Yes. It's a correct assumption.
A template definition is incomplete code. You need to fill in the template parameters before compiling it.
If I want to write a C++ library which provide template classes and
template functions, is it impossible
to compile the library into shared
file (.so, .a)
No it's not possible. You can only compile individual instantiations of a template.
How is C++ standard template library
(STL) implemented? Is its source code
precompiled or compiled together with
user's program?
A large part of the STL code resides in header files and gets compiled together with your application.
In C, how to write a library that
provide functions acting like template
functions in C++? Is this a good way
to do it
http://www.vlfeat.org/api/imop_8c_source.html?
Any other ways?
Including the same file multiple times after redefining a macro (as demonstrated in the link you provided) is a good way to do this.

(3). In C, how to write a library that provide functions acting like template functions in C++? Is overloading a good solution?
If I have to write a procedure into a different function for different types of arguments, is there a good way for code reusing? Is this a good way to do it http://www.vlfeat.org/api/imop_8c_source.html? Any other ways?
When I need to write general purpose code I use void * as basic data type. This is good because it allows you to store both a generic pointer and a "primitive" value (like a int). Also recently I had to compile some code using this pattern in a 64 bit environment, and this taught me the importance of the stdint.h data types!
Speaking of acting like template in C, this is not a good idea. This is just my opinion, of course, but I think that the strong point of C is its simplicity, which is the reason why C is less error prone than C++.

Related

What is the need for a library in C++?

Header files contain only the declaration of the function and the actual implementation of the function is in the library. If they don't want to share source code they can share the obj file.
Why do we use a Library when the implementation of a function can also be done in another C++ file?
Usually, a library is a collection of several translation units. A library archive is simply a convenient way to bundle the separate object files into one blob.
Besides that, shared libraries add the ability of dynamic loading and sharing of commonly used libraries between multiple dependents which isn't possible with a plain object file.
Libraries used in general as a collection of functions implements by other developers and can used by another developers " as a general used" , also provide a benefit to minimize your code and also u can take a function from library and edit the implementation as u need.
As other people explained I want to add more into it, Library provides you Reusability across multiple systems regardless of in which language they are written in Different Language unless it follows the same ABI. For example, A library written in C can be easily used with Rust, by wrapping the same function in Rust syntax.
For example, a function in C has this signature.
int return_num(int a);
If we write it in Rust, which will interact the same way as C do.
extern "C" {
fn return_num(x: i32) -> i32;
}

Header-only linking

Many C++ projects (e.g. many Boost libraries) are "header-only linked".
Is this possible also in plain C? How to put the source code into headers? Are there any sites about it?
Executive summary: You can, but you shouldn't.
C and C++ code is preprocessed before it's compiled: all headers are "pasted" into the source files that include them, recursively. If you define a function in a header and it is included by two C files, you will end up with two copies in each object file (One Definition Rule violation).
You can create "header-only" C libraries if all your functions are marked as static, that is, not visible outside the translation unit. But it also means you will get a copy of all the static functions in each translation unit that includes the header file.
It is a bit different in C++: inline functions are not static, symbols emitted by the compiler are still visible by the linker, but the linker can discard duplicates, rather than giving up ("weak" symbols).
It's not idiomatic to write C code in the headers, unless it's based on macros (e.g. queue(3)). In C++, the main reason to keep code in the headers are templates, which may result in code instantiation for different template parameters, which is not applicable to C.
You do not link headers.
In C++ it's slightly easier to write code that's already better-off in headers than in separately-compiled modules because templates require it. 1
But you can also use the inline keyword for functions, which exists in C as well as C++. 2
// Won't cause redefinition link errors, because of 6.7.4/5
inline void foo(void) {
// ...
}
[c99: 6.7.4/5:] A function declared with an inline function
specifier is an inline function. The function specifier may appear
more than once; the behavior is the same as if it appeared only once.
Making a function an inline function suggests that calls to the
function be as fast as possible. The extent to which such
suggestions are effective is implementation-defined.
You're a bit stuck when it comes to data objects, though.
1 - Sort of.
2 - C99 for sure. C89/C90 I'd have to check.
Boost makes heavy use templates and template meta-programming which you cannot emulate (all that easily) in C.
But you can of course cheat by having declarations and code in C headers which you #include but that is not the same thing. I'd say "When in Rome..." and program C as per C conventions with libraries.
Yes, it is quite possible. Declare all functions in headers and either all as static or just use a single compilation unit (i.e. only a single c file) in your projects.
As a personal anecdote, I know quite a number of physicists who insist that this technique is the only true way to program C. It is beneficial because it's the poor man's version of -fwhole-program, i.e. makes optimizations based on the knowledge of function behaviour possible. It is practical because you don't need to learn about using the linker flags. It is a bad idea because your whole program must be compiled as a whole and recompiled with each minor change.
Personally, I'd recommend to let it be or at least go with static for only a few functions.

Compile header-only template library into a shared library?

We are in the process of designing a new C++ library and decided to go with a template-based approach along with some specific partial template specialisations for corner cases. In particular, this will be a header-only template library.
Now, there is some concern that this will lead to a lot of code duplication in the binaries, since this template 'library' will be compiled into any other shared library or executable that uses it (arguably only those parts that are used). I still think that this is not a problem (in particular, the compiler might even inline things which it could not across shared library boundaries).
However, since we know the finite set of types this is going to be used for, is there a way to compile this header into a library, and provide a different header with only the declarations and nothing else? Note that the library must contain not only the generic implementations but also the partial specialisations..
Yes. What you can do is explicitly instantiate the templates in CPP files using the compiler's explicit template instantiation syntax. Here is how to use explicit instantiation in VC++: http://msdn.microsoft.com/en-us/library/by56e477(v=VS.100).aspx. G++ has a similar feature: http://gcc.gnu.org/onlinedocs/gcc/Template-Instantiation.html#Template-Instantiation.
Note that C++11 introduced a standard syntax for explicit instantiation, described in [14.7.2] Explicit instantiation of the FDIS:
The syntax for explicit instantiation is:
explicit-instantiation:
externopt template declaration
C++ Shared Library with Templates: Undefined symbols error
Some answers there cover this topic. To sum up short: it is possible if you force to instantiate templates in shared library code explicitly. It will require explicit specification for all used types for all used templates on shared lib side, though.
If it is really templates-only, then there is no shared library. See various Boost projects for concrete examples. Only when you have non-template code will you have a library. A concrete example is eg Boost Date_Time and date formatting and parsing; you can use the library with or without that feature and hence with or without linking.
Not having a shared library is nice in the sense of having fewer dependencies. The downside is that your binaries may get a little bigger and that you have somewhat higher compile-time costs. But storage is fairly cheap (unless you work in embedded systems are other special circumstances) and compiling is usually a fixed one-time cost.
Although there isn't a standard way to do it, it is usually possible with implementation specific techniques. I did it a long time ago with Borland's C++ Builder. The idea is to declare your templates to be exported from the shared library where they need to reside and import them where they are used. The way I did it was along these lines:
// A.h
#ifdef GENERATE
# define DECL __declspec(dllexport)
#else
# define DECL __declspec(dllimport)
#endif
template <typename T> class DECL C {
};
// A.cpp
#define GENERATE
#include "A.h"
template class DECL A<int>;
Beware that I don't have access to the original code, so it may contain mistakes. This blog entry describes a very similar approach.
From your wording I suspect you're not on Windows, so you'll have to find out if and how this approach can be adopted with your compiler. I hope this is enough to put you in the right direction.

What are the advantages and disadvantages of separating declaration and definition as in C++?

In C++, declaration and definition of functions, variables and constants can be separated like so:
function someFunc();
function someFunc()
{
//Implementation.
}
In fact, in the definition of classes, this is often the case. A class is usually declared with it's members in a .h file, and these are then defined in a corresponding .C file.
What are the advantages & disadvantages of this approach?
Historically this was to help the compiler. You had to give it the list of names before it used them - whether this was the actual usage, or a forward declaration (C's default funcion prototype aside).
Modern compilers for modern languages show that this is no longer a necessity, so C & C++'s (as well as Objective-C, and probably others) syntax here is histotical baggage. In fact one this is one of the big problems with C++ that even the addition of a proper module system will not solve.
Disadvantages are: lots of heavily nested include files (I've traced include trees before, they are surprisingly huge) and redundancy between declaration and definition - all leading to longer coding times and longer compile times (ever compared the compile times between comparable C++ and C# projects? This is one of the reasons for the difference). Header files must be provided for users of any components you provide. Chances of ODR violations. Reliance on the pre-processor (many modern languages do not need a pre-processor step), which makes your code more fragile and harder for tools to parse.
Advantages: no much. You could argue that you get a list of function names grouped together in one place for documentation purposes - but most IDEs have some sort of code folding ability these days, and projects of any size should be using doc generators (such as doxygen) anyway. With a cleaner, pre-processor-less, module based syntax it is easier for tools to follow your code and provide this and more, so I think this "advantage" is just about moot.
It's an artefact of how C/C++ compilers work.
As a source file gets compiled, the preprocessor substitutes each #include-statement with the contents of the included file. Only afterwards does the compiler try to interpret the result of this concatenation.
The compiler then goes over that result from beginning to end, trying to validate each statement. If a line of code invokes a function that hasn't been defined previously, it'll give up.
There's a problem with that, though, when it comes to mutually recursive function calls:
void foo()
{
bar();
}
void bar()
{
foo();
}
Here, foo won't compile as bar is unknown. If you switch the two functions around, bar won't compile as foo is unknown.
If you separate declaration and definition, though, you can order the functions as you wish:
void foo();
void bar();
void foo()
{
bar();
}
void bar()
{
foo();
}
Here, when the compiler processes foo it already knows the signature of a function called bar, and is happy.
Of course compilers could work in a different way, but that's how they work in C, C++ and to some degree Objective-C.
Disadvantages:
None directly. If you're using C/C++ anyway, it's the best way to do things. If you've got a choice of language/compiler, then maybe you can pick one where this is not an issue. The only thing to consider with splitting declarations into header files is to avoid mutually recursive #include-statements - but that's what include guards are for.
Advantages:
Compilation speed: As all included files are concatenated and then parsed, reducing the amount and complexity of code in included files will improve compilation time.
Avoid code duplication/inlining: If you fully define a function in a header file, each object file that includes this header and references this function will contain it's own version of that function. As a side-note, if you want inlining, you need to put the full definition into the header file (on most compilers).
Encapsulation/clarity: A well defined class/set of functions plus some documentation should be enough for other developers to use your code. There is (ideally) no need for them to understand how the code works - so why require them to sift through it? (The counter-argument that it's may be useful for them to access the implementation when required still stands, of course).
And of course, if you're not interested in exposing a function at all, you can usually still choose to define it fully in the implementation file rather than the header.
The standard requires that when using a function, a declaration must be in scope. This means, that the compiler should be able to verify against a prototype (the declaration in a header file) what you are passing to it. Except of course, for functions that are variadic - such functions do not validate arguments.
Think of C, when this was not required. At that time, compilers treated no return type specification to be defaulted to int. Now, assume you had a function foo() which returned a pointer to void. However, since you did not have a declaration, the compiler will think that it has to return an integer. On some Motorola systems for example, integeres and pointers would be be returned in different registers. Now, the compiler will no longer use the correct register and instead return your pointer cast to an integer in the other register. The moment you try to work with this pointer -- all hell breaks loose.
Declaring functions within the header is fine. But remember if you declare and define in the header make sure they are inline. One way to achieve this is to put the definition inside the class definition. Otherwise prepend the inline keyword. You will run into ODR violation otherwise when the header is included in multiple implementation files.
There are two main advantages to separating declaration and definition into C++ header and source files. The first is that you avoid problems with the One Definition Rule when your class/functions/whatever are #included in more than one place. Secondly, by doing things this way, you separate interface and implementation. Users of your class or library need only to see your header file in order to write code that uses it. You can also take this one step farther with the Pimpl Idiom and make it so that user code doesn't have to recompile every time the library implementation changes.
You've already mentioned the disadvantage of code repetition between the .h and .cpp files. Maybe I've written C++ code for too long, but I don't think it's that bad. You have to change all user code every time you change a function signature anyway, so what's one more file? It's only annoying when you're first writing a class and you have to copy-and-paste from the header to the new source file.
The other disadvantage in practice is that in order to write (and debug!) good code that uses a third-party library, you usually have to see inside it. That means access to the source code even if you can't change it. If all you have is a header file and a compiled object file, it can be very difficult to decide if the bug is your fault or theirs. Also, looking at the source gives you insight into how to properly use and extend a library that the documentation might not cover. Not everyone ships an MSDN with their library. And great software engineers have a nasty habit of doing things with your code that you never dreamed possible. ;-)
Advantage
Classes can be referenced from other files by just including the declaration. Definitions can then be linked later on in the compilation process.
You basically have 2 views on the class/function/whatever:
The declaration, where you declare the name, the parameters and the members (in the case of a struct/class), and the definition where you define what the functions does.
Amongst the disadvantages are repetition, yet one big advantage is that you can declare your function as int foo(float f) and leave the details in the implementation(=definition), so anyone who wants to use your function foo just includes your header file and links to your library/objectfile, so library users as well as compilers just have to care for the defined interface, which helps understanding the interfaces and speeds up compile times.
One advantage that I haven't seen yet: API
Any library or 3rd party code that is NOT open source (i.e. proprietary) will not have their implementation along with the distribution. Most companies are just plain not comfortable with giving away source code. The easy solution, just distribute the class declarations and function signatures that allow use of the DLL.
Disclaimer: I'm not saying whether it's right, wrong, or justified, I'm just saying I've seen it a lot.
One big advantage of forward declarations is that when used carefully you can cut down the compile time dependencies between modules.
If ClassA.h needs to refer to a data element in ClassB.h, you can often use just a forward references in ClassA.h and include ClassB.h in ClassA.cc rather than in ClassA.h, thus cutting down a compile time dependency.
For big systems this can be a huge time saver on a build.
Disadvantage
This leads to a lot of repetition. Most of the function signature needs to be put in two or more (as Paulious noted) places.
Separation gives clean, uncluttered view of program elements.
Possibility to create and link to binary modules/libraries without disclosing sources.
Link binaries without recompiling sources.
When done correctly, this separation reduces compile times when only the implementation has changed.

Multiple definitions of a function template

Suppose a header file defines a function template. Now suppose two implementation files #include this header, and each of them has a call to the function template. In both implementation files the function template is instantiated with the same type.
// header.hh
template <typename T>
void f(const T& o)
{
// ...
}
// impl1.cc
#include "header.hh"
void fimpl1()
{
f(42);
}
// impl2.cc
#include "header.hh"
void fimpl2()
{
f(24);
}
One may expect the linker would complain about multiple definitions of f(). Specifically, if f() wouldn't be a template then that would indeed be the case.
How come the linker doesn't complain about multiple definitions of f()?
Is it specified in the standard that the linker must handle this situation gracefully? In other words, can I always count on programs similar to the above to compile and link?
If the linker can be clever enough to disambiguate a set of function template instantiations, why can't it do the same for regular functions, given they are identical as is the case for instantiated function templates?
The Gnu C++ compiler's manual has a good discussion of this. An excerpt:
C++ templates are the first language
feature to require more intelligence
from the environment than one usually
finds on a UNIX system. Somehow the
compiler and linker have to make sure
that each template instance occurs
exactly once in the executable if it
is needed, and not at all otherwise.
There are two basic approaches to this
problem, which are referred to as the
Borland model and the Cfront model.
Borland model
Borland C++ solved the template
instantiation problem by adding the
code equivalent of common blocks to
their linker; the compiler emits
template instances in each translation
unit that uses them, and the linker
collapses them together. The advantage
of this model is that the linker only
has to consider the object files
themselves; there is no external
complexity to worry about. This
disadvantage is that compilation time
is increased because the template code
is being compiled repeatedly. Code
written for this model tends to
include definitions of all templates
in the header file, since they must be
seen to be instantiated.
Cfront model
The AT&T C++ translator, Cfront,
solved the template instantiation
problem by creating the notion of a
template repository, an automatically
maintained place where template
instances are stored. A more modern
version of the repository works as
follows: As individual object files
are built, the compiler places any
template definitions and
instantiations encountered in the
repository. At link time, the link
wrapper adds in the objects in the
repository and compiles any needed
instances that were not previously
emitted. The advantages of this model
are more optimal compilation speed and
the ability to use the system linker;
to implement the Borland model a
compiler vendor also needs to replace
the linker. The disadvantages are
vastly increased complexity, and thus
potential for error; for some code
this can be just as transparent, but
in practice it can be very difficult
to build multiple programs in one
directory and one program in multiple
directories. Code written for this
model tends to separate definitions of
non-inline member templates into a
separate file, which should be
compiled separately.
When used with GNU ld version 2.8 or
later on an ELF system such as
GNU/Linux or Solaris 2, or on
Microsoft Windows, G++ supports the
Borland model. On other systems, G++
implements neither automatic model.
In order to support C++, the linker is smart enough to recognize that they are all the same function and throws out all but one.
EDIT: clarification:
The linker doesn't compare function contents and determine that they are the same.
Templated functions are marked as such and the linker recognizes that they have the same signatures.
This is more or less a special case just for templates.
The compiler only generates the template instantiations that are actually used. Since it has no control over what code will be generated from other source files, it has to generate the template code once for each file, to make sure that the method gets generated at all.
Since it's difficult to solve this (the standard has an extern keyword for templates, but g++ doesn't implement it) the linker simply accepts the multiple definitions.