How do I partially-expose object contents in an object library? - c++

I'm compiling some C++ code into a library. Suppose my source files are mylib.cpp and util.cpp. The code in util.cpp is used in the library implementation, but is not part of the library in the sense that code using the library cannot call it (it's not in the public headers) and should not be aware of its existence; but my_lib.cpp includes util.hpp and does rely on the compiled util.cpp object code.
Now, if I compile mylib.o and util.o, then perform:
ar qc libmylib.a mylib.o util.o
my library works just fine; but - the utility code is exposed as symbols. Thus, if I link this library with some other code, there might be clashes of double-definitions. Or that other code might inappropriately rely on symbols being available (e.g. with its own header).
How can I ensure that only the object code in mylib.o (and in util.o) "sees" the symbols from util.o, while outside code does not?
Note: I believe this question stands also for C and perhaps other compiled languages.

Transferring comments into an answer.
If your C++ library has its own namespace, then using that or a sub-namespace is nominally the correct way to control access to the internal utilities. It sounds as if your code is not providing template classes — the constraints for those have to be thought through separately.
If privacy is a major concern, I'd probably consider including util.cpp (as well as util.hpp) into the source for mylib.cpp (meaning #include "util.cpp") with appropriate namespace controls so that the code from util.cpp is available inside mylib.cpp but not outside (using an anonymous namespace, or namespace mylib::Private or some such scheme). This is not very conventional, but it is probably effective (once you've worked out the necessary tweaks). The chances are that the combination TU (translation unit) is not so big as to cause your compiler major problems. This doesn't rely on compiler extensions.

Here is my fallback "solution", which is actually a workaround:
I keep the public visibility, but I burden all of the code in util.cpp with some element of naming which will make it effectively unique. For example, I may enclose those functions with a namespace mylib. Not the (demangled) symbols are all mylib::foo() (or mylib::util::foo()). They will be searched, but it is reasonable to assume they won't match anything outside of the mylib code.
In addition to the hassle, this has the detriment of still allowing external code to depend on this internal utility code - if it does so intentionally.

Related

Is normal to list all the cpp/cc files when compiling with g++?

I'm doing the "Hello World" in the GTKMM tutorial, the "app" uses three files, the main.cc, helloworld.h and helloworld.cc.
At the beginning I thought that compiling the main.cc :
g++ -o HW main.cc $(pkg-config ... )
would be enough, but gives an error (undefined reference to Helloworld::Helloworld), etc.
In other words, it compiles the main and the header, but not the HW class, and this makes sense because the header is included in Main but not the Helloworld.cc. The thing is I'm kinda scared of including it because I read in other question that "including everything was a bad practice".
That being said, when I compile using all the files in the same command:
g++ -o HW main.cc helloworld.cc $(pkg-config ... )
the "app" works without errors.
So, since using the last command works, is compiling in this way a good practice?
What happens if my app uses a big ton of classes?
Must I manually write them all down in the command?
If not, must I use #include?
Is it good practice using #include for all cc used files?
Is normal to list all the cpp/cc files when compiling with g++?
Yes, completely.
How else will it know what source code you want it to compile?
The thing is I'm kinda scared of including it because I read in other question that including everything was a bad practice.
#includeing excess headers is bad practice.
Passing your complete source code to the compiler is not.
Is it good practice using #include for all cc used files?
Absolutely not.
What happens if my app uses a big ton of classes? Must I manually write them all down in the command?
No. You should be using a build system that handles this for you. That could be an IDE which takes all the files in your project and passes them to the compiler in turn, or it could be a CMakeLists.txt/Makefile with a *.cpp wildcard in (although I actually recommend listing source files explicitly, one-by-one; it's not hard).
Invoking g++ manually on the command-line is fine for a quick test, but for real usage you don't want to be clowning around with such machinery.
is good practice using #include for all cc used files
It's not only bad practice, never do it.
In order to create an executable you actually have to do two things:
Compile all the source code files to object files or libraries.
Link all the object files and needed libraries into an executable.
You seem to be missing the point that the link phase is where symbols defined in separate source files are resolved or linked.
Must I manually write them all down in the command?
For the compiler to know about the DEFINTION of the symbols DECLARED in your headers, you must include all source files. Exceptions to this rule can be (but are not limited to) headers containing template metaprogramming (TMP) code that usually exist entirely in header files.
What happens if my app uses a big ton of classes?
Most of the large C++ projects utilize build configuration tools such as CMAKE to handle the generation of makefiles for them.

Library design: allow user to decide between "header-only" and dynamically linked?

I have created several C++ libraries that currently are header-only. Both the interface and the implementation of my classes are written in the same .hpp file.
I've recently started thinking that this kind of design is not very good:
If the user wants to compile the library and link it dynamically, he/she can't.
Changing a single line of code requires full recompilation of existing projects that depend on the library.
I really enjoy the aspects of header-only libraries though: all functions get potentially inlined and they're very very easy to include in your projects - no need to compile/link anything, just a simple #include directive.
Is it possible to get the best of both worlds? I mean - allowing the user to choose how he/she wants to use the library. It would also speed up development, as I'd work on the library in "dynamically-linking mode" to avoid absurd compilation times, and release my finished products in "header-only mode" to maximize performance.
The first logical step is dividing interface and implementation in .hpp and .inl files.
I'm not sure how to go forward, though. I've seen many libraries prepend LIBRARY_API macros to their function/class declarations - maybe something similar would be needed to allow the user to choose?
All of my library functions are prefixed with the inline keyword, to avoid "multiple definition of..." errors. I assume the keyword would be replaced by a LIBRARY_INLINE macro in the .inl files? The macro would resolve to inline for "header-only mode", and to nothing for the "dynamically-linking mode".
Preliminary note: I am assuming a Windows environment, but this should be easily transferable to other environments.
Your library has to be prepared for four situations:
Used as header-only library
Used as static library
Used as dynamic library (functions are imported)
Built as dynamic library (functions are exported)
So let's make up four preprocessor defines for those cases: INLINE_LIBRARY, STATIC_LIBRARY, IMPORT_LIBRARY, and EXPORT_LIBRARY (it is just an example; you may want to use some sophisticated naming scheme).
The user has to define one of them, depending on what he/she wants.
Then you can write your headers like this:
// foo.hpp
#if defined(INLINE_LIBRARY)
#define LIBRARY_API inline
#elif defined(STATIC_LIBRARY)
#define LIBRARY_API
#elif defined(EXPORT_LIBRARY)
#define LIBRARY_API __declspec(dllexport)
#elif defined(IMPORT_LIBRARY)
#define LIBRARY_API __declspec(dllimport)
#endif
LIBRARY_API void foo();
#ifdef INLINE_LIBRARY
#include "foo.cpp"
#endif
Your implementation file looks just like usual:
// foo.cpp
#include "foo.hpp"
#include <iostream>
void foo()
{
std::cout << "foo";
}
If INLINE_LIBRARY is defined, the functions are declared inline and the implementation gets included like a .inl file.
If STATIC_LIBRARY is defined, the functions are declared without any specifier, and the user has to include the .cpp file into his/her build process.
If IMPORT_LIBRARY is defined, the functions are imported, and there isn't a need for any implementation.
If EXPORT_LIBRARY is defined, the functions are exported and the user has to compile those .cpp files.
Switching between static / import / export is a really common thing, but I'm not sure if adding header-only to the equation is a good thing. Normally, there are good reasons for defining something inline or not to do so.
Personally, I like to put everything into .cpp files unless it really has to be inlined (like templates) or it makes sense performance-wise (very small functions, usually one-liners). This reduces both compile time and - way more important - dependencies.
But if I choose to define something inline, I always put it in separate .inl files, just to keep the header files clean and easy to understand.
It is operating system and compiler specific. On Linux with a very recent GCC compiler (version 4.9) you might produce a static library using interprocedural linktime optimization.
This means that you build your library with g++ -O2 -flto both at compile and at library link time, and that you use your library with g++ -O2 -flto both at compile and link time of the invoking program.
This is to complement #Horstling's answer.
You can either create a static or a dynamic library. When you create statically-linked libraries, compiled code for all functions/objects will be saved to a file (with .lib extension in Windows). At main project (the project using the library) 's link time, these codes will be linked into your final executable together with the main project codes. So the final executable wouldn't have any runtime dependency.
Dynamically linked libraries will be merged into the main project at run time (and not link time). When you compile the library you get a .dll file (which contains actual compiled code) and a .lib file (which contains enough data for the compiler/runtime to find functions/objects in the .dll file). At link time, the executable will be configured to load the .dll and use compiled code from that .dll as needed. You will need to distribute the .dll file with your executable to be able to run it.
There is no need to choose between static or dynamic linking (or header-only) when designing your library, you create multiple project/makefiles, one to create a static .lib, another to create a .lib/.dll pair, and distribute both versions, for the user to choose between. (You'll need to use preprocessor macros like the ones #Horstling suggested).
You cannot put any templates in a pre-compiled library, unless you use a technique called Explicit Instantiation, which limits template parameters.
Also note that modern compiler/linkers usually do not respect the inline modifier. They may inline a function even if it's not designated as inline, or may dynamically call another that has inline modifier, as they see fit. (Regardless, I'll advise explicitly putting inline where applicable for maximum compatibility). So, there won't be any runtime performance penalty if you use a statically linked library instead of a header-only library (and enable compiler/linker optimizations, of course). As others have suggested, for really small functions that are sure to benefit from being called inline, it is best practice to put them in header files, so dynamically linked libraries will also not suffer any significant performance loss. (In any case, inlining functions will only affect performance for functions that are being called very often, inside loops that are going to be called thousands/millions of times).
Instead of putting inline functions in header files (with an #include "foo.cpp" in your header), you can change makefile/project settings and add foo.cpp to the list of source files to be compiled. This way, if you change any function implementation there will be no need to re-compile the whole project and only foo.cpp will be re-compiled. As I mentioned earlier, your small functions will still be inlined by the optimizing compiler, and you don't need to worry about that.
If you use/design a pre-compiled library, you should consider the case where the library is compiled with a different version of compiler to the main project. Each different compiler version (even different configurations, like Debug or Release) uses a different C runtime (things like memcpy, printf, fopen, ...) and C++ standard library runtime (things like std::vector<>, std::string, ...). These different library implementations may complicate linking, or even create runtime errors.
As a general rule, always avoid sharing compiler runtime objects (data structures that are not defined by standards, like FILE*) across libraries, because incompatible data structures will lead to runtime errors.
When linking your project, C/C++ runtime functions must be linked into your library .lib or .lib/.dll, or your executable .exe. C/C++ runtime itself can be linked as static or dynamic library (you can set this in makefile/project settings).
You will find that dynamically linking to C/C++ runtime in both the library and the main project (even when you compile the library itself as a static library) avoids most linking problems (with duplicate function implementations in multiple runtime versions). Of course you would need to distribute runtime DLLs for all used versions with your executable and library.
There are scenarios that statically linking to C/C++ runtime is needed, and the best approach in these cases would be to compile the library with the same compiler setting as the main project to avoid linking problems.
Rationale
Put as little as necessary in header files and as much as possible in library modules, because of the very reasons that you mentioned: compile-time dependency and long compilation time. The only good reasons for header-only modules are:
generic templates for user-defined template parameter;
very short convenience functions when inlining gives significant
performance.
In case 1, it is often possible to hide some functionality that does not depend on the user-defined type in a .cpp file.
Conclusion
If you stick to this rationale, then there is no choice: templated functionality that must allow user-defined types cannot be pre-compiled, but requires a header-only implementation. Other functionality should be hidden from the user in a library to avoid exposing them to the implementation details.
Rather than a dynamic library, you could have a precompiled static library and thin header file. In an interactive quick build, you get the benefit of not having to recompile the world if implementation details changes. But a fully optimized release build can do global optimization and still figure out it can inline functions. Basically, with "link-time code generation" the toolset does the trick you were thinking about.
I'm familiar with Microsoft's compiler, which I know for sure does this as of Visual Studio 2010 (if not earlier).
Templated code will necessarily be header-only: for instantiating this code, the type parameters must be known at compilation time. There is no way to embed template code in shared libraries. Only .NET and Java support JIT instantiation from byte-code.
Re: non-template code, for short one-liners I suggest keeping it header-only. Inline functions give the compiler a lot more opportunities to optimize the final code.
To avoid "insane compilation time", Microsoft Visual C++ has a "precompiled headers" feature. I do not think GCC has a similar feature.
Long functions should not be inlined in any case.
I had one project which had header-only bits, compiled library bits and some bits I could not decide where belonged. I ended up having .inc files, conditionally included in either .hpp or .cxx depending on #ifdef. Truth to be told, the project was always compiled in "max inline" mode, so after a while I got rid of the .inc files and simply moved the contents to .hpp files.
Is it possible to get the best of both worlds?
In terms; limitations arise because tools aren't smart enough. This answer gives the current best effort that is still portable enough to be used effectively.
I've recently started thinking that this kind of design is not very good.
It ought to be. Header-only libraries are ideal because they simplify deployment: makes the reusing mechanism of the language similar to almost all others', which is just the sane thing to do. But this is C++. Current C++ tools still rely on half-a-century-old linking models that remove important degrees of flexibility, such as choosing which entry points to import or export on an individual level without being forced to change the library's original source code. Also, C++ lacks a proper module system and still relies on glorified copy-paste operations to work (although this is just a side factor to the problem in question).
In fact, MSVC is a little better in this regard. It is the only major implementation trying to achieve some degree of modularity in C++ (by attempting e.g. C++ modules). And it is the only compiler that actually allows e.g. the following:
//// Module.c++
#pragma once
inline void Func() { /* ... */ }
//// Program1.c++
#include <Module.c++>
// Inlines or "vague" links Func(), whatever is better.
int main() { Func(); }
//// Program2.c++
// This forces Func() to be imported.
// The declaration must come *BEFORE* the definition.
__declspec(dllimport) __declspec(noinline) void Func();
#include <Module.c++>
int main() { Func(); }
//// Program3.c++
// This forces Func() to be exported.
__declspec(dllexport) __declspec(noinline) void Func();
#include <Module.c++>
Note that this can be used to selectively import and export individual symbols from the library, although still cumbersomely.
GCC also accepts this (but the order of the declarations must be changed) and Clang does not have any way to achieve the same effect without changing the library's source.

How to make a library in c++ like stl

I have made my own implementations of many of the STL features like Vectors, Lists, BST, Queue, Stack and given them all the functions that an STL corresponding library has....
Now i want to use this library by
#include "myLibName.h"
What I Did :
g++ -o -c myLib myLib.cpp
From This I got the object file...
But when i compile programs i have to link the object file myself...
Is there any way that i can do without linking...like the iostream and the other libraries are linked automatically.
I know that a SHARED OBJECT file (eg. libc.so in C) is where all the implementations are held in C....
If that's the solution then how do i make any and use it like other standard libraries in C++ without linking object file every time.
PS: After a lot of efforts i have created these libraries myself...Now Struck at the final step...Pls Help...
You can't unless you're going to write your own toolchain. GCC links in its runtime and standard library because it's GCC and knows that it should; it won't magically do the same with your library.
Conventionally, either make your library header-only or ship a .a/.so/.dll for devs to link against at linktime. In the latter two cases you'll also need to ship the .so/.dll for users to link against at runtime.
To make your build process cleaner for large projects in which you need to link multiple projects, you can use Makefiles.
After that you need to just type make at the terminal to compile and build the whole project.
Another solution is the following, although many people don't recommend it,
header.h
class Foo
{
// some variable and method declarations.
}
header.h is your header file which will contain your declarations.
implement.cpp // this is the implementation file
#include "header.h"
// Now implement various methods you declared in your "header.h" file.
implement.cpp is your implementation file which contains the implementation and the definition of static members.
main.cpp
#include "header.cpp"
// use your methods.
Now you don't need to link your object files, just do g++ -Wall main.cpp
First of all, you should probably differentiate between STL and the Standard C++ Library.
Each compiler comes with its own implementation of the Standard Library, some of them being (at least mostly) compatible (see clang++ and g++). So basically your way to go would be to modify the compiler you are using.
If you are writing header-only implementations, then no library is needed to be built and you can use it without linking. But in that case your work has to be distributed as source and not as library + header.
If you want to simply distribute your library and do not mind to link against the shared or static library you distributed, you should build a shared or static library, depending on the case. But you will have to link it when it is used.

How to structure a "library" of C++ source?

I'm developing a collection of C++ classes and am struggling with how to share the code in a way that maintains organization without compromising ease of compilation for a user of the collection.
Options that I have seen include:
Distribute compiled library file
Put the source in the header file (with implicit inline as discussed in this answer)
Use symbolic links to allow the compiler to find the files.
I'm currently using the third option where, for each class the I want to include I symbolic link each classess headers and source files (e.g. ln -s <path_to_class folder>/myclass.cpp) This works well except that I can't move the project folder location (it breaks all the symlinks) and I have to have all those symlinked files hanging around.
I like the second option (it has the appearance of Java), but I'm worried about code size bloat if everything is declared inline.
A user of the collection will create a project folder somewhere, and somehow include the collection into their compilation process.
I'd like a few things to be possible:
Easy compilation (something like gcc *.cpp from the project folder)
Easy distribution of library in uncompiled form.
Library organization by module.
Compiled code size is not bloated.
I'm not worried about documentation (Doxygen takes care of that) or compile time: the overall modules are small and even the largest projects on the slowest machines won't take more than a few seconds to compile.
I'm using the GCC compiler, if it makes any difference.
A library is the best option (in my opinion) of the three you raised. Then provide the header file(s) in the include path and the library in the linker path.
Since you also want to distribute the library in source code form, I would be inclined to provide a compressed archive (gzip, 7-zip, tarball, or other preferred format) in a central repository.
If I understand correctly, you do not want users to have to include the .cpp files in their build, but instead just want them to use either: (i) the headers directly, (ii) use a compiled form of the lib.
Your requirements are a bit unusual, but they can be achieved. It seems to me like you could organize your code in the following manner. First, have a global define that dictates whether or not you are compiling the library:
// global.h
// ...
#define LIB_SOURCE
// ...
Then in every header file, you check whether that define is set: if the library is distributed as a static/shared lib, the definitions are not included, otherwise, the '.cpp' file is included from the header file.
// A.h
#ifndef _A_H
#include "global.h"
#ifdef LIB_SOURCE
#include "A.cpp"
#endif
// ...
#endif
where 'A.cpp' would contain the actual implementation.
Again, this is a very strange way of doing things and I would actually advise against such practice. A better way (but one which requires more work) is to always distribute a shared library. But to keep things independent of the compiler, write a C layer around it. This way, you have a portable, maintainable library.
As for some of the other requirements:
Keep the build process simple by providing a Makefile
If you worry about the code size of the compiled library, look into gcc's optimization options (-Os). If you worry about the code size of the library when distributed in source-form in the headers, this is more tricky. Since the (inlined) code will actually be in the headers, the code will obviously grow with each inclusion in a .cpp file by the user.
I ended up using inline headers for all of the code. You can see the library here:
https://github.com/libpropeller/libpropeller/tree/master/libpropeller
The library is structured as:
library folder
class A
classA.h
classA.test.h
class B
classB.h
classB.test.h
class C
...
With this structure I can distribute the library as source, and all the user has to do is include -I/path/to/library in their makefile, and #include "library/classA/classA.h" in their source files.
And, as it turns out, having inline headers actually reduces the code size. I've done a full analysis of this, and it turns out that inline code in the headers allows the compiler to make the final binary roughly 5% smaller.

Compiling & linking multiple files in C++

One of my "non-programmer" friends recently decided to make a C++ program to solve a complicated mechanical problem.
He wrote each function in a separate .cpp file, then included them all in the main source file, something like this:
main.cpp:
#include "function1.cpp"
#include "function2.cpp"
...
int main()
{
...
}
He then compiled the code, with a single gcc line:
g++ main.cpp // took about 2 seconds
Now, I know that this should work, but I'm not sure whether including .cpp files directly into the main program is a good idea. I have seen the following scheme several times, where all the function prototypes go into a header file with the extern keyword, like this:
funcs.h:
extern void function1(..);
extern void function2(..);
...
main.cpp:
...
#include "funcs.h"
...
& compiling with:
g++ -c function1.cpp
g++ -c function2.cpp
...
g++ -c main.cpp
g++ -o final main.o function1.o function2.o ...
I think that this scheme is better (with a makefile, ofcourse). What reasons can I give my friend to convince him so?
The main reason people compile object by object is to save time. High-level localised code changes often only require compilation of one object and a relink, which can be faster. (Compiling too many objects that draw in heaps of headers, or redundantly instantiate the same templates, may actually be slower when a change in common code triggers a fuller recompilation).
If the project is so small that it can be compiled in 2 seconds, then there's not much actual benefit to the traditional approach, though doing what's expected can save developer time - like yours and ours on here :-). Balancing that, maintaining a makefile takes time too, though you may well end up doing that anyway in order to conveniently capture include directories, libraries, compiler switches etc.
Actual implications to written/generated code:
cpp files normally first include their own headers, which provides a sanity check that the header content can be used independently by other client code: put everything together and the namespace is already "contaminated" with includes from earlier headers/implementation files
the compiler may optimise better when everything is in one translation unit (+1 for leppie's comment, do do the same...)
static non-member variables and anonymous namespaces are private to the translation unit, so including multiple cpps means sharing these around, for better or worse (+1 for Alexander :-))
say a cpp files defines a function or variable which is not mentioned in its header and might even be in an anonymous namespace or static: code later in the translation unit could call it freely without needing to hack up their own forward declaration (this is bad - if the function was intended to be called outside its own cpp then it should have been in the header and an externally exposed symbol in its translation unit's object)
BTW - in C++ your headers can declare functions without explicitly using the extern keyword, and it's normal to do so.
The reason for the second style is because each .cpp file can be treated separately, with its own classes, global variables, ect.. without risk of conflict.
It is also easier in IDEs that automatically link all the .cpp files (like MSVC).