Why should we set symbolic link when we build a library? - c++

I have a question related to building a library in C++ for multiple platforms. I notice that many libraries expect a "symbolic link". With CMake, the symbolic link is done by the following codes:
set_target_properties({library_name}, PROPERTIES VERSION, ${library_string_version} SOVERSION {library_string_shortversion})
I cannot understand why symbolic link is necessary for a library. Moreover, it seems to me symbolic link is always related to the version of the library, and are there any relationships between them? Thanks!

The advantage of using a symbolic link is that you can easily update the library, with a new version, maintaining a consistent name, while at the same time having the version in the library name accessible. So applications can always link against the same name if even if you update it. Only when they need a specific version, they can link to that instead.
Also it makes it easier to move it around if need be, because the application doesn't need to know where it comes from.
I often wish I had symbolic links in MS Windows as well, as it makes life much easier.

It allows for side-by-side versioning of the library.
libfoo.so -> libfoo.2.so
libfoo.1.so -> libfoo.1.23.so
libfoo.1.23.so
libfoo.2.so -> libfoo.2.1.so
libfoo.2.1.so
This way, libfoo.so is always the latest version. If you know, (for compatibility reasons) that you need version 1 and not version 2, you can link against libfoo.1.so, and always have the latest v1 version.

Related

What is a .so.2 file?

I compiled Intel TBB from source using GCC. It generates a libtbb.so and littbb.so.2. Looks like the .so.2 file is the real shared library, libtbb.so just contains a line of text
INPUT (libtbb.so.2)
What is the purpose to generate these two files instead of one? For the INPUT (libtbb.so.2), what is the syntax? I want to know more about it.
Usually when you build shared objects (.so) then you also take care of versions by adding suffixes such as mylib.so.2.3.1. To make sure your programs can load this lib or other later versions you create links with names
mylib.so -> mylib.so.2.3.1
mylib.so.2 -> mylib.so.2.3.1
mylib.so.2.3 -> mylib.so.2.3.1
So, everything after .so represents version.sub-version.build (or similar)
Also, it is possible for more than one version of the same lib to coexist with this scheme, and all that is necessary to switch programs to using a particular version is to have the appropriate links in place.
Dynamically linked ELF binary (whether another library or an executable) uses shared-object name or soname to identify the library that the executable should be linked against upon execution.
When a library created as an ELF shared library, the compile-time link editor inserts a DT_SONAME field in the executable which the library's SONAME into the library itself. The DT_SONAME is defined in the ELF standard as:
This element holds the string table offset of a null-terminated
string, giving the name of the shared object. The offset is an index
into the table recorded in the DT_STRTAB entry. See ‘‘Shared Object
Dependencies’’ below for more information about these names.
So now when an executable is create the SONAME is embedded into it. When when the executable is run is used by the linker to look for the library in the files in the predifined locations for dynamic library. The predefined location in windows would be wherever DLLs reside. In Linux and Mac OS X and other System V compatible systems they would be /lib and /usr/lib and possibly other spots, it depends on the linker used, and can be defined in linkers own configurations.
In all events the linker looks to see if the library named in soname entry is present in any of those locations, if it is it will use it.
Note that the standard says that the soname is a STRING and the versioning conventions became a defacto standard after the fact and goes something like this:
Make the soname to be libmyname.so.A and make the library filename be libmyname.so.A.B or libmyname.so.A.B.C (under MacOSX it's libmyname.A.B.dylib). Create a softlink from libmyname.so.A.B[.C]? to libmyname.so.A.
A is kept the same while the library's ABI stays the same.
B (or B.C) becomes the minor version.
Under Linux it's really common that the library version would be the same as the package version number. This has its pros and cons.
libtool formalization
GNU libtool is used a lot to build dynamic libraries, and has a more formal versioning system and has strong logic for it. The libtool versioning system for sonames works very well and is adopted by complex libraries to keep things straight.
Under libtool, the versioning is as under:
libmylib-current.release.age
Under libtool the idea is that as libraries evolve they will add and remove functionality.
Let's say you are developing a library. Start by using a version as 0.0.0.
Now let's say you fix a few bugs, you would only increase the release number.
So new name would be come libmylib.0.1.0 or libmylib.0.2.0 etc.. for every release that just fixes bugs but doesn't change any of the ABI.
Along the way you say. Ugh! I could've done this subfunctionality better, So you add a new set of functions to do something better, but because others are still using your library so you still leave the old (deprecated) functionality in there.
The rules are as under:
Start with version information of ‘0:0:0’ for each libtool library.
Update the version information only immediately before a public
release of your software. More frequent updates are unnecessary, and
only guarantee that the current interface number gets larger faster.
If the library source code has changed at all since the last update,
then increment revision (‘c:r:a’ becomes ‘c:r+1:a’).
If any interfaces have been added, removed, or changed since the last
update, increment current, and set revision to 0.
If any interfaces have been added since the last public release, then increment age.
If any interfaces have been removed or changed since the last public
release, then set age to 0.
You can read more about it in the libtool documentation
Update ...
The following was a comment that my explanation has an error. It does not, which, requires a bit more detail than can be put into an answer comment, so see below.
Original objection
There is an error here: on linux, the version is of the form
libmylib.(current-age).release.age, where the parentheses indicate an
expression to be evaluated. For example GLPK 4.54 with
current:revision:age = 37:1:1 on linux installs the library file
libglpk.so.36.1.1. For more info, see, e.g.,
<autotools.io/libtool/version.html>.
Rebuttal
TLDR: autotools.io's not authortative source.
Explanation
Whilst the Flameeyes is an amazing developer and he is one of Gentoo maintainers, it was he who made the mistake, and created a "rule of thumb" loose interpretation of the libtool spec. While this is not going to break systems 99% of the time, if we were to follow the ad-hoc way of updating current:
The rules of thumb, when dealing with these values are:
Always increase the revision value.
Increase the current value whenever an interface has been added,
removed or changed.
Increase the age value only if the changes made to the ABI are
backward compatible.
he then goes on to say that maintaining multiple versions of Gtk it would be best to just append the library version into the library NAME and simply dump the version number. (as they do in GTK+):
In this situation, the best option is to append part of the library's
version information to the library's name, which is exemplified by
Glib's libglib-2.0.so.0 soname. To do so, the declaration in the
Makefile.am has to be like this:
lib_LTLIBRARIES = libtest-1.0.la
libtest_1_0_la_LDFLAGS = -version-info 0:0:0
Well that's just crockpot approach to mucking up the power of dynamic linking and symbol resolution versioning completely moot!. He's saying just turn it off. Horse boogers! No wonder even experienced developers have had a hard time building and maintaining open source projects and we are constantly running into binaries dying every time new versions of libraries are installed (because they clobber each other).
The libtool versioning approach is VERY WELL THOUGHT OUT. It is an algorithm and its steps are ordered instructions 1 to 6 are to be followed every time there is an update to the code of a dynamic linked library.
For new and current developers, please read them carefully and visualize what will happen to the library version number throughout the life of your amazing software. If you do you will notice that every piece of previously linked software will always use the most current and accurate version of your amazing library correctly, and none of them will ever clobber or stomp on each other, AND you never have to add a blooming number in the name of your library (unless it's for pleasure or esthetics).

How can I learn to include and link to libraries?

I'm trying to teach myself C++ programming. The C++ is the easy part. Some patience and good reference material goes a long way. Including and linking against libraries is the hard part. The instructions provided usually assume some knowledge which I don't have and don't know how to aquire without painfully slow trial and error.
The latest concrete example is http://cpp-netlib.org/
I've spent the whole afternoon trying to get it to work and I still don't even an idea why it's not working.
How can I learn this skill from the ground up?
Is it it normal to have such enormous difficulties learning how to do this?
Well, the principle is pretty much always the same for any C++ compiler (the option flags mentioned are quite standard but might differ for particular compilers):
Install a library you want to use in your system (this may include a step to compile this library with your particular compiler toolchain).
Setup the include paths to be used for this library using the -I option
Use the headers of the library API in your code (#include <libheader.h>)
Setup the library paths to be used for this library using the -L option, tell the linker which libraries to link using -l<extra>, where extra should refer to some file named lib<extra>.a or lib<extra>.lib
Things to note:
Third party libraries might depend on further libraries you'd also need to install (compile with the same toolchain as your target uses)
On Windows using the MS Visual Studio (Express) toolchain you'll need to take care choosing the right library versions that are compliant with the 'threading model' and in general 'debug' / 'non-debug' library versions.
An (appropriate and useful) IDE will usually let you choose the toolchain (MinGW GCC, MS VS compiler, LLVM, etc.) on project setup, and offer some properties dialog to set these options.
What's necessary to setup for the toolchain, 3rd party libraries, IDE and OS you're using is a bit different learning curve and depends on what you want to use in particular.

How do I statically link against two versions of xerces-c (or any library for that matter)?

I know this is not a very clean thing to do but how do I do it nonetheless?
Basically, I am statically linking a third party library that uses xerces-c 2.7 and I want to use xerces-c 3.1 (for some of the newer latest and greatest features not really available in 2.x)
The modules that use 2.7 (used internally by the third party library and never exposed to my code) have nothing in common with the modules using 3.1 (in my code).
Any way how to do this? I know it's not a good thing but I shudder to think of the lead time between submitting an upgrade request for the library and actually getting it done. Probably months at least and I don't want to go down that unholy path.
A generic compiler independent solution would ofc be much better.
Another solution aside from that mentioned in " Linking libraries with incompatible dependecies " is to isolate the different versions by building them into different dynamic libraries. The simplest approach may be to move the code that uses xerces 3.1 into a new dynamic library, and create an interface to it. Since you're statically linking against Xerces, this will keep the references internal to the dynamic library. You may need to change the gcc visibility settings to ensure that only selected function names are exported from the dynamic library.

Linking Statically with glibc and libstdc++

I'm writing a cross-platform application which is not GNU GPL compatible. The major problem I'm currently facing is that the application is linked dynamically with glibc and libstdc++, and almost every new major update to the libraries are not backwards compatible. Hence, random crashes are seen in my application.
As a workaround, I distribute binaries of my application compiled on several different systems (with different C/C++ runtime versions). But I want to do without this. So my question is, keeping licensing and everything in mind, can I link against glibc and libstdc++ statically? Also, will this cause issues with rtld?
You don't need to.
Copy the original libraries you linked against to a directory (../lib in this example) in your application folder.
Like:
my_app_install_path
.bin
lib
documentation
Rename you app for something like app.bin. Substitute your app for a little shell script that sets the enviroment variable LD_LIBRARY_PATH to the library path (and concatenate the previous LD_LIBRARY_PATH contents, if any). Now ld should be able to find the dynamic libraries you linked against and you don't need to compile them statically to your executable.
Remember to comply with the LGPL adding the given attribution to the libraries and pointing in the documentation where the source can be downloaded.
glibc is under the LGPL. Under section 6. of LGPL 2.1, you can distribute your program linked to the library provided you comply with one of five options. The first is to provide the source code of the library, along with the object code (source is optional, not required) of your own program, so it can be relinked with the library. You can alternatively provide a written offer of the same. Your own code does not have to be under the LGPL, and you don't have to release source.
libstdc++ is under the GPL, but with a major exception. You can basically just distribute under the license of your choice without providing source for either your own code or libstdc++. The only condition is that you compile normally, without e.g. proprietary modifications or plugins to GCC.
IANAL, and you should consider consulting one if you need real legal advice.
Specifying the option -static-libgcc to the linker would cause it to link against a static version of the C library, if available on the system. Otherwise it is ignored.
I must question what the heck you are doing with the poor library functions?
I have some cross platform software as well. It runs fine on Linux systems of all sorts. Build with the oldest version of software that you want to support. The glibc and libstdc++ libraries are really very backward compatible.
I have built on CentOS 4 and run it on RHEL 6 beta. No problems.
I can build on stable Debian and run it on testing.
Now, I do sometimes have trouble with some libraries if I try to build on, say old Debian and try to run it on CentOS 5.4. That is usually due to distribution configuration choices that are different, like choosing threading or non-threading.

Can multiple versions of a same (Boost) DLL co-exist in same process?

My (C++, cross-platform) app is heavily using Boost libraries (say version 1.x), and I want to also link against a 3rd-party (vendor)'s SDK (no source), itself using Boost (but version 1.y).
So, we both link dynamically against our own version of Boost DLLs, CRT being identical. Consequently, at run-time my app would have to load both DLL of Boost 1.x & 1.y.
What are the potential issues & gotchas associated?
I can't change vendor's SDK, but I can change my app. Maybe I should try to link statically against my Boost 1.x?
PS: Name of Boost's DLL include their version, so no name collision, both are identifiable. Not the usual DLL-hell.
As far as using the DLLs for different versions there should be no problem. At least not on Windows.
This is true if the SDK is using boost internally. If the SDK uses boost constructs in its interface, for example: it has a function that returns a boost::optional, then having multiple versions can cause problems. It could still work fine, dependent on the changes between the versions, but that will definitely be a risk. I don't know of any good solution in that case. This is also true if you include a SDK header file that includes a boost header file.
This is a big problem.
Do a search on DLL hell.
Basically the DLL (or shared libs in Linux) are loaded but not all the names are resolved at load time. What happens is a lazy evaluation, so the names are evaluated on first use. The problem is that if 2 dll have the same name then the location where the name is resolved to depends on the what order the DLL are searched in (which depends on load order).
If you statically link then you will not have problems with method calls as yours will all be resolved at compile time and the third party will be resolved at runtime from the DLL. But what about structures that are created by version-1 boost. If you then pass these to the third party library that then passes it to the version-x boost. Are the structures layed out in the same way?
This is a very tricky area and when problems occur very hard to de-bug.
So try and use the same version.
If you write a function foo, and export it from F.dll, and another function foo exported from G.dll, would you expect problems?
When AF.exe is linked, the linker is told: put some code in there that loads the address of function foo from F.dll. Now BG.dll is linked to retrieve the foo address from G.dll. I still see no problem.
Now replace AF.exe with your app, BG.dll with your vendor's app, F.dll with your boost version, G.dll with the vendor's boost version.
Concluding: I see no problems if the dll names are different.