Same symbols in different libraries and linking order - c++

I have 2 libraries: test.1 and test.2. Both libraries contain a single global extern "C" void f(); function, with different implementations (just a cout for the test).
I did the following test:
Test 1 Dynamic linking:
If I add libtest.1.so and then libtest.2.so in the makefile of the executable and then call f(); in main, libtest.1.so->f() is called.
If I change the order in the makefile, libtest.2.so->f() is called
Test 2 Static linking:
Absolutely the same happens with static libraries
Test 3 Dynamic loading
As the library is manually loaded, everything works as expected.
I expected an error for multiple definitions, which obviously didn't happen.
Also, this does not break the one-definition-rule, as the situation is different.
It's also not a dependency-hell(not that it's related to this at all), nor any linking fiasco..
So, than what is this? Undefined behavior? Unspecified behavior? Or it really does depend on the linking order?
And is there a way to easily detect such situations?
Related questions:
dlopen vs linking overhead
What is the difference between dynamic linking and dynamic loading
Is there a downside to using -Bsymbolic-functions?
Why does the order in which libraries are linked sometimes cause errors in GCC?
linking two shared libraries with some of the same symbols
EDIT I did two more tests, which confirm this UB:
I added a second function void g() in test.1 and NOT in test.2.
Using dynamic linking and .so libs, the same happens - f is called with the same manner, g is also executable (as expected).
But using static linking now changes the things: if test.1 is before test.2, there are no errors, both functions from test.1 are called.
But when the order is changed, "multiple definitions" error occurs.
It's clear, that "no diagnostic required" (see #MarkB's answer), but it's "strange" that sometimes the error occurs, sometimes - it doesn't.
Anyway, the answer is pretty clear and explains everything above - UB.

A library is a collection of object files. The linker extracts objects from libraries as necessary to satisfy unresolved symbols. What is important, the linker inspects libraries in the order they appear on a command line, looks into each library just once (unless the command line mentions the library more than once), and takes only objects which satisfy some reference.
In your first set of tests, everything is clear: the linker satisfies a reference to f() from the first available library, and that's pretty much it.
Now the second set of tests. In the success case test.1 satisfies both f and g references, so test.2 is irrelevant. In the failure case, test.2 satisfies the f reference, but g remains undefined. To satisfy g, linker must pull some object from test.1, which also happen to supply f. Obviously it is multiple definition.
Notice that in order to have an error you must have f and g in the same object. If test.1 is composed of 2 objects (one defining f and another defining g) the error disappears.

This absolutely violates the one definition rule in cases 1&2. In case 3, since you explicitly specify which version of the function to execute it may or may not. Violating the ODR is undefined behavior, no diagnostic required.
3.2/3:
Every program shall contain exactly one definition of every non-inline
function or variable that is odr-used in that program; no diagnostic
required.

Related

Any way to tell linker to "respect" __attribute__((__used__))

I am trying to work around the fact that linker drops the registration in my code.
See this answer for details.
The problem I have with that answer is that the --whole-archive option seems like an overkill for just 1 function call. I would like to avoid huge code bloat that I assume it causes.
I found
attribute((used))
, but that works on compile, not link level.
So I wonder if there is a specific way to tell the linker to not drop specific function call, instead of changing link options for entire program.
for clarification this is my code:
bool dummy = (Register(), false); // Register is never called because linker dropped entire static library
So I wonder if there is a specific way to tell the linker to not drop specific function call, instead of changing link options for entire program.
Your objective is actually to tell the linker not to drop the definition of an unreferenced variable (dummy) in whose
initialiser there is a function call that you wish to ensure is executed by your program.
__attribute__(used) is an attribute of functions, but not of variables, and its effect is to force the compiler to compile the function definition,
even if the function is static and appears unreferenced in the translation unit. In your case:
bool dummy = (Register(), false);
it cannot appear to the compiler that Register is unreferenced - it is called - so __attribute__(used) will
be redundant even if the definition of Register() is in the same translation unit and is static. But whether or
not the definition of Register() is compiled in this translation unit or some other, this call to Register()
will not be linked or executed in the program if this definition of dummy is not linked.
I assume you do not want to write a custom linker script, or to modify the source code so that dummy is referenced.
In that case you need to instruct the linker to postulate an undefined reference to dummy, by
passing --undefined=dummy in its options. This will force it to search libraries for
a definition of dummy, and to link archive members (and/or shared libraries) exactly as if there
actually was an undefined reference to dummy in the first file that is linked. No redundant code will be linked,
as is probable with --whole-archive.
You can pass --undefined=<symbol> to the linker for as many values of <symbol> as
you like. To pass it through gcc/g++, use -Wl,--undefined=<symbol>.
Put it in its own section and in the linker script:
KEEP(sectionname)
edit
That line of code might be reduced to zeroing one register or variable

Why can I link two libraries exporting the same C-Function in VC?

I have the situation that two C++ Libraries export the very same C-function symbols from shared code. When I now compile an executable which links both libraries, I do not get any linker error or warning from VC12. Why is this? It silently just chooses one of the two symbols and I have no idea which one is chosen.;
extern "C" { __declspec(dllexport) int function(void* argument);}
There is a flag named /FORCE which whould be able to convice VC to compile even if there are multiply defined symbols, but this flag is not set.
I do not find any official information from Microsoft why this links at all. I was expecting to get a LNK4006 warning, but I don't.
I just want to know if this is the expected or undefined behavior, which only did not explode by coincidence. I read things about the One Definition Rule not being applied generally to C-Code, but I cannot find any reliable statement for the VC compiler.
Can I assume that, given the functions do not use any singletons, use the very same code and compiler flags, it does not matter which one is chosen?
You are violating the one definition rule.
The behavior of you program is undefined.
See section "3.2 One definition rule [basic.def.odr]" in the C++ standard.
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required. ...
Section 3.2.6 describes when there can be more than one definition
of a class type, inline function with external linkage etc. in a program.
I just want to know if this is the expected or undefined behavior, which only did not explode by coincidence. I read things about the One Definition Rule not being applied generally to C-Code, but I cannot find any reliable statement for the VC compiler.
It is undefined behavior.
The C++ standard is the master, not the VC compiler.
Can I assume that, given the functions do not use any singletons, use the very same code and compiler flags, it does not matter which one is chosen?
It is still undefined behavior - though the program might appear to behave as expected.

Dynamic link vs. static link efficiency

I have an argument with another developer, I'd like to settle here over Dynamic Link vs. Static Link.
In Theory:
Say you have a library with 100 functions, each has significant amounts of code inside it:
int A()
int B()
int C()
..
..and so on...
And your application only calls or depends on one of them.
You have two methods at your disposal.
Build the library as a dynamic linked library
Build the library as a statically linked library
My colleague claims that linking the static library to our application, the compiler/linker will not add the code of the 99 unused functions into our executable. I claim it will. I claim in this scenario the only advantage is having a single executable and not having to distribute the library with our application, but it will not have significant size differences if we used a dynamically linked library approach.
Who is correct?
It can depend on a combination of how the code is organized, and what compiler flags you use.
Following the classic, simple model of things, the linker would link in whatever object files in the library were needed to satisfy the symbol references, so if your A(), B() and C() were each defined in different object files, only the object file that contained the symbol you actually used would be linked into the program (unless it, in turn, depended upon one or more of the others, in which case, the linker would find object files to satisfy those references as well, recursively, until it either satisfied them all, or found one it couldn't satisfy (at which time you'd get the standard "Unresolved external XXX" error message).
More recently, most compilers can "package" functions into separate "modules" without your having to put them into separate source files to create separate object files. Details vary, but can reduce (or eliminate) the necessity for having each source file as tiny as possible just to keep what ends up in the final executable to a minimum.
So, bottom line: at least for the most part, he's right and you're wrong.
It depends :-)
If you put each function in its own source file, or use the /Gy compile option, each function will be packaged in a separate section of the static library.
The linker will then be able to pick them up as needed, and only include the functions that are actually called.

C++ Global class variables are not created

Iam looking at a piece of code that creates global class variables. The constructors of these classes calls a symbol table singleton and adds the this pointers in it..
In a Keywords.cpp file
class A : class KeyWord
{
A() { add(); }
} A def;
similarly for keywords B,C etc
void KeyWord::add()
{
CSymbolCtrl& c = CSymbolCtrl::GetInstance();
c.addToTable(this);
}
These translation units are compiled to form a library. When i "dumpbin" the library, i see the dynamic initializers for ADef, BDef etc.
No in the exe, when i call the CSymbolCtrl instance, i didnt find the ADef, BDef.. stored in its map. When i set a breakpoint in add(), its not getting hit. Is there a way that the linker is ignoring ADef, BDef because they are not referenced anywhere?
}
From Standard docs 1.9 Program execution,
4) This provision is sometimes called the “as-if” rule, because an implementation is free to disregard any requirement of this International Standard
as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program. For instance,
an actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no side effects affecting the
observable behavior of the program are produced.
So, it might, yes.
The short answer is yes. A pretty common way to force registration is to do something like:
static bool foo = register_type();
It's not too clear from your question, but are you actually
including the compiled object files in your link or not? Just
putting a file in a library doesn't cause it to be included in
the final program. By definition, a file from a library will
only be included in the executable if it resolves an unresolved
external symbol. If you want an object file to be part of the
final executable, and it doesn't contain any globals which would
resolve an undefined external, then you have several choices:
-- Link the object file directly, rather than putting it in
a library. (This is the "standard" or "canonical" way of
doing it.)
-- Use a DLL. Dispite the name, DLL's are not libraries, but
object files, and are linked in an all or nothing way.
-- Create a dummy global symbol, and reference it somewhere.
(This can often be automated, and might be the preferred
solution if you're delivering a library as a third party
supplier.)

How does function-level linking deal with variables declared at file level?

As I understand function-level linking builds (explicitly or not) a graph of all possible calls and only includes the reachable functions' code into the produced binary. But how does it deal with variables declared at file level?
Say I have
MyClass GlobalVariable;
static MyClass StaticGlobalVariable;
in some file that contains only these two variables and a set of functions not actually called from any of the remaining code.
Will the code for these variables allocation/initialization be included into the output?
From experience (rather than quoting the standard):
If the initilaization has visible side effects like calls into external libraries or file I/O, the initialization will always happen.
boost::singleton_default provides an interesting solution that enforces the initialization to be done only when the object is referenced elsewhere, i.e. when all other references to the object are removed by the linker, the initialization is removed, too.
Edit: Yes. g++ optimize flags try to figure out function calls and prune away .o files resulting in linker errors. I'm not sure if this happens only with certain optimize flags, but it does happen.
A bad habit in our company is the presence of a lot of 'extern g_GlobalFunction()' definitions in different files. As their calls depended on conditional code, the .o files were often thrown away, resulting in link errors.
We fixed that with g_InitModule() and g_InitFileName() calls that are called hierarchically starting from main(). Mostly, these are empty functions just meant to dissuade g++ from discarding the .o file.