Avoid multiple definition linker error when not using the redefined symbols - c++

I try to build an executable that links to various shared and static libraries. It turns out that two of the static libraries both define the same symbol, which results in a multiple definition linker error. My executable doesn't use this symbol so it's not really a concern.
I can avoid the error by adding the --allow-multiple-definitions flag but that seems like a nuclear option. I would like the linker to complain if I try to use a multiple-time defined symbol.
Is there a way to tell the linker "complain for multiple definitions only if the symbol is used"? Or alternatively tell it, "from lib ABC ignore symbol XYZ". I am developing with g++ on linux.

You may have a one variant of the problem or a different variant,
depending on facts whose relevance you haven't yet considered. Or possibly you have a mixture of both, so I'll walk through a solution to each variant.
You should be familiar with the nature of static libraries and how they are consumed in linkage,
as summarised here
The Superflous Globals Symbols Variant
Here are a couple of source files and a header file:
one.cpp
#include <onetwo.h>
int clash = 1;
int get_one()
{
return clash;
}
two.cpp
#include <onetwo.h>
int get_two()
{
return 2;
}
onetwo.h
#pragma once
extern int get_one();
extern int get_two();
These have been built into a static library libonetwo.a
$ g++ -Wall -Wextra -pedantic -I. -c one.cpp two.cpp
$ ar rcs libonetwo.a one.o two.o
whose intended API is defined in onetwo.h
Simarily, some other source files and a header have been
built into a static libary libfourfive.a whose intended API
is defined in fourfive.h
four.cpp
#include <fourfive.h>
int clash = 4;
int get_four()
{
return clash;
}
five.cpp
#include <fourfive.h>
int get_five()
{
return 5;
}
fourfive.h
#pragma once
extern int get_four();
extern int get_five();
And here's the source of a program that depends on both libraries:
prog.cpp
#include <onetwo.h>
#include <fourfive.h>
int main()
{
return get_one() + get_four();
}
which we try to build like so:
$ g++ -Wall -Wextra -pedantic -I. -c prog.cpp
$ g++ -o prog prog.o -L. -lonetwo -lfourfive
/usr/bin/ld: ./libfourfive.a(four.o):(.data+0x0): multiple definition of `clash'; ./libonetwo.a(one.o):(.data+0x0): first defined here
collect2: error: ld returned 1 exit status
encountering a name-collision for the symbol clash, because it is globally defined in two of the
object files that the linkage requires, one.o and four.o:
$ readelf -s libonetwo.a libfourfive.a | egrep '(File|Symbol|OBJECT|FUNC)'
File: libonetwo.a(one.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 clash
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z7get_onev
File: libonetwo.a(two.o)
Symbol table '.symtab' contains 10 entries:
9: 0000000000000000 15 FUNC GLOBAL DEFAULT 1 _Z7get_twov
File: libfourfive.a(four.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 clash
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z8get_fourv
File: libfourfive.a(five.o)
Symbol table '.symtab' contains 10 entries:
9: 0000000000000000 15 FUNC GLOBAL DEFAULT 1 _Z8get_fivev
The problem symbol clash is not referenced in our own code, prog.(cpp|o). You wondered:
Is there a way to tell the linker "complain for multiple definitions only if the symbol is used"?
No there isn't, but that's immaterial. one.o would not have been extracted from libonetwo.a
and linked into the program if the linker didn't need it to resolve some symbol. It needed it
to resolve get_one. Likewise it only linked four.o because it's needed to resolve get_four.
So the colliding definitions of clash are in the linkage. And although prog.o doesn't use clash,
it does use get_one, which uses clash and which intends to use the defintion of clash in one.o.
Likewise prog.o uses get_four, which uses clash and intends to use the different definition in four.o.
Even if clash was unused by each libary as well as the program, the fact that it is defined
in multiple object files that must be linked into the program means that the program will contain
multiple definitions of it, and only --allow-multiple-definitions will allow that.
In that light you'll also see that:
Or alternatively [is there a way to] tell it, "from lib ABC ignore symbol XYZ".
in general won't fly. If we could tell the linker to ignore (say) the definition of clash
in four.o and resolve the symbol everywhere to the definition in one.o (the only other candidate) then get_four() would return 1 instead of 4 in our program. That
is in fact the effect of --allow-multiple-definitions, since it causes the first definition
in the linkage to be used.
By inspection of the source code of libonetwo.a (or libfourfive.a) we can fairly confidently spot the root
cause of the problem. The symbol clash has been left with external linkage where
it only needed internal linkage, since it isn't declared in the associated
header file and is referenced nowhere in the libary except
in the file where it's defined. The offending source files should have written:
one_good.cpp
#include <onetwo.h>
namespace {
int clash = 1;
}
int get_one()
{
return clash;
}
four_good.cpp
#include <fourfive.h>
namespace {
int clash = 4;
}
int get_four()
{
return clash;
}
and all would be good:
$ g++ -Wall -Wextra -pedantic -I. -c one_good.cpp four_good.cpp
$ readelf -s one_good.o four_good.o | egrep '(File|Symbol|OBJECT|FUNC)'
File: one_good.o
Symbol table '.symtab' contains 11 entries:
5: 0000000000000000 4 OBJECT LOCAL DEFAULT 3 _ZN12_GLOBAL__N_15clashE
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z7get_onev
File: four_good.o
Symbol table '.symtab' contains 11 entries:
5: 0000000000000000 4 OBJECT LOCAL DEFAULT 3 _ZN12_GLOBAL__N_15clashE
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z8get_fourv
$ g++ -o prog prog.o one_good.o four_good.o
$./prog; echo $?
5
Since re-writing the source code like that is not a option, we have to modify the object files to the
same effect. The tool for this is objcopy.
$ objcopy --localize-symbol=clash libonetwo.a libonetwo_good.a
This command has the same effect as running:
$ objcopy --localize-symbol=clash orig.o fixed.o
on each of the object files libonetwo(orig.o) to output a fixed object file fixed.o,
and archiving all the fixed.o files in a new static library libonetwo_good.a. And the
effect of --localize-symbol=clash, on each object file, is to change the linkage of
the symbol clash, if defined, from external (GLOBAL) to internal (LOCAL):
$ readelf -s libonetwo_good.a | egrep '(File|Symbol|OBJECT|FUNC)'
File: libonetwo_good.a(one.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT LOCAL DEFAULT 3 clash
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z7get_onev
File: libonetwo_good.a(two.o)
Symbol table '.symtab' contains 10 entries:
Now the linker cannot see the LOCAL definition of clash in libonetwo_good.a(one.o).
That's sufficient to head off the multiple definition error, but since libfourfive.a has
the same defect, we'll fix it too:
$ objcopy --localize-symbol=clash libfourfive.a libfourfive_good.a
And then we can relink prog successfully, using the fixed libraries.
$ g++ -o prog prog.o -L. -lonetwo_good -lfourfive_good
$ ./prog; echo $?
5
The Global Symbols Deadlock Variant
In this scenario, the sources and headers for libonetwo.a are:
one.cpp
#include <onetwo.h>
#include "priv_onetwo.h"
int inc_one()
{
return inc(clash);
}
two.cpp
#include <onetwo.h>
#include "priv_onetwo.h"
int inc_two()
{
return inc(clash + 1);
}
priv_onetwo.cpp
#include "priv_onetwo.h"
int clash = 1;
int inc(int i)
{
return i + 1;
}
priv_onetwo.h
#pragma once
extern int clash;
extern int inc(int);
onetwo.h
#pragma once
extern int inc_one();
extern int inc_two();
And for libfourfive.a they are:
four.cpp
#include <fourfive.h>
#include "priv_fourfive.h"
int dec_four()
{
return dec(clash);
}
five.cpp
#include <fourfive.h>
#include "priv_fourfive.h"
int dec_five()
{
return dec(clash + 1);
}
priv_fourfive.cpp
#include "priv_fourfive.h"
int clash = 4;
int dec(int i)
{
return i - 1;
}
priv_fourfive.h
#pragma once
extern int clash;
extern int dec(int);
fourfive.h
#pragma once
extern int dec_four();
extern int dec_five();
Each of these libraries is built with some common internals defined
in a source file - (priv_onetwo.cpp|priv_fourfive.cpp)
- and these internals are globally declared for building the library through a private header
- (priv_onetwo.h|priv_fourfive.h) - that is not distributed with the library.
They are undocumented symbols but nevertheless exposed to the linker.
Now there are two files in each library make that undefined (UND) references
to the global symbol clash, which is defined in another file:
$ readelf -s libonetwo.a libfourfive.a | egrep '(File|Symbol|OBJECT|FUNC|clash)'
File: libonetwo.a(one.o)
Symbol table '.symtab' contains 13 entries:
9: 0000000000000000 23 FUNC GLOBAL DEFAULT 1 _Z7inc_onev
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash
File: libonetwo.a(two.o)
Symbol table '.symtab' contains 13 entries:
9: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 _Z7inc_twov
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash
File: libonetwo.a(priv_onetwo.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 clash
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 _Z3inci
File: libfourfive.a(four.o)
Symbol table '.symtab' contains 13 entries:
9: 0000000000000000 23 FUNC GLOBAL DEFAULT 1 _Z8dec_fourv
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash
File: libfourfive.a(five.o)
Symbol table '.symtab' contains 13 entries:
9: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 _Z8dec_fivev
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash
File: libfourfive.a(priv_fourfive.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 clash
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 _Z3deci
Our program source this time is:
prog.cpp
#include <onetwo.h>
#include <fourfive.h>
int main()
{
return inc_one() + dec_four();
}
and:
$ g++ -Wall -Wextra -pedantic -I. -c prog.cpp
$ g++ -o prog prog.o -L. -lonetwo -lfourfive
/usr/bin/ld: ./libfourfive.a(priv_fourfive.o):(.data+0x0): multiple definition of `clash'; ./libonetwo.a(priv_onetwo.o):(.data+0x0): first defined here
collect2: error: ld returned 1 exit status
once again clash is multiply defined. To resolve inc_one in main, the
linker needed one.o, which obliged it to resolve inc, which made it need
priv_onetwo.o, which contains the first definition of clash. To resolve dec_four in main, the
linker needed four.o, which obliged it to resolve dec, which made it need
priv_fourfive.o, which contains a rival definition of clash.
In this scenario, it isn't a coding error in either library that clash has external linkage.
It needs to have external linkage. Localizing the definition of clash with objcopy in either of
libonetwo.a(priv_onetwo.o) or libfourfive.a(priv_fourfive.o) will not work. If we do
that the linkage will succeed but output a bugged program, because the linker will resolve
clash to the one surviving GLOBAL definition from the other object file: then dec_four()
will return 0 instead of 3 in the program, dec_five() will return 1 not 4 ; or else inc_one() will return 5 and inc_two() will return 6.
And if we localize both definitions then no definition of clash will be found in the
linkage of prog to satisfy the references in one.o or four.o, and it will fail for undefined reference to clash
This time objcopy comes to the rescue again, but with a different option1:
$ objcopy --redefine-sym clash=clash_onetwo libonetwo.a libonetwo_good.a
The effect of this command is to create a new static library libonetwo_good.a,
containing new object files that are pairwise the same as those in libonetwo.a,
except that the symbol clash has been everywhere replaced with clash_onetwo:
$ readelf -s libonetwo_good.a | egrep '(File|Symbol|clash)'
File: libonetwo_good.a(one.o)
Symbol table '.symtab' contains 13 entries:
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash_onetwo
File: libonetwo_good.a(two.o)
Symbol table '.symtab' contains 13 entries:
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash_onetwo
File: libonetwo_good.a(priv_onetwo.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 clash_onetwo
We'll do the corresponding thing with libfourfive.a:
$ objcopy --redefine-sym clash=clash_fourfive libfourfive.a libfourfive_good.a
Now we're good to go once more:
$ g++ -o prog prog.o -L. -lonetwo_good -lfourfive_good
$ ./prog; echo $?
5
Of the two solutions, use the fix for The Superflous Globals Symbols Variant if
superflous globals is what you've got, although the fix for the The Global Symbols Deadlock Variant
would also work. It is never desirable to tamper with object files
between compilation and linkage; it can only be unavoidable or the lesser of evils. But
if you're going to tamper with them, localizing a global symbol that should never have been global
is a more transparent tampering than changing the name of a symbol to one that
has no origin in source code.
[1] Don't forget that if you want to use objcopy with any option argument that
is a symbol in a C++ object file, you have to use the mangled name of the
C++ identifier than maps to the symbol. In this demo code it happens that the mangled name of the
C++ identifier clash is also clash. But if, e.g. the fully qualified identfier had
been onetwo::clash, its mangled name would be _ZN6onetwo5clashE, as reported by
nm or readelf. Conversely of course if you wished to use objcopy to change _ZN6onetwo5clashE in
an object file to a symbol that will demangle as onetwo::klash, then that symbol will
be _ZN6onetwo5klashE.

Related

How linker allow multiple definitions of a function template in different object files but only allow one-definition of ordinary functions

I know how to use inline keyword to avoid 'multiple definition' while using C++ template. However, what I am curious is that how linker is distinguishing which specialization is full specialization and violating ODR and reporting error, while another specialization is implicit and correctly handle it?
From the nm output, we can see duplicated definitions in main.o and other.o for both int-version max() and char-version max(), but C++ linker only reports 'multiple definition error for char-version max()' but let 'char-version max() go a successful link? How linker differentiate them and does this?
// tmplhdr.hpp
#include <iostream>
// this function is instantiated in main.o and other.o
// but leads no 'multiple definition' error by linker
template<typename T>
T max(T a, T b)
{
std::cout << "match generic\n";
return (b<a)?a:b;
}
// 'multiple definition' link error if without inline
template<>
inline char max(char a, char b)
{
std::cout << "match full specialization\n";
return (b<a)?a:b;
}
// main.cpp
#include "tmplhdr.hpp"
extern int mymax(int, int);
int main()
{
std::cout << max(1,2) << std::endl;
std::cout << mymax(10,20) << std::endl;
std::cout << max('a','b') << std::endl;
return 0;
}
// other.cpp
#include "tmplhdr.hpp"
int mymax(int a, int b)
{
return max(a, b);
}
Test output on Ubuntu is reasonable; but output on Cygwin is rather strange and confusing...
==== Test on Cygwin ====
g++ linker only reported 'char max(char, char)' is duplicated.
$ g++ -o main.exe main.cpp other.cpp
/usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld:
/tmp/ccYivs3O.o:other.cpp:(.text$_Z3maxIcET_S0_S0_[_Z3maxIcET_S0_S0_]+0x0):
multiple definition of `char max<char>(char, char)';
/tmp/cc7HJqbS.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
I dumped my .o object file and found no many clues (maybe I am not quite familiar with object format spec.).
$ nm main.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <<-- implicit specialization
U mymax(int, int)
$ nm other.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
000000000000009b t _GLOBAL__sub_I__Z5mymaxii
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <-- implicit specialization
0000000000000000 T mymax(int, int)
==== Test on Ubuntu ====
This is what I have got on my Ubuntu with g++-9 after having remove inline from tmplhdr.hpp
tony#Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ g++ -o main main.o other.o
/usr/bin/ld: other.o: in function `char max<char>(char, char)':
other.cpp:(.text+0x0): multiple definition of `char max<char>(char, char)'; main.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
'char-version max()' is marked with T which is not allowed to have multiple definitions; but 'in-version max()' is marked as W which allows multiple definitions. However, I start to be curious why nm gives different marks on Cygwin than on Ubuntu?? and Why linker on Cgywin can handle two T definitions correctly?
tony#Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm main.o | grep max | c++filt
0000000000000133 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
U mymax(int, int)
tony#Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm other.o | grep max | c++filt
00000000000000d7 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
000000000000003e T mymax(int, int)
However, I start to be curious why nm gives different marks on Cygwin than on Ubuntu?? and Why linker on Cgywin can handle two T definitions correctly?
You need to understand that the nm output does not give you the full picture.
nm is part of binutils, and uses libbfd. The way this works is that various object file formats are parsed into libbfd-internal representation, and then tools like nm print that internal representation in human-readable format.
Some things get "lost in translation". This is the reason you should ~never use e.g. objdump to look at ELF files (at least not at the symbol table of the ELF files).
As you correctly deduced, the reason multiple max<int>() symbols are allowed on Linux is that the compiler emits them as a W (weakly defined) symbol.
The same is true for Windows, except Windows uses older COFF format, which doesn't have weak symbols. Instead, the symbol is emitted into a special .linkonce.$name section, and the linker knows that it can select any such section into the link, but should only do that once (i.e. it knows to discard all other duplicates of that section in any other object file).

How to export template instantiation as non weak?

C++ template functions are exported as weak symbols to work around the one definition rule (related question). In a situation where the function is explicitly instantiated for every use case, is there a way to export the symbol as non-weak?
Example use case:
// foo.hpp
template<typename T>
void foo();
// All allowed instantiations are explicitly listed.
extern template void foo<int>();
extern template void foo<short>();
extern template void foo<char>();
// foo.cpp
template<typename T>
void foo()
{
// actual implementation
}
// All explicit instantiations.
template void foo<int>();
template void foo<short>();
template void foo<char>();
When I compile the code above with GCC or ICC, they are tagged as weak:
$ nm foo.o
U __gxx_personality_v0
0000000000000000 W _Z3fooIcEvv
0000000000000000 W _Z3fooIiEvv
0000000000000000 W _Z3fooIsEvv
Is there a way to prevent that? Since they are actually definitive, I would want them to not be candidate for replacement.
objcopy supports the --weaken option, but you want the opposite.
It also supports the --globalize-symbol, but that appears to have no effect on weak symbols:
gcc -c t.cc
readelf -Ws t.o | grep _Z3fooI
14: 0000000000000000 7 FUNC WEAK DEFAULT 7 _Z3fooIiEvv
15: 0000000000000000 7 FUNC WEAK DEFAULT 8 _Z3fooIsEvv
16: 0000000000000000 7 FUNC WEAK DEFAULT 9 _Z3fooIcEvv
objcopy -w --globalize-symbol _Z3fooI* t.o t1.o &&
readelf -Ws t1.o | grep _Z3fooI
14: 0000000000000000 7 FUNC WEAK DEFAULT 7 _Z3fooIiEvv
15: 0000000000000000 7 FUNC WEAK DEFAULT 8 _Z3fooIsEvv
16: 0000000000000000 7 FUNC WEAK DEFAULT 9 _Z3fooIcEvv
Not to be deterred, we can first localize the symbols, then globalize them:
objcopy -w -L _Z3fooI* t.o t1.o &&
objcopy -w --globalize-symbol _Z3fooI* t1.o t2.o &&
readelf -Ws t2.o | grep _Z3fooI
14: 0000000000000000 7 FUNC GLOBAL DEFAULT 7 _Z3fooIiEvv
15: 0000000000000000 7 FUNC GLOBAL DEFAULT 8 _Z3fooIsEvv
16: 0000000000000000 7 FUNC GLOBAL DEFAULT 9 _Z3fooIcEvv
VoilĂ : the symbols are now strongly defined.
The problem I am trying to solve is that the link time is too slow and I want to reduce the work of the linker to the minimum.
If this makes the linker do less work (which I doubt), I'd consider that a bug in the linker -- if the symbol is defined once, it shouldn't matter to the linker whether that definition is strong or weak.

ld undefined reference, despite library found and symbols exported

Been fighting with this on and off for 48 hours now; I'm still getting undefined reference errors when attempting to link a dynamic library with its dependency - despite all exports existing, and the library being found successfully.
Scenario:
libmemory (C++) - exports functions with extern "C"
libstring (C) - exports functions, imports from libmemory
libmemory builds successfully:
$ g++ -shared -fPIC -o ./builds/libmemory.so ...$(OBJECTS)...
libstring compiles successfully, but fails to link:
$ gcc -shared -fPIC -o ./builds/libstring.so ...$(OBJECTS)... -L./builds -lmemory
./temp/libstring/string.o: In function `STR_duplicate':
string.c:(.text+0x1cb): undefined reference to `MEM_priv_alloc'
./temp/libstring/string.o: In function `STR_duplicate_replace':
string.c:(.text+0x2a0): undefined reference to `MEM_priv_free'
string.c:(.text+0x2bf): undefined reference to `MEM_priv_alloc'
/usr/bin/ld: ./builds/libstring.so: hidden symbol `MEM_priv_free' isn't defined
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
Verifying libmemory exports its symbols, and the library itself is found by using -v to gcc:
...
attempt to open ./builds/libmemory.so succeeded
-lmemory (./builds/libmemory.so)
...
$ nm -gC ./builds/libmemory.so | grep MEM_
0000000000009178 T MEM_exit
0000000000009343 T MEM_init
00000000000093e9 T MEM_print_leaks
00000000000095be T MEM_priv_alloc
000000000000971d T MEM_priv_free
00000000000099c1 T MEM_priv_realloc
0000000000009d26 T MEM_set_callback_leak
0000000000009d3f T MEM_set_callback_noleak
$ objdump -T ./builds/libmemory.so | grep MEM_
0000000000009d3f g DF .text 0000000000000019 Base MEM_set_callback_noleak
00000000000093e9 g DF .text 00000000000001d5 Base MEM_print_leaks
0000000000009d26 g DF .text 0000000000000019 Base MEM_set_callback_leak
00000000000099c1 g DF .text 0000000000000365 Base MEM_priv_realloc
0000000000009343 g DF .text 00000000000000a6 Base MEM_init
00000000000095be g DF .text 000000000000015f Base MEM_priv_alloc
000000000000971d g DF .text 00000000000002a4 Base MEM_priv_free
0000000000009178 g DF .text 00000000000000a7 Base MEM_exit
$ readelf -Ws ./builds/libmemory.so | grep MEM_
49: 0000000000009d3f 25 FUNC GLOBAL DEFAULT 11 MEM_set_callback_noleak
95: 00000000000093e9 469 FUNC GLOBAL DEFAULT 11 MEM_print_leaks
99: 0000000000009d26 25 FUNC GLOBAL DEFAULT 11 MEM_set_callback_leak
118: 00000000000099c1 869 FUNC GLOBAL DEFAULT 11 MEM_priv_realloc
126: 0000000000009343 166 FUNC GLOBAL DEFAULT 11 MEM_init
145: 00000000000095be 351 FUNC GLOBAL DEFAULT 11 MEM_priv_alloc
192: 000000000000971d 676 FUNC GLOBAL DEFAULT 11 MEM_priv_free
272: 0000000000009178 167 FUNC GLOBAL DEFAULT 11 MEM_exit
103: 0000000000009343 166 FUNC GLOBAL DEFAULT 11 MEM_init
108: 0000000000009178 167 FUNC GLOBAL DEFAULT 11 MEM_exit
148: 0000000000009d3f 25 FUNC GLOBAL DEFAULT 11 MEM_set_callback_noleak
202: 00000000000095be 351 FUNC GLOBAL DEFAULT 11 MEM_priv_alloc
267: 000000000000971d 676 FUNC GLOBAL DEFAULT 11 MEM_priv_free
342: 0000000000009d26 25 FUNC GLOBAL DEFAULT 11 MEM_set_callback_leak
346: 00000000000099c1 869 FUNC GLOBAL DEFAULT 11 MEM_priv_realloc
366: 00000000000093e9 469 FUNC GLOBAL DEFAULT 11 MEM_print_leaks
Is there something horribly simple I'm missing? All the other related questions to this have simple answers such as link library order, and the paths used - but I've already verified they're in place and working as expected.
Tinkering with -fvisibility led to no changes either.
The same result exists whether using clang or gcc.
Linux 3.16.0-38-generic
gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
You should mark MEM_priv_alloc() function as extern "C" or wrap function body into extern "C" { /* function implementation */ } (already done as I can see from description).
And use combination of __cplusplus and extern "C" in header
#ifdef __cplusplus
#define EXTERNC extern "C"
#else
#define EXTERNC
#endif
EXTERNC int MEM_priv_alloc (void);
Please also double check how MEM_priv_alloc prototype is done. For example, MEM_priv_alloc should not be inline (unfortunately I do not know phisics of this, however inline example failed for me)
Below is simplified example that works for me:
files:
$ ls
main.c Makefile mem.cpp mem.h strings.c strings.h
mem.cpp
#include <stdio.h>
#include "mem.h"
extern "C" int MEM_priv_alloc (void)
{
return 0;
}
mem.h
#ifdef __cplusplus
#define EXTERNC extern "C"
#else
#define EXTERNC
#endif
EXTERNC int MEM_priv_alloc (void);
strings.c:
#include <stdio.h>
#include "mem.h"
int STR_duplicate_replace (void)
{
MEM_priv_alloc();
}
strings.h:
int STR_duplicate_replace (void);
main.c
#include <stdio.h>
#include "string.h"
int main (void)
{
STR_duplicate_replace ();
return 0;
}
Makefile:
prog: libmemory.so libstrings.so
gcc -o $# -L. -lmemory -lstrings main.c
libmemory.so: mem.cpp
g++ -shared -fPIC -o $# $^
libstrings.so: strings.c
gcc -shared -fPIC -o $# $^
and the build log:
g++ -shared -fPIC -o libmemory.so mem.cpp
gcc -shared -fPIC -o libstrings.so strings.c
gcc -o prog -L. -lmemory -lstrings main.c
please also see more details in
Using C++ library in C code and
wiki How to mix C and C++
So, I was stripping out the final parts of the amalgamation, and uncovered the issue.
My import/export is modelled off of this: https://gcc.gnu.org/wiki/Visibility
My equivalent implementation ends up looking like this:
# if GCC_IS_V4_OR_LATER
# define DLLEXPORT __attribute__((visibility("default")))
# define DLLIMPORT __attribute__((visibility("hidden")))
# else
# define DLLEXPORT
# define DLLIMPORT
# endif
The DLLIMPORT (visibility hidden) is causing the problem; I replace it with a blank definition and it's all ok. Yes, I did also have the equivalent for clang, which is why that also failed in the same way.
My takeaway from this is that the C code only ever saw these would-be-symbols as hidden, and therefore couldn't import them no matter how hard it tried, and however much they actually existed!

Am I guaranteed to not be bitten by this ODR violation?

I have a header file that declares a template with a static variable and also defines it:
/* my_header.hpp */
#ifndef MY_HEADER_HPP_
#define MY_HEADER_HPP_
#include <cstdio>
template<int n>
struct foo {
static int bar;
static void dump() { printf("%d\n", bar); }
};
template<int n>
int foo<n>::bar;
#endif // MY_HEADER_HPP_
This header is included both by main.cpp and a shared library mylib. In particular, mylib_baz.hpp just includes this template and declares a function that modifies a specialization of the template.
/* mylib_baz.hpp */
#ifndef MYLIB_BAZ_HPP_
#define MYLIB_BAZ_HPP_
#include "my_header.hpp"
typedef foo<123> mylib_foo;
void init_mylib_foo();
#endif // MYLIB_BAZ_HPP_
and
/* mylib_baz.cpp */
#include "mylib_baz.hpp"
void init_mylib_foo() {
mylib_foo::bar = 123;
mylib_foo::dump();
};
When I make mylib.so (containing mylib_baz.o), the symbol for foo<123>::bar is present and declared weak. However, the symbol for foo<123>::bar is declared weak also in my main.o:
/* main.cpp */
#include "my_header.hpp"
#include "mylib_baz.hpp"
int main() {
foo<123>::bar = 456;
foo<123>::dump(); /* outputs 456 */
init_mylib_foo(); /* outputs 123 */
foo<123>::dump(); /* outputs 123 -- is this guaranteed? */
}
It appears that I am violating one definition rule (foo<123>::bar defined both in my_header.cpp and main.cpp). However, both with g++ and clang the symbols are declared weak (or unique), so I am not getting bitten by this -- all accesses to foo<123>::bar modify the same object.
Question 1: Is this a coincidence (maybe ODR works differently for static members of templates?) or am I in fact guaranteed this behavior by the standard?
Question 2: How could I have predicted the behavior I'm observing? That is, what exactly makes the compiler declare symbol weak?
There is no ODR violation. You have one definition of bar. It is here:
template<int n>
int foo<n>::bar; // <==
As bar is static, that indicates that there is one definition across all translation units. Even though bar will show up once in all of your object files (they need a symbol for it, after all), the linker will merge them together to be the one true definition of bar. You can see that:
$ g++ -std=c++11 -c mylib_baz.cpp -o mylib_baz.o
$ g++ -std=c++11 -c main.cpp -o main.o
$ g++ main.o mylib_baz.o -o a.out
Produces:
$ nm mylib_baz.o | c++filt | grep bar
0000000000000000 u foo<123>::bar
$ nm main.o | c++filt | grep bar
0000000000000000 u foo<123>::bar
$ nm a.out | c++filt | grep bar
0000000000601038 u foo<123>::bar
Where u indicates:
"u"
The symbol is a unique global symbol. This is a GNU extension to the standard set of ELF symbol bindings. For such a symbol the dynamic linker will make sure that in the entire process there is just one symbol with this name and type in use.

ld of data file makes size of data an *ABS* and not an integer

I have a c++ program which includes an external dependency on an empty xlsx file. To remove this dependency I converted this file to a binary object in view of linking it in directly, using:
ld -r -b binary -o template.o template.xlsx
followed by
objcopy --rename-section .data=.rodata,alloc,load,readonly,data,contents template.o template.o
Using objdump, I can see three variables declared :
$ objdump -x template.o
template.o: file format elf64-x86-64
template.o
architecture: i386:x86-64, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .rodata 00000fd1 0000000000000000 0000000000000000 00000040 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000fd1 g *ABS* 0000000000000000 _binary_template_xlsx_size
0000000000000000 g .rodata 0000000000000000 _binary_template_xlsx_start
0000000000000fd1 g .rodata 0000000000000000 _binary_template_xlsx_end
I then tell my program about this data :
template.h:
#ifndef TEMPLATE_H
#define TEMPLATE_H
#include <cstddef>
extern "C" {
extern const char _binary_template_xlsx_start[];
extern const char _binary_template_xlsx_end[];
extern const int _binary_template_xlsx_size;
}
#endif
This compiles and links fine,(although I am having some trouble automating it with cmake, see here : compile and add object file from binary with cmake)
However, when I use _binary_template_xlsx_size in my code, it is interpreted as a pointer to an address that doesn't exist. So to get the size of my data, I have to pass (int)&_binary_template_xlsx_size (or (int)(_binary_template_xlsx_end - _binary_template_xlsx_start))
Some research tells me that the *ABS* in the objdump above means "absolute value" but I don't get why. How can I get my c++ (or c) program to see the variable as an int and not as a pointer?
An *ABS* symbol is an absolute address; it's more often created by passing --defsym foo=0x1234 to ld.
--defsym symbol=expression
Create a global symbol in the output file, containing the absolute
address given by expression. [...]
Because an absolute symbol is a constant, it's not possible to link it into a C source file as a variable; all C object variables have an address, but a constant doesn't.
To make sure you don't dereference the address (i.e. read the variable) by accident, it's best to define it as const char [] as you have with the other symbols:
extern const char _binary_template_xlsx_size[];
If you want to make sure you're using it as an int, you could use a macro:
extern const char _abs_binary_template_xlsx_size[] asm("_binary_template_xlsx_size");
#define _binary_template_xlsx_size ((int) (intptr_t) _abs_binary_template_xlsx_size)