Why does g++ generate two constructors with different name manglings? [duplicate] - c++

This question already has an answer here:
Dual emission of constructor symbols
(1 answer)
Closed 9 years ago.
Test case as follows:
// test.cpp
class X {
public:
X();
};
X::X() { }
void foo() {
X x;
}
Compile it and read the symbols in the object file like this:
[root#localhost tmp]# g++ -c test.cpp
[root#localhost tmp]# readelf -s -W test.o
Symbol table '.symtab' contains 12 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS test.cpp
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 0 SECTION LOCAL DEFAULT 6
6: 0000000000000000 0 SECTION LOCAL DEFAULT 7
7: 0000000000000000 0 SECTION LOCAL DEFAULT 5
8: 0000000000000000 10 FUNC GLOBAL DEFAULT 1 _ZN1XC2Ev => X::X()
9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __gxx_personality_v0
10: 0000000000000000 10 FUNC GLOBAL DEFAULT 1 _ZN1XC1Ev => X::X()
11: 000000000000000a 22 FUNC GLOBAL DEFAULT 1 _Z3foov
[root#localhost tmp]# c++filt _ZN1XC1Ev
X::X()
[root#localhost tmp]# c++filt _ZN1XC2Ev
X::X()
Why does g++ generate two constructors with different name manglings(_ZN1XC1Ev and _ZN1XC2Ev)?

It is a known defect from G++. Refer Known g++ bugs
G++ emits two copies of constructors and destructors.
In general there are three types of constructors (and destructors).
1.The complete object constructor/destructor.
2.The base object constructor/destructor.
3.The allocating constructor/deallocating destructor.
The first two are different, when virtual base classes are involved.
If you want to know more about this three types of ctors and dtors, refer link

Related

How does the compiler know the name of captured variables in an std::function?

Something surprised me, and that was that if I assign a capturing lambda (syntactic sugar for a constructed object with an overloaded function call operator) to an std::function during debug at runtime the compiler tells me the names of the members of that functor/class:
std::function<void()> callable1;
std::function<void()> callable2;
std::function<void()> callable3;
int main()
{
int aa = 1;
int bb = 1;
callable1 = [dog = aa, cat = bb]() {
return 5;
};
callable2 = [kitchen = aa, lounge = bb]() {
return 5;
};
callable3 = [hammer = aa, sickle = bb]() {
return 5;
};
std::vector < std::function<void()>> callables;
callables.push_back(callable1);
callables.push_back(callable2);
callables.push_back(callable3);
}
And what I see on the compiler is:
That's very impressive, I'm wondering how it's done. Whatever technique is being used I'm guessing is a very method of reflection I would love to learn about because c++ doesn't have reflection features. The funny thing is that the size of an std::function is 64 bytes on my machine whether is debug or release mode, so I don't know how they're storing the strings as names of the members.
How does the compiler know the name of captured variables in an std::function?
On Linux, gcc -g will add additional sections to the ELF executable.
$ readelf -S ./a.out | grep debug
[29] .debug_aranges PROGBITS 0000000000000000 0000561c
[30] .debug_info PROGBITS 0000000000000000 00005aec
[31] .debug_abbrev PROGBITS 0000000000000000 0000cf7e
[32] .debug_line PROGBITS 0000000000000000 0000da6c
[33] .debug_str PROGBITS 0000000000000000 0000ed66
[34] .debug_ranges PROGBITS 0000000000000000 0001499c
These sections contain debugging information in DWARF format about the program. For example in .debug_str there are strings referenced by debugging information in .debug_info:
$ readelf -x .debug_str ./a.out | grep kitchen
0x00005ae0 656e005f 5f6b6974 6368656e 005f5f73 en.__kitchen.__s
$ readelf -x .debug_str ./a.out | grep dog
0x00004170 5f646f67 005f5a4e 53743676 6563746f _dog._ZNSt6vecto
$ readelf -x .debug_str ./a.out | grep sickle
0x000042a0 64005f5f 7369636b 6c650077 6373746f d.__sickle.wcsto
Debugger inspects the std::function, finds a function pointer. Then scans all entries inside .debug_info about debugging information associated with that pointer. Then parses that information and displays parsed results in your IDE.
How can I get this information? I mean the string names and offsets, etc?
Usually and most commonly, you access that information with the debugger. It is the tool to access that information. You can use libdwarf and efitools to access that information from your own programs.
For example, you can include ELF library in your program and open(argv[0]) and then parse the debugging sections in your own ELF executable and display information about it. Typically, inspecting ELF itself is used when displaying nice stack traces on C++ exceptions.

How to export template instantiation as non weak?

C++ template functions are exported as weak symbols to work around the one definition rule (related question). In a situation where the function is explicitly instantiated for every use case, is there a way to export the symbol as non-weak?
Example use case:
// foo.hpp
template<typename T>
void foo();
// All allowed instantiations are explicitly listed.
extern template void foo<int>();
extern template void foo<short>();
extern template void foo<char>();
// foo.cpp
template<typename T>
void foo()
{
// actual implementation
}
// All explicit instantiations.
template void foo<int>();
template void foo<short>();
template void foo<char>();
When I compile the code above with GCC or ICC, they are tagged as weak:
$ nm foo.o
U __gxx_personality_v0
0000000000000000 W _Z3fooIcEvv
0000000000000000 W _Z3fooIiEvv
0000000000000000 W _Z3fooIsEvv
Is there a way to prevent that? Since they are actually definitive, I would want them to not be candidate for replacement.
objcopy supports the --weaken option, but you want the opposite.
It also supports the --globalize-symbol, but that appears to have no effect on weak symbols:
gcc -c t.cc
readelf -Ws t.o | grep _Z3fooI
14: 0000000000000000 7 FUNC WEAK DEFAULT 7 _Z3fooIiEvv
15: 0000000000000000 7 FUNC WEAK DEFAULT 8 _Z3fooIsEvv
16: 0000000000000000 7 FUNC WEAK DEFAULT 9 _Z3fooIcEvv
objcopy -w --globalize-symbol _Z3fooI* t.o t1.o &&
readelf -Ws t1.o | grep _Z3fooI
14: 0000000000000000 7 FUNC WEAK DEFAULT 7 _Z3fooIiEvv
15: 0000000000000000 7 FUNC WEAK DEFAULT 8 _Z3fooIsEvv
16: 0000000000000000 7 FUNC WEAK DEFAULT 9 _Z3fooIcEvv
Not to be deterred, we can first localize the symbols, then globalize them:
objcopy -w -L _Z3fooI* t.o t1.o &&
objcopy -w --globalize-symbol _Z3fooI* t1.o t2.o &&
readelf -Ws t2.o | grep _Z3fooI
14: 0000000000000000 7 FUNC GLOBAL DEFAULT 7 _Z3fooIiEvv
15: 0000000000000000 7 FUNC GLOBAL DEFAULT 8 _Z3fooIsEvv
16: 0000000000000000 7 FUNC GLOBAL DEFAULT 9 _Z3fooIcEvv
VoilĂ : the symbols are now strongly defined.
The problem I am trying to solve is that the link time is too slow and I want to reduce the work of the linker to the minimum.
If this makes the linker do less work (which I doubt), I'd consider that a bug in the linker -- if the symbol is defined once, it shouldn't matter to the linker whether that definition is strong or weak.

Avoid multiple definition linker error when not using the redefined symbols

I try to build an executable that links to various shared and static libraries. It turns out that two of the static libraries both define the same symbol, which results in a multiple definition linker error. My executable doesn't use this symbol so it's not really a concern.
I can avoid the error by adding the --allow-multiple-definitions flag but that seems like a nuclear option. I would like the linker to complain if I try to use a multiple-time defined symbol.
Is there a way to tell the linker "complain for multiple definitions only if the symbol is used"? Or alternatively tell it, "from lib ABC ignore symbol XYZ". I am developing with g++ on linux.
You may have a one variant of the problem or a different variant,
depending on facts whose relevance you haven't yet considered. Or possibly you have a mixture of both, so I'll walk through a solution to each variant.
You should be familiar with the nature of static libraries and how they are consumed in linkage,
as summarised here
The Superflous Globals Symbols Variant
Here are a couple of source files and a header file:
one.cpp
#include <onetwo.h>
int clash = 1;
int get_one()
{
return clash;
}
two.cpp
#include <onetwo.h>
int get_two()
{
return 2;
}
onetwo.h
#pragma once
extern int get_one();
extern int get_two();
These have been built into a static library libonetwo.a
$ g++ -Wall -Wextra -pedantic -I. -c one.cpp two.cpp
$ ar rcs libonetwo.a one.o two.o
whose intended API is defined in onetwo.h
Simarily, some other source files and a header have been
built into a static libary libfourfive.a whose intended API
is defined in fourfive.h
four.cpp
#include <fourfive.h>
int clash = 4;
int get_four()
{
return clash;
}
five.cpp
#include <fourfive.h>
int get_five()
{
return 5;
}
fourfive.h
#pragma once
extern int get_four();
extern int get_five();
And here's the source of a program that depends on both libraries:
prog.cpp
#include <onetwo.h>
#include <fourfive.h>
int main()
{
return get_one() + get_four();
}
which we try to build like so:
$ g++ -Wall -Wextra -pedantic -I. -c prog.cpp
$ g++ -o prog prog.o -L. -lonetwo -lfourfive
/usr/bin/ld: ./libfourfive.a(four.o):(.data+0x0): multiple definition of `clash'; ./libonetwo.a(one.o):(.data+0x0): first defined here
collect2: error: ld returned 1 exit status
encountering a name-collision for the symbol clash, because it is globally defined in two of the
object files that the linkage requires, one.o and four.o:
$ readelf -s libonetwo.a libfourfive.a | egrep '(File|Symbol|OBJECT|FUNC)'
File: libonetwo.a(one.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 clash
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z7get_onev
File: libonetwo.a(two.o)
Symbol table '.symtab' contains 10 entries:
9: 0000000000000000 15 FUNC GLOBAL DEFAULT 1 _Z7get_twov
File: libfourfive.a(four.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 clash
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z8get_fourv
File: libfourfive.a(five.o)
Symbol table '.symtab' contains 10 entries:
9: 0000000000000000 15 FUNC GLOBAL DEFAULT 1 _Z8get_fivev
The problem symbol clash is not referenced in our own code, prog.(cpp|o). You wondered:
Is there a way to tell the linker "complain for multiple definitions only if the symbol is used"?
No there isn't, but that's immaterial. one.o would not have been extracted from libonetwo.a
and linked into the program if the linker didn't need it to resolve some symbol. It needed it
to resolve get_one. Likewise it only linked four.o because it's needed to resolve get_four.
So the colliding definitions of clash are in the linkage. And although prog.o doesn't use clash,
it does use get_one, which uses clash and which intends to use the defintion of clash in one.o.
Likewise prog.o uses get_four, which uses clash and intends to use the different definition in four.o.
Even if clash was unused by each libary as well as the program, the fact that it is defined
in multiple object files that must be linked into the program means that the program will contain
multiple definitions of it, and only --allow-multiple-definitions will allow that.
In that light you'll also see that:
Or alternatively [is there a way to] tell it, "from lib ABC ignore symbol XYZ".
in general won't fly. If we could tell the linker to ignore (say) the definition of clash
in four.o and resolve the symbol everywhere to the definition in one.o (the only other candidate) then get_four() would return 1 instead of 4 in our program. That
is in fact the effect of --allow-multiple-definitions, since it causes the first definition
in the linkage to be used.
By inspection of the source code of libonetwo.a (or libfourfive.a) we can fairly confidently spot the root
cause of the problem. The symbol clash has been left with external linkage where
it only needed internal linkage, since it isn't declared in the associated
header file and is referenced nowhere in the libary except
in the file where it's defined. The offending source files should have written:
one_good.cpp
#include <onetwo.h>
namespace {
int clash = 1;
}
int get_one()
{
return clash;
}
four_good.cpp
#include <fourfive.h>
namespace {
int clash = 4;
}
int get_four()
{
return clash;
}
and all would be good:
$ g++ -Wall -Wextra -pedantic -I. -c one_good.cpp four_good.cpp
$ readelf -s one_good.o four_good.o | egrep '(File|Symbol|OBJECT|FUNC)'
File: one_good.o
Symbol table '.symtab' contains 11 entries:
5: 0000000000000000 4 OBJECT LOCAL DEFAULT 3 _ZN12_GLOBAL__N_15clashE
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z7get_onev
File: four_good.o
Symbol table '.symtab' contains 11 entries:
5: 0000000000000000 4 OBJECT LOCAL DEFAULT 3 _ZN12_GLOBAL__N_15clashE
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z8get_fourv
$ g++ -o prog prog.o one_good.o four_good.o
$./prog; echo $?
5
Since re-writing the source code like that is not a option, we have to modify the object files to the
same effect. The tool for this is objcopy.
$ objcopy --localize-symbol=clash libonetwo.a libonetwo_good.a
This command has the same effect as running:
$ objcopy --localize-symbol=clash orig.o fixed.o
on each of the object files libonetwo(orig.o) to output a fixed object file fixed.o,
and archiving all the fixed.o files in a new static library libonetwo_good.a. And the
effect of --localize-symbol=clash, on each object file, is to change the linkage of
the symbol clash, if defined, from external (GLOBAL) to internal (LOCAL):
$ readelf -s libonetwo_good.a | egrep '(File|Symbol|OBJECT|FUNC)'
File: libonetwo_good.a(one.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT LOCAL DEFAULT 3 clash
10: 0000000000000000 16 FUNC GLOBAL DEFAULT 1 _Z7get_onev
File: libonetwo_good.a(two.o)
Symbol table '.symtab' contains 10 entries:
Now the linker cannot see the LOCAL definition of clash in libonetwo_good.a(one.o).
That's sufficient to head off the multiple definition error, but since libfourfive.a has
the same defect, we'll fix it too:
$ objcopy --localize-symbol=clash libfourfive.a libfourfive_good.a
And then we can relink prog successfully, using the fixed libraries.
$ g++ -o prog prog.o -L. -lonetwo_good -lfourfive_good
$ ./prog; echo $?
5
The Global Symbols Deadlock Variant
In this scenario, the sources and headers for libonetwo.a are:
one.cpp
#include <onetwo.h>
#include "priv_onetwo.h"
int inc_one()
{
return inc(clash);
}
two.cpp
#include <onetwo.h>
#include "priv_onetwo.h"
int inc_two()
{
return inc(clash + 1);
}
priv_onetwo.cpp
#include "priv_onetwo.h"
int clash = 1;
int inc(int i)
{
return i + 1;
}
priv_onetwo.h
#pragma once
extern int clash;
extern int inc(int);
onetwo.h
#pragma once
extern int inc_one();
extern int inc_two();
And for libfourfive.a they are:
four.cpp
#include <fourfive.h>
#include "priv_fourfive.h"
int dec_four()
{
return dec(clash);
}
five.cpp
#include <fourfive.h>
#include "priv_fourfive.h"
int dec_five()
{
return dec(clash + 1);
}
priv_fourfive.cpp
#include "priv_fourfive.h"
int clash = 4;
int dec(int i)
{
return i - 1;
}
priv_fourfive.h
#pragma once
extern int clash;
extern int dec(int);
fourfive.h
#pragma once
extern int dec_four();
extern int dec_five();
Each of these libraries is built with some common internals defined
in a source file - (priv_onetwo.cpp|priv_fourfive.cpp)
- and these internals are globally declared for building the library through a private header
- (priv_onetwo.h|priv_fourfive.h) - that is not distributed with the library.
They are undocumented symbols but nevertheless exposed to the linker.
Now there are two files in each library make that undefined (UND) references
to the global symbol clash, which is defined in another file:
$ readelf -s libonetwo.a libfourfive.a | egrep '(File|Symbol|OBJECT|FUNC|clash)'
File: libonetwo.a(one.o)
Symbol table '.symtab' contains 13 entries:
9: 0000000000000000 23 FUNC GLOBAL DEFAULT 1 _Z7inc_onev
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash
File: libonetwo.a(two.o)
Symbol table '.symtab' contains 13 entries:
9: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 _Z7inc_twov
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash
File: libonetwo.a(priv_onetwo.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 clash
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 _Z3inci
File: libfourfive.a(four.o)
Symbol table '.symtab' contains 13 entries:
9: 0000000000000000 23 FUNC GLOBAL DEFAULT 1 _Z8dec_fourv
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash
File: libfourfive.a(five.o)
Symbol table '.symtab' contains 13 entries:
9: 0000000000000000 26 FUNC GLOBAL DEFAULT 1 _Z8dec_fivev
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash
File: libfourfive.a(priv_fourfive.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 clash
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 _Z3deci
Our program source this time is:
prog.cpp
#include <onetwo.h>
#include <fourfive.h>
int main()
{
return inc_one() + dec_four();
}
and:
$ g++ -Wall -Wextra -pedantic -I. -c prog.cpp
$ g++ -o prog prog.o -L. -lonetwo -lfourfive
/usr/bin/ld: ./libfourfive.a(priv_fourfive.o):(.data+0x0): multiple definition of `clash'; ./libonetwo.a(priv_onetwo.o):(.data+0x0): first defined here
collect2: error: ld returned 1 exit status
once again clash is multiply defined. To resolve inc_one in main, the
linker needed one.o, which obliged it to resolve inc, which made it need
priv_onetwo.o, which contains the first definition of clash. To resolve dec_four in main, the
linker needed four.o, which obliged it to resolve dec, which made it need
priv_fourfive.o, which contains a rival definition of clash.
In this scenario, it isn't a coding error in either library that clash has external linkage.
It needs to have external linkage. Localizing the definition of clash with objcopy in either of
libonetwo.a(priv_onetwo.o) or libfourfive.a(priv_fourfive.o) will not work. If we do
that the linkage will succeed but output a bugged program, because the linker will resolve
clash to the one surviving GLOBAL definition from the other object file: then dec_four()
will return 0 instead of 3 in the program, dec_five() will return 1 not 4 ; or else inc_one() will return 5 and inc_two() will return 6.
And if we localize both definitions then no definition of clash will be found in the
linkage of prog to satisfy the references in one.o or four.o, and it will fail for undefined reference to clash
This time objcopy comes to the rescue again, but with a different option1:
$ objcopy --redefine-sym clash=clash_onetwo libonetwo.a libonetwo_good.a
The effect of this command is to create a new static library libonetwo_good.a,
containing new object files that are pairwise the same as those in libonetwo.a,
except that the symbol clash has been everywhere replaced with clash_onetwo:
$ readelf -s libonetwo_good.a | egrep '(File|Symbol|clash)'
File: libonetwo_good.a(one.o)
Symbol table '.symtab' contains 13 entries:
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash_onetwo
File: libonetwo_good.a(two.o)
Symbol table '.symtab' contains 13 entries:
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND clash_onetwo
File: libonetwo_good.a(priv_onetwo.o)
Symbol table '.symtab' contains 11 entries:
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 clash_onetwo
We'll do the corresponding thing with libfourfive.a:
$ objcopy --redefine-sym clash=clash_fourfive libfourfive.a libfourfive_good.a
Now we're good to go once more:
$ g++ -o prog prog.o -L. -lonetwo_good -lfourfive_good
$ ./prog; echo $?
5
Of the two solutions, use the fix for The Superflous Globals Symbols Variant if
superflous globals is what you've got, although the fix for the The Global Symbols Deadlock Variant
would also work. It is never desirable to tamper with object files
between compilation and linkage; it can only be unavoidable or the lesser of evils. But
if you're going to tamper with them, localizing a global symbol that should never have been global
is a more transparent tampering than changing the name of a symbol to one that
has no origin in source code.
[1] Don't forget that if you want to use objcopy with any option argument that
is a symbol in a C++ object file, you have to use the mangled name of the
C++ identifier than maps to the symbol. In this demo code it happens that the mangled name of the
C++ identifier clash is also clash. But if, e.g. the fully qualified identfier had
been onetwo::clash, its mangled name would be _ZN6onetwo5clashE, as reported by
nm or readelf. Conversely of course if you wished to use objcopy to change _ZN6onetwo5clashE in
an object file to a symbol that will demangle as onetwo::klash, then that symbol will
be _ZN6onetwo5klashE.

Clang++ 3.5.0 -rdynamic

I'm compiling c++ code and I'm trying to add in the -rdynamic option so I can print out a meaningful stack trace for debugging my c++ program, but clang throws back a warning saying "argument unused during compilation: '-rdynamic'".
As a test, on my system I've tried writing a simple c++ program and compiling it with -rdynamic and it worked no problem, but with this project it doesn't seem to go.
Any advice is much appricated
You are likely using the -rdynamic flag when you are just compiling the source code, not linking it.
It's a flag for the linker, so you only need it when linking.
Some versions of clang might not recognize it, in which case you can just instruct clang to pass the proper option to the linker, which commonly is:
-Wl,--export-dynamic
So, e.g.
clang++ -rdynamic test.cpp
or
clang++ --Wl,--export-dynamic test.cpp
But if you are compiling and linking separately, only use it at the linking stage:
clang++ -c test.cpp
clang++ --Wl,--export-dynamic test.o
(or as the last step: clang++ -rdynamic test.o)
The nos's answer is right ,and help me a lot.
One little tip,--Wl,--export-dynamic should be -Wl,--export-dynamic
And there ars some ways to make sure -rdynamic worked.
Use readelf -s to see the ELF symbos:
e.g.
$ cat t.c
#include <stdio.h>
void bar() {}
void baz() {}
void foo() {}
int main() { foo(); printf("test"); return 0; }
$ clang -O0 -o test t.c
$ readelf -s test >test.elf
Symbol table '.dynsym' contains 7 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.17 (2)
3: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND abort#GLIBC_2.17 (2)
5: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf#GLIBC_2.17 (2)
$ clang -rdynamic -O0 -o test1 t.c
$ readelf -s test1 >test1.elf
Symbol table '.dynsym' contains 24 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.17 (2)
3: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND abort#GLIBC_2.17 (2)
5: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf#GLIBC_2.17 (2)
7: 0000000000420038 0 NOTYPE GLOBAL DEFAULT 25 _bss_end__
8: 00000000004009a0 68 FUNC GLOBAL DEFAULT 14 main
9: 0000000000420030 0 NOTYPE GLOBAL DEFAULT 25 __bss_start__
10: 0000000000420030 0 NOTYPE GLOBAL DEFAULT 25 __bss_start
11: 0000000000400994 4 FUNC GLOBAL DEFAULT 14 bar
12: 0000000000400a7c 4 OBJECT GLOBAL DEFAULT 16 _IO_stdin_used
13: 0000000000420038 0 NOTYPE GLOBAL DEFAULT 25 _end
14: 0000000000420038 0 NOTYPE GLOBAL DEFAULT 25 __end__
15: 0000000000420020 0 NOTYPE GLOBAL DEFAULT 24 __data_start
16: 0000000000420030 0 NOTYPE GLOBAL DEFAULT 24 _edata
17: 0000000000400a68 4 FUNC GLOBAL DEFAULT 14 __libc_csu_fini
18: 000000000040099c 4 FUNC GLOBAL DEFAULT 14 foo
19: 00000000004009e8 128 FUNC GLOBAL DEFAULT 14 __libc_csu_init
20: 00000000004008a0 0 FUNC GLOBAL DEFAULT 14 _start
21: 0000000000420020 0 NOTYPE WEAK DEFAULT 24 data_start
22: 0000000000400998 4 FUNC GLOBAL DEFAULT 14 baz
23: 0000000000420038 0 NOTYPE GLOBAL DEFAULT 25 __bss_end__
You will see all symbols in .dynsym ,not only used ones.
And there are some interesting test about strip influence in -rdynamic flag in:
gcc debug symbols (-g flag) vs linker's -rdynamic option

Symbol not found when using template defined in a library

I'm trying to use the adobe xmp library in an iOS application but I'm getting link errors. I have the appropriate headers and libraries in my path, but I'm getting link errors. I double-checked to make sure the headers and library are on my path. I checked the mangled names of the methods, but they aren't in the library (I checked using the nm command). What am I doing wrong?
Library Header:
#if defined ( TXMP_STRING_TYPE )
#include "TXMPMeta.hpp"
#include "TXMPIterator.hpp"
#include "TXMPUtils.hpp"
typedef class TXMPMeta <TXMP_STRING_TYPE> SXMPMeta; // For client convenience.
typedef class TXMPIterator <TXMP_STRING_TYPE> SXMPIterator;
typedef class TXMPUtils <TXMP_STRING_TYPE> SXMPUtils;
.mm file:
#include <string>
using namespace std;
#define IOS_ENV
#define TXMP_STRING_TYPE string
#import "XMP.hpp"
void DoStuff()
{
SXMPMeta meta;
string returnValue;
meta.SetProperty ( kXMP_NS_PDF, "test", "{ formId: {guid} }" );
meta.DumpObject(DumpToString, &returnValue);
}
Link Errors:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::DumpObject(int (*)(void*, char const*, unsigned int), void*) const", referenced from:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::TXMPMeta()", referenced from:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::SetProperty(char const*, char const*, char const*, unsigned int)", referenced from:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::~TXMPMeta()", referenced from:
(null): Linker command failed with exit code 1 (use -v to see invocation)
Basically what's happened is that you only have the definitions in the headers
if I say
template<class T> T something(T); somewhere, that tells the compiler "trust me bro, it exists, leave it to the linker"
and it adds the symbol to the object file as if it did exist. Because it can see the prototype it knows how much stack space, what type it returns and such, so it just sets it up so the linker can just come along and put the address of the function in.
BUT in your case there is no address. You /MUST/ have the template definition (not just declaration) in the same file so the compiler can create one (with weak linkage) so here it's assumed they exist, but no where does it actually stamp out this class from a template, so the linker doesn't find it, hence the error.
Will fluff out my answer now, hope this helps.
Addendum 1:
template<class T> void output(T&);
int main(int,char**) {
int x = 5;
output(x);
return 0;
}
This will compile but NOT link.
Output:
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/main.cpp >> build/main.o.d ; then rm build/main.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/main.cpp -o build/main.o
g++ build/main.o -o a.out
build/main.o: In function `main':
(my home)/src/main.cpp:13: undefined reference to `void output<int>(int&)'
collect2: error: ld returned 1 exit status
make: *** [a.out] Error 1
(I hijacked an open projet for this hence the names)
As you can see the compile command works fine (the one that ends in -o build/main.o) because we tell it "look this function exists"
So in the object file it says to the linker (in some "name managled form" to keep the templates) "put the location in memory of void output(int&); here" the linker can't find it.
Compiles and links
#include <iostream>
template<class T> void output(T&);
int main(int,char**) {
int x = 5;
output(x);
return 0;
}
template<class T> void output(T& what) {
std::cout<<what<<"\n";
std::cout.flush();
}
Notice line 2, we tell it "there exists a function, a template in T called output, that returns nothing and takes a T reference", that means it can use it in the main function (remember when it's parsing the main function it hasn't seen the definition of output yet, it has just been told it exists), the linker then fixes that. 'though modern compilers are much much smarter (because we have more ram :) ) and rape the structure of your code, link-time-optimisation does this even more, but this is how it used to work, and how it can be considered to work these days.
Output:
make all
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/main.cpp >> build/main.o.d ; then rm build/main.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/main.cpp -o build/main.o
g++ build/main.o -o a.out
As you can see it compiled fine and linked fine.
Multiple files without include as proof of this
main.cpp
#include <iostream>
int TrustMeCompilerIExist();
int main(int,char**) {
std::cout<<TrustMeCompilerIExist();
std::cout.flush();
return 0;
}
proof.cpp
int TrustMeCompilerIExist() {
return 5;
}
Compile and link
make all
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/main.cpp >> build/main.o.d ; then rm build/main.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/main.cpp -o build/main.o
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/proof.cpp >> build/proof.o.d ; then rm build/proof.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/proof.cpp -o build/proof.o
g++ build/main.o build/proof.o -o a.out
(Outputs 5)
Remember #include LITERALLY dumps a file where it says "#include" (+ some other macros that adjust line numbers) this is called a translation unit. Rather than using a header file to contain "int TrustMeCompilerIExist();" which declares that the function exists (but the compiler again doesn't know where it is, the code inside of it, just that it exists) I repeated myself.
Lets look at proof.o
command
objdump proof.o -t
output
proof.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 proof.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .debug_info 0000000000000000 .debug_info
0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev
0000000000000000 l d .debug_aranges 0000000000000000 .debug_aranges
0000000000000000 l d .debug_line 0000000000000000 .debug_line
0000000000000000 l d .debug_str 0000000000000000 .debug_str
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 g F .text 0000000000000006 _Z21TrustMeCompilerIExistv
Right at the bottom there, there's a function, at offset 6 into the file, with debugging information, (the g is global though) you can see it's called _Z (this is why _ is reserved for some things, I forget what exactly... but it's to do with this) and Z is "integer", 21 is the name length, and after the name, the v is "void" the return type.
The zeros at the start btw are the section number, remember binaries can be HUGE.
Disassembly
running:
objdump proof.o -S gives
proof.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z21TrustMeCompilerIExistv>:
int TrustMeCompilerIExist() {
return 5;
}
0: b8 05 00 00 00 mov $0x5,%eax
5: c3 retq
Because I have -g you can see it put the code that the assembly relates to (it makes more sense with bigger functions, it shows you what the following instructions until the next code block actually do) that wouldn't normally be there.
main.o
Here's the symbol table, obtained the same way as the above:
objdump main.o -t
main.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 main.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .text.startup 0000000000000000 .text.startup
0000000000000030 l F .text.startup 0000000000000026 _GLOBAL__sub_I_main
0000000000000000 l O .bss 0000000000000001 _ZStL8__ioinit
0000000000000000 l d .init_array 0000000000000000 .init_array
0000000000000000 l d .debug_info 0000000000000000 .debug_info
0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev
0000000000000000 l d .debug_loc 0000000000000000 .debug_loc
0000000000000000 l d .debug_aranges 0000000000000000 .debug_aranges
0000000000000000 l d .debug_ranges 0000000000000000 .debug_ranges
0000000000000000 l d .debug_line 0000000000000000 .debug_line
0000000000000000 l d .debug_str 0000000000000000 .debug_str
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 g F .text.startup 0000000000000026 main
0000000000000000 *UND* 0000000000000000 _Z21TrustMeCompilerIExistv
0000000000000000 *UND* 0000000000000000 _ZSt4cout
0000000000000000 *UND* 0000000000000000 _ZNSolsEi
0000000000000000 *UND* 0000000000000000 _ZNSo5flushEv
0000000000000000 *UND* 0000000000000000 _ZNSt8ios_base4InitC1Ev
0000000000000000 *UND* 0000000000000000 .hidden __dso_handle
0000000000000000 *UND* 0000000000000000 _ZNSt8ios_base4InitD1Ev
0000000000000000 *UND* 0000000000000000 __cxa_atexit
See how it says undefined, that's because it doesn't know where it is, it just knows it exists (along with the standard lib stuff, which the linker will find itself)
In closing
USE HEADER GUARDS and with templates put #include file.cpp at the bottom BEFORE the closing header guard. that way you can include header files as usual :)
The answer to your question is present in ever sample that comes with XMP SDK Toolkit.Clients must compile XMP.incl_cpp to ensure that all client-side glue code is generated. Do this by including it in exactly one of your source files.
For your ready reference I am pasting below a more detailed explanation present in section Template classes and accessing the API of XMPProgrammersGuide.pdf that comes with XMP SDK Toolkit
Template classes and accessing the API
The full client API is defined and documented in the TXMP*.hpp header files. The TXMP* classes are C++ template classes that must be instantiated with a string class such as std::string, which is used to return text strings for property values, serialized XMP, and so on. To allow your code to access the entire XMP API you must:
Provide a string class such as std::string to instantiate the template classes.
Provide access to XMPCore and XMPFiles by including the necessary defines and headers. To do this, add the necessary define and includes directives to your source code so that all necessary code is incorporated into the build:
#include <string>
#define XMP_INCLUDE_XMPFILES 1 //if using XMPFiles
#define TXMP_STRING_TYPE std::string
#include "XMP.hpp"
The SDK provides complete reference documentation for the template classes, but the templates must be instantiated for use. You can read the header files (TXMPMeta.hpp and so on) for information, but do not include them directly in your code. There is one overall header file, XMP.hpp, which is the only one that C++ clients should include using the #include directive. Read the instructions in this file for instantiating the template classes. When you have done this, the API is available through the concrete classes named SXMP*; that is, SXMPMeta, SXMPUtils, SXMPIterator, and SXMPFiles. This document refers to the SXMP* classes, which you can instantiate and which provide static functions.
Clients must compile XMP.incl_cpp to ensure that all client-side glue code is generated. Do this by including it in exactly one of your source files.
Read XMP_Const.h for detailed information about types and constants for namespace URIs and option flags.