I have searched online a lot but I couldn't find an example that works with g+, all examples work with GCC.
The error I keep getting is:
wrap_malloc.o: In function `__wrap_malloc(unsigned int)':
wrap_malloc.cc:(.text+0x20): undefined reference to `__real_malloc(unsigned int)'
wrap_malloc.o: In function `main':
wrap_malloc.cc:(.text+0x37): undefined reference to `__wrap_malloc'
collect2: ld returned 1 exit status
The code that creates this error is the following (this code works if I compile it with GCC and change the headers from cstdio to stdio.h):
#include <cstdio>
#include <cstdlib>
void *__real_malloc(size_t);
void *__wrap_malloc(size_t c) {
printf("My malloc called with %d\n", c);
return __real_malloc(c);
}
int main(void) {
void *ptr = malloc(12);
free(ptr);
return 0;
}
This is how I compile it:
wrap_malloc.o: wrap_malloc.cc
g++ -c wrap_malloc.cc -o wrap_malloc.o
wrap_malloc: wrap_malloc.o
g++ wrap_malloc.o -o wrap_malloc -Wl,--wrap,malloc
When you use a C++ compiler, all names are mangled. What this means becomes clear when you run nm wrap_malloc.o, which should give you something like this:
00000000 b .bss
00000000 d .data
00000000 r .rdata
00000000 t .text
U __Z13__real_mallocj
00000000 T __Z13__wrap_mallocj
U _printf
This means that you use (U) a symbol called __Z13__real_mallocj and that you define a symbol in the text segment (T) called __Z13__wrap_mallocj. But you probably want a symbol called __real_malloc. To achieve this you have to say the compiler that __real_malloc is a C-style function, like this:
extern "C" void *__real_malloc(size_t);
extern "C" void *__wrap_malloc(size_t c) {
printf("My malloc called with %d\n", c);
return __real_malloc(c);
}
Now the output of nm is:
00000000 b .bss
00000000 d .data
00000000 r .rdata
00000000 t .text
U ___real_malloc
00000000 T ___wrap_malloc
U _printf
You can see that the name _printf hasn't changed. This is because in the header files, many functions are declared as extern "C" already.
Note: I did all of the above on Windows in the cygwin environment. That's why there is an additional leading underscore in the external symbols.
If this is the complete code, the problem you are having is that you haven't implemented __real_malloc()!
And by the way, identifiers with double-underscores are reserved by the language. You might want to think about picking different names.
Related
I know how to use inline keyword to avoid 'multiple definition' while using C++ template. However, what I am curious is that how linker is distinguishing which specialization is full specialization and violating ODR and reporting error, while another specialization is implicit and correctly handle it?
From the nm output, we can see duplicated definitions in main.o and other.o for both int-version max() and char-version max(), but C++ linker only reports 'multiple definition error for char-version max()' but let 'char-version max() go a successful link? How linker differentiate them and does this?
// tmplhdr.hpp
#include <iostream>
// this function is instantiated in main.o and other.o
// but leads no 'multiple definition' error by linker
template<typename T>
T max(T a, T b)
{
std::cout << "match generic\n";
return (b<a)?a:b;
}
// 'multiple definition' link error if without inline
template<>
inline char max(char a, char b)
{
std::cout << "match full specialization\n";
return (b<a)?a:b;
}
// main.cpp
#include "tmplhdr.hpp"
extern int mymax(int, int);
int main()
{
std::cout << max(1,2) << std::endl;
std::cout << mymax(10,20) << std::endl;
std::cout << max('a','b') << std::endl;
return 0;
}
// other.cpp
#include "tmplhdr.hpp"
int mymax(int a, int b)
{
return max(a, b);
}
Test output on Ubuntu is reasonable; but output on Cygwin is rather strange and confusing...
==== Test on Cygwin ====
g++ linker only reported 'char max(char, char)' is duplicated.
$ g++ -o main.exe main.cpp other.cpp
/usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld:
/tmp/ccYivs3O.o:other.cpp:(.text$_Z3maxIcET_S0_S0_[_Z3maxIcET_S0_S0_]+0x0):
multiple definition of `char max<char>(char, char)';
/tmp/cc7HJqbS.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
I dumped my .o object file and found no many clues (maybe I am not quite familiar with object format spec.).
$ nm main.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <<-- implicit specialization
U mymax(int, int)
$ nm other.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
000000000000009b t _GLOBAL__sub_I__Z5mymaxii
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <-- implicit specialization
0000000000000000 T mymax(int, int)
==== Test on Ubuntu ====
This is what I have got on my Ubuntu with g++-9 after having remove inline from tmplhdr.hpp
tony#Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ g++ -o main main.o other.o
/usr/bin/ld: other.o: in function `char max<char>(char, char)':
other.cpp:(.text+0x0): multiple definition of `char max<char>(char, char)'; main.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
'char-version max()' is marked with T which is not allowed to have multiple definitions; but 'in-version max()' is marked as W which allows multiple definitions. However, I start to be curious why nm gives different marks on Cygwin than on Ubuntu?? and Why linker on Cgywin can handle two T definitions correctly?
tony#Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm main.o | grep max | c++filt
0000000000000133 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
U mymax(int, int)
tony#Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm other.o | grep max | c++filt
00000000000000d7 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
000000000000003e T mymax(int, int)
However, I start to be curious why nm gives different marks on Cygwin than on Ubuntu?? and Why linker on Cgywin can handle two T definitions correctly?
You need to understand that the nm output does not give you the full picture.
nm is part of binutils, and uses libbfd. The way this works is that various object file formats are parsed into libbfd-internal representation, and then tools like nm print that internal representation in human-readable format.
Some things get "lost in translation". This is the reason you should ~never use e.g. objdump to look at ELF files (at least not at the symbol table of the ELF files).
As you correctly deduced, the reason multiple max<int>() symbols are allowed on Linux is that the compiler emits them as a W (weakly defined) symbol.
The same is true for Windows, except Windows uses older COFF format, which doesn't have weak symbols. Instead, the symbol is emitted into a special .linkonce.$name section, and the linker knows that it can select any such section into the link, but should only do that once (i.e. it knows to discard all other duplicates of that section in any other object file).
I was going through the article - http://www.geeksforgeeks.org/extern-c-in-c/
There are two example given -
int printf(const char *format,...);
int main()
{
printf("GeeksforGeeks");
return 0;
}
It say this wont compile because the compiler wont be able to find the mangled version of 'printf' function. However, the below give output.
extern "C"
{
int printf(const char *format,...);
}
int main()
{
printf("GeeksforGeeks");
return 0;
}
This is beacuse extern "C" block prevent the name from being mangled. However, the code run and gives output. From where does it get the definition of 'printf'. I read a post which says 'stdio.h' is included by default. If this is true, below code must run. However, it give error that printf is not defined.
int main()
{
printf("GeeksforGeeks");
return 0;
}
Can somebody explain this?
Your compiler is being helpful by treating printf specially as a built-in.
Sample code "tst.cpp":
int printf(char const *format,...);
int foo(int a, char const *b);
int main() {
printf("Hello, World!");
foo(42, static_cast<char const *>("Hello, World!"));
return 0;
}
When compiling with Microsoft's cl compiler command "cl /c tst.cpp" we can inspect the resulting .obj and find:
00000000 r $SG2552
00000010 r $SG2554
00000000 N .debug$S
00000000 i .drectve
00000000 r .rdata
00000000 t .text$mn
U ?foo##YAHHPBD#Z
U ?printf##YAHPBDZZ
00e1520d a #comp.id
80000191 a #feat.00
00000000 T _main
Note that both foo() and printf() are mangled.
But when we compile with /usr/lib/gcc/i686-pc-cygwin/3.4.4/cc1plus.exe via cygwin "g++ -c tst.cpp", we get:
00000000 b .bss
00000000 d .data
00000000 r .rdata
00000000 t .text
U __Z3fooiPKc
U ___main
U __alloca
00000000 T _main
U _printf
Here foo() is mangled and printf() is not, because the cygwin compiler is being helpful. Most would consider this a compiler defect. If the cygwin compiler is invoked with "g++ -fno-builtin -c tst.cpp" then the problem goes away and both symbols are mangled as they should be.
A more up-to-date g++ gets it right, compiling with with /usr/libexec/gcc/i686-redhat-linux/4.8.3/cc1plus via "g++ -c tst.cpp" we get:
00000000 T main
U _Z3fooiPKc
U _Z6printfPKcz
Both foo() and printf() are mangled.
But if we declare printf such that cygwin g++ does not recognize it:
char const * printf(char const *format,...);
int foo(int a, char const *b);
int main() {
printf("Hello, World!");
foo(42, static_cast<char const *>("Hello, World!"));
return 0;
}
Then both foo() and printf() are mangled:
00000000 b .bss
00000000 d .data
00000000 r .rdata
00000000 t .text
U __Z3fooiPKc
U __Z6printfPKcz
U ___main
U __alloca
00000000 T _main
Let's take a look at the relevant standard quotes:
17.6.2.3 Linkage [using.linkage]
2 Whether a name from the C standard library declared with external linkage has extern "C" or extern "C++" linkage is implementation-defined. It is recommended that an implementation use extern "C++" linkage for this purpose.
17.6.4.3 Reserved names [reserved.names]
2 If a program declares or defines a name in a context where it is reserved, other than as explicitly allowed by this Clause, its behavior is undefined.
17.6.4.3.3 External linkage [extern.names]
1 Each name declared as an object with external linkage in a header is reserved to the implementation to designate that library object with external linkage, both in namespace std and in the global namespace.
2 Each global function signature declared with external linkage in a header is reserved to the implementation to designate that function signature with external linkage.
3 Each name from the Standard C library declared with external linkage is reserved to the implementation for use as a name with extern "C" linkage, both in namespace std and in the global namespace.
4 Each function signature from the Standard C library declared with external linkage is reserved to the implementation for use as a function signature with both extern "C" and extern "C++" linkage, or as a name of namespace scope in the global namespace.
What we get from this is that the compiler may assume that printf in any of the given instances always refers to the standard-library-function printf, and thus can have any amount of info about them baked in. And if you get the declaration wrong, or indeed simply provide your own, it is free to do whatever it wants, including but not limited to magically correcting it.
Anyway, you cannot know which language-linkage it expects.
Assume a simple file bla.cpp:
struct MyClass {
virtual int foo(int x);
virtual ~MyClass();
};
int MyClass::foo(int x) { return x + 23; }
MyClass::~MyClass() {}
Build into a shared library with
g++ -c -fPIC bla.cpp
g++ -shared -o bla.so bla.o
will usually contain some type_info symbol because RTTI is enabled by default on gcc. However, if I build with
g++ -c -fPIC -fno-rtti bla.cpp
the type_info will be missing.
Is there a simple, reliable way (on gcc or clang) to check if a library has been built with -fno-rtti or -frtti? I ask because today I stared at the infamous undefined reference to type_info and it took me a moment to understand that this was cause by a library I was linking against being built with -fno-rtti.
If a class has virtual. functions, it should have type info. Do nm -C libname.so and watch for "vtable for", "typeinfo for", and "typeinfo name for". Example:
00000000 b .bss
00000000 d .data
00000000 r .eh_frame
00000000 r .rdata$_ZTI3Foo
00000000 r .rdata$_ZTS3Foo
00000000 r .rdata$_ZTV3Foo
00000000 r .rdata$zzz
00000000 t .text
00000000 T Foo::foo()
00000000 R typeinfo for Foo
00000000 R typeinfo name for Foo
00000000 R vtable for Foo
U vtable for __cxxabiv1::__class_type_info
If you have vtable but not typeinfo, this is compiled with -fno-rtti. Example:
00000000 b .bss
00000000 d .data
00000000 r .eh_frame
00000000 r .rdata$_ZTV3Foo
00000000 r .rdata$zzz
00000000 t .text
00000000 T Foo::foo()
00000000 R vtable for Foo
If you don't have any virtual functions, you cannot tell (and should not care).
If you need this for configuration, do as GNU autoconf does: Write a minimal proggie that does the checking and build that one. Whether the build (or perhaps a run) fails or not tells you what you need to know.
I have a c++ program which includes an external dependency on an empty xlsx file. To remove this dependency I converted this file to a binary object in view of linking it in directly, using:
ld -r -b binary -o template.o template.xlsx
followed by
objcopy --rename-section .data=.rodata,alloc,load,readonly,data,contents template.o template.o
Using objdump, I can see three variables declared :
$ objdump -x template.o
template.o: file format elf64-x86-64
template.o
architecture: i386:x86-64, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .rodata 00000fd1 0000000000000000 0000000000000000 00000040 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000fd1 g *ABS* 0000000000000000 _binary_template_xlsx_size
0000000000000000 g .rodata 0000000000000000 _binary_template_xlsx_start
0000000000000fd1 g .rodata 0000000000000000 _binary_template_xlsx_end
I then tell my program about this data :
template.h:
#ifndef TEMPLATE_H
#define TEMPLATE_H
#include <cstddef>
extern "C" {
extern const char _binary_template_xlsx_start[];
extern const char _binary_template_xlsx_end[];
extern const int _binary_template_xlsx_size;
}
#endif
This compiles and links fine,(although I am having some trouble automating it with cmake, see here : compile and add object file from binary with cmake)
However, when I use _binary_template_xlsx_size in my code, it is interpreted as a pointer to an address that doesn't exist. So to get the size of my data, I have to pass (int)&_binary_template_xlsx_size (or (int)(_binary_template_xlsx_end - _binary_template_xlsx_start))
Some research tells me that the *ABS* in the objdump above means "absolute value" but I don't get why. How can I get my c++ (or c) program to see the variable as an int and not as a pointer?
An *ABS* symbol is an absolute address; it's more often created by passing --defsym foo=0x1234 to ld.
--defsym symbol=expression
Create a global symbol in the output file, containing the absolute
address given by expression. [...]
Because an absolute symbol is a constant, it's not possible to link it into a C source file as a variable; all C object variables have an address, but a constant doesn't.
To make sure you don't dereference the address (i.e. read the variable) by accident, it's best to define it as const char [] as you have with the other symbols:
extern const char _binary_template_xlsx_size[];
If you want to make sure you're using it as an int, you could use a macro:
extern const char _abs_binary_template_xlsx_size[] asm("_binary_template_xlsx_size");
#define _binary_template_xlsx_size ((int) (intptr_t) _abs_binary_template_xlsx_size)
Using g++ and having linker errors. I have a simple program in split into two modules: main.cpp and Dice.h Dice.cpp.
main.cpp:
#include <iostream>
#include "Dice.h"
int main(int argc, char **argv) {
int dieRoll = Dice::roll(6);
std::cout<<dieRoll<<std::endl;
std::cin.get();
return 0;
}
Dice.h:
#ifndef DieH
#define DieH
namespace Dice
{
int roll(unsigned int dieSize);
}
#endif
Dice.cpp:
#include <ctime>
#include <cstdlib>
#include "Dice.h"
namespace Dice
{
int roll(unsigned int dieSize)
{
if (dieSize == 0)
{
return 0;
}
srand((unsigned)time(0));
int random_int = 0;
random_int = rand()%dieSize+1;
return random_int;
}
}
I compile and link these files using g++ as follows:
g++ -o program main.cpp Dice.cpp
And I get the following linker error:
Undefined symbols:
"Dice::roll(int)", referenced from:
_main in ccYArhzP.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
I'm completely flummoxed. Any help would be greatly appreciated.
Your code is well-formed.
Ensure that your don't have conflicting file names, the files exist and contain what you think they do. For example, perhaps you have a Dice.cpp that's empty, and you're editing a newly created one somewhere else.
Minimize possible discrepancy by removing unnecessary files; only have main.cpp, dice.h, and dice.cpp.
Your errors do not match your code: "Dice::roll(int)". Observe that this is looking for an int, but your functions take an unsigned int. Make sure your header matches.
Try the following:
g++ main.cpp -c
This will generate main.o, the compiled but not-linked code for main. Do the same with dice.cpp:
g++ dice.cpp -c
You now have two object files that need to be linked together. Do so with:
g++ main.o dice.o
And see if that works. If not, do the following:
nm main.o dice.o
This will list all the available symbols in an object, and should give you something like this:
main.o:
00000000 b .bss
00000000 d .ctors
00000000 d .data
00000000 r .eh_frame
00000000 t .text
00000098 t __GLOBAL__I_main
00000069 t __Z41__static_initialization_and_destruction_0ii
U __ZN4Dice4rollEj
U __ZNSi3getEv
U __ZNSolsEPFRSoS_E
U __ZNSolsEi
U __ZNSt8ios_base4InitC1Ev
U __ZNSt8ios_base4InitD1Ev
U __ZSt3cin
U __ZSt4cout
U __ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
00000000 b __ZStL8__ioinit
U ___gxx_personality_v0
U ___main
00000055 t ___tcf_0
U _atexit
00000000 T _main
dice.o:
00000000 b .bss
00000000 d .data
00000000 t .text
00000000 T __ZN4Dice4rollEj
U _rand
U _srand
U _time
C++ mangles function names, which is why everything looks so weird. (Note, there is no standard way of mangling names, this is how GCC 4.4 does it).
Observe that dice.o and main.o refer to the same symbol: __ZN4Dice4rollEj. If these do not match, that's your problem. For example, if I change part of dice.cpp to be this:
// Note, it is an int, not unsigned int
int roll(int dieSize)
Then nm main.o dice.o produces the following:
main.o:
00000000 b .bss
00000000 d .ctors
00000000 d .data
00000000 r .eh_frame
00000000 t .text
00000098 t __GLOBAL__I_main
00000069 t __Z41__static_initialization_and_destruction_0ii
U __ZN4Dice4rollEj
U __ZNSi3getEv
U __ZNSolsEPFRSoS_E
U __ZNSolsEi
U __ZNSt8ios_base4InitC1Ev
U __ZNSt8ios_base4InitD1Ev
U __ZSt3cin
U __ZSt4cout
U __ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
00000000 b __ZStL8__ioinit
U ___gxx_personality_v0
U ___main
00000055 t ___tcf_0
U _atexit
00000000 T _main
dice.o:
00000000 b .bss
00000000 d .data
00000000 t .text
00000000 T __ZN4Dice4rollEi
U _rand
U _srand
U _time
Note, this gives two different symbols. main.o looking for this: __ZN4Dice4rollEj and dice.o containing this __ZN4Dice4rollEi. (The last letter differs).
When trying to compile these mismatched symbols (with g++ main.o dice.o), I get:
undefined reference to `Dice::roll(unsigned int)'
Your problem is that you're calling Dice::roll(int) and you wrote a Dice::roll(unsigned int) function. The function it's looking for doesn't actually exist (internally, argument types of a function matter just as much as its name; this is how overloading works). Try passing it an unsigned int (6u for example) and see if that works. Alternately, your header file may not match in both files, and main.cpp thinks your function takes a (signed) int.
I know you don't want to hear anything like it, but "It works for me!".
Your errors look suspiciously old - what version of GCC are you using? This look like GCC 2.x!
Also, try to type dieRoll in main.cpp as unsigned, that might resolve namespace problem on this old compiler.