How to see mangled name of C++ function in Mac Terminal - c++

I want to see the mangled name of this code. How should I do it.
I have tried this by compiling the code using G++ compiler and seeing output as ./a.out but nothing is printed. I have read about dumpbin.exe in windows but I do not have any idea about Mac.
nameMangling.cpp
// This demonstrate the nameMangling of function to make their signature.
int square(int x){
return x*x;
}
double square(double y){
return y*y;
}
void nothing1(int a, float b, char c, int &d){
}
void nothing2(char a, int b, float &c, double &d){
}
int main(){
return 0; // Indicate successful termination
}
The expected result is
__Z6squarei
__Z6squared
__Z8nothing1ifcRi
__Z8nothing2ciRfRd
_main
Any light on my problem will be Appreciated. Thank You

If you don't have binutils installed, install that package.
This is probably a good place to start: Install binutils on Mac OSX
Then nm a.out should show you the mangled names and nm -C a.out should show you the demangled names.

Related

dynamic library issue: dlsym() failing to find smbol

I've been following Apple's Dynamic Library Programming Topics
docs to create and use a runtime-loaded library using dlopen() / dlsym().
It seems I'm getting a failure to find the desired symbol on my Mid 2012 MacBook Air, running macOS Mojave.
Library Source Code
// adder.h
int add(int x);
and
// adder.cpp
#include "adder.h"
int add(int x) {
return (x + 1);
}
Compiled with clang -dynamiclib adder.cpp -o libAdd.A.dylib
Main Source
// main.cpp
#include <stdio.h>
#include <dlfcn.h>
#include <stdlib.h>
#include "adder.h"
int main() {
void* adder_handle = dlopen("libAdd.A.dylib", RTLD_LOCAL|RTLD_LAZY);
if (!adder_handle) {
printf("[%s] Unable to load library: %s\n\n", __FILE__, dlerror());
exit(EXIT_FAILURE);
}
while(true) {
void* voidptr = dlsym(adder_handle, "add");
int (*add)(int) = (int (*)(int))voidptr;
if (!add) {
printf("[%s] Unable to get symbol: %s\n\n", __FILE__, dlerror());
exit(EXIT_FAILURE);
}
printf("%d\n", add(0));
}
dlclose(adder_handle);
return 0;
}
Compiled with clang main.cpp -o main
I've also set the DYLD_LIBRARY_PATH environment variable to ensure the library can be found. Everything compiles ok.
Nevertheless, when I run the main executable, I get the error:
[main.cpp] Unable to get symbol: dlsym(0x7fb180500000, add): symbol not found
Running nm -gC libAdd.A.dylib outputs:
0000000000000fa0 T add(int)
U dyld_stub_binder
Any ideas on what could be wrong, or what I need to do to debug this issue?
Thanks!
C++ actually mangles the functionname which results in a different symbolname.
Your are able to spot these mangled symbol names using nm -g <yourlib.dylib>
You can change this behavior by wrapping your method into
extern "C" {
int add(int x);
}

Is the return type of a function part of the mangled name?

Suppose I have two functions with the same parameter types and name (not in the same program):
std::string foo(int x) {
return "hello";
}
int foo(int x) {
return x;
}
Will they have the same mangled name once compiled?
Is the the return type part of the mangled name in C++?
As mangling schemes aren't standardised, there's no single answer to this question; the closest thing to an actual answer would be to look at mangled names generated by the most common mangling schemes. To my knowledge, those are the GCC and MSVC schemes, in alphabetical order, so...
GCC:
To test this, we can use a simple program.
#include <string>
#include <cstdlib>
std::string foo(int x) { return "hello"; }
//int foo(int x) { return x; }
int main() {
// Assuming executable file named "a.out".
system("nm a.out");
}
Compile and run with GCC or Clang, and it'll list the symbols it contains. Depending on which of the functions is uncommented, the results will be:
// GCC:
// ----
std::string foo(int x) { return "hello"; } // _Z3fooB5cxx11i
// foo[abi:cxx11](int)
int foo(int x) { return x; } // _Z3fooi
// foo(int)
// Clang:
// ------
std::string foo(int x) { return "hello"; } // _Z3fooi
// foo(int)
int foo(int x) { return x; } // _Z3fooi
// foo(int)
The GCC scheme contains relatively little information, not including return types:
Symbol type: _Z for "function".
Name: 3foo for ::foo.
Parameters: i for int.
Despite this, however, they are different when compiled with GCC (but not with Clang), because GCC indicates that the std::string version uses the cxx11 ABI.
Note that it does still keep track of the return type, and make sure signatures match; it just doesn't use the function's mangled name to do so.
MSVC:
To test this, we can use a simple program, as above.
#include <string>
#include <cstdlib>
std::string foo(int x) { return "hello"; }
//int foo(int x) { return x; }
int main() {
// Assuming object file named "a.obj".
// Pipe to file, because there are a lot of symbols when <string> is included.
system("dumpbin/symbols a.obj > a.txt");
}
Compile and run with Visual Studio, and a.txt will list the symbols it contains. Depending on which of the functions is uncommented, the results will be:
std::string foo(int x) { return "hello"; }
// ?foo##YA?AV?$basic_string#DU?$char_traits#D#std##V?$allocator#D#2##std##H#Z
// class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > __cdecl foo(int)
int foo(int x) { return x; }
// ?foo##YAHH#Z
// int __cdecl foo(int)
The MSVC scheme contains the entire declaration, including things that weren't explicitly specified:
Name: foo# for ::foo, followed by # to terminate.
Symbol type: Everything after the name-terminating #.
Type and member status: Y for "non-member function".
Calling convention: A for __cdecl.
Return type:
H for int.
?AV?$basic_string#DU?$char_traits#D#std##V?$allocator#D#2##std# (followed by # to terminate) for std::basic_string<char, std::char_traits<char>, std::allocator<char>> (std::string for short).
Parameter list: H for int (followed by # to terminate).
Exception specifier: Z for throw(...); this one is omitted from demangled names unless it's something else, probably because MSVC just ignores it anyway.
This allows it to whine at you if declarations aren't identical across every compilation unit.
Generally, most compilers will use one of those schemes (or sometimes a variation thereof) when targeting *nix or Windows, respectively, but this isn't guaranteed. For example...
Clang, to my knowledge, will use the GCC scheme for *nix, or the MSVC scheme for Windows.
Intel C++ uses the GCC scheme for Linux and Mac, and the MSVC scheme (with a few minor variations) for Windows.
The Borland and Watcom compilers have their own schemes.
The Symantec and Digital Mars compilers generally use the MSVC scheme, with a few small changes.
Older versions of GCC, and a lot of UNIX tools, use a modified version of cfront's mangling scheme.
And so on...
Schemes used by other compilers are thanks to Agner Fog's PDF.
Note:
Examining the generated symbols, it becomes apparent that GCC's mangling scheme doesn't provide the same level of protection against Machiavelli as MSVC's. Consider the following:
// foo.cpp
#include <string>
// Simple wrapper class, to avoid encoding `cxx11 ABI` into the GCC name.
class MyString {
std::string data;
public:
MyString(const char* const d) : data(d) {}
operator std::string() { return data; }
};
// Evil.
MyString foo(int i) { return "hello"; }
// -----
// main.cpp
#include <iostream>
// Evil.
int foo(int);
int main() {
std::cout << foo(3) << '\n';
}
If we compile each source file separately, then attempt to link the object files together...
GCC: MyString, due to not being part of the cxx11 ABI, causes MyString foo(int) to be mangled as _Z3fooi, just like int foo(int). This allows the object files to be linked, and an executable is produced. Attempting to run it causes a segfault.
MSVC: The linker will look for ?foo##YAHH#Z; as we instead supplied ?foo##YA?AVMyString##H#Z, linking will fail.
Considering this, a mangling scheme that includes the return type is safer, even though functions can't be overloaded solely on differences in return type.
No, and I expect that their mangled name will be the same with all modern compilers. More importantly, using them in the same program results in undefined behavior. Functions in C++ cannot differ only in their return type.

Linking on different version of shared libraries

I have two versions of a a shared library:
library version 2:
simple.h
#pragma once
int first(int x);
simple.c
#include "simple.h"
#include <stdio.h>
__asm__(".symver first_1_0,first#LIBSIMPLE_1.0");
int first_1_0(int x)
{
printf("lib: %s\n", __FUNCTION__);
return x + 1;
}
__asm__(".symver first_2_0,first##LIBSIMPLE_2.0");
int first_2_0(int x)
{
int y;
printf("lib: %d\n", y);
printf("lib: %s\n", __FUNCTION__);
return (x + 1) * 1000;
}
linker version script file:
LIBSIMPLE_1.0{
global:
first;
local:
*;
};
LIBSIMPLE_2.0{
global:
first;
local:
*;
};
gcc -Wall -g -O0 -fPIC -c simple.c
gcc -shared simple.o -Wl,--version-script,script -o libsimple.so.2.0.0
And library version 3:
simple.h
#pragma once
#ifdef SIMPLELIB_VERSION_3_0
int first(int x, int normfactor);
#else
int first(int x);
#endif //SIMPLELIB_VERSION_3_0
simple.c
#include "simple.h"
#include <stdio.h>
__asm__(".symver first_1_0,first#LIBSIMPLE_1.0");
int first_1_0(int x)
{
printf("lib: %s\n", __FUNCTION__);
return x + 1;
}
__asm__(".symver first_2_0,first#LIBSIMPLE_2.0");
int first_2_0(int x)
{
printf("lib: %s\n", __FUNCTION__);
return (x + 1) * 1000;
}
__asm__(".symver first_3_0,first##LIBSIMPLE_3.0");
int first_3_0(int x, int normfactor)
{
printf("lib: %s\n", __FUNCTION__);
return (x + 1) * normfactor;
}
Linker version script file:
LIBSIMPLE_1.0{
global:
first; second;
local:
*;
};
LIBSIMPLE_2.0{
global:
first;
local:
*;
};
LIBSIMPLE_3.0{
global:
first;
local:
*;
};
gcc -Wall -g -O0 -fPIC -c simple.c
gcc -shared simple.o -Wl,--version-script,script -o libsimple.so.3.0.0
So i end up with having two different libraries. Next i create a simple application, that eventually i want to link to a library version 3, so in it i use function first() that takes two arguments:
main.c
#include <stdio.h>
#include "simple.h"
int main(int argc, char* argv[])
{
int nFirst = first(1, 10);
printf("First(1) = %d\n", nFirst);
}
I compile app with next commands:
gcc -g -Wall -DSIMPLELIB_VERSION_3_0 -c main.c
And then, by accident, instead of linking to a library version 3, i linked against library version 2. I expected linking to fail, but it went through, and application was working.
gcc main.o -Wl,-L. -lsimple.2.0.0 -Wl,-R. -o demo
So my questions are:
Is it because library exports symbol with name 'function', and application tries to link to the same symbol name, and that is why linker didn't complain, and just linked against library version 2 ?
I thought since c++ mangles symbol names, such thing wouldn't happen, and linker wouldn't link to a library version 2. So i tried all the same, but instead of a gcc compiler, i tried to use g++. Everything went well, until i tried to link application to a library, and i received unresolved links error. Can not figure out why.
p.s. Sorry for a big amount of code. I was trying to make it clear.
Thanks
Is it because library exports symbol with name 'function', and application tries to link to the same symbol name, and that is why linker didn't complain, and just linked against library version 2 ?
Yes, since plain C does not have function overload there is no need for mangling and as a consequence only function name will be used as a symbol for linking. In the end your application code wants to link with function and your library code exports function and this is enough to keep linker happy (even though it is not valid from binary interface perspective).
I thought since c++ mangles symbol names, such thing wouldn't happen, and linker wouldn't link to a library version 2. So i tried all the same, but instead of a gcc compiler, i tried to use g++. Everything went well, until i tried to link application to a library, and i received unresolved links error. Can not figure out why.
Yes, this problem should not occur in C++ because of name mangling. However, this is true only if you have both your application code and library code in C++ or if you bridge your C and C++ code the right way.
It is hard to say (without full listing) what happened in your case when you used g++ but from the looks of it you ended up having application code in C++ and library code still in C. If that is the case your application code will now want to link with mangled function while your library code still exports unmangled function.
To verify this you can inspect your object file with something like:
nm main.o
... and see exactly what kind of symbol does it want. If you will get something like this:
...
U _Z3functionii
...
... instead of:
...
U function
...
... then that is the case.
To "fix" this and make your C++ application code link with unmangled function from library code you'll need to declare your function prototype as extern "C".

C++ name mangling in C

C language does not use name mangling like C++. This can lead to subtle bugs, when function prototype is declared differently in different files. Simple example:
/* file1.c */
int test(int x, int y)
{
return y;
}
/* file2.c */
#include <stdio.h>
extern int test(int x);
int main()
{
int n = test(2);
printf("n = %d\n", n);
return 0;
}
When compiling such code using C compiler (in my case gcc) no errors are reported. After switching to C++ compiler, linking will fail with error "undefined reference to 'test(int)'". Unfortunately in practice this is not so easy - there are cases when code is accepted by C compiler (with possible warning messages), but compilation fails when using C++ compiler.
This is of course bad coding practice - all function prototypes should be added to .h file, which is then included in files where function is implemented or used. Unfortunately in my app there are many cases like this, and fixing all of them is not possible in short term. Switching to g++ is also not at option, I got compilation error quite fast.
One of possible solutions would be to use C++ name mangling when compiling C code. Unfortunately gcc does not allow to do this - I did not found command line option to do this. Do you know if it is possible to do this (maybe use other compiler?). I also wonder if some static analysis tools are able to catch this.
Using splint catches these kinds of errors.
foo.c:
int test(int x);
int main() {
test(0);
}
bar.c:
int test(int x, int y) {
return y;
}
Running splint:
$ splint -weak foo.c bar.c
Splint 3.1.2 --- 20 Feb 2009
bar.c:1:5: Function test redeclared with 2 args, previously declared with 1
Types are incompatible. (Use -type to inhibit warning)
foo.c:4:5: Previous declaration of test
Finished checking --- 1 code warning
~/dev/temp$ cat > a.c
int f(int x, int y) { return x + y; }
~/dev/temp$ cat > b.c
extern int f(int x); int g(int x) { return f(x + x); }
~/dev/temp$ splint *.c
Splint 3.1.2 --- 03 May 2009
b.c:1:12: Function f redeclared with 1 arg, previously declared with 2
Types are incompatible. (Use -type to inhibit warning)
a.c:1:5: Previous declaration of f
Finished checking --- 1 code warning
~/dev/temp$

G++ Undefined symbols for architecture x86_64

I am learning C++ and have been given an assignment to create a Vector3D class. When I try to compile main.cpp using G++ on OSX I get the following error message. Why would this be?
g++ main.cpp
Undefined symbols for architecture x86_64:
"Vector3DStack::Vector3DStack(double, double, double)", referenced from:
_main in cc9dsPbh.o
ld: symbol(s) not found for architecture x86_64
main.cpp
#include <iostream>;
#include "Vector3DStack.h";
using namespace std;
int main() {
double x, y, z;
x = 1.0, y = 2.0, z = 3.0;
Vector3DStack v (x, y, z);
return 0;
}
Vector3DStack.h
class Vector3DStack {
public:
Vector3DStack (double, double, double);
double getX ();
double getY ();
double getZ ();
double getMagnitude();
protected:
double x, y, z;
};
Vector3DStack.cpp
#include <math.h>;
#include "Vector3DStack.h";
Vector3DStack::Vector3DStack (double a, double b, double c) {
x = a;
y = b;
z = c;
}
double Vector3DStack::getX () {
return x;
}
double Vector3DStack::getY () {
return y;
}
double Vector3DStack::getZ () {
return z;
}
double Vector3DStack::getMangitude () {
return sqrt (pow (x, 2) * pow (y, 2) * pow (z, 2));
}
You have to compile and link your Vector3DStack.cpp as well. Try:
g++ main.cpp Vector3DStack.cpp -o vectortest
This should create an executable called vectortest.
Pass the implementation of Vector3D to the compiler:
g++ main.cpp Vector3DStack.cpp
This will produce executable called a.out on Linux and Unix systems. To change the executable name use -o option:
g++ -o my_program main.cpp Vector3DStack.cpp
This is the simplest possible way of building your program. You should learn a bit more - read about make program, or even cmake.
I had ran into a similar issue when writing my own implementation of a hashTable with templates.
In your main.cpp, just include "Vector3DStack.cpp", which includes Vector3DStack.h, instead of just including Vector3DStack.h.
In my case, since templates, as we know, are evaluated at compile time, having templatized (including fully specialized) methods in the class as part of the cpp file (where they are defined) need to be known to the compiler. Some of the C++ gotchas.. so much to remember, easy to forget the little things.
Mostly likely you've already got our solution, thanks to the answers posted earlier, but my $0.02 anyways.
Happy C++ Programming!