I hope to LoadLibrary on an unmanaged C++ DLL with managed code, and then call GetProcAddress on extern functions which have been mangled. My question is are the mangled names you get from a C++ compiler deterministic? That is: Will the name always by converted to the same mangled name, if the original's signature hasn't changed?
It isn't specified by the standard, and has certainly changed between versions of the same compiler in my experience, though it has to be deterministic over some fixed set of circumstances, because otherwise there would be no way to link two separately compiled modules.
If you're using GetProcAddress, it would be far cleaner to export the functions as extern "C" so their names are not mangled.
It's compiler specific, as others have said. However, you can find details in a document by Agner Fog...
http://www.agner.org/optimize/#manuals
See item 5 on that page.
Also, these days, there are libraries which can handle mangling and demangling for common compilers for you. For Visual C++, the starting point would be the dbghelp and imagehlp libraries.
http://msdn.microsoft.com/en-us/library/ms679292%28v=VS.85%29.aspx
http://msdn.microsoft.com/en-us/library/ms680321%28v=VS.85%29.aspx
Name mangeling is handled differently by every compiler (maybe or not-there is no standard). If you use pure C functions in your C++ code, you can use the extern "C" to supress name mangeling for the C functions so the compiler is able to find them.
Related
Is there any way to explicitly do name mangling (also called name decoration) in a library written in c(or cpp).I want all the symbols of my shared library to have their names mangled.
Consider this question:
Two library of different versions in an application
In this if I can explicitly have all their names mangled , I think i can resolve that issue.May be there is some option in gcc compiler itself to do this.
Your question is:
Is there any way to explicitly do name mangling (also called name decoration) in a library written in c(or cpp).I want all the symbols of my shared library to have their names mangled.
However, I suspect that you're using the term name mangling inappropriately. Name mangling has nothing to do with library release version. If you mean to version each object exported in your library, then there are plenty of questions to answer that. Personally, I would use a versioned namespace -- but only because I haven't (yet) been bitten by it. Here's a quick example:
namespace mylibrary {
namespace v1 {
class foo {};
}
using foo = v1::foo;
}
mylibrary::foo f; // mylibrary::v1::foo
...then on a later release...
namespace mylibrary {
namespace v1 {
class foo {};
}
namespace v2 {
class foo;
}
using foo = v2::foo;
}
mylibrary::foo newer_f; // mylibrary::v2::foo
mylibrary::v1::foo older_f;
There are of course many permutations you could have. And there are a lot of caveats, especially if you have templated code or make use of ADL. If you release version 1 of the library with one definition of class foo but then version 2 has a different definition, then the two libraries will not be compatible! That's rather the whole point though.
If however I am incorrect and you truly do want to enforce C++ name mangling in your C++ library (which is odd, because it should be done by default), then the answer is twofold. First, take a look at some related questions:
Why would you use 'extern "C++"'?
"Undefined reference to" error while linking object files
"Undefined reference to" error when linking static C library with C++ code
The reading is related but not causal. The related questions are answering your question in reverse.
Many operating systems are written in C and that is typically why you'd see extern "C" when including system headers. It's also why you sometimes see the linker complaining about missing functions when you try to use things declared in a header whose library was compiled with C instead of C++.
So to go in the other direction (in your direction): in your header file, you can declare your exports to be extern "C++". That tells the compiler to specifically use mangled names when importing or exporting the object.
Using extern "C++" won't by itself be your magic trick. There are some GCC options which control some of the more specific functionality about name mangling. So, secondly, take a look at those. The (external link) to the GCC manual page is here: https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html
Any option which mentions ABI, such as -fabi, might impact you. "-fabi" flags relate to "Application Binary Interface". You might want to learn more about these terms too. What is an application binary interface has some excellent answers describing what an ABI is and how you can start to reason about them. "-Wabi" will tell GCC to emit warnings when it detects potential ABI conflicts. But, like all things C++, it's not foolproof. I would not be surprised if there are name mangling issues which might not be detected by it. That's particularly true if you ever mix heterogenous compiler vendors or versions.
Importantly: mixing ABIs is likely going to be a big headache. I'd be very concerned about ABI incompatibilities being forced together and causing very difficult-to-debug undefined behaviors!
Is it possible to call a function in a C++ DLL from C code?
The function is not declared extern "C".
An ugly platform dependent hack that only works with Visual Studio is fine.
Calling conventions should not be a major issue, but how do I deal with name mangling.?
For instance with Visual Studio, a C++ function with signature void f() has the mangled name ?f##YAXXZ and that is not a legal C identifier.
(You don't need to tell me that I should declare the C++ function as extern "C".
I already know that. But I'm in a situation where I cannot change the C++ code.)
Wrap the offenging function in another C++ function, and declare it with extern "C". No need to create a special DLL for it, just include one C++ file in your project.
To make your compiler to statically link a function with a different exported name may be tricky. But you can always load the DLL with LoadLibrary and then use GetProcAddress.
You could investigate
LoadLibrary("path to dll");
to load the DLL and
GetProcAddress("?f##YAXXZ");
to grab a function pointer to the externally declared function.
I do not see any clean solution besides creating an additional dll written in C++ and exposing all interfaces via extern "C".
You could compile your C code using the same C++ compiler they used, then your C functions will be mangled using the same mechanism and everything will link seamlessly, and no-one will notice any difference.
If you must use a different compiler, then you'll have to manually load the dll using LoadLibrary and each function using GetProcAddress.
I know there are differences in the source code between C and C++ programs - this is not what I'm asking about.
I also know this will vary from CPU to CPU and OS to OS, depending on compiler.
I'm teaching myself C++ and I've seen numerous references to libraries that can be used by both languages. This has started me thinking - are there significant differences between the binary executables of the two languages?
For libraries to be easily used by both, I would think they'd have to be similar on an executable level.
Are there many situations where a person could examine a executable file and tell whether it was created by C or C++ source code? Or would the binaries be pretty similar?
In most cases, yes, it's pretty easy. Here are just a few clues that I've seen often enough to remember them easily:
C++ program will typically end up with at least a few visible symbols that have been mangled.
C++ program will typically have at least a few calls to virtual functions, which are typically quite distinctive from code you'll typically see in C.
Many C++ compilers implement a calling convention for C++ that gives special consideration to passing the this pointer into C++ member functions. Again, since the this pointer simply doesn't exist in C, you'll rarely see a direct analog (though in some cases, they will use the same convention to pass some other pointer, so you need to be careful about this one).
A executable is a executable is a executable, no matter what language it's written in. If it's built for the target architecture, it'll run on the architecture.
The (arguably) most important difference between C and C++-compiled code, and the one relevant to libraries that can be linked both against C and C++ executables, is that of name mangling. Basically: when a library is compiled, it exports a set of symbols (function names, exported variables, etc.) that executables linked against the library can use. How these symbols are named is a fairly compiler/linker-specific, and if the subsequent executable is linked using a linker using an incompatible convention, then symbols won't resolve correctly. In addition, C and C++ have slightly different conventions. The Wikipedia article linked above has more of the details; suffice to say, when declaring exported symbols in a header file, you'll usually see a construction like:
#ifdef __cplusplus
extern "C" {
#endif
/* exported declarations here */
#ifdef __cplusplus
}
#endif
__cplusplus is a preprocessor macro only defined when compiling C++ code. The idea here is that, when using the header in C++, the compiler is instructed to use the C way of naming exported symbols (inside the "extern "C" { /* foo */ }" block, so the library can be linked both in C and C++ correctly.
I think I could tell if something is C++ or C from reading the disassembled binary code [for processor architectures that I'm familiar with, x86, x86_64 and ARM]. But in reality, there isn't much difference, you'd have to look pretty hard to know for sure.
Signs to look for are "indirect calls" (function pointer calls via a table) and this-pointers. Although C can have pointer to struct arguments and will often use function pointers, it's not usually set up in the way that C++ does it. Also, you'll notice, sometimes, that the compiler takes a pointer to a struct and adds a small offset - that's removing the outer layer of an inherited class. This CAN happen in C as well, but it won't be as common/distinctive.
Looking just at the binary [unless you can "do disassembly in your head" would be a lot harder - especially if it's been stripped of symbols - that's like the guy who could tell you what classical music something was on an old Vinyl record from looking at the tracks [with the label hidden] - not something most people can do, even if they are "good".
In practice, a C program (or a C++ program) is rarely only pure standard C (or C++) (for instance the C99 standard has no mean to scan a directory). So programs use additional libraries.
On Linux, most binaries are dynamically linked. Use the ldd command to find out.
If the binary is linked to the stdc++ library, the source code is likely C++.
If only the libc.so library is linked, the source code is probably only C (but you could link statically the libstdc++.a library).
You can also use tools working on binary files (e.g. objdump, readelf, strings, nm on Linux ....) to find more about them.
The code generated by C and C++ compilers is generally the same code. There are two important differences:
Name mangling: Each function and global variable becomes a symbol at compile time. In C these symbol's names are the same as their names in your source code. In C++ they are being mangled a bit to allow for polymorphic code
Calling conventions: If you call a method in C++ the this-pointer is passed as a hidden first parameter. Other conventions might also be different such as call by reference which does not exist in C
You can use an block such as this to let the C++-compiler generate code compatible to C:
extern "C" {
/* code */
}
I would like to be able to use C/C++ functions from python using ctypes python module.
I have a function int doit() in the .c / .cpp file. When I try to load the shared library:
Frr=CDLL("/path/FoCpy2/libFrr.so")
Frr.doit(c_int(5))
i find it working really well when the .c variant is used. When C++ is called the good way to call this function is (found out using nm libFrr.so using nm -gC libFrr.so produces just plain doit()):
Frr._Z4doitv(c_int(5))
I have read on Stackexchange that there is no standard way to call C++ from python, and there is "nonstandard name mangling" issue. Is the "Z4" a part of that issue? I guess the nonstandard name mangling would appear for more advanced language features such as class methods, templates, but also for such basic functions? Is it possible to force using simple C function names in simple cases for the C++ code?
You can use extern "C" to make the functions "look like" C functions to the outside world (i.e., disable name-mangling). And yes, you are correct, name-mangling is needed mostly for the more complicated features and types of functions that C++ has, and the name-mangling scheme has never been standardized (nor the binary compatibility) and so it varies from compiler to compiler and between versions (but most main-stream compilers have settled to something permanent now, but still different between compiler-vendors). And the reason that mangling is also required for plain old free functions is because C++ supports overloading (same function names but with different parameters), and thus, the compilers will encode the parameter specification (e.g., types) into the mangled names. Of course, if you use extern "C" you lose all features for which name-mangling is needed, so, it more or less boils down to C functions only.
You can use extern "C" either on a per-function basis, like so:
extern "C" int doit();
Or for the overall header:
extern "C" {
// all the function declarations here ...
};
However, for Python specifically, I highly recommend that you use a library that allows you to construct Python classes and functions that are a reflection of your C++ classes and functions, that makes life a lot easier and hides away all this extern "C" business. I recommend using Boost.Python, see this getting started page, it makes exporting functions and classes to Python a breeze. I guess others would also recommend SWIG, but I have never used it.
Calling c++ library functions is always a mess, actually even if you're using C++ you have to use the same compiler, etc. to make sure it works.
The only general solution is to define your c++ functions as extern "C" and make sure you follow the involved limitations - see here for an explanation of it.
Is it possible to export a function from a C++Builder DLL with a specific mangled name?
I am trying to build a C++Builder DLL to replace an existing VC++ DLL. The problem is that the application that uses the DLL expects one of the functions to have a specific mangled name.
That is, it expects the function to be called:
"?_FUNCTIONNAME_##YAHPAU_PARAM1_##PAU_PARAM2_###Z"
Of course I can export function from my DLL with a mangled name but CodeGear insists on using its own name mangling scheme.
Is it possible to force C++Builder to:
Use a specific mangled name for a function?
or
Use VC++ name mangling for a specific function?
Note that I only want to change the mangling for a specific function, not all the functions in the DLL.
Name mangling is just one of many aspects of the ABI. And it fact it would be the more easy to standardize. Compilers are using purposefully different name manglings when they are using different ABI in order to prevent linking incompatible objects.
One thing you may try for a given function is to mark it extern "C", then it will use the calling convention of C and the same mangling (commonly no mangling at all or just an initial _). Obviously that won't take care of other issues (handling of exceptions, precise content of vtbls, which registers are used for passing parameters, the precise definition of the standard library, ...)
Use a .def file to specify your own naming for the exported function.