How to force compiler to mangle C names to C++ names - c++

I have .obj with function that has everything it needs to be linked as C++ member function. Problem is, it's in C and thus the class using it expects something uglier than it's normal name. So I figure this can be done in only 2 ways: either mangle the name of the C function and/or add additional symbol to the symbol table that would have the mangled name anyway, but I could still use it's original name too.. so mangle the name basically. Any ideas how to do this or have some completely other way of solving this? Please share but do consider usefulness of saying extern "C" in this particular case :)) thx

Your (C-based) object file has a symbol and you cannot redefine that symbol to have a different name -- that would be a task for the compiler generating that object file. The C compiler doesn't know about C++ and it cannot be made to emit a symbol with C++ linkage and name mangling. So the only way to use that symbol (your C function) is to call it by the symbol it is know for.
You can, of course, use that function to implement a C++ (member) function (the additional level of indirection is optimised away if the call is inline) as in
extern "C" { int my_C_func(void*, int); } // could be in an included header
struct A {
// implement the followind member using the C function
int operator()(int i) { return my_C_func(this,i); }
};
If the C++ class is already declared and its declaration cannot be touched, then you can still implement the member function in the same way in a separate source file. However, this cannot be inline and hence comes at the cost of an additional function call:
in file.cpp:
extern "C" { int my_C_func(void*, int); } // could be in an included header
int A::operator()(int i) { return my_C_func(this,i); }
From your reply to me comment, I conclude that you actually have control of the implementation of the C function. So, why do you need to implement this in C? Why can't you simply implement it in C++? Then you will get the correct linkage and name mangling and you can directly implement the desired member function.

Related

symbol name in shared object differs from function in .cpp file

In a project environment, I wanted to change a source file for a shared object from c to cpp. I made sure to change its entry in the CMakeLists.txt, too:
add_library(*name* SHARED *mysource*.cpp)
target_link_libraries(*name as target* *item*)
The build process runs fine. Unfortunately, when I try to use it I get an error that the functions inside the .so can not be found.
After checking the dynamic symbol table inside the shared object with objdump -T, I found out that the names of the symbols differ from the ones in the source file.
e.g.
int sr_plugin_init_cb(sr_session_ctx_t *session, void **private_ctx);
becomes
_Z17sr_plugin_init_cbP16sr_session_ctx_sPPv
Inside my visual studio code it says that it can build the object and link the shared library correctly, and it also changed from C to CXX in the output and gives me no errors even though some code is c++ only.
Why do the symbol names change?
Why do the symbol names change?
C++ has a feature called function overload. Basically what happens is that you declare two function that are named the same, but slightly differ:
int sr_plugin_init_cb(sr_session_ctx_t *session, void **private_ctx);
int sr_plugin_init_cb(sr_session_ctx_t *session, void **private_ctx, int some_arg);
or a little worse case:
struct A {
# each of these functions can be different depending on the object
void func();
void func() const;
void func() volatile;
void func() volatile const;
};
Functions are named the same. Linker doesn't see the C++ source, but it still has to differentiate between the two functions to link with them. So C++ compiler "mangles" the function names, so that linker can differentiate between them. For big simplicity it could look like:
sr_plugin_init_cb_that_doesnt_take_int_arg
sr_plugin_init_cb_that_takes_int_arg
A_func
A_func_but_object_is_const
A_func_but_object_is_volatile
A_func_but_object_is_volatile_and_const
The rules of name mangling are complicated, to make the names as short as possible. They have to take into account any number of templates, arguments, objects, names, qualifers, lambdas, overloads, operators etc. and generate an unique name and they have to use only characters that are compatible with the linker on a specific architecture. For example here is a reference for name mangling used by gnu g++ compiler.
The symbol name _Z17sr_plugin_init_cbP16sr_session_ctx_sPPv is the mangled by your compiler name of your function.
Thank You very much for the detailled answer. I now understand the issue.
After a quick search I found a solution for my problem. Encapsulating the function prototypes like this avoids the name mangling.
extern "C" {
// Function prototypes
};

NASM call for external C++ function

I am trying to call external C++ function from NASM. As I was searching on google I did not find any related solution.
C++
void kernel_main()
{
char* vidmem = (char*)0xb8000;
/* And so on... */
}
NASM
;Some calls before
section .text
;nothing special here
global start
extern kernel_main ;our problem
After running compiling these two files I am getting this error: kernel.asm(.text+0xe): undefined reference to kernel_main'
What is wrong here? Thanks.
There is no standardized method of calling C++ functions from assembly, as of now. This is due to a feature called name-mangling. The C++ compiler toolchain does not emit symbols with the names exactly written in the code. Therefore, you don't know what the name will be for the symbol representing the function coded with the name kernel_main or kernelMain, whatever.
Why is name-mangling required?
You can declare multiple entities (classes, functions, methods, namespaces, etc.) with the same name in C++, but under different parent namespaces. This causes symbol conflicts if two entities with the name local name (e.g. local name of class SomeContainer in namespace SymbolDomain is SomeContainer but global name is SymbolDomain::SomeContainer, atleast to talk in this answer, okay) have the same symbol name.
Conflicts also occur with method overloading, therefore, the types of each argument are also emitted (in some form) for methods of classes. To cope with this, the C++ toolchain will somehow mangle the actual names in the ELF binary object.
So, can't I use the C++ mangled name in assembly?
Yes, this is one solution. You can use readelf -s fileName with the object-file for kernel_main. You'll have to search for a symbol having some similarity with kernel_main. Once you think you got it, then confirm that with echo _ZnSymbolName | c++filt which should output kernel_main.
You use this name in assembly instead of kernel_main.
The problem with this solution is that, if for some reason, you change the arguments, return value, or anything else (we don't know what affects name-mangling), your assembly code may break. Therefore, you have to be careful about this. On the other hand, this is not a good practice, as your going into non-standard stuff.
Note that name-mangling is not standardized, and varies from toolchain to toolchain. By depending on it, your sticking to the same compiler too.
Can't I do something standardized?
Yep. You could use a C function in C++ by declaring the function extern "C" like this
extern "C" void kernelMain(void);
This is the best solution in your case, as your kernel_main is already a C-style function with no parent class and namespace. Note that, the C function is written in C++ and still uses C++ features (internally).
Other solutions include using a macro indirection, where a C function calls the C++ function, if you really need to. Something like this -
///
/// Simple class containing a method to illustrate the concept of
/// indirection.
///
class SomeContainer
{
public:
int execute(int y)
{
}
}
#define _SepArg_ , // Comma macro, to pass into args, comma not used directly
///
/// Indirection for methods having return values and arguments (other than
/// this). For methods returning void or having no arguments, make something
/// similar).
///
#define _Generate_Indirection_RetEArgs(ret, name, ThisType, thisArg, eargs) \
extern "C" ret name ( ThisType thisArg, eargs ) \
{ \
return thisArg -> name ( eargs ); \
} \
_Generate_Indirection_RetEArgs(int, execute, SomeContainer, x, int y);

Why I need C++ linkage for a template?

Sometimes I try to follow the logic of some rules, sometimes the logic of why things are happening the way they do defeats any law that I know of.
Typically a template it's described as something that lives only during the compilation phase and it's exactly equivalent to hand-writing some function foo for any given type T .
So why this code doesn't compile ( I'm using C++11 with gcc and clang at the moment but I don't think it's that relevant in this case ) ?
#include <iostream>
#include <cstdint>
#include <cstdlib>
extern "C" {
template <typename T>
T foo(T t)
{
return t;
}
}
int main()
{
uint32_t a = 42;
std::cout << foo(a) << '\n';
return EXIT_SUCCESS;
}
And the thing that defeats all the logic is that the complain is about the linkage, and the implicit message is that this code doesn't generate a function, it generates something else that after compilation it's not suitable for a C style linkage.
What is the technical reason why this code doesn't compile ?
Let's look at this from a simple perspective. At the very least, using extern "C" will remove the C++ name mangling. So, we then have your template, and we'll instantiate it twice.
int foo(int val);
float foo(float val);
Under C's naming rules, these are required to have the same name foo from the perspective of the linker. If they have the same name though, we can't distinguish between them, and we'll have an error.
Under C++, the rules for how names are mangled is implementation defined. So C++ compilers will apply a name mangling to these two functions to differentiate them. Perhaps we'll call them foo_int and foo_float.
Because C++ can do this, we have no issues. But extern "C" requires the compiler to apply the C naming rules.
"linkage" is a slightly misleading term. The main thing that extern "C" changes is name mangling. That is, it produces symbol names in the object files that are compatible with the sort of symbols that equivalent C code would produce. That way it can link with C object code.... but it's a different sort of thing than specifying static or extern linkage.
But templates don't have a C equivalent, and name mangling is used to make sure that different instantiations of a given templated function result in different symbol names (so that the linker knows which one to use in a given place).
So there's no way to give templates C linkage; you're asking the compiler to do two fundamentally incompatible things.
Well other answers have explained why it doesn't work from the C++ side.
From the C side there are work-rounds but they are not portable.
You can simply not use the extern "C" keyword, create name-mangled functions and then in the C code link to the actual mangled names.
To make that easier you could also use GCC's abi::__cxa_demangle() function combined with a look-up table so you don't need to know what the mangled function names are (just their demangled signature).
But it all a bit of a bodge really.
Of course if you only call the template functions from C code, they'll never get instantiated to begin with. So you would need to make sure they get called in the C++ code to make sure they're present in the object file.

Duplicate symbol and functions with static keyword

I want to create a Commons.h file where I can put some shared info, constants, macros and helper functions.
This file has to be included in many part of my application.
If I create function with this syntax I get a Duplicate Symbol error:
int myFunction(int a){
//do something..
}
While if I add the static keyword I get no errors.
static int myFunction(int a){
//do something..
}
1) Is this a valid/right way to add helper functions to a project?
2) What happen exactly adding the static keyword at that definition?
Not really. You're creating a separate instance of the function
in every translation unit. What you should do is only declare
the function in the header:
extern int myFunction( int a );
and define it in a source file somewhere. (Note that the
extern above is optional, since it is implicit for all
function declarations, and it is usual to omit it. I add it
here only to stress the fact that you are declaring, and not
defining.)
If you use the keyword static before a function declaration, then you only can use this function inside the actual translation unit (.cpp, .c or .m), where it was defined.
So it is the opposite of the keyword extern, extern is the default storage class specifier for functions.
The use for a helper function is then wrong, because it doesn't even compile.
Instead you should declare the function in the helper file as extern. And use it without implementing it again. You can implement it once in the .c/.cpp/.m of the helper .h.
If you use a function as a helper function for other files, then it is good practice to use the extern keyword, even though it is not needed. It is a hint for programmers, this function is used somewhere else.

When is "extern C " necessary in c++ in windows?

As we know that we can use c functions directly in c++, when is extern "C" necessary then?
If your function is implemented in a .c file, the .cpp files will need the extern "C" reference, or else they'd reference a mangled C++-style function name, and the link would fail.
It's also handy for exporting functions from DLLs so that they are exported with a non-mangled name.
It's necessary when a C++ function must be called by C code rather than C++ code.
Basically, when you want your C++ library to be backwards compatible.
There are two rather different uses for extern "C". One is to define a function in C++ that you should be able to call from C. I.e., you're writing code in C++, but it needs to interface with C code. In this case, you define the function as extern "C":
extern "C" {
int c_callable_func1() {}
int c_callable_func2() {}
}
When you do this, the interface of those functions must follow pretty much the same rules as they would in C as well (e.g., you can't overload the functions or use default values for any parameters).
The other (considerably more common) situation is that you have code written in C that you want to be able to call from C++. In this case, the function definitions remain exactly as before, but the functions need to be declared/prototyped as extern "C". In a typical case, you want to use a single header that can be #included in either a C or C++ file, so the structure looks something like this:
// myheader.h
#ifndef MY_HEADER_H_INCLUDED_
#define MY_HEADER_H_INCLUDED_
#ifdef __cplusplus
extern "C" {
#endif
int func1(void);
void func2(int);
#ifdef __cplusplus
}
#endif
#endif
So, a C++ compiler will see function declarations (and typedefs, etc.) surrounded by an extern "C" block, while a C compiler will see prototypes, not surrounded by something it doesn't recognize.
In the first case (C++ functions callable from C), you'll normally structure the header roughly the same way, so you can also call those functions from C++ if necessary (but at the interface, you still lose all extra features of C++ like function overloading).
As you know that c++ support function overloading, which define the same function or method many times with different parameters. To do this, the compiler has to add some part of symbols for each one ... for example, the compiler will change the function name foo in the following declaration
from
void foo(int f,char c);
to
foo#i&c
Unfortunately, C doesn't support this. All function names remain the same after compiling it. So, to call a c++ function from c, you have to know the exact name after the modification and I think it's hard and different from a compiler to another.
to work around this and be able call c++ function from c and stop the compiler from changing the names you have to use this keyword like
extern "C" {
void foo(int f,char c);
}
that's it !!!
Because the function signatures generated by C and C++ compilers differ -- this sets up the C convention for C function even when using C++.