How to manually mangle names in Visual C++? - c++

If I have a function in a .c like
void foo(int c, char v);
...in my .obj, this becomes a symbol named
_foo
...as per C name mangling rules. If I have a similar function in a .cpp file, this becomes something else entirely, as per the compiler-specific name mangling rules. msvc 12 will give us this:
?foo##YAXHD#Z
If I have that function foo in the .cpp file and I want it to use C name mangling rules (assuming I can do without overloading), we can declare it as
extern "C" void foo(int c, char v);
...in which case, we're back to good old
_foo
...in the .obj symbol table.
My question is, is it possible to go the other way around? If I wanted to simulate C++ name mangling with a C function, this would be easy with gcc because gcc's name mangling rules only make use of identifier-friendly characters, thus the mangled name of foo becomes _ZN3fooEic, and we could easily write
void ZN3fooEic(int c, char v);
Back in Microsoft-compiler-land, I obviously can't create a function whose name is a completely invalid identifier called
void ?foo##YAXHD#Z(int c, char v);
...but I'd still like that function to show up with that symbol name in the .obj symbol table.
Any ideas? I've looked through Visual C++'s supported pragmas, and I don't see anything useful.

You can do this using __identifier:
#include <stdio.h>
#pragma warning(suppress: 4483)
extern "C" void __cdecl __identifier("?foo##YAXHD#Z")(int c, char v)
{
printf("%d %c\n", c, v);
}
void __cdecl foo(int, char);
int main()
{
foo(10, 'x');
}

You're right. That's not (directly) possible (note: never trust VSC++). However, there exists a nifty workaround if you really need this. First of all, in the C++ file...
extern "C" int proxy(int i, char c);
int foo(int i, char c)
{
return proxy(i, c);
}
Then, in the C file...
int proxy(int i, char c)
{
// Do whatever you wanna do here
}
Without having to type any mangled name at all, you are now able to call the foo function, which is actually just a wrapper around the C function proxy. This gives you the same effect as if proxy was actually foo, from C++'s point of view. The single penalty here is of course a quick 'n' dirty function call. If the ABI allows it, and the compiler is smart enough, this can be replaced with a single JMP x86 instruction.
Another way would be to write a function foo in C, and then use MinGW's objcopy in order to rename the symbol...
$ objcopy --redefine-sym "foo=?foo##YAXHD#Z" foobar.obj
I'm not sure if that's possible just with VSC++ tools. It would be very unstable, unportable, and hacky anyways.

You might get it to work using a .DEF file.
Define your function in your foo.cpp:
void foo(int c, char v) { ... }
Then pass a def file to the linker, that looks like this:
LIBRARY mylib
EXPORTS
?foo##YAXHD#Z=_foo
Disclaimer: untested, I might be missing some details.

Related

C++ program using a C library headers is recognizing "this" as a keyword. Extern "C" error?

My C++ program needs to use an external C library.
Therefore, I'm using the
extern "C"
{
#include <library_header.h>
}
syntax for every module I need to use.
It worked fine until now.
A module is using the this name for some variables in one of its header file.
The C library itself is compiling fine because, from what I know, this has never been a keyword in C.
But despite my usage of the extern "C" syntax,
I'm getting errors from my C++ program when I include that header file.
If I rename every this in that C library header file with something like _this,
everything seems to work fine.
The question is:
Shouldn't the extern "C" syntax be enough for backward compatibility,
at least at syntax level, for an header file?
Is this an issue with the compiler?
Shouldn't the extern "C" syntax be enough for backward compatibility, at least at syntax level, for an header file? Is this an issue with the compiler?
No. Extern "C" is for linking - specifically the policy used for generated symbol names ("name mangling") and the calling convention (what assembly will be generated to call an API and stack parameter values) - not compilation.
The problem you have is not limited to the this keyword. In our current code base, we are porting some code to C++ and we have constructs like these:
struct Something {
char *value;
char class[20]; // <-- bad bad code!
};
This works fine in C code, but (like you) we are forced to rename to be able to compile as C++.
Strangely enough, many compilers don't forcibly disallow keyword redefinition through the preprocessor:
#include <iostream>
// temporary redefinition to compile code abusing the "this" keyword
#define cppThis this
#define this thisFunction
int this() {
return 1020;
}
int that() {
return this();
}
// put the C++ definition back so you can use it
#undef this
#define this cppThis
struct DumpThat {
int dump() {
std::cout << that();
}
DumpThat() {
this->dump();
}
};
int main ()
{
DumpThat dt;
}
So if you're up against a wall, that could let you compile a file written to C assumptions that you cannot change.
It will not--however--allow you to get a linker name of "this". There might be linkers that let you do some kind of remapping of names to help avoid collisions. A side-effect of that might be they allow you to say thisFunction -> this, and not have a problem with the right hand side of the mapping being a keyword.
In any case...the better answer if you can change it is...change it!
If extern "C" allowed you to use C++ keywords as symbols, the compiler would have to resolve them somehow outside of the extern "C" sections. For example:
extern "C" {
int * this; //global variable
typedef int class;
}
int MyClass::MyFunction() { return *this; } //what does this mean?
//MyClass could have a cast operator
class MyOtherClass; //forward declaration or a typedef'ed int?
Could you be more explicit about "using the this name for some variables in one of its header files"?
Is it really a variable or is it a parameter in a function prototype?
If it is the latter, you don't have a real problem because C (and C++) prototypes identify parameters by position (and type) and the names are optional. You could have a different version of the prototype, eg:
#ifdef __cplusplus
extern "C" {
void aFunc(int);
}
#else
void aFunc(int this);
#endif
Remember there is nothing magic about header files - they just provide code which is lexically included in at the point of #include - as if you copied and pasted them in.
So you can have your own copy of a library header which does tricks like the above, just becoming a maintenance issue to ensure you track what happens in the original header. If this was likely to become an issue, add a script as a build step which runs a diff against the original and ensures the only point of difference is your workaround code.

Standard c++ library linking

I'm trying to understand when does standard library linking to my own binary. I've written the following:
#include <stdio.h>
double atof(const char*);
int main(){
const char * v="22";
printf("Cast result is %f", atof(v));
}
It's compiling successful with g++ -c main.cpp, but when I'm linking just created object file I've an error. Error descriptio is:
/tmp/ccWOPOS0.o: In function `main':
main.cpp:(.text+0x19): undefined reference to `atof(char const*)'
collect2: error: ld returned 1 exit status
But I don't understand why this error is caused? I think that the standard c++ library automatically linked to my binary by the ld linker. What is the difference between the including header files and just declaring a functions which I need to use explicitly .
As a general rule in C++, it is a bad idea to manually declare library functions such as atof().
It used to be common in old C programs, but C doesn't have function overloading so it is more forgiving about "almost" correct declarations. (Well some of the old compilers were, I can't really speak for the newest ones). That is why we describe C as a "weakly typed" language, while C++ is a more "strongly typed" language.
An additional complication is that the compilers perform "name mangling": the name they pass to the linker is a modified version of the source name. The C compiler may perform quite different name mangling from the C++ compiler. The standard lib version of atof() is a C function. To declare it in a C++ source file you need to declare it as
extern "C"
{
double atof(const char *);
}
or possibly
extern "C" double atof(const char *);
There are many additional complexities, but that is enough to go on with.
Safest idea is to just include the appropriate headers.
#include <iostream>
#include <cstdlib>
int main()
{
const char v[]= "22";
std::cout << "Cast result is " << atof(v) << std::endl;
return 0;
}
Extra background in response to comment by #DmitryFucintv
Calling conventions
When calling a function, a calling convention is an agreement on how parameters and return values are passed between the calling function and the called function. On x86 architecture, the two most common are __cdecl and __stdcall, but a number of others exist.
Consider the following:
/* -- f.c --*/
int __stdcall f(int a, double b, char *c)
{
// do stuff
return something;
}
/* main.c */
#include <iostream>
extern int __cdecl f(int a, double b, char *c);
int main()
{
std::cout << f(1, 2.3, "45678") << std::endl;
return 0;
}
In a C program, this will probably compile and link OK. The function f() is expecting its args in __stdcall format, but we pass them in __cdecl format. The result is indeterminate, but could easily lead to stack corruption.
Because the C++ linker is a bit fussier, it will probably generate an error like the one you saw. Most would agree that is a better outcome.
2 Name Mangling
Name Mangling (or name decoration) is a scheme where the compiler adds some extra characters to the object name to give some hints to the linker. An object might be a function or a variable. Languages that permit function overloading (like C++ and Java) must do something like this so that the linker can tell the difference between different functions with the same name.
e.g.
int f(int a);
int f(double a);
int f(const char *a, ...);
It's because atof has C linkage, and you're compiling this as C++ - change:
double atof(const char*);
to:
extern "C" double atof(const char*);
and it should work.
Obviously you should not normally do this and you should just use the correct header:
#include <cstdlib>
This has nothing to do with the standard library.
The problem you have is that atofis not being defined, so the linker doesn't find it. You need to define the function, otherwise is impossible to know what the code is supposed to do.
And it looks like atof is a C function in the header stdlib.h. This code should work, although is not using C++ exclusive functions.
#include <stdlib.h>
int main(){
const char * v="22";
printf("Cast result is %f", atof(v));
}
When you declare atof, you're declaring a subtly different function to the standard one. The function you're declaring is not defined in the standard library.
Don't re-declare standard functions, because you're liable to getting it wrong, as here. You're including the header and the header correctly declares the functions for you.

Calling C++ function pointers from C libraries

I have a class that only has static members.
I would like to register one of its member functions (VerifyClean in the code below) to be called at exit, using the "atexit" library function.
The C++ FQA says that i must specify extern "C" for the function i want to register this way, like in the following example.
class Example
{
public:
static void Initialize();
static void DoDirtyStuff {++dirtLevel;}
static void CleanUpStuff {--dirtLevel;}
private:
static void VerifyClean();
// DOESN'T COMPILE: extern "C" static void VerifyClean();
static int dirtLevel;
}
int Example::dirtLevel;
extern "C" void Example::VerifyClean() // DO I NEED extern "C" HERE?
{
assert(dirtLevel == 0);
}
void Example::Initialize()
{
dirtLevel = 0;
atexit(&VerifyClean);
}
Do i really have to use extern "C"?
Does the answer change if i replace "atexit" with a non-library function (implemented in plain C)?
If the function VerifyClean were public and i decided to call it directly from C++ code, would i get link errors or runtime crashes? I ask this because the declaration doesn't mention extern "C" at all, so regular C++ code might handle the function call incorrectly. This works OK on my MS Visual Studio 2005 system.
It is possible for a compiler to use different calling conventions for C and C++ code; however, in practice, this almost never occurs.
If you just want it to work and don't care about supporting obscure compilers, don't bother with extern "C". It's not necessary in any widely-used compiler.
If you want to be absolutely pedantic, or need to support a pedantic compiler, write a wrapper:
extern "C" static void ExampleVerifyClean()
{
Example::VerifyClean();
}
void Example::Initialize()
{
dirtLevel = 0;
atexit(&ExampleVerifyClean);
}
link errors.
C++ performs what is called name mangling, which generates a link-time function name with type information.
extern C turns that off into a simpler identifier.
edit:
If everything is being compiled by a C++ compiler, it won't be an issue. But if you have an object file compiled by a C compiler and one being compiled by a C++ compiler, you are going to have some issues.
I seem to recall DLLs requiring an extern "C" specification, but that memory is maybe 10 years old at this point
Okay.
I whipped up a test case with a function that had a signature
int foo(float, float)
And compiled it under 3 different gcc invocations -
gcc test_c.c -S
g++ test.cpp -S
These two invocations produced different identifiers in the assembly. The C++ had mangled the name in its usual type-modifying approach. (of course compilers may do this differently)
Then, I wrapped foo in Extern "C" and invoked G++ again...
g++ test.cpp -S
Which then removed the mangled C++ name, leaving a plain C unmangled name.
While there are other subtleties involved here, e.g., the order of arguments pushed onto the stack, I rest my case on this point, based on data.
Without extern "C", your function name will got mangled by compiler, so the function name might end up different from what you expect. You need to call the function using its mangled name (such as using GetProcAddress in windows) or you'll got linker error. Different compiler mangled it different way so it's best if you keep using extern keyword.
you might use this :
class yourname
{
public:
...
static void _cdecl AtExitCall ();
};
int main()
{
ataexit( yourname::AtExitCall );
}
It is a mistake to use extern "c" in this case for 2 reasons:
You use extern C, when you want to cross compile your program with a g++ (c++ compiler) for linking with gcc (C compiler). It tells g++ to turn off function name mangling. Clearly that is not your case here, or your whole file should be in extern "C"
When using extern "C" you should use c-like function names, like Example_VerifyClean.
Example::VerifyClean is not a valid c-function name and cannot be stored without mangling

Function declaration in C and C++

I have two C++ files, say file1.cpp and file2.cpp as
//file1.cpp
#include<cstdio>
void fun(int i)
{
printf("%d\n",i);
}
//file2.cpp
void fun(double);
int main()
{
fun(5);
}
When I compile them and link them as c++ files, I get an error "undefined reference to fun(double)".
But when I do this as C files, I don't get error and 0 is printed instead of 5.
Please explain the reason.
Moreover I want to ask whether we need to declare a function before defining it because
I haven't declared it in file1.cpp but no error comes in compilation.
This is most likely because of function overloading. When compiling with C, the call to fun(double) is translated into a call to the assembly function _fun, which will be linked in at a later stage. The actual definition also has the assembly name _fun, even though it takes an int instead of a double, and the linker will merrily use this when fun(double) is called.
C++ on the other hand mangles the assembly names, so you'll get something like _fun#int for fun(int) and _fun#double for fun(double), in order for overloading to work. The linker will see these have different names and spurt out an error that it can't find the definition for fun(double).
For your second question it is always a good idea to declare function prototypes, generally done in a header, especially if the function is used in multiple files. There should be a warning option for missing prototypes in your compiler, gcc uses -Wmissing-prototypes. Your code would be better if set up like
// file1.hpp
#ifndef FILE1_HPP
#define FILE1_HPP
void fun(int)
#endif
// file1.c
#include "file1.hpp"
...
// file2.c
#include "file1.hpp"
int main()
{
fun(5);
}
You'd then not have multiple conflicting prototypes in your program.
This is because C++ allows you to overload functions and C does not. It is valid to have this in C++:
double fun(int i);
double fun(double i);
...
double fun(int i) { return 1;}
double fun(double i) { return 2.1; }
but not in C.
The reason you aren't able to compile it with your C++ compiler is because the C++ compiler sees the declaration as double and tries to find a definition for it. With the C compiler, you should be getting an error for this as well, I would think you didn't enter the code exactly as you said you did when testing this with the C compiler.
The main point: C++ has function overloading, C does not.
C++ (a sadistic beast, you will agree) likes to mangle the names of the functions. Thus, in your header file for the C part:
at the top:
#ifdef __cplusplus
extern "C" {`
#endif
at the bottom:
#ifdef __cplusplus
}
#endif
This will persuade it not to mangle some of the names.
Look here
OR, in your cpp you could say
extern "C" void fun( double );
A holdover of the C language is that it allows functions to be called without actually requiring the declaration visible within the translation -- it just assumes that the arguments of such functions are all int.
In your example, C++ allows for overloading, and does not support implicit function declarations - the compiler uses the visible function fun(double), and the linker fails because the function fun(double) is never implemented. fun(int) has a different signature (in C++), and exists as a unique symbol, whereas a C compiler (or linker, depending on visibility) would produce an error when you declare both fun(int) and fun(double) as C symbols.
That's just how languages evolved over the years (or not). Your compiler probably has a warning for this problem (implicit function declarations).
You'll see different results when you declare the functions as C functions (they're declared as C++ functions in your example when compiled as C++ source files).
C++ requires the function to be declared before it is used, C does not (unless you tell your compiler to warn you about the issue).
When compiled as C++ you are allowed to have two functions with the same name (as long as they have different parameters). In C++ name mangling is used so the linker can distinguish the two:
fun_int(int x);
fun_double(double x);
When compiled in C there is only one function with a specific name.
When you compile file1.c it generate a function that reads an integer from the stack and prints it.
When you compile file2.c it sees that the fun() takes a double. So it converts the input parameter to a double push it onto the stack then inserts a call to fun() into the code. As the function is in a different compilation unit the actual address is not resolved here but only when the linker is invoked. When the linker is invoked it sees a call to fun needs to be resolved and inserts the correct address, but it has no type information to validate the call with.
At runtime 5 is now converted into a double and pushed onto the stack. Then fun() is invoked. fun() reads an integer from the stack and then prints it. Because a double has a different layout to an integer what will be printed will be implementation defined and depends on how both double and int are layed out in memory.
#include <stdio.h>
int Sum(int j, int f)
{
int k;
k = j + f;
return k;
}
main()
{
int i=3;
int j = 6;
int k = sum(i,j);
printf("%d",k);
}
Output is 9

How does an extern "C" declaration work?

I'm taking a programming languages course and we're talking about the extern "C" declaration.
How does this declaration work at a deeper level other than "it interfaces C and C++"? How does this affect the bindings that take place in the program as well?
extern "C" is used to ensure that the symbols following are not mangled (decorated).
Example:
Let's say we have the following code in a file called test.cpp:
extern "C" {
int foo() {
return 1;
}
}
int bar() {
return 1;
}
If you run gcc -c test.cpp -o test.o
Take a look at the symbols names:
00000010 T _Z3barv
00000000 T foo
foo() keeps its name.
Let's look at a typical function that can compile in both C and C++:
int Add (int a, int b)
{
return a+b;
}
Now in C the function is called "_Add" internally. Whereas the C++ function is called something completely different internally using a system called name-mangling. Its basically a way to name a function so that the same function with different parameters has a different internal name.
So if Add() is defined in add.c, and you have the prototype in add.h you will get a problem if you try to include add.h in a C++ file. Because the C++ code is looking for a function with a name different to the one in add.c you will get a linker error. To get around that problem you must include add.c by this method:
extern "C"
{
#include "add.h"
}
Now the C++ code will link with _Add instead of the C++ name mangled version.
That's one of the uses of the expression. Bottom line, if you need to compile code that is strictly C in a C++ program (via an include statement or some other means) you need to wrap it with a extern "C" { ... } declaration.
When you flag a block of code with extern "C", you're telling the system to use C style linkage.
This, mainly, affects the way the linker mangles the names. Instead of using C++ style name mangling (which is more complex to support operator overloads), you get the standard C-style naming out of the linker.
It should be noted that extern "C" also modifies the types of functions. It does not only modify things on lower levels:
extern "C" typedef void (*function_ptr_t)();
void foo();
int main() { function_ptr_t fptr = &foo; } // error!
The type of &foo does not equal the type that the typedef designates (although the code is accepted by some, but not all compilers).
extern C affects name mangling by the C++ compiler. Its a way of getting the C++ compiler to not mangle names, or rather to mangle them in the same way that a C compiler would. This is the way it interfaces C and C++.
As an example:
extern "C" void foo(int i);
will allow the function to be implemented in a C module, but allow it to be called from a C++ module.
The trouble comes when trying to get a C module to call a C++ function (obviously C can't use C++ classes) defined in a C++ module. The C compiler doesn't like extern "C".
So you need to use this:
#ifdef __cplusplus
extern "C" {
#endif
void foo(int i);
#ifdef __cplusplus
}
#endif
Now when this appears in a header file, both the C and C++ compilers will be happy with the declaration and it could now be defined in either a C or C++ module, and can be called by both C and C++ code.
In C++ the name/symbol of the functions are actually renamed to something else such that different classes/namespaces can have functions of same signatures. In C, the functions are all globally defined and no such customized renaming process is needed.
To make C++ and C talk with each other, "extern C" instructs the compiler not to use the C convention.
extern "C" denotes that the enclosed code uses C-style linking and name mangling. C++ uses a more complex name mangling format. Here's an example:
http://en.wikipedia.org/wiki/Name_mangling
int example(int alpha, char beta);
in C: _example
in C++: __Z7exampleic
Update: As GManNickG notes in the comments, the pattern of name mangling is compiler dependent.
extern "C", is a keyword to declare a function with C bindings, because C compiler and C++ compiler will translate source into different form in object file:
For example, a code snippet is as follows:
int _cdecl func1(void) {return 0}
int _stdcall func2(int) {return 0}
int _fastcall func3(void) {return 1}
32-bit C compilers will translate the code in the form as follows:
_func1
_func2#4
#func3#4
in the cdecl, func1 will translate as '_name'
in the stdcall, func2 will translate as '_name#X'
in the fastcall, func2 will translate as '#name#X'
'X' means the how many bytes of the parameters in parameter list.
64-bit convention on Windows has no leading underscore
In C++, classes, templates, namespaces and operator overloading are introduced, since it is not allowed two functions with the same name, C++ compiler provide the type information in the symbol name,
for example, a code snippet is as follows:
int func(void) {return 1;}
int func(int) {return 0;}
int func_call(void) {int m=func(), n=func(0);}
C++ compiler will translate the code as follows:
int func_v(void) {return 1;}
int func_i(int) {return 0;}
int func_call(void) {int m=_func_v(), n=_func_i(0);}
'_v' and '_i' are type information of 'void' and 'int'
Here is a quote from msdn
"The extern keyword declares a variable or function and specifies that it has external linkage (its name is visible from files other than the one in which it's defined). When modifying a variable, extern specifies that the variable has static duration (it is allocated when the program begins and deallocated when the program ends). The variable or function may be defined in another source file, or later in the same file. Declarations of variables and functions at file scope are external by default."
http://msdn.microsoft.com/en-us/library/0603949d%28VS.80%29.aspx