Combining C++ and C - how does #ifdef __cplusplus work? - c++

I'm working on a project that has a lot of legacy C code. We've started writing in C++, with the intent to eventually convert the legacy code, as well. I'm a little confused about how the C and C++ interact. I understand that by wrapping the C code with extern "C" the C++ compiler will not mangle the C code's names, but I'm not entirely sure how to implement this.
So, at the top of each C header file (after the include guards), we have
#ifdef __cplusplus
extern "C" {
#endif
and at the bottom, we write
#ifdef __cplusplus
}
#endif
In between the two, we have all of our includes, typedefs, and function prototypes. I have a few questions, to see if I'm understanding this correctly:
If I have a C++ file A.hh which
includes a C header file B.h,
includes another C header file C.h,
how does this work? I think that
when the compiler steps into B.h,
__cplusplus will be defined, so it
will wrap the code with extern "C"
(and __cplusplus will not be
defined inside this block). So,
when it steps into C.h,
__cplusplus will not be defined
and the code will not be wrapped in
extern "C". Is this correct?
Is there anything wrong with
wrapping a piece of code with
extern "C" { extern "C" { .. } }?
What will the second extern "C"
do?
We don't put this wrapper around the .c files, just the .h files. So, what happens if a function doesn't have a prototype? Does the compiler think that it's a C++ function?
We are also using some third-party
code which is written in C, and does
not have this sort of wrapper around
it. Any time I include a header
from that library, I've been putting
an extern "C" around the #include.
Is this the right way to deal with
that?
Finally, is this set up a good idea?
Is there anything else we should do?
We're going to be mixing C and C++
for the foreseeable future, and I
want to make sure we're covering all
our bases.

extern "C" doesn't really change the way that the compiler reads the code. If your code is in a .c file, it will be compiled as C, if it is in a .cpp file, it will be compiled as C++ (unless you do something strange to your configuration).
What extern "C" does is affect linkage. C++ functions, when compiled, have their names mangled -- this is what makes overloading possible. The function name gets modified based on the types and number of parameters, so that two functions with the same name will have different symbol names.
Code inside an extern "C" is still C++ code. There are limitations on what you can do in an extern "C" block, but they're all about linkage. You can't define any new symbols that can't be built with C linkage. That means no classes or templates, for example.
extern "C" blocks nest nicely. There's also extern "C++" if you find yourself hopelessly trapped inside of extern "C" regions, but it isn't such a good idea from a cleanliness perspective.
Now, specifically regarding your numbered questions:
Regarding #1: __cplusplus will stay defined inside of extern "C" blocks. This doesn't matter, though, since the blocks should nest neatly.
Regarding #2: __cplusplus will be defined for any compilation unit that is being run through the C++ compiler. Generally, that means .cpp files and any files being included by that .cpp file. The same .h (or .hh or .hpp or what-have-you) could be interpreted as C or C++ at different times, if different compilation units include them. If you want the prototypes in the .h file to refer to C symbol names, then they must have extern "C" when being interpreted as C++, and they should not have extern "C" when being interpreted as C -- hence the #ifdef __cplusplus checking.
To answer your question #3: functions without prototypes will have C++ linkage if they are in .cpp files and not inside of an extern "C" block. This is fine, though, because if it has no prototype, it can only be called by other functions in the same file, and then you don't generally care what the linkage looks like, because you aren't planning on having that function be called by anything outside the same compilation unit anyway.
For #4, you've got it exactly. If you are including a header for code that has C linkage (such as code that was compiled by a C compiler), then you must extern "C" the header -- that way you will be able to link with the library. (Otherwise, your linker would be looking for functions with names like _Z1hic when you were looking for void h(int, char)
5: This sort of mixing is a common reason to use extern "C", and I don't see anything wrong with doing it this way -- just make sure you understand what you are doing.

extern "C" doesn't change the presence or absence of the __cplusplus macro. It just changes the linkage and name-mangling of the wrapped declarations.
You can nest extern "C" blocks quite happily.
If you compile your .c files as C++ then anything not in an extern "C" block, and without an extern "C" prototype will be treated as a C++ function. If you compile them as C then of course everything will be a C function.
Yes
You can safely mix C and C++ in this way.

A couple of gotchas that are colloraries to Andrew Shelansky's excellent answer and to disagree a little with doesn't really change the way that the compiler reads the code
Because your function prototypes are compiled as C, you can't have overloading of the same function names with different parameters - that's one of the key features of the name mangling of the compiler. It is described as a linkage issue but that is not quite true - you will get errors from both the compiler and the linker.
The compiler errors will be if you try to use C++ features of prototype declaration such as overloading.
The linker errors will occur later because your function will appear to not be found, if you do not have the extern "C" wrapper around declarations and the header is included in a mixture of C and C++ source.
One reason to discourage people from using the compile C as C++ setting is because this means their source code is no longer portable. That setting is a project setting and so if a .c file is dropped into another project, it will not be compiled as c++. I would rather people take the time to rename file suffixes to .cpp.

It's about the ABI, in order to let both C and C++ application use C interfaces without any issue.
Since C language is very easy, code generation was stable for many years for different compilers, such as GCC, Borland C\C++, MSVC etc.
While C++ becomes more and more popular, a lot things must be added into the new C++ domain (for example finally the Cfront was abandoned at AT&T because C could not cover all the features it needs). Such as template feature, and compilation-time code generation, from the past, the different compiler vendors actually did the actual implementation of C++ compiler and linker separately, the actual ABIs are not compatible at all to the C++ program at different platforms.
People might still like to implement the actual program in C++ but still keep the old C interface and ABI as usual, the header file has to declare extern "C" {}, it tells the compiler generate compatible/old/simple/easy C ABI for the interface functions if the compiler is C compiler not C++ compiler.

Related

Is it possible for a C method not be able to be included in a C++ project?

I know I can include C methods inside a C++ project using the extern "C" thing. But lets us now suppose that I'm thinking in creating a C++ project that would use quite a lot of C methods coming from both libraries made by me as well as libraries made by other people/companies whose developing details and compilation specifications I'm simply not aware of.
Is it possible that some of this methods of C libraries, with unknown compilation and configuration details, could not be included in my C++ project with extern "C"? Or are all C methods necessarily 100% compatible with C++ code insofar extern "C" is used?
It's possible that some of the functions exported by C use names that collide with C++ keywords. You wouldn't be able to declare those using extern "C".
Functions exported by assembler could even use names that conflict with C keywords.
Those and functions declared static can still be called via function pointer, as long as the library gives you a way to get one.
Headers might not be parse-able in C++ mode for the same reasons -- things like the restrict keyword.
Other than naming issues, C++ has full support for the C calling convention. That's what extern "C" is all about.
C has constructs for interfaces that are not compatible with C++, in particular variable length arrays. In modern C you would write
void matMult(size_t n, size_t k, size_t m, double A[n][k], double B[k][m], double C[n][m]);
this interface can not be included as such in C++ compilation units.
Although rather unlikely, one possible issue that might arise with extern "C" in-place is when a function pointer declared extern "C" points to a C++ function that is not declared extern "C". See the last part of this page for more details.

How does "extern C++" work?

I jumped into winnt.h and I found out the code as following:
extern "C++" // templates cannot be declared to have 'C' linkage
template <typename T, size_t N>
char (*RtlpNumberOf( UNALIGNED T (&)[N] ))[N];
I'd like to ask questions as following:
how does extern "C++" work?
is this portable among GCC, and Clang?
can all templates be exported with this syntax?
With question 3, I mean that can I separate declearation and definition of the templates, and then generate a dynamic link for the template without actually give the implementation by using this trick?
Well, extern "C++" won't work in C, of course (though some compilers might support it as an extension). So it only makes sense to use it in C++.
That's because in the case of multiple nested extern linkage specifiers, the innermost one takes effect. So if you have a header file surrounded with extern "C", you can use extern "C++" to temporarily break out of it and declare something with C++ linkage.
It makes the most sense when you want to provide a generally C interface for a C++ library, but you also want to provide C++ helper bits for people actually using it in C++. So you'd put #ifdef __cplusplus \ extern "C" { \ #endif around the header as a whole, and then you ifdef-in those bits with extern "C++" to revert to C++ linkage.
It works by forcing the compiler to use C++ linkage when the surrounding code uses C linkage by default (e.g., you include winnt.h in a C program).
Yes, it should be portable.
Yes they can.
There is not much use for "extern "C++"" in C++ programs because the linkage is "C++" anyway. It makes sense to use "extern "C++"" only if there is a good chance that your C++ code will be included into a C code.

extern "C"---when *exactly* to use? [duplicate]

This question already has answers here:
When to use extern "C" in simple words? [duplicate]
(7 answers)
Closed 9 years ago.
If you're tempted to flag this question as a duplicate, please note that I've read the questions on this subject, yet something still is unclear to me. I'm under the impression that this construct is used when including C headers and linking with C code (please do correct me if I'm wrong). Does it mean that I never have to use "extern C" when not dealing with object files? If I'm wrong about that, why can't the old C code just be compiled as C++, as most likely it's legal c++ code anyway?
I'm a bit iffy about it because I swear I've had situations when working with old C source code in C++ where a linker error is solved only with "extern C", and library headers do have
#ifdef __cplusplus
#extern "C"{
#endif
//......
#ifdef _cplusplus
}
#endif
around them.
EDIT: sorry for being unclear, but what I meant to ask is that whether it's true that "extern C" is only needed when including C headers and linking with pre-existing C object files? If it's true, (and it seems to be judging from comments below), why do library headers have "extern C" clauses around them, why can't they just be included and compiled as C++?
Name mangling rules are different for C. And C can have a different ABI than C++. These reasons alone require you to use extern "C" when embedding C code in C++ code. Even if a compiler can compile both C and C++ code, it might use different name mangling rules or ABIs for the two languages.
Also, your assertion that "[C code is] most likely ... legal c++ code" is not quite true, as C and C++ have diverged more and more as the years have gone on. They have a lot of similarities, but they also have a good number of differences.
The library itself is a C object file, therefore in order to use it your application has to expect a C-ABI for calling the functions in the library and you need to provide the appropriate hint to the compiler when you prototype the functions.
extern void libraryFunc();
If the library is actually compiled as C, which is the only way it can support C and C++, then you need to include annotation for C++ compilers that this MUST be linked as C.
#ifdef __cplusplus // only true when compiling with a C++ compiler
extern "C" {
#endif
extern void libraryFunc();
#ifdef __cplusplus
}
#endif
To a C compiler, this reads
extern void libraryFunc();
To a C++ compiler, this reads
extern "C" {
extern void libraryFunc();
...
}
which is equivalent to
extern "C" void libraryFunc();
If the duplication of extern bothers you, consider:
#if defined __cplusplus
# define C_EXTERN extern "C"
#else
# define C_EXTERN extern
#endif
EXTERN_C {
void foo();
}
The compiler and linker now know to use the C ABI when trying to call/link that function.
Note that the C++ ABI is a superset of the C ABI (application binary interface) so if you want to share code, C is the LCD and needs to be your common interface. C is completely unaware of C++ "name mangling" etc.
This is essentially answered here: When to use extern "C" in simple words?
But the pertinent point is that, when compiling in C++, the names of functions are "mangled" to encode certain information about the function (like argument types). Since C++ always mangles a function name the same way, everything is consistent between the mangled name call and what is put in the object file.
If you compile a file in C called "foo", and in your C++ file you have extern foo(int c);, the C++ compile will have mangled "foo" into something different, for example, foo__Ic (I just made it up, the actual mangling will look different).
Meanwhile, in the plain C code compiled with the C compiler, the object code has defined symbol that is simply foo.
However, when the whole thing is linked, the C++ code has an external symbol foo__Ic that it's trying to resolve, which does not match the defined symbol foo in the C object file.
Hope that helps.
There's another case not mentioned yet. extern "C" specifies the C linkage, but C is not the only other language. C linkage is also used by other languages which are too obscure to have their own widely accepted linkage. E.g. Java uses C linkage, too. Obviously you can't compile that Java code with C++, so you do need extern "C" on the C++ side.

When to use extern "C"?

I know how to use extern "C" but what are the conditions when you have to use it?
extern "C" tells the C++ compiler not to perform any name-mangling on
the code within the braces. This allows you to call C functions from
within C++.
For example:
#include <string.h>
int main()
{
char s[] = "Hello";
char d[6];
strcpy_s(d, s);
}
While this compiles fine on VC++. But sometimes this is written as:
extern "C" {
#include <string.h>
}
I don't see the point. Can you give a real example where extern "C" is necessary?
You use extern "C" to prevent name mangling inside header files and your C++ object files for libraries or objects that have already been compiled without mangling.
For example, say you have a widget library which was compiled with a C compiler so that its published interface is non-mangled.
If you include the header file as is into your code, it will assume the names are mangled and those mangled versions are what you'll tell the linker to look for.
However, since you'll be asking for something like function#intarray_float_charptr and the widget library will have only published function, you're going to run into problems.
However, if you include it with:
extern "C" {
#include "widget.h"
}
your compiler will know that it should try to use function, the non-mangled version.
That's why, in header files for C stuff meant to be included in C _or C++ programs, you'll see things like:
#ifdef __cplusplus
extern "C" {
#endif
// Everything here works for both C and C++ compilers.
#ifdef __cplusplus
}
#endif
If you use a C compiler to include this, the #ifdef lines will cause the extern "C" stuff to disappear. For a C++ compiler (where __cplusplus is defined), everything will be non-mangled.
One very common use of extern "C" when you are exporting a function from a library. If you don't disable C++ name mangling you can otherwise make it very hard for clients of your library to name your function. And likewise, when going in the other direction, when you are importing a function that has been exported with C linkage.
Here is a concrete example of where things break and need extern "C" to get fixed.
module.h:
int f(int arg);
module.c:
int f(int arg) {
return arg + 1;
}
main.cpp:
#include "module.h"
int main() {
f(42);
}
Since I am mixing C and C++, this won't link (of the two object files, only one will know f under its C++ mangled name).
Perhaps the cleanest way to fix this is by making the header file compatible with both C and C++:
module.h:
#ifdef __cplusplus
extern "C" {
#endif
int f(int arg);
#ifdef __cplusplus
}
#endif
When you link with libraries that are written in C extern tells the compiler not to decorate the names so that the linker can find the functions. In C++ function names et al have information for the linker e.g. argument types and sizes contained in the name.
If you are producing a binary library A that exposes a function that you would like to call from binary B.
Imagine A is A.dll and B is B.exe and you are on a Windows system.
C++ does not describe a binary layout such that B knows how to call A. Typically this problem is worked around by using the same compiler to produce A and B. If you want a more generic solution you use the extern keyword. This exposes the function in a C manner. C does describe a binary format so that different binaries from different compilers can talk to each other.
See:
http://en.wikipedia.org/wiki/Application_binary_interface
http://en.wikipedia.org/wiki/Name_mangling#Name_mangling_in_C.2B.2B
If, inside your C++ code, you #include a header for an external library (coded in C) and if the functions there are not declared extern "C" they wont work (you'll get undefined reference at link time).
But these days, people coding C libraries and providing header files for you tend to know that, and often put the extern "C" in their header file (suitably protected with #ifdef __cplusplus)
Perhaps a better way to understand is to use (assuming you have a Linux system) the nm utility to show you the (unmangled) names used in a library or an executable.

When to use extern "C" in simple words? [duplicate]

This question already has answers here:
What is the effect of extern "C" in C++?
(17 answers)
Closed 8 years ago.
Maybe I'm not understanding the differences between C and C++, but when and why do we need to use
extern "C" {
? Apparently its a "linkage convention".
I read about it briefly and noticed that all the .h header files included with MSVS surround their code with it. What type of code exactly is "C code" and NOT "C++ code"? I thought C++ included all C code?
I'm guessing that this is not the case and that C++ is different and that standard features/functions exist in one or the other but not both (ie: printf is C and cout is C++), but that C++ is backwards compatible though the extern "C" declaration. Is this correct?
My next question depends on the answer to the first, but I'll ask it here anyway: Since MSVS header files that are written in C are surrounded by extern "C" { ... }, when would you ever need to use this yourself in your own code? If your code is C code and you are trying to compile it in a C++ compiler, shouldn't it work without problem because all the standard h files you include will already have the extern "C" thing in them with the C++ compiler?
Do you have to use this when compiling in C++ but linking to already built C libraries or something?
You need to use extern "C" in C++ when declaring a function that was implemented/compiled in C. The use of extern "C" tells the compiler/linker to use the C naming and calling conventions, instead of the C++ name mangling and C++ calling conventions that would be used otherwise. For functions provided by other libraries, you will almost never need to use extern "C", as well-written libraries will already have this in there for the public APIs that it exports to both C and C++. If, however, you write a library that you want to make available both in C and in C++, then you will have to conditionally put that in your headers.
As for whether all C code is C++ code... no, that is not correct. It is a popular myth that C++ is a "superset of C". While C++ certainly strives to be as compatible with C as possible, there are some incompatibilities. For example, bool is valid C++ but not valid C, while _Bool exists in C99, but is not available in C++.
As to whether you will ever need to use extern "C" with the system's ".h" files.... any well-designed implementation will have those in there for you, so that you do not need to use them. However, to be certain that they are provided, you should include the equivalent header file that begins with "c" and omits ".h". For example, if you include <ctype.h>, almost any reasonable system will have the extern "C" added; however, to be assured a C++-compatible header, you should instead include the header <cctype>.
You may also be interested in Mixing C and C++ from the C++ FAQ Lite.
The other answers are correct, but a complete "boilerplate" example will probably help. The canonical method for including C code in C and/or C++ projects is as follows:
//
// C_library.h
//
#ifdef __cplusplus
extern "C" {
#endif
//
// ... prototypes for C_library go here ...
//
#ifdef __cplusplus
}
#endif
-
//
// C_library.c
//
#include "C_library.h"
//
// ... implementations for C_library go here ...
//
-
//
// C++_code.cpp
//
#include "C_library.h"
#include "C++_code.h"
//
// ... C++_code implementation here may call C functions in C_library.c ...
//
Note: the above also applies to calling C code from Objective-C++.
C++ compilers mangle the names in their symbol table differently than C compilers. You need to use the extern "C" declaration to tell the C++ compiler to use the C mangling convention instead when building the symbol table.
I use 'extern c' so that C# can read my C++ code without having to figure out the extra name mangling done when exporting a C++ dll function. Otherwise, there are extra nonsensical (or really, non-English) characters that I have to add at the end of a function entry point on the C# side in order to properly access a C++ function in a dll.
extern "C" {} blocks tell a C++ compiler to use the C naming and calling conventions. If you don't use this you will get linker errors if trying to include a C library with your C++ project because C++ will mangle the names. I tend to use this on all my C headers just in case they are ever used in a C++ project:
#ifdef __cplusplus
extern "C" {
#endif
/* My library header */
#ifdef __cplusplus
} // extern
#endif
You need to use extern "C" when you want to use the C calling convention in code compiled by a C++ compiler. There are two reasons for this:
You have a function implemented in C and want to call it from C++.
You have a function implemented in C++ and want to call it from C. Note that in this case you can only use the C part of C++ in the function interface (no classes, ...).
Apart from C this also applies when you want to interoperate between C++ and other languages which use the same calling and naming conventions as C.
Typically the declarations in a C header file are surrounded with
#ifdef __cplusplus
extern "C" {
#endif
[... C declarations ...]
#ifdef __cplusplus
}
#endif
to make it usable from C++.
C++ functions are subject to name mangling. This makes them impossible to call directly from C code unless extern "C" is used.