How to detect declared but undefined functions in C++?

How to detect declared but undefined functions in C++? - c++

The following compiles, links and runs just fine (on Xcode 5.1 / clang):
#include <iostream>
class C { int foo(); };
int main(int argc, const char * argv[])
{
C c;
cout << "Hello world!";
}
However, C::foo() is not defined anywhere, only declared. I don't get any compiler or linker warnings / errors, apparently because C::foo() is never referenced anywhere.
Is there any way I can emit a warning that in the whole program no definition for C::foo() exists even though it is declared? An error would actually be better.
Thanks!

There are good reasons why it is not easily feasible. A set of header files could declare many functions, some of which are provided by additional libraries. You may want to #include such headers without using all of these functions (for instance, if you only want to use some #define-d constant).
Alternatively, it is legitimate to have some header and to implement (in your library) only a subset of the API defined by the header files.
And a C++ or C header file could also define the interface of code defined by potential plugins, for programs which usually run without plugins. Many programs accepting plugins are declaring the plugin interface in their header file.
If you really wanted to have such a check, you might perhaps consider customizing GCC with MELT; however, such a check is non trivial to implement currently (and you'll need link time optimization too).

Perhaps try calling all functions in your implementation map and adding a try catch that spits out some warning if they segfault.

You don't. Or rather, this is not the job of the compiler
I guess I'm just repeating what other said in the comments, but:
It's actually a feature (see unimplemented private constructor)
It wouldn't really really help, as the "bug" here is that no proper code cleanup was done prior to committing the code, and an unimplemented and unused function is really you smallest problem then. What about all the other stuff that hasn't been cleaned up? The likelihood of having an implemented and unused function seems just as high to me, and it's the same mess more or less.
Rather than worry about this specific case, I would check if this was just a one time glitch, or if your development team could improve some procedures that would prevent such things in the future.

As far as languages such as C++ are concerned, detecting and reporting undefined functions will defeat one of the important feature provided by it. Virtual/pure-virtual functions are one of the important mechanism by which C++ implements run-time polymorphism.
If you are developing a library to be used by its clients, you might have declared-but-not-defined virtual functions. You may leave the definition for its clients. In such cases, if the compiler were to report such undefined functions, it won't be beneficial.

Related

Is it safe to use #ifdef guards on C++ class member functions?

Suppose you have the following definition of a C++ class:
class A {
// Methods
#ifdef X
// Hidden methods in some translation units
#endif
};
Is this a violation of One Definition Rule for the class? What are the associated hazards?
I suspect if member function pointers or virtual functions are used this will most likely break. Otherwise is it safe to use?
I am considering it in the context of Objective C++. The header file is included in both pure C++ and Objective C++ translation units. My idea is to guard methods with Objective-C types with OBJC macro. Otherwise, I have to use void pointer for all Objective-C types in the header, but this way I am losing strong typing and also ugly static casts must be added all over the code.

Yes, it MAY allow a hazard of ODR breach if separate compilation units are allowed to have different state of macro definition X. X should be defined (or not defined) globally across program (and shared objects) before every inclusion of that class definition for program to meet requirement to be compliant. As far as C++ compiler (not preprocessor) concerned, those are two different, incompatible, unrelated class types.
Imagine situation where in compilation unit A.cpp X was defined before class A and in unit B.cpp X was not defined. You would not get any compiler errors if nothing within B.cpp uses those members which were "removed". Both units may be considered well-formed on their own. Now if B.cpp would contain a new expression, it would create an object of incompatible type, smaller than one defined in A.cpp. But any method from class A , including constructor, may cause an UB by accessing memory outside of object's storage when called with object created in B.cpp, because they use the larger definition.
There is a variation of this folly, an inclusion of header file's copy into two or more different folders of build tree with same name of file and POD struct type, one of those folders accessible by #include <filename>. Units with #include "filename" designed to use alternatives. But they wouldn't. Because order of header file lookup in this case is platform-defined, programmer isn't entirely in control of which header would be included in which unit on every platform with #include "filename". As soon as one definition would be changed, even just by re-ordering members, ODR was breached.
To be particularly safe, such things should be done only in compiler domain by using templates, PIMPL, etc. For inter-language communication some middle ground should be arranged, using wrappers or adapters, C++ and ObjectiveC++ may have incompatible memory layout of non-POD objects.

This blows up horribly. Do not do this. Example with gcc:
Header file:
// a.h
class Foo
{
public:
Foo() { ; }
#ifdef A
virtual void IsCalled();
#endif
virtual void NotCalled();
};
First C++ File:
// a1.cpp
#include <iostream>
#include "a.h"
void Foo::NotCalled()
{
std::cout << "This function is never called" << std::endl;
}
extern Foo* getFoo();
extern void IsCalled(Foo *f);
int main()
{
Foo* f = getFoo();
IsCalled(f);
}
Second C++ file:
// a2.cpp
#define A
#include "a.h"
#include <iostream>
void Foo::IsCalled(void)
{
std::cout << "We call this function, but ...?!" << std::endl;
}
void IsCalled(Foo *f)
{
f->IsCalled();
}
Foo* getFoo()
{
return new Foo();
}
Result:
This function is never called
Oops! The code called virtual function IsCalled and we dispatched to NotCalled because the two translation units disagreed on which entry was where in the class virtual function table.
What went wrong here? We violated ODR. So now two translations units disagree on what is supposed to be where in the virtual function table. So if we create a class in one translation unit and call a virtual function in it from another translation unit, we may call the wrong virtual function. Oopsie whoopsie!
Please do not deliberately do thigs that the relevant standards say are not allowed and will not work. You will never be able to think of every possible way it can go wrong. This kind of reasoning has caused many disasters over my decades of programming and I really wish people would stop deliberately and intentionally creating potential disasters.

Is it safe to use #ifdef guards on C++ class member functions?
In practice (look at the generated assembler code using GCC as g++ -O2 -fverbose-asm -S) what you propose to do is safe. In theory it should not be.
However, there is another practical approach (used in Qt and FLTK). Use some naming conventions in your "hidden" methods (e.g. document that all of them should have dontuse in their name like int dontuseme(void)), and write your GCC plugin to warn against them at compile time. Or just use some clever grep(1) in your build process (e.g. in your Makefile)
Alternatively, your GCC plugin may implement new #pragma-s or function attributes, and could warn against misuse of such functions.
Of course, you can also use (cleverly) private: and most importantly, generate C++ code (with a generator like SWIG) in your build procedure.
So practically speaking, your #ifdef guards may be useless. And I am not sure they make the C++ code more readable.
If performance matters (with GCC), use the -flto -O2 flags at both compile and link time.
See also GNU autoconf -which uses similar preprocessor based approaches.
Or use some other preprocessor or C++ code generator (GNU m4, GPP, your own one made with ANTLR or GNU bison) to generate some C++ code. Like Qt does with its moc.
So my opinion is that what you want to do is useless. Your unstated goals can be achieved in many other ways. For example, generating "random" looking C++ identifiers (or C identifiers, or ObjectiveC++ names, etc....) like _5yQcFbU0s (this is done in RefPerSys) - the accidental collision of names is then very improbable.
In a comment you state:
Otherwise, I have to use void* for all Objective-C types in the header, but this way I am losing strong typing
No, you can generate some inline C++ functions (that would use reinterpret_cast) to gain again that strong typing. Qt does so! FLTK or FOX or GTKmm also generate C++ code (since GUI code is easy to generate).
My idea was to guard methods with Objective-C types with OBJC macro
This make perfect sense if you generate some C++ or C or Objective C code with these macros.
I suspect if member function pointers or virtual functions are used this will most likely break.
In practice, it won't break if you generate random looking C++ identifiers. Or just if you document naming conventions (like GNU bison or ANTLR does) in generated C++ code (or in generated Objective C++, or in generated C, ... code)
Please notice that compilers like GCC use today (in 2021, internally) several C++ code generators. So generating C++ code is a common practice. In practice, the risks of name collisions are small if you take care of generating "random" identifiers (you could store them in some sqlite database at build time).
also ugly static casts must be added all over the code
These casts don't matter if the ugly code is generated.
As examples, RPCGEN and SWIG -or Bisoncpp- generate ugly C and C++ code which works very well (and perhaps also some proprietary ASN.1 or JSON or HTTP or SMTP or XML related in-house code generators).
The header file is included in both pure C++ and Objective C++ translation units.
An alternative approach is to generate two different header files...
one for C++, and another for Objective C++. The SWIG tool could be inspirational. Of course your (C or C++ or Objective C) code generators would emit random looking identifiers.... Like I do in both Bismon (generating random looking C names like moduleinit_9oXtCgAbkqv_4y1xhhF5Nhz_BM) and RefPerSys (generating random looking C++ names like rpsapply_61pgHb5KRq600RLnKD ...); in both systems accidental name collision is very improbable.
Of course, in principle, using #ifdef guards is not safe, as explained in this answer.
PS. A few years ago I did work on GCC MELT which generated millions of lines of C++ code for some old versions of the GCC compiler. Today -in 2021- you practically could use asmjit or libgccjit to generate machine code more directly. Partial evaluation is then a good conceptual framework.

C migrating to C++ (embedded)

I have a project in pure C - st usb library and I need to migrate it to c++ and change same structures into classes. I removed all c++ "protections" like:
#ifdef __cplusplus
extern "C" {
#endif
#ifdef __cplusplus
}
#endif
I changed all files extensions from .c to .cpp (except HAL library).
I realized that c++ .hex is 7kB smaller then c .hex. When I looked into .map file I saw that many functions are missing. I thought that staticfunctions caused that, but removing static key word didn't help. Does anyone have idea what could cause that some functions weren't compiled. When extensions are .c everything is fine.

I can think of two main reasons:
Inlining. The compiler can decide there is no need to emit the function as a standalone function if all usages can be inlined.
Unused code. The compiler can see that a function isn't used anywhere in your code and decide to eliminate it from the final result.
If the result is to be used as sort of library, that your environment calls specific functions without having an explicit call in your own code, I think the best method is to compile and link your code as a library (probably dynamic lib) and exporting those functions as library interface (visibility in gcc, dllexport in MSVC) will force the compiler/linker to include them in the produced binary even if they don't see why they are needed right now.
(Of course, this is a wild guess about your target environment.)
Another option is to turn off the specific compiler/linker optimizations for removing dead code and forcing emitting standalone function instance for inlined functions, but I think it's very indirect approach, have a wider effect and can complicate maintenance later.

C++ functions are given different signatures than C functions. Since you lost functionality and the code is much smaller, it's likely that a function that requires C linkage is being compiled as C++ and the signature mismatch is preventing proper linking.
A likely place for this to happen is in the interrupt vector table. If the handler function is compiled with C++ linkage, the handler address won't make it into a table that's compiled with C.
Double check the interrupt vectors and make sure that they reference the correct functions. If they are correct, check any other code compiled with C that might reference an external symbol compiled with C++.

Extern variable only in header unexpectedly working, why?

I'm currently updating a C++ library for Arduino (Specifically 8-bit AVR processors compiled using avr-gcc).
Typically the authors of the default Arduino libraries like to include an extern variable for the class inside the header, which is defined in the class .cpp file also. This I assume is basically to have everything provided ready to go for newbies as built-in objects.
The scenario I have is: The library I have updated no longer requires the .cpp file and I have removed it from the library. It wasn't until I went on a final pass checking for bugs that I realized, no linker error was produced despite the fact a definition wasn't provided for the extern variable in a .cpp file.
This is as simple as I can get it (header file):
struct Foo{
void method() {}
};
extern Foo foo;
Including this code and using it in one or many source files does not cause any linker error. I have tried it in both versions of GCC which Arduino uses (4.3.7, 4.8.1) and with C++11 enabled/disabled.
In my attempt to cause an error, I found it was only possible when doing something like taking the address of the object or modifying the contents of a dummy variable I added.
After discovering this I find its important to note:
The class functions only return other objects, as in, nothing like operators returning references to itself, or even a copy.
It only modifies external objects (registers which are effectively volatile uint8_t references in code), and returns temporaries of other classes.
All of the class functions in this header are so basic that they cost less than or equal to the cost of a function call, therefore they are (in my tests) completely in-lined into the caller. A typical statement may create many temporary objects in the call chain, however the compiler sees through these and outputs efficient code modifying registers directly, rather than a set of nested function calls.
I also recall reading in n3797 7.1.1 - 8 that extern can be used on incomplete types, however the class is fully defined whereas the declaration is not (this is probably irrelevant).
I'm led to believe that this may be a result of optimizations at play. I have seen the effect that taking the address has on objects which would otherwise be considered constant and compiled without RAM usage. By adding any layer of indirection to an object in which the compiler cannot guarantee state will cause this RAM consuming behavior.
So, maybe I've answered my question by simply asking it, however I'm still making assumptions and it bothers me. After quite some time hobby-coding C++, literally the only thing on my list of do-not's is making assumptions.
Really, what I want to know is:
With respect to the working solution I have, is it a simple case of documenting the inability to take the address (cause indirection) of the class?
Is it just an edge case behavior caused by optimizations eliminating the need for something to be linked?
Or is plain and simple undefined behavior. As in GCC may have a bug and is permitting code that might fail if optimizations were lowered or disabled?
Or one of you may be lucky enough to be in possession of a decoder ring that can find a suitable paragraph in the standard outlining the specifics.
This is my first question here, so let me know if you would like to know certain details, I can also provide GitHub links to the code if needed.
Edit: As the library needs to be compatible with existing code I need to maintain the ability to use the dot syntax, otherwise I'd simply have a class of static functions.
To remove assumptions for now, I see two options:
Add a .cpp just for the variable declaration.
Use a define in the header like #define foo (Foo()) allowing dot syntax via a temporary.
I prefer the method using a define, what does the community think?
Cheers.

Declaring something extern just informs the assembler and the linker that whenever you use that label/symbol, it should refer to entry in the symbol table, instead of a locally allocated symbol.
The role of the linker is to replace symbol table entries with an actual reference to the address space whenever possible.
If you don't use the symbol at all in your C file, it will not show up in the assembly code, and thus will not cause any linker error when your module is linked with others, since there is no undefined reference.

It is either an edge case behaviour caused by optimization, or you never use the foo variable in your code. I'm not 100% sure it is formally not an undefined behavior, but i'm quite sure it isn't undefined from practical point of view.
extern variables are implemented in such way, that code compiled with them produces so-called relocations - empty places where addres of variable should be placed - which are then filled by linker. Apparently foo is never used in your code in such a way that would need getting it's address and therefore linker doesn't even try to find that symbol. If you turn optimization off (-O0) you will probably get linker error.
Update: If you want to keep "dot notation" but remove the problem with undefined extern, you may replace extern with static (in header file), creating separate "instance" of variable for each TU. As this variable is going to be optimized out anyway, this will not change the real code at all, but will also work for unoptimized build.

Why do functions need to be declared before they are used?

When reading through some answers to this question, I started wondering why the compiler actually does need to know about a function when it first encounters it. Wouldn't it be simple to just add an extra pass when parsing a compilation unit that collects all symbols declared within, so that the order in which they are declared and used does not matter anymore?
One could argue, that declaring functions before they are used certainly is good style, but I am wondering, is there are any other reason why this is mandatory in C++?
Edit - An example to illustrate: Suppose you have to functions that are defined inline in a header file. These two function call each other (maybe a recursive tree traversal, where odd and even layers of the tree are handled differently). The only way to resolve this would be to make a forward declaration of one of the functions before the other.
A more common example (though with classes, not functions) is the case of classes with private constructors and factories. The factory needs to know the class in order to create instances of it, and the class needs to know the factory for the friend declaration.
If this is requirement is from the olden days, why was it not removed at some point? It would not break existing code, would it?

How do you propose to resolve undeclared identifiers that are defined in a different translation unit?
C++ has no module concept, but has separate translation as an inheritance from C. A C++ compiler will compile each translation unit by itself, not knowing anything about other translation units at all. (Except that export broke this, which is probably why it, sadly, never took off.)
Header files, which is where you usually put declarations of identifiers which are defined in other translation units, actually are just a very clumsy way of slipping the same declarations into different translation units. They will not make the compiler aware of there being other translation units with identifiers being defined in them.
Edit re your additional examples:
With all the textual inclusion instead of a proper module concept, compilation already takes agonizingly long for C++, so requiring another compilation pass (where compilation already is split into several passes, not all of which can be optimized and merged, IIRC) would worsen an already bad problem. And changing this would probably alter overload resolution in some scenarios and thus break existing code.
Note that C++ does require an additional pass for parsing class definitions, since member functions defined inline in the class definition are parsed as if they were defined right behind the class definition. However, this was decided when C with Classes was thought up, so there was no existing code base to break.

Historically C89 let you do this. The first time the compiler saw a use of a function and it didn't have a predefined prototype, it "created" a prototype that matched the use of the function.
When C++ decided to add strict typechecking to the compiler, it was decided that prototypes were now required. Also, C++ inherited the single-pass compilation from C, so it couldn't add a second pass to resolved all symbols.

Because C and C++ are old languages. Early compilers didn't have a lot of memory, so these languages were designed so a compiler can just read the file from top to bottom, without having to consider the file as a whole.

I think of two reasons:
It makes the parsing easy. No extra pass needed.
It also defines scope; symbols/names are available only after its declaration. Means, if I declare a global variable int g_count;, the later code after this line can use it, but not the code before the line! Same argument for global functions.
As an example, consider this code:
void g(double)
{
cout << "void g(double)" << endl;
}
void f()
{
g(int());//this calls g(double) - because that is what is visible here
}
void g(int)
{
cout << "void g(int)" << endl;
}
int main()
{
f();
g(int());//calls g(int) - because that is what is the best match!
}
Output:
void g(double)
void g(int)
See the output at ideone : http://www.ideone.com/EsK4A

The main reason will be to make the compilation process as efficient as possible. If you add an extra pass you're adding both time and storage. Remember that C++ was developed back before the time of Quad Core Processors :)

The C programming language was designed so that the compiler could be implemented as a one-pass compiler. In such a compiler, each compilation phase is only executed once. In such a compiler you cannot referrer to an entity that is defined later in the source file.
Moreover, in C, the compiler only interpret a single compilation unit (generally a .c file and all the included .h files) at a time. So you needed a mechanism to referrer to a function defined in another compilation unit.
The decision to allow one-pass compiler and to be able to split a project in small compilation unit was taken because at the time the memory and the processing power available was really tight. And allowing forward-declaration could easily solve the issue with a single feature.
The C++ language was derived from C and inherited the feature from it (as it wanted to be as compatible with C as possible to ease the transition).

I guess because C is quite old and at the time C was designed efficient compilation was a problem because CPUs were much slower.

Since C++ is a static language, the compiler needs to check if values' type is compatible with the type expected in the function's parameters. Of course, if you don't know the function signature, you can't do this kind of checks, thus defying the purpose of a static compiler. But, since you have a silver badge in C++, I think you already know this.
The C++ language specs were made right because the designer didn't want to force a multi-pass compiler, when hardware was not as fast as the one available today. In the end, I think that, if C++ was designed today, this imposition would go away but then, we would have another language :-).

One of the biggest reasons why this was made mandatory even in C99 (compared to C89, where you could have implicitly-declared functions) is that implicit declarations are very error-prone. Consider the following code:
First file:
#include <stdio.h>
void doSomething(double x, double y)
{
printf("%g %g\n",x,y);
}
Second file:
int main()
{
doSomething(12345,67890);
return 0;
}
This program is a syntactically valid* C89 program. You can compile it with GCC using this command (assuming the source files are named test.c and test0.c):
gcc -std=c89 -pedantic-errors test.c test0.c -o test
Why does it print something strange (at least on linux-x86 and linux-amd64)? Can you spot the problem in the code at a glance? Now try replacing c89 with c99 in the command line — and you'll be immediately notified about your mistake by the compiler.
Same with C++. But in C++ there're actually other important reasons why function declarations are needed, they are discussed in other answers.
* But has undefined behavior

Still, you can have a use of a function before it is declared sometimes (to be strict in the wording: "before" is about the order in which the program source is read) -- inside a class!:
class A {
public:
static void foo(void) {
bar();
}
private:
static void bar(void) {
return;
}
};
int main() {
A::foo();
return 0;
}
(Changing the class to a namespace doesn't work, per my tests.)
That's probably because the compiler actually puts the member-function definitions from inside the class right after the class declaration, as someone has pointed it out here in the answers.
The same approach could be applied to the whole source file: first, drop everything but declaration, then handle everything postponed. (Either a two-pass compiler, or large enough memory to hold the postponed source code.)
Haha! So, they thought a whole source file would be too large to hold in the memory, but a single class with function definitions wouldn't: they can allow for a whole class to sit in the memory and wait until the declaration is filtered out (or do a 2nd pass for the source code of classes)!

I remember with Unix and Linux, you have Global and Local. Within your own environment local works for functions, but does not work for Global(system). You must declare the function Global.

What are the advantages and disadvantages of separating declaration and definition as in C++?

In C++, declaration and definition of functions, variables and constants can be separated like so:
function someFunc();
function someFunc()
{
//Implementation.
}
In fact, in the definition of classes, this is often the case. A class is usually declared with it's members in a .h file, and these are then defined in a corresponding .C file.
What are the advantages & disadvantages of this approach?

Historically this was to help the compiler. You had to give it the list of names before it used them - whether this was the actual usage, or a forward declaration (C's default funcion prototype aside).
Modern compilers for modern languages show that this is no longer a necessity, so C & C++'s (as well as Objective-C, and probably others) syntax here is histotical baggage. In fact one this is one of the big problems with C++ that even the addition of a proper module system will not solve.
Disadvantages are: lots of heavily nested include files (I've traced include trees before, they are surprisingly huge) and redundancy between declaration and definition - all leading to longer coding times and longer compile times (ever compared the compile times between comparable C++ and C# projects? This is one of the reasons for the difference). Header files must be provided for users of any components you provide. Chances of ODR violations. Reliance on the pre-processor (many modern languages do not need a pre-processor step), which makes your code more fragile and harder for tools to parse.
Advantages: no much. You could argue that you get a list of function names grouped together in one place for documentation purposes - but most IDEs have some sort of code folding ability these days, and projects of any size should be using doc generators (such as doxygen) anyway. With a cleaner, pre-processor-less, module based syntax it is easier for tools to follow your code and provide this and more, so I think this "advantage" is just about moot.

It's an artefact of how C/C++ compilers work.
As a source file gets compiled, the preprocessor substitutes each #include-statement with the contents of the included file. Only afterwards does the compiler try to interpret the result of this concatenation.
The compiler then goes over that result from beginning to end, trying to validate each statement. If a line of code invokes a function that hasn't been defined previously, it'll give up.
There's a problem with that, though, when it comes to mutually recursive function calls:
void foo()
{
bar();
}
void bar()
{
foo();
}
Here, foo won't compile as bar is unknown. If you switch the two functions around, bar won't compile as foo is unknown.
If you separate declaration and definition, though, you can order the functions as you wish:
void foo();
void bar();
void foo()
{
bar();
}
void bar()
{
foo();
}
Here, when the compiler processes foo it already knows the signature of a function called bar, and is happy.
Of course compilers could work in a different way, but that's how they work in C, C++ and to some degree Objective-C.
Disadvantages:
None directly. If you're using C/C++ anyway, it's the best way to do things. If you've got a choice of language/compiler, then maybe you can pick one where this is not an issue. The only thing to consider with splitting declarations into header files is to avoid mutually recursive #include-statements - but that's what include guards are for.
Advantages:
Compilation speed: As all included files are concatenated and then parsed, reducing the amount and complexity of code in included files will improve compilation time.
Avoid code duplication/inlining: If you fully define a function in a header file, each object file that includes this header and references this function will contain it's own version of that function. As a side-note, if you want inlining, you need to put the full definition into the header file (on most compilers).
Encapsulation/clarity: A well defined class/set of functions plus some documentation should be enough for other developers to use your code. There is (ideally) no need for them to understand how the code works - so why require them to sift through it? (The counter-argument that it's may be useful for them to access the implementation when required still stands, of course).
And of course, if you're not interested in exposing a function at all, you can usually still choose to define it fully in the implementation file rather than the header.

The standard requires that when using a function, a declaration must be in scope. This means, that the compiler should be able to verify against a prototype (the declaration in a header file) what you are passing to it. Except of course, for functions that are variadic - such functions do not validate arguments.
Think of C, when this was not required. At that time, compilers treated no return type specification to be defaulted to int. Now, assume you had a function foo() which returned a pointer to void. However, since you did not have a declaration, the compiler will think that it has to return an integer. On some Motorola systems for example, integeres and pointers would be be returned in different registers. Now, the compiler will no longer use the correct register and instead return your pointer cast to an integer in the other register. The moment you try to work with this pointer -- all hell breaks loose.
Declaring functions within the header is fine. But remember if you declare and define in the header make sure they are inline. One way to achieve this is to put the definition inside the class definition. Otherwise prepend the inline keyword. You will run into ODR violation otherwise when the header is included in multiple implementation files.

There are two main advantages to separating declaration and definition into C++ header and source files. The first is that you avoid problems with the One Definition Rule when your class/functions/whatever are #included in more than one place. Secondly, by doing things this way, you separate interface and implementation. Users of your class or library need only to see your header file in order to write code that uses it. You can also take this one step farther with the Pimpl Idiom and make it so that user code doesn't have to recompile every time the library implementation changes.
You've already mentioned the disadvantage of code repetition between the .h and .cpp files. Maybe I've written C++ code for too long, but I don't think it's that bad. You have to change all user code every time you change a function signature anyway, so what's one more file? It's only annoying when you're first writing a class and you have to copy-and-paste from the header to the new source file.
The other disadvantage in practice is that in order to write (and debug!) good code that uses a third-party library, you usually have to see inside it. That means access to the source code even if you can't change it. If all you have is a header file and a compiled object file, it can be very difficult to decide if the bug is your fault or theirs. Also, looking at the source gives you insight into how to properly use and extend a library that the documentation might not cover. Not everyone ships an MSDN with their library. And great software engineers have a nasty habit of doing things with your code that you never dreamed possible. ;-)

Advantage
Classes can be referenced from other files by just including the declaration. Definitions can then be linked later on in the compilation process.

You basically have 2 views on the class/function/whatever:
The declaration, where you declare the name, the parameters and the members (in the case of a struct/class), and the definition where you define what the functions does.
Amongst the disadvantages are repetition, yet one big advantage is that you can declare your function as int foo(float f) and leave the details in the implementation(=definition), so anyone who wants to use your function foo just includes your header file and links to your library/objectfile, so library users as well as compilers just have to care for the defined interface, which helps understanding the interfaces and speeds up compile times.

One advantage that I haven't seen yet: API
Any library or 3rd party code that is NOT open source (i.e. proprietary) will not have their implementation along with the distribution. Most companies are just plain not comfortable with giving away source code. The easy solution, just distribute the class declarations and function signatures that allow use of the DLL.
Disclaimer: I'm not saying whether it's right, wrong, or justified, I'm just saying I've seen it a lot.

One big advantage of forward declarations is that when used carefully you can cut down the compile time dependencies between modules.
If ClassA.h needs to refer to a data element in ClassB.h, you can often use just a forward references in ClassA.h and include ClassB.h in ClassA.cc rather than in ClassA.h, thus cutting down a compile time dependency.
For big systems this can be a huge time saver on a build.

Disadvantage
This leads to a lot of repetition. Most of the function signature needs to be put in two or more (as Paulious noted) places.

Separation gives clean, uncluttered view of program elements.
Possibility to create and link to binary modules/libraries without disclosing sources.
Link binaries without recompiling sources.

When done correctly, this separation reduces compile times when only the implementation has changed.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js