Why do functions need to be declared before they are used? - c++

When reading through some answers to this question, I started wondering why the compiler actually does need to know about a function when it first encounters it. Wouldn't it be simple to just add an extra pass when parsing a compilation unit that collects all symbols declared within, so that the order in which they are declared and used does not matter anymore?
One could argue, that declaring functions before they are used certainly is good style, but I am wondering, is there are any other reason why this is mandatory in C++?
Edit - An example to illustrate: Suppose you have to functions that are defined inline in a header file. These two function call each other (maybe a recursive tree traversal, where odd and even layers of the tree are handled differently). The only way to resolve this would be to make a forward declaration of one of the functions before the other.
A more common example (though with classes, not functions) is the case of classes with private constructors and factories. The factory needs to know the class in order to create instances of it, and the class needs to know the factory for the friend declaration.
If this is requirement is from the olden days, why was it not removed at some point? It would not break existing code, would it?

How do you propose to resolve undeclared identifiers that are defined in a different translation unit?
C++ has no module concept, but has separate translation as an inheritance from C. A C++ compiler will compile each translation unit by itself, not knowing anything about other translation units at all. (Except that export broke this, which is probably why it, sadly, never took off.)
Header files, which is where you usually put declarations of identifiers which are defined in other translation units, actually are just a very clumsy way of slipping the same declarations into different translation units. They will not make the compiler aware of there being other translation units with identifiers being defined in them.
Edit re your additional examples:
With all the textual inclusion instead of a proper module concept, compilation already takes agonizingly long for C++, so requiring another compilation pass (where compilation already is split into several passes, not all of which can be optimized and merged, IIRC) would worsen an already bad problem. And changing this would probably alter overload resolution in some scenarios and thus break existing code.
Note that C++ does require an additional pass for parsing class definitions, since member functions defined inline in the class definition are parsed as if they were defined right behind the class definition. However, this was decided when C with Classes was thought up, so there was no existing code base to break.

Historically C89 let you do this. The first time the compiler saw a use of a function and it didn't have a predefined prototype, it "created" a prototype that matched the use of the function.
When C++ decided to add strict typechecking to the compiler, it was decided that prototypes were now required. Also, C++ inherited the single-pass compilation from C, so it couldn't add a second pass to resolved all symbols.

Because C and C++ are old languages. Early compilers didn't have a lot of memory, so these languages were designed so a compiler can just read the file from top to bottom, without having to consider the file as a whole.

I think of two reasons:
It makes the parsing easy. No extra pass needed.
It also defines scope; symbols/names are available only after its declaration. Means, if I declare a global variable int g_count;, the later code after this line can use it, but not the code before the line! Same argument for global functions.
As an example, consider this code:
void g(double)
{
cout << "void g(double)" << endl;
}
void f()
{
g(int());//this calls g(double) - because that is what is visible here
}
void g(int)
{
cout << "void g(int)" << endl;
}
int main()
{
f();
g(int());//calls g(int) - because that is what is the best match!
}
Output:
void g(double)
void g(int)
See the output at ideone : http://www.ideone.com/EsK4A

The main reason will be to make the compilation process as efficient as possible. If you add an extra pass you're adding both time and storage. Remember that C++ was developed back before the time of Quad Core Processors :)

The C programming language was designed so that the compiler could be implemented as a one-pass compiler. In such a compiler, each compilation phase is only executed once. In such a compiler you cannot referrer to an entity that is defined later in the source file.
Moreover, in C, the compiler only interpret a single compilation unit (generally a .c file and all the included .h files) at a time. So you needed a mechanism to referrer to a function defined in another compilation unit.
The decision to allow one-pass compiler and to be able to split a project in small compilation unit was taken because at the time the memory and the processing power available was really tight. And allowing forward-declaration could easily solve the issue with a single feature.
The C++ language was derived from C and inherited the feature from it (as it wanted to be as compatible with C as possible to ease the transition).

I guess because C is quite old and at the time C was designed efficient compilation was a problem because CPUs were much slower.

Since C++ is a static language, the compiler needs to check if values' type is compatible with the type expected in the function's parameters. Of course, if you don't know the function signature, you can't do this kind of checks, thus defying the purpose of a static compiler. But, since you have a silver badge in C++, I think you already know this.
The C++ language specs were made right because the designer didn't want to force a multi-pass compiler, when hardware was not as fast as the one available today. In the end, I think that, if C++ was designed today, this imposition would go away but then, we would have another language :-).

One of the biggest reasons why this was made mandatory even in C99 (compared to C89, where you could have implicitly-declared functions) is that implicit declarations are very error-prone. Consider the following code:
First file:
#include <stdio.h>
void doSomething(double x, double y)
{
printf("%g %g\n",x,y);
}
Second file:
int main()
{
doSomething(12345,67890);
return 0;
}
This program is a syntactically valid* C89 program. You can compile it with GCC using this command (assuming the source files are named test.c and test0.c):
gcc -std=c89 -pedantic-errors test.c test0.c -o test
Why does it print something strange (at least on linux-x86 and linux-amd64)? Can you spot the problem in the code at a glance? Now try replacing c89 with c99 in the command line — and you'll be immediately notified about your mistake by the compiler.
Same with C++. But in C++ there're actually other important reasons why function declarations are needed, they are discussed in other answers.
* But has undefined behavior

Still, you can have a use of a function before it is declared sometimes (to be strict in the wording: "before" is about the order in which the program source is read) -- inside a class!:
class A {
public:
static void foo(void) {
bar();
}
private:
static void bar(void) {
return;
}
};
int main() {
A::foo();
return 0;
}
(Changing the class to a namespace doesn't work, per my tests.)
That's probably because the compiler actually puts the member-function definitions from inside the class right after the class declaration, as someone has pointed it out here in the answers.
The same approach could be applied to the whole source file: first, drop everything but declaration, then handle everything postponed. (Either a two-pass compiler, or large enough memory to hold the postponed source code.)
Haha! So, they thought a whole source file would be too large to hold in the memory, but a single class with function definitions wouldn't: they can allow for a whole class to sit in the memory and wait until the declaration is filtered out (or do a 2nd pass for the source code of classes)!

I remember with Unix and Linux, you have Global and Local. Within your own environment local works for functions, but does not work for Global(system). You must declare the function Global.

Related

Is it safe to use #ifdef guards on C++ class member functions?

Suppose you have the following definition of a C++ class:
class A {
// Methods
#ifdef X
// Hidden methods in some translation units
#endif
};
Is this a violation of One Definition Rule for the class? What are the associated hazards?
I suspect if member function pointers or virtual functions are used this will most likely break. Otherwise is it safe to use?
I am considering it in the context of Objective C++. The header file is included in both pure C++ and Objective C++ translation units. My idea is to guard methods with Objective-C types with OBJC macro. Otherwise, I have to use void pointer for all Objective-C types in the header, but this way I am losing strong typing and also ugly static casts must be added all over the code.
Yes, it MAY allow a hazard of ODR breach if separate compilation units are allowed to have different state of macro definition X. X should be defined (or not defined) globally across program (and shared objects) before every inclusion of that class definition for program to meet requirement to be compliant. As far as C++ compiler (not preprocessor) concerned, those are two different, incompatible, unrelated class types.
Imagine situation where in compilation unit A.cpp X was defined before class A and in unit B.cpp X was not defined. You would not get any compiler errors if nothing within B.cpp uses those members which were "removed". Both units may be considered well-formed on their own. Now if B.cpp would contain a new expression, it would create an object of incompatible type, smaller than one defined in A.cpp. But any method from class A , including constructor, may cause an UB by accessing memory outside of object's storage when called with object created in B.cpp, because they use the larger definition.
There is a variation of this folly, an inclusion of header file's copy into two or more different folders of build tree with same name of file and POD struct type, one of those folders accessible by #include <filename>. Units with #include "filename" designed to use alternatives. But they wouldn't. Because order of header file lookup in this case is platform-defined, programmer isn't entirely in control of which header would be included in which unit on every platform with #include "filename". As soon as one definition would be changed, even just by re-ordering members, ODR was breached.
To be particularly safe, such things should be done only in compiler domain by using templates, PIMPL, etc. For inter-language communication some middle ground should be arranged, using wrappers or adapters, C++ and ObjectiveC++ may have incompatible memory layout of non-POD objects.
This blows up horribly. Do not do this. Example with gcc:
Header file:
// a.h
class Foo
{
public:
Foo() { ; }
#ifdef A
virtual void IsCalled();
#endif
virtual void NotCalled();
};
First C++ File:
// a1.cpp
#include <iostream>
#include "a.h"
void Foo::NotCalled()
{
std::cout << "This function is never called" << std::endl;
}
extern Foo* getFoo();
extern void IsCalled(Foo *f);
int main()
{
Foo* f = getFoo();
IsCalled(f);
}
Second C++ file:
// a2.cpp
#define A
#include "a.h"
#include <iostream>
void Foo::IsCalled(void)
{
std::cout << "We call this function, but ...?!" << std::endl;
}
void IsCalled(Foo *f)
{
f->IsCalled();
}
Foo* getFoo()
{
return new Foo();
}
Result:
This function is never called
Oops! The code called virtual function IsCalled and we dispatched to NotCalled because the two translation units disagreed on which entry was where in the class virtual function table.
What went wrong here? We violated ODR. So now two translations units disagree on what is supposed to be where in the virtual function table. So if we create a class in one translation unit and call a virtual function in it from another translation unit, we may call the wrong virtual function. Oopsie whoopsie!
Please do not deliberately do thigs that the relevant standards say are not allowed and will not work. You will never be able to think of every possible way it can go wrong. This kind of reasoning has caused many disasters over my decades of programming and I really wish people would stop deliberately and intentionally creating potential disasters.
Is it safe to use #ifdef guards on C++ class member functions?
In practice (look at the generated assembler code using GCC as g++ -O2 -fverbose-asm -S) what you propose to do is safe. In theory it should not be.
However, there is another practical approach (used in Qt and FLTK). Use some naming conventions in your "hidden" methods (e.g. document that all of them should have dontuse in their name like int dontuseme(void)), and write your GCC plugin to warn against them at compile time. Or just use some clever grep(1) in your build process (e.g. in your Makefile)
Alternatively, your GCC plugin may implement new #pragma-s or function attributes, and could warn against misuse of such functions.
Of course, you can also use (cleverly) private: and most importantly, generate C++ code (with a generator like SWIG) in your build procedure.
So practically speaking, your #ifdef guards may be useless. And I am not sure they make the C++ code more readable.
If performance matters (with GCC), use the -flto -O2 flags at both compile and link time.
See also GNU autoconf -which uses similar preprocessor based approaches.
Or use some other preprocessor or C++ code generator (GNU m4, GPP, your own one made with ANTLR or GNU bison) to generate some C++ code. Like Qt does with its moc.
So my opinion is that what you want to do is useless. Your unstated goals can be achieved in many other ways. For example, generating "random" looking C++ identifiers (or C identifiers, or ObjectiveC++ names, etc....) like _5yQcFbU0s (this is done in RefPerSys) - the accidental collision of names is then very improbable.
In a comment you state:
Otherwise, I have to use void* for all Objective-C types in the header, but this way I am losing strong typing
No, you can generate some inline C++ functions (that would use reinterpret_cast) to gain again that strong typing. Qt does so! FLTK or FOX or GTKmm also generate C++ code (since GUI code is easy to generate).
My idea was to guard methods with Objective-C types with OBJC macro
This make perfect sense if you generate some C++ or C or Objective C code with these macros.
I suspect if member function pointers or virtual functions are used this will most likely break.
In practice, it won't break if you generate random looking C++ identifiers. Or just if you document naming conventions (like GNU bison or ANTLR does) in generated C++ code (or in generated Objective C++, or in generated C, ... code)
Please notice that compilers like GCC use today (in 2021, internally) several C++ code generators. So generating C++ code is a common practice. In practice, the risks of name collisions are small if you take care of generating "random" identifiers (you could store them in some sqlite database at build time).
also ugly static casts must be added all over the code
These casts don't matter if the ugly code is generated.
As examples, RPCGEN and SWIG -or Bisoncpp- generate ugly C and C++ code which works very well (and perhaps also some proprietary ASN.1 or JSON or HTTP or SMTP or XML related in-house code generators).
The header file is included in both pure C++ and Objective C++ translation units.
An alternative approach is to generate two different header files...
one for C++, and another for Objective C++. The SWIG tool could be inspirational. Of course your (C or C++ or Objective C) code generators would emit random looking identifiers.... Like I do in both Bismon (generating random looking C names like moduleinit_9oXtCgAbkqv_4y1xhhF5Nhz_BM) and RefPerSys (generating random looking C++ names like rpsapply_61pgHb5KRq600RLnKD ...); in both systems accidental name collision is very improbable.
Of course, in principle, using #ifdef guards is not safe, as explained in this answer.
PS. A few years ago I did work on GCC MELT which generated millions of lines of C++ code for some old versions of the GCC compiler. Today -in 2021- you practically could use asmjit or libgccjit to generate machine code more directly. Partial evaluation is then a good conceptual framework.

Is it practical to use Header files without a partner Class/Cpp file in C++

I've recently picked up C++ as part of my course, and I'm trying to understand in more depth the partnership between headers and classes. From every example or tutorial I've looked up on header files, they all use a class file with a constructor and then follow up with methods if they were included. However I'm wondering if it's fine just using header files to hold a group of related functions without the need to make an object of the class every time you want to use them.
//main file
#include <iostream>
#include "Example.h"
#include "Example2.h"
int main()
{
//Example 1
Example a; //I have to create an object of the class first
a.square(4); //Then I can call the function
//Example 2
square(4); //I can call the function without the need of a constructor
std::cin.get();
}
In the first example I create an object and then call the function, i use the two files 'Example.h' and 'Example.cpp'
//Example1 cpp
#include <iostream>
#include "Example.h"
void Example::square(int i)
{
i *= i;
std::cout << i << std::endl;
}
//Example1 header
class Example
{
public:
void square(int i);
};
In example2 I call the function directly from file 'Example2.h' below
//Example2 header
void square(int i)
{
i *= i;
std::cout << i;
}
Ultimately I guess what I'm asking is, if it's practical to use just the header file to hold a group of related functions without creating a related class file. And if the answer is no, what's the reason behind that. Either way I'm sure I've over looked something, but as ever I appreciate any kind of insight from you guys on this!
Of course, it's just fine to have only headers (as long as you consider the One Definition Rule as already mentioned).
You can as well write C++ sources without any header files.
Strictly speaking, headers are nothing else than filed pieces of source code which might be #included (i.e. pasted) into multiple C++ source files (i.e. translation units). Remembering this basic fact was sometimes quite helpful for me.
I made the following contrived counter-example:
main.cc:
#include <iostream>
// define float
float aFloat = 123.0;
// make it extern
extern float aFloat;
/* This should be include from a header
* but instead I prevent the pre-processor usage
* and simply do it by myself.
*/
extern void printADouble();
int main()
{
std::cout << "printADouble(): ";
printADouble();
std::cout << "\n"
"Surprised? :-)\n";
return 0;
}
printADouble.cc:
/* This should be include from a header
* but instead I prevent the pre-processor usage
* and simply do it by myself.
*
* This is intentionally of wrong type
* (to show how it can be done wrong).
*/
// use extern aFloat
extern double aFloat;
// make it extern
extern void printADouble();
void printADouble()
{
std::cout << aFloat;
}
Hopefully, you have noticed that I declared
extern float aFloat in main.cc
extern double aFloat in printADouble.cc
which is a disaster.
Problem when compiling main.cc? No. The translation unit is consistent syntactically and semantically (for the compiler).
Problem when compiling printADouble.cc? No. The translation unit is consistent syntactically and semantically (for the compiler).
Problem when linking this mess together? No. Linker can resolve every needed symbol.
Output:
printADouble(): 5.55042e-315
Surprised? :-)
as expected (assuming you expected as well as me nothing with sense).
Live Demo on wandbox
printADouble() accessed the defined float variable (4 bytes) as double variable (8 bytes). This is undefined behavior and goes wrong on multiple levels.
So, using headers doesn't support but enables (some kind of) modular programming in C++. (I didn't recognize the difference until I once had to use a C compiler which did not (yet) have a pre-processor. So, this above sketched issue hit me very hard but was really enlightening for me, also.)
IMHO, header files are a pragmatic replacement for an essential feature of modular programming (i.e. the explicit definion of interfaces and separation of interfaces and implementations as language feature). This seems to have annoyed other people as well. Have a look at A Few Words on C++ Modules to see what I mean.
C++ has a One Definition Rule (ODR). This rule states that functions and objects should be defined only once. Here's the problem: headers are often included more than once. Your square(int) function might therefore be defined twice.
The ODR is not an absolute rule. If you declare square as
//Example2 header
inline void square(int i)
// ^^^
{
i *= i;
std::cout << i;
}
then the compiler will inform the linker that there are multiple definitions possible. It's your job then to make sure all inline definitions are identical, so don't redefine square(int) elsewhere.
Templates and class definitions are exempt; they can appear in headers.
C++ is a multi paradigm programming language, it can be (at least):
procedural (driven by condition and loops)
functional (driven by recursion and specialization)
object oriented
declarative (providing compile-time arithmetic)
See a few more details in this quora answer.
Object oriented paradigm (classes) is only one of the many that you can leverage programming in C++.
You can mix them all, or just stick to one or a few, depending on what's the best approach for the problem you have to solve with your software.
So, to answer your question:
yes, you can group a bunch of (better if) inter-related functions in the same header file. This is more common in "old" C programming language, or more strictly procedural languages.
That said, as in MSalters' answer, just be conscious of the C++ One Definition Rule (ODR). Use inline keyword if you put the declaration of the function (body) and not only its definition (templates exempted).
See this SO answer for description of what "declaration" and "definition" are.
Additional note
To enforce the answer, and extend it to also other programming paradigms in C++,
in the latest few years there is a trend of putting a whole library (functions and/or classes) in a single header file.
This can be commonly and openly seen in open source projects, just go to github or gitlab and search for "header-only":
The common way is and always has been to put code in .cpp files (or whatever extension you like) and declarations in headers.
There is occasionally some merit to putting code in the header, this can allow more clever inlining by the compiler. But at the same time, it can destroy your compile times since all code has to be processed every time it is included by the compiler.
Finally, it is often annoying to have circular object relationships (sometimes desired) when all the code is the headers.
Some exception case is Templates. Many newer "modern" libraries such as boost make heavy use of templates and often are "header only." However, this should only be done when dealing with templates as it is the only way to do it when dealing with them.
Some downsides of writing header only code
If you search around, you will see quite a lot of people trying to find a way to reduce compile times when dealing with boost. For example: How to reduce compilation times with Boost Asio, which is seeing a 14s compile of a single 1K file with boost included. 14s may not seem to be "exploding", but it is certainly a lot longer than typical and can add up quite quickly. When dealing with a large project. Header only libraries do affect compile times in a quite measurable way. We just tolerate it because boost is so useful.
Additionally, there are many things which cannot be done in headers only (even boost has libraries you need to link to for certain parts such as threads, filesystem, etc). A Primary example is that you cannot have simple global objects in header only libs (unless you resort to the abomination that is a singleton) as you will run into multiple definition errors. NOTE: C++17's inline variables will make this particular example doable in the future.
To be more specific boost, Boost is library, not user level code. so it doesn't change that often. In user code, if you put everything in headers, every little change will cause you to have to recompile the entire project. That's a monumental waste of time (and is not the case for libraries that don't change from compile to compile). When you split things between header/source and better yet, use forward declarations to reduce includes, you can save hours of recompiling when added up across a day.

C++ - What is the use of inline keyword if we should define method in header?

Based on the answer here, It is only needed to define the method inside header file in order to make it inline. So my question is, why there is inline keyword?
For example, if you want to define a free function inside a header (not a member function of a class), you will need to declare it inline, otherwise including it in multiple translation units will cause an ODR violation.
The usual approach is to declare the function in the header and define it in a separately compiled .cpp file. However, defining it in the header as an inline function allows every translation unit that includes it to have its definition available, which makes it easier for that function to actually be inlined; this is important if the function is heavily used.
The historical reason for the inline keyword is that older C compilers weren't as capable of optimising code as good quality modern C and C++ compilers are. It was therefore, originally, introduced to allow the programmer to indicate a preference to inline a function (effectively insert the function body directly into the caller, which avoids overheads associated with function calls). Since there was no guarantee that a compiler could inline the function (e.g. some compilers could only inline certain types of functions) the inline keyword was made a hint rather than a directive.
The other typical use of inline is not as discretionary - if a function is defined in multiple compilation units (e.g. because the containing header is #included more than once) then leaving off inline causes a violation of the one definition rule (and therefore in undefined behaviour). inline is an instruction to the compiler (which in turn probably emits instructions to the linker) to resolve such incidental violations of the one-definition rule, and produce a working program (e.g. without causing the linker to complain about multiple definitions).
As to why it is still needed, functionally, the preprocessor only does text substitution ... such as replacing an #include directive with contents of the included header.
Let's consider two scenarios. The first has two source files (compilation units) that do;
#include "some_header"
//some other functions needed by the program, not pertinent here
and
#include "some_header"
int main()
{
foo();
}
where some_header contains the definition
void foo() { /* whatever */ }
The second scenario, however, omits the header file (yes, people do that) and has the first source file as
void foo() { /* whatever */ }
//some other functions needed by the program, not pertinent here
and the second as;
void foo() { /* whatever */ }
int main()
{
foo();
}
In both scenarios, assume the two compilation units are compiled and linked together. The result is a violation of the one definition rule (and, practically, typically results in a linker error due to multiply defined symbols).
Under current rules of the language (specifically, the preprocessor only doing text substitution), the two scenarios are required to be EXACTLY functionally equivalent. If one scenario was to print the value 42, the so should the other. If one has undefined behaviour (as is the case here) so does the other.
But, let's say for sake of discussion, that there was a requirement that a function be magically inlined if it is defined in a header. The code in the two scenarios would no longer be equivalent. The header file version would have defined behaviour (no violation of the one definition rule) and the second would have undefined behaviour.
Oops. We've just broken equivalence of the two scenarios. That may not seem much, but programmers would practically have trouble understanding why one version links and the other doesn't. And they would have have no way of fixing that ... other than moving code into a header file.
That means, we need some way to make the two scenarios equivalent. This means there needs to be something in the code which makes them equivalent. Enter the inline keyword, and prefix it to the definitions of foo().
Now, okay, one might argue that the preprocessor should do something a bit more intelligent i.e. do more than simple text substitution. But, now you are on a slippery slope. The C and C++ standards do (at length) specify that the preprocessor does that. And changing that would introduce a cascade of other changes. Whether that is a good idea or not (and, certainly, there is some advocacy for eliminating the preprocessor from C++ entirely), that is a much bigger change, with numerous ramifications on the language, and on programmers (who, whether it is good or bad, can rely on the preprocessor behaving as it does).
Short answer. It's not required. Compilers generally ignore the inline keyword.
More comprehensive answers are already given in other questions, not to mention the second answer in the question you linked.

inline in C++ and compiler

!! Specific on frequently used methods like getter & setter. !!
I have no idea when the keyword inline should be used. Ofc I know what it does, but I still have no idea.
According to an interview with Bjarne Stroustrup he said:
My own rule of thumb is to use inlining (explicitly or implicitly) only for simple one- or two-line functions that I know to be frequently used and unlikely to change much over the years. Things like the size() function for a vector. The best uses of inlining is for function where the body is less code than the function call and return mechanism, so that the inlined function is not only faster than a non-inlined version, but also more compact in the object core: smaller and faster.
But I often read that the compiler automatically inline short functions like getter, setter methods (in this case getting the size() of a vector).
Can anyone help?
Edit:
Coming back to this after years and more experience the high performance C+++ programming, inline can indeed help. Working in the games industry even forceinline sometimes makes a difference, since not all compilers work the same. Some might inline automatically some don't.
My advice is if you work on frameworks, libraries or any heavily used code consider the use of inline, but this is just general advice anyway since you want such code to be fully optimized for any compiler. Always using inline might not be the best, because you'll also need the class definition for this part of the code. Sometimes this can increase compilation times if you can't use forward declarations anymore.
another hint: you can use C++14 auto return type deduction even with seperating the function definition:
MyClass.h
class MyClass
{
int myint;
public:
auto GetInt() const;
}
inline auto MyClass::GetInt() const { return myint; }
all in one .h file.
Actually, inline keyword is not for the compiler anymore, but for the linker.
That is, while inline in function declaration still serves for most compilers as a hint, on high optimization setting they will inline things without inline and won't inline things with inline, if they deem it better for the resulting code.
Where it is still necessary is to mark function symbols as weak and thus circumvent One Definition Rule, which says that in given set of object files you want to make into a binary, each symbol (such as function) shall be present only once.
Bjarne's quote is old. Modern compilers are pretty smart at it.
That said, if you don't use Link Time Code Generation, the compiler must see the code to inline it. For functions used in multiple .cpp files, that means you need to define them in a header. And to circumvent the One Definition Rule in that case, you must define those functions as inline.
Class members defined inside the class are inline by default, though.
The below speaks specifically to C++:
The inline keyword has nothing to do with inlining.
The inline keyword allows the same function to be defined multiple times in the same program:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required.
§3.2 [basic.def.odr]
Attaching meaning beyond this to the inline keyword is erroneous. The compiler is free to inline (or not) anything according to the "as-if rule":
A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input.
§1.9 [intro.execution]
Considering what compiler optimizations can do, the only use of inline I have today is for non-template function whose body is defined inside headers files outside class bodies.
Everything is defined (note: defined != declared) inside class bodies is inline by default, just as templates are.
The meaning of inline in fact is: "Defined in header, potentially imported in multiple sources, just keep just one copy of it" told to the linker.
May be in c++35 someone will finally decide to replace that keyword with another one more meaningful.
C++ standard Section 7.1.2 Point 2:
(...) The inline specifier indicates to the implementation that
inline substitution of the function body at the point of call is to be
preferred to the usual function call mechanism. An implementation
is not required to perform this inline substitution at the point of
call (...)
In other words instead of havin a single code for your function, that is called several times, the compiler may just duplicate your code in the various places the function is called. This avoids the little overhead related to the function call, at the cost of bigger executables.
Be aware that inline keyword may be used also with namespaces, but with a very different meaning. Members of an inline namespace can be used in most respects as though they were members of the enclosing namespace. (see Standard, section 7.3.1 point 8).
Edit:
The google style guide recommends to inline only when a function is ten lines or less.
I'm answering the question my self!: Solution: After a few performance tests, the rule of thumb from Stroustrup is right! inlining Short functions like the .size() from vector can improve the performance (.size() calls are used frequently). But the impact is only noticeable for FREQUENTLY used functions. If a getter/setter method is used a lot, inlining it might increase the performance.
Stroustrup:
Don’t make statements about “efficiency” of code without first doing
time measurements. Guesses about performance are most unreliable.

What are the advantages and disadvantages of separating declaration and definition as in C++?

In C++, declaration and definition of functions, variables and constants can be separated like so:
function someFunc();
function someFunc()
{
//Implementation.
}
In fact, in the definition of classes, this is often the case. A class is usually declared with it's members in a .h file, and these are then defined in a corresponding .C file.
What are the advantages & disadvantages of this approach?
Historically this was to help the compiler. You had to give it the list of names before it used them - whether this was the actual usage, or a forward declaration (C's default funcion prototype aside).
Modern compilers for modern languages show that this is no longer a necessity, so C & C++'s (as well as Objective-C, and probably others) syntax here is histotical baggage. In fact one this is one of the big problems with C++ that even the addition of a proper module system will not solve.
Disadvantages are: lots of heavily nested include files (I've traced include trees before, they are surprisingly huge) and redundancy between declaration and definition - all leading to longer coding times and longer compile times (ever compared the compile times between comparable C++ and C# projects? This is one of the reasons for the difference). Header files must be provided for users of any components you provide. Chances of ODR violations. Reliance on the pre-processor (many modern languages do not need a pre-processor step), which makes your code more fragile and harder for tools to parse.
Advantages: no much. You could argue that you get a list of function names grouped together in one place for documentation purposes - but most IDEs have some sort of code folding ability these days, and projects of any size should be using doc generators (such as doxygen) anyway. With a cleaner, pre-processor-less, module based syntax it is easier for tools to follow your code and provide this and more, so I think this "advantage" is just about moot.
It's an artefact of how C/C++ compilers work.
As a source file gets compiled, the preprocessor substitutes each #include-statement with the contents of the included file. Only afterwards does the compiler try to interpret the result of this concatenation.
The compiler then goes over that result from beginning to end, trying to validate each statement. If a line of code invokes a function that hasn't been defined previously, it'll give up.
There's a problem with that, though, when it comes to mutually recursive function calls:
void foo()
{
bar();
}
void bar()
{
foo();
}
Here, foo won't compile as bar is unknown. If you switch the two functions around, bar won't compile as foo is unknown.
If you separate declaration and definition, though, you can order the functions as you wish:
void foo();
void bar();
void foo()
{
bar();
}
void bar()
{
foo();
}
Here, when the compiler processes foo it already knows the signature of a function called bar, and is happy.
Of course compilers could work in a different way, but that's how they work in C, C++ and to some degree Objective-C.
Disadvantages:
None directly. If you're using C/C++ anyway, it's the best way to do things. If you've got a choice of language/compiler, then maybe you can pick one where this is not an issue. The only thing to consider with splitting declarations into header files is to avoid mutually recursive #include-statements - but that's what include guards are for.
Advantages:
Compilation speed: As all included files are concatenated and then parsed, reducing the amount and complexity of code in included files will improve compilation time.
Avoid code duplication/inlining: If you fully define a function in a header file, each object file that includes this header and references this function will contain it's own version of that function. As a side-note, if you want inlining, you need to put the full definition into the header file (on most compilers).
Encapsulation/clarity: A well defined class/set of functions plus some documentation should be enough for other developers to use your code. There is (ideally) no need for them to understand how the code works - so why require them to sift through it? (The counter-argument that it's may be useful for them to access the implementation when required still stands, of course).
And of course, if you're not interested in exposing a function at all, you can usually still choose to define it fully in the implementation file rather than the header.
The standard requires that when using a function, a declaration must be in scope. This means, that the compiler should be able to verify against a prototype (the declaration in a header file) what you are passing to it. Except of course, for functions that are variadic - such functions do not validate arguments.
Think of C, when this was not required. At that time, compilers treated no return type specification to be defaulted to int. Now, assume you had a function foo() which returned a pointer to void. However, since you did not have a declaration, the compiler will think that it has to return an integer. On some Motorola systems for example, integeres and pointers would be be returned in different registers. Now, the compiler will no longer use the correct register and instead return your pointer cast to an integer in the other register. The moment you try to work with this pointer -- all hell breaks loose.
Declaring functions within the header is fine. But remember if you declare and define in the header make sure they are inline. One way to achieve this is to put the definition inside the class definition. Otherwise prepend the inline keyword. You will run into ODR violation otherwise when the header is included in multiple implementation files.
There are two main advantages to separating declaration and definition into C++ header and source files. The first is that you avoid problems with the One Definition Rule when your class/functions/whatever are #included in more than one place. Secondly, by doing things this way, you separate interface and implementation. Users of your class or library need only to see your header file in order to write code that uses it. You can also take this one step farther with the Pimpl Idiom and make it so that user code doesn't have to recompile every time the library implementation changes.
You've already mentioned the disadvantage of code repetition between the .h and .cpp files. Maybe I've written C++ code for too long, but I don't think it's that bad. You have to change all user code every time you change a function signature anyway, so what's one more file? It's only annoying when you're first writing a class and you have to copy-and-paste from the header to the new source file.
The other disadvantage in practice is that in order to write (and debug!) good code that uses a third-party library, you usually have to see inside it. That means access to the source code even if you can't change it. If all you have is a header file and a compiled object file, it can be very difficult to decide if the bug is your fault or theirs. Also, looking at the source gives you insight into how to properly use and extend a library that the documentation might not cover. Not everyone ships an MSDN with their library. And great software engineers have a nasty habit of doing things with your code that you never dreamed possible. ;-)
Advantage
Classes can be referenced from other files by just including the declaration. Definitions can then be linked later on in the compilation process.
You basically have 2 views on the class/function/whatever:
The declaration, where you declare the name, the parameters and the members (in the case of a struct/class), and the definition where you define what the functions does.
Amongst the disadvantages are repetition, yet one big advantage is that you can declare your function as int foo(float f) and leave the details in the implementation(=definition), so anyone who wants to use your function foo just includes your header file and links to your library/objectfile, so library users as well as compilers just have to care for the defined interface, which helps understanding the interfaces and speeds up compile times.
One advantage that I haven't seen yet: API
Any library or 3rd party code that is NOT open source (i.e. proprietary) will not have their implementation along with the distribution. Most companies are just plain not comfortable with giving away source code. The easy solution, just distribute the class declarations and function signatures that allow use of the DLL.
Disclaimer: I'm not saying whether it's right, wrong, or justified, I'm just saying I've seen it a lot.
One big advantage of forward declarations is that when used carefully you can cut down the compile time dependencies between modules.
If ClassA.h needs to refer to a data element in ClassB.h, you can often use just a forward references in ClassA.h and include ClassB.h in ClassA.cc rather than in ClassA.h, thus cutting down a compile time dependency.
For big systems this can be a huge time saver on a build.
Disadvantage
This leads to a lot of repetition. Most of the function signature needs to be put in two or more (as Paulious noted) places.
Separation gives clean, uncluttered view of program elements.
Possibility to create and link to binary modules/libraries without disclosing sources.
Link binaries without recompiling sources.
When done correctly, this separation reduces compile times when only the implementation has changed.