Benefits of splitting interface and implementation in C++

Benefits of splitting interface and implementation in C++ - c++

I'm using C++ and I'm considering putting my function implementation into .h. I know that .h file is for definitions and .cpp is for implementations but how splitting all files into headers and sources will benefit me. Well if my aim would be to create static or dynamic library than of course that would make a difference but I am creating this code for myself and not planning to make a library out of it. So is there any other benefit from splitting source from definition?

The obvious goal is to reduce coupling : as soon as you change a header file, anything that includes it must be recompiled. This can rapidly have a strong impact on compilation times (even in a small project).

You can put almost all code into .h file, it will be header-only library. But if you want more faster partial recompilation, or if you want put some code to shared library - you should create .cpp files.

Depending on the size of your project it will save you compile time and make it possible to know all ressources etc. (unless you put everything into one single file).
The better your header files are organized the less work your compiler has to do to apply changes. Also looking in a small header file to look up some forgotten parameter information is a lot easier than scrolling through a hole cpp file.

One other obvious improvement is in avoiding re-compiling the code for your function in each file that uses it, instead compiling it once and using it where needed.
Another is that it follows convention (and the standard's one definition rule), so others will find it much easier to deal with and understand.

It depends on the size of the project. Up to about 500 LOC,
I tend to put everything in a single file, with the function
definitions in the class. Except that up to about 500 LOC,
I generally use a simpler language than C++; something like AWK.
As soon as the code gets big enough to warrent several source
files, it's definitely an advantage to put as little as possible
in the header, and that means putting all of the function
definitions in the source files. And as soon as the classes
become non-trivial, you probably don't want the function
definitions in the class itself, for readability reasons.
--
James Kanze

Related

How to properly declare functions in C++? [duplicate]

I know this maybe quite subjective, but are there any general rules for situations when it is not necessary for code to be split into two files?
For example is the class is extremely small, or if the file simply holds some global definitions or static functions? Also, in these cases, should the single file be a .cpp file or a .h file?

On the technical side, whenever you need to obey the one definition rule you have to separate declarations from definitions, since you will need to include the declarations many times in multiple translation units, but you must only provide one single definition.
In aesthetic terms, the answer could be something like "always", or "systematically". In any case, you should always have a header for every logical unit of code (e.g. one class or one collection of functions); and the source file is one that is possibly optional, depending on whether or not you have everything defined inline (exempting you from ODR), or if you have a template library.
As a meta strategy, you should seek to decouple the compilation units as much as possible, so that you can include only what's needed in a fine-grained way. This allows your project to grow without having compilation times become unbearable, and it makes it much easier to reuse code in other projects.

I favor putting code in .hpp files but am very often compelled to put the implementation in the .cpp for any of the following reasons:
Reducing build time. This is the #1 reason to use .cpp files... and the reason most code you find in .hpp files is small and simple. When the implementation changes, you don't want to have to rebuild every .cpp that includes it.
When the function's linkage is important. For example, if the function is exported as a library (e.g. DLL) function, it's important that it select a single compilation unit to live in. Or for static / global instances. This is also important for distributing an import header for a DLL.
When you wish to hide implementation details when distributing a library
The definition and declaration are not identical. This can be the case with respect to constness of arguments.
You want a "clean" overview of the interface in the .hpp file. I find that with modern code tools and increasing familiarity with single-code-file languages like javascript / C# / inline C++, this is not a strong argument.
You explicitly do not want the function to be declared inline. Although, it won't matter; the optimizing compiler will likely inline if it wants to.
There are logical motivations for keeping code inline in a .hpp file:
Why have two files when you can have one?
Duplicating the declaration / function headers is unnecessary maintenance and feels redundant. We have code analysis tools to show interfaces.
The concept that inline / template code must be put in a header and other code is put in a .cpp is arbitrary. If we are forced to put inline code in a .hpp file, and this is OK, then why not put all code in a .hpp file?
I am pretty sure though that the tradition of separate .cpp and .hpp files is far stronger than the reasoning however.

I know this maybe quite subjective, but are there any general rules
for situations when it is not necessary for code to be split into two
files?
Split the code into header and source whenever you can.
Generally, this shouldn't be done in next cases :
the code consists of template classes and functions (unless you
explicitly instantiate templates in the source file)
the header consists only of inline functions
should the single file be a .cpp file or a .h file?
Should be the header file (.h).

The rule I use is the following:
Whenever you can put code into a cpp file, do it.
The reasons are multiple:
Header files serve as rudimentary documentation. It is better not to clutter them with code.
You can also use pimpls at some places if you feel like, for the reason above.
It reduces compilation times:
whenever you change a .cpp, only this file will be recompiled.
whenever a header is included, it only contains the minimal amount of code needed
It allows you to assess more easily which part of your code depends on which other part, just by looking at which headers are included. This point mandates another rule of mine:
Whenever you can forward declare a class instead of including a header, do it.
This way, the .cpp files carry the dependency information between parts of your source. It also lowers build times.

I know this maybe quite subjective, but are there any general rules for situations when it is not necessary for code to be split into two files?
It's not always subjective; you will very good reasons to separate them in large projects. It's good to get into the practice of separating them, and learning when it is and is not appropriate to separate definition from declaration. It's hard to answer your question without knowing how complex your codebase will become.
For example is the class is extremely small
It's still not necessarily a bad idea to separate them, in general.
or if the file simply holds some global definitions
The header should not contain global definitions which require static construction, unless necessary.
or static functions?
These do not belong anywhere in C++. Use inline, or anonymous namespace. If you mean within a class' body, "it depends on the instruction count, if you are hoping it will be inlined".
Also, in these cases, should the single file be a .cpp file or a .h file?
The single file should be a header. Rationale: You should not #include cpp files.
And don't forget that intermodule (aka link-time) optimizations are getting better and better.
C++ compilation times are long, and it's very very very time consuming to fix this after the fact. I recommend that you get into the practice of using cpp files before your build times and dependencies explode.

How does including the header automatically define the class in to the driver (main)? [duplicate]

So I finished my first C++ programming assignment and received my grade. But according to the grading, I lost marks for including cpp files instead of compiling and linking them. I'm not too clear on what that means.
Taking a look back at my code, I chose not to create header files for my classes, but did everything in the cpp files (it seemed to work fine without header files...). I'm guessing that the grader meant that I wrote '#include "mycppfile.cpp";' in some of my files.
My reasoning for #include'ing the cpp files was:
- Everything that was supposed to go into the header file was in my cpp file, so I pretended it was like a header file
- In monkey-see-monkey do fashion, I saw that other header files were #include'd in the files, so I did the same for my cpp file.
So what exactly did I do wrong, and why is it bad?

To the best of my knowledge, the C++ standard knows no difference between header files and source files. As far as the language is concerned, any text file with legal code is the same as any other. However, although not illegal, including source files into your program will pretty much eliminate any advantages you would've got from separating your source files in the first place.
Essentially, what #include does is tell the preprocessor to take the entire file you've specified, and copy it into your active file before the compiler gets its hands on it. So when you include all the source files in your project together, there is fundamentally no difference between what you've done, and just making one huge source file without any separation at all.
"Oh, that's no big deal. If it runs, it's fine," I hear you cry. And in a sense, you'd be correct. But right now you're dealing with a tiny tiny little program, and a nice and relatively unencumbered CPU to compile it for you. You won't always be so lucky.
If you ever delve into the realms of serious computer programming, you'll be seeing projects with line counts that can reach millions, rather than dozens. That's a lot of lines. And if you try to compile one of these on a modern desktop computer, it can take a matter of hours instead of seconds.
"Oh no! That sounds horrible! However can I prevent this dire fate?!" Unfortunately, there's not much you can do about that. If it takes hours to compile, it takes hours to compile. But that only really matters the first time -- once you've compiled it once, there's no reason to compile it again.
Unless you change something.
Now, if you had two million lines of code merged together into one giant behemoth, and need to do a simple bug fix such as, say, x = y + 1, that means you have to compile all two million lines again in order to test this. And if you find out that you meant to do a x = y - 1 instead, then again, two million lines of compile are waiting for you. That's many hours of time wasted that could be better spent doing anything else.
"But I hate being unproductive! If only there was some way to compile distinct parts of my codebase individually, and somehow link them together afterwards!" An excellent idea, in theory. But what if your program needs to know what's going on in a different file? It's impossible to completely separate your codebase unless you want to run a bunch of tiny tiny .exe files instead.
"But surely it must be possible! Programming sounds like pure torture otherwise! What if I found some way to separate interface from implementation? Say by taking just enough information from these distinct code segments to identify them to the rest of the program, and putting them in some sort of header file instead? And that way, I can use the #include preprocessor directive to bring in only the information necessary to compile!"
Hmm. You might be on to something there. Let me know how that works out for you.

This is probably a more detailed answer than you wanted, but I think a decent explanation is justified.
In C and C++, one source file is defined as one translation unit. By convention, header files hold function declarations, type definitions and class definitions. The actual function implementations reside in translation units, i.e .cpp files.
The idea behind this is that functions and class/struct member functions are compiled and assembled once, then other functions can call that code from one place without making duplicates. Your functions are declared as "extern" implicitly.
/* Function declaration, usually found in headers. */
/* Implicitly 'extern', i.e the symbol is visible everywhere, not just locally.*/
int add(int, int);
/* function body, or function definition. */
int add(int a, int b)
{
return a + b;
}
If you want a function to be local for a translation unit, you define it as 'static'. What does this mean? It means that if you include source files with extern functions, you will get redefinition errors, because the compiler comes across the same implementation more than once. So, you want all your translation units to see the function declaration but not the function body.
So how does it all get mashed together at the end? That is the linker's job. A linker reads all the object files which is generated by the assembler stage and resolves symbols. As I said earlier, a symbol is just a name. For example, the name of a variable or a function. When translation units which call functions or declare types do not know the implementation for those functions or types, those symbols are said to be unresolved. The linker resolves the unresolved symbol by connecting the translation unit which holds the undefined symbol together with the one which contains the implementation. Phew. This is true for all externally visible symbols, whether they are implemented in your code, or provided by an additional library. A library is really just an archive with reusable code.
There are two notable exceptions. First, if you have a small function, you can make it inline. This means that the generated machine code does not generate an extern function call, but is literally concatenated in-place. Since they usually are small, the size overhead does not matter. You can imagine them to be static in the way they work. So it is safe to implement inline functions in headers. Function implementations inside a class or struct definition are also often inlined automatically by the compiler.
The other exception is templates. Since the compiler needs to see the whole template type definition when instantiating them, it is not possible to decouple the implementation from the definition as with standalone functions or normal classes. Well, perhaps this is possible now, but getting widespread compiler support for the "export" keyword took a long, long time. So without support for 'export', translation units get their own local copies of instantiated templated types and functions, similar to how inline functions work. With support for 'export', this is not the case.
For the two exceptions, some people find it "nicer" to put the implementations of inline functions, templated functions and templated types in .cpp files, and then #include the .cpp file. Whether this is a header or a source file doesn't really matter; the preprocessor does not care and is just a convention.
A quick summary of the whole process from C++ code (several files) and to a final executable:
The preprocessor is run, which parses all the directives which starts with a '#'. The #include directive concatenates the included file with inferior, for example. It also does macro-replacement and token-pasting.
The actual compiler runs on the intermediate text file after the preprocessor stage, and emits assembler code.
The assembler runs on the assembly file and emits machine code, this is usually called an object file and follows the binary executable format of the operative system in question. For example, Windows uses the PE (portable executable format), while Linux uses the Unix System V ELF format, with GNU extensions. At this stage, symbols are still marked as undefined.
Finally, the linker is run. All the previous stages were run on each translation unit in order. However, the linker stage works on all the generated object files which were generated by the assembler. The linker resolves symbols and does a lot of magic like creating sections and segments, which is dependent on the target platform and binary format. Programmers aren't required to know this in general, but it surely helps in some cases.
Again, this was definetely more than you asked for, but I hope the nitty-gritty details helps you to see the bigger picture.

The typical solution is to use .h files for declarations only and .cpp files for implementation. If you need to reuse the implementation you include the corresponding .h file into the .cpp file where the necessary class/function/whatever is used and link against an already compiled .cpp file (either an .obj file - usually used within one project - or .lib file - usually used for reusing from multiple projects). This way you don't need to recompile everything if only the implementation changes.

Think of cpp files as a black box and the .h files as the guides on how to use those black boxes.
The cpp files can be compiled ahead of time. This doesn't work in you #include them, as it needs to actual "include" the code into your program each time it compiles it. If you just include the header, it can just use the header file to determine how to use the precompiled cpp file.
Although this won't make much of a difference for your first project, if you start writing large cpp programs, people are going to hate you because compile times are going to explode.
Also have a read of this: Header File Include Patterns

Header files usually contain declarations of functions / classes, while .cpp files contain the actual implementations. At compile time, each .cpp file gets compiled into an object file (usually extension .o), and the linker combines the various object files into the final executable. The linking process is generally much faster than the compilation.
Benefits of this separation: If you are recompiling one of the .cpp files in your project, you don't have to recompile all the others. You just create the new object file for that particular .cpp file. The compiler doesn't have to look at the other .cpp files. However, if you want to call functions in your current .cpp file that were implemented in the other .cpp files, you have to tell the compiler what arguments they take; that is the purpose of including the header files.
Disadvantages: When compiling a given .cpp file, the compiler cannot 'see' what is inside the other .cpp files. So it doesn't know how the functions there are implemented, and as a result cannot optimize as aggressively. But I think you don't need to concern yourself with that just yet (:

The basic idea that headers are only included and cpp files are only compiled. This will become more useful once you have many cpp files, and recompiling the whole application when you modify only one of them will be too slow. Or when the functions in the files will start depending on each other. So, you should separate class declarations into your header files, leave implementation in cpp files and write a Makefile (or something else, depending on what tools are you using) to compile the cpp files and link the resulting object files into a program.

If you #include a cpp file in several other files in your program, the compiler will try to compile the cpp file multiple times, and will generate an error as there will be multiple implementations of the same methods.
Compilation will take longer (which becomes a problem on large projects), if you make edits in #included cpp files, which then force recompilation of any files #including them.
Just put your declarations into header files and include those (as they don't actually generate code per se), and the linker will hook up the declarations with the corresponding cpp code (which then only gets compiled once).

re-usability, architecture and data encapsulation
here's an example:
say you create a cpp file which contains a simple form of string routines all in a class mystring, you place the class decl for this in a mystring.h compiling mystring.cpp to a .obj file
now in your main program (e.g. main.cpp) you include header and link with the mystring.obj.
to use mystring in your program you don't care about the details how mystring is implemented since the header says what it can do
now if a buddy wants to use your mystring class you give him mystring.h and the mystring.obj, he also doesn't necessarily need to know how it works as long as it works.
later if you have more such .obj files you can combine them into a .lib file and link to that instead.
you can also decide to change the mystring.cpp file and implement it more effectively, this will not affect your main.cpp or your buddies program.

While it is certainly possible to do as you did, the standard practice is to put shared declarations into header files (.h), and definitions of functions and variables - implementation - into source files (.cpp).
As a convention, this helps make it clear where everything is, and makes a clear distinction between interface and implementation of your modules. It also means that you never have to check to see if a .cpp file is included in another, before adding something to it that could break if it was defined in several different units.

If it works for you then there is nothing wrong with it -- except that it will ruffle the feathers of people who think that there is only one way to do things.
Many of the answers given here address optimizations for large-scale software projects. These are good things to know about, but there is no point in optimizing a small project as if it were a large project -- that is what is known as "premature optimization". Depending on your development environment, there may be significant extra complexity involved in setting up a build configuration to support multiple source files per program.
If, over time, your project evolves and you find that the build process is taking too long, then you can refactor your code to use multiple source files for faster incremental builds.
Several of the answers discuss separating interface from implementation. However, this is not an inherent feature of include files, and it is quite common to #include "header" files that directly incorporate their implementation (even the C++ Standard Library does this to a significant degree).
The only thing truly "unconventional" about what you have done was naming your included files ".cpp" instead of ".h" or ".hpp".

When you compile and link a program the compiler first compiles the individual cpp files and then they link (connect) them. The headers will never get compiled, unless included in a cpp file first.
Typically headers are declarations and cpp are implementation files. In the headers you define an interface for a class or function but you leave out how you actually implement the details. This way you don't have to recompile every cpp file if you make a change in one.

I will suggest you to go through Large Scale C++ Software Design by John Lakos. In the college, we usually write small projects where we do not come across such problems. The book highlights the importance of separating interfaces and the implementations.
Header files usually have interfaces which are supposed not to be changed so frequently.
Similarly a look into patterns like Virtual Constructor idiom will help you grasp the concept further.
I am still learning like you :)

It's like writing a book, you want to print out finished chapters only once
Say you are writing a book. If you put the chapters in separate files then you only need to print out a chapter if you have changed it. Working on one chapter doesn't change any of the others.
But including the cpp files is, from the compiler's point of view, like editing all of the chapters of the book in one file. Then if you change it you have to print all the pages of the entire book in order to get your revised chapter printed. There is no "print selected pages" option in object code generation.
Back to software: I have Linux and Ruby src lying around. A rough measure of lines of code...
Linux Ruby
100,000 100,000 core functionality (just kernel/*, ruby top level dir)
10,000,000 200,000 everything
Any one of those four categories has a lot of code, hence the need for modularity. This kind of code base is surprisingly typical of real-world systems.

There are times when non conventional programming techniques are actually quite useful and solve otherwise difficult (if not impossible problems).
If C source is generated by third party applications such as lexx and yacc they can obviously be compiled and linked separately and this is the usual approach.
However there are times when these sources can cause linkage problems with other unrelated sources. You have some options if this occurs. Rewrite the conflicting components to accommodate the lexx and yacc sources. Modify the lexx & yacc componets to accommodate your sources. '#Include' the lexx and yacc sources where they are required.
Re-writing the the components is fine if the changes are small and the components are understood to begin with (i.e: you not porting someone else's code).
Modifying the lexx and yacc source is fine as long as the build process doesn't keep regenerating the source from the lexx and yacc scripts.
You can always revert to one of the other two methods if you feel it is required.
Adding a single #include and modifying the makefile to remove the build of the lexx/yacc components to overcome all your problems is attractive fast and provides you the opportunity to prove the code works at all without spending time rewriting code and questing whether the code would have ever worked in the first place when it isn't working now.
When two C files are included together they are basically one file and there are no external references required to be resolved at link time!

Header and cpp or just cpp files - best practise?

Looking around at different code bases I see a variety of styles:
Class "interfaces" defined in header file and the actual impl in a cpp file. In this approach the headers look well defined and easy to read but the cpp files look confusing as it's just a list of methods.
The second approach i see is just to put everything in a single class cpp file. These class files contain the definition and actual method impls in the body of the class definition. This approach looks better to me (more like Java and c#).
Which style should I be using?

For all but the simplest programs, style #2 is simply impossible. If you #include a .cpp file with function definitions from multiple other .cpp files, the definitions get added to multiple object files (.o / .obj) and the linker will complain about clashing symbols.
Use style #1 and learn to live with the confusion.

The former - interfaces in header files and class bodies in implementation files. You'll find this causes you fewer problems when working on large systems.
In C++ why have header files and cpp files?

C++ doesn't use "interfaces" they use classes - base/derived classes. I use one file to define class/and its implementation methods if the project is small and separate files if the project is large.
In java, I pack them up into one package then import it once in need.

Since you tagged with c++, go for first style. I don't find it confusing, for a Java programmer, it may seem different, but in C++, you are always going to use this approach.
In fact in my favorite IDE (MSVS), I open header file, and cpp file side by side. Makes looking up prototypes, and class declaration easy.
And when you have a dozen classes; a dozen .h files, and another dozen .cpp file, will make your work simpler. Because, when you want just to see, what a class does, you just open relevant .h file, and take a look at class members, and maybe short comments. You don't need to wade through several lines deep code.
Conclusion : The style options you gave, are option only for a small code, typically single file, with very few methods etc. Otherwise, it is not even a option. (#Thomas has given the reason why #2 is not even a option)

Header (HPP):
The header includes the declarations of your code, particularly function declarations. Technically speaking classes are defined in header-files, but again, the member functions are just declared.
Code in other files will include just this header and retain all necessary information from there.
Implementation (CPP):
The implementation includes the definition of functions, member-functions and variables.
Rationale:
Header-files gives a developer (a external user of your code) a plain overview and just offers the external available code (i.e. easy to read, only the information necessary for users).
Header-files allow the compiler to check the implementation for correctness
Header-files allow the compiler to check external code for correctness
Header-files allow for seperate-compilation. You need to keep in mind. that in former times, computers doesn't have enough resources to keep everything in main-memory during a compilation process. Header files are small, while implementation files are big.
Use #style 1, even for simple programs. So you can learn easily to work with. That maybe look outated today, especially in background of modern Multi-Pass-Compilers. But seperate header-files are even today beneficial. Rumours about the next C++-Standard appeared, as far as I know something like symbol export ( Java or C#) will be possible. But don't nail me down on this!
Notes:
- member-functions which are defined inside a class are by default inline, normally you don't want this
- use always defined guards

If you are developing large project, you'll find the first approach helps you a lot. The second approach may help you in small project. As your project becomes larger, management of complexity is a big issue of software development, and the first approach turns out to be a better choice.

What I do is:
write .cpp files, with the method names prefixed with the class name
in the .h file, create an empty class, with the appropriate name, then use a cogapp generator script, cog_addheaders.py, to insert the declarations, eg:
.cpp file: WeightsPersister.cpp
.h file: WeightsPersister.h
This way I get:
fast compilation (just needs to recompile the .cpp file, unless I change the class interface)
few issues with circular declarations
acceptably low tedious mindless manual work :-)

When should I not split my code into header and source files?

I know this maybe quite subjective, but are there any general rules for situations when it is not necessary for code to be split into two files?
For example is the class is extremely small, or if the file simply holds some global definitions or static functions? Also, in these cases, should the single file be a .cpp file or a .h file?

On the technical side, whenever you need to obey the one definition rule you have to separate declarations from definitions, since you will need to include the declarations many times in multiple translation units, but you must only provide one single definition.
In aesthetic terms, the answer could be something like "always", or "systematically". In any case, you should always have a header for every logical unit of code (e.g. one class or one collection of functions); and the source file is one that is possibly optional, depending on whether or not you have everything defined inline (exempting you from ODR), or if you have a template library.
As a meta strategy, you should seek to decouple the compilation units as much as possible, so that you can include only what's needed in a fine-grained way. This allows your project to grow without having compilation times become unbearable, and it makes it much easier to reuse code in other projects.

I favor putting code in .hpp files but am very often compelled to put the implementation in the .cpp for any of the following reasons:
Reducing build time. This is the #1 reason to use .cpp files... and the reason most code you find in .hpp files is small and simple. When the implementation changes, you don't want to have to rebuild every .cpp that includes it.
When the function's linkage is important. For example, if the function is exported as a library (e.g. DLL) function, it's important that it select a single compilation unit to live in. Or for static / global instances. This is also important for distributing an import header for a DLL.
When you wish to hide implementation details when distributing a library
The definition and declaration are not identical. This can be the case with respect to constness of arguments.
You want a "clean" overview of the interface in the .hpp file. I find that with modern code tools and increasing familiarity with single-code-file languages like javascript / C# / inline C++, this is not a strong argument.
You explicitly do not want the function to be declared inline. Although, it won't matter; the optimizing compiler will likely inline if it wants to.
There are logical motivations for keeping code inline in a .hpp file:
Why have two files when you can have one?
Duplicating the declaration / function headers is unnecessary maintenance and feels redundant. We have code analysis tools to show interfaces.
The concept that inline / template code must be put in a header and other code is put in a .cpp is arbitrary. If we are forced to put inline code in a .hpp file, and this is OK, then why not put all code in a .hpp file?
I am pretty sure though that the tradition of separate .cpp and .hpp files is far stronger than the reasoning however.

I know this maybe quite subjective, but are there any general rules
for situations when it is not necessary for code to be split into two
files?
Split the code into header and source whenever you can.
Generally, this shouldn't be done in next cases :
the code consists of template classes and functions (unless you
explicitly instantiate templates in the source file)
the header consists only of inline functions
should the single file be a .cpp file or a .h file?
Should be the header file (.h).

The rule I use is the following:
Whenever you can put code into a cpp file, do it.
The reasons are multiple:
Header files serve as rudimentary documentation. It is better not to clutter them with code.
You can also use pimpls at some places if you feel like, for the reason above.
It reduces compilation times:
whenever you change a .cpp, only this file will be recompiled.
whenever a header is included, it only contains the minimal amount of code needed
It allows you to assess more easily which part of your code depends on which other part, just by looking at which headers are included. This point mandates another rule of mine:
Whenever you can forward declare a class instead of including a header, do it.
This way, the .cpp files carry the dependency information between parts of your source. It also lowers build times.

I know this maybe quite subjective, but are there any general rules for situations when it is not necessary for code to be split into two files?
It's not always subjective; you will very good reasons to separate them in large projects. It's good to get into the practice of separating them, and learning when it is and is not appropriate to separate definition from declaration. It's hard to answer your question without knowing how complex your codebase will become.
For example is the class is extremely small
It's still not necessarily a bad idea to separate them, in general.
or if the file simply holds some global definitions
The header should not contain global definitions which require static construction, unless necessary.
or static functions?
These do not belong anywhere in C++. Use inline, or anonymous namespace. If you mean within a class' body, "it depends on the instruction count, if you are hoping it will be inlined".
Also, in these cases, should the single file be a .cpp file or a .h file?
The single file should be a header. Rationale: You should not #include cpp files.
And don't forget that intermodule (aka link-time) optimizations are getting better and better.
C++ compilation times are long, and it's very very very time consuming to fix this after the fact. I recommend that you get into the practice of using cpp files before your build times and dependencies explode.

C++: When is it acceptable to have code in the header file?

I've been taught to keep class definitions and code separate.
However, I've seen situations where people would often include some bits of code in the header, e.g. simple access methods which returns a reference of a variable.
Where do you draw the line?

Generally speaking, things you want the compiler to inline, or templated code. In either case, the code must be available to the compiler everywhere it's used, so you have no choice.
However, note that the more code you put in a header file, the longer it will take to compile - and the more often you'll end up touching header files, thus causing a chain reaction of slow builds :)

One reason to minimize the amount of code in headers is to minimize the amount of code to be recompiled when the implementation changes. If you don't care about that you can have any amount of code in headers.
Sometimes having the code in headers only is done intentionally to expose the code - ATL does that for eaxmple.

When developing a large C++ project, you need to be vigilant to make each .CPP file couple with as few header files as reasonably possible.
So I have a simple rule for "drawing the line":
If, by inlining the implementation, your header file now needs to include an additional header file, you should move the implementation out of the header and into the .CPP file.
Of course, that isn't the only reason not to inline, but this is a definite example of a line that shouldn't be crossed.

Inline functions
One/two line methods
Templates
But when the header grows in size, it will take more and more time to compile, so it is useful to use precompiled headers.

I would stick with the code in .c and .cpp files and headers stuff in .h, .hpp files. Reason is that when I download a .tar.gz or zip of source code, I'm always looking for the .h files for the header stuff and the .cpp files for the source. Many people are used to that so I think you should keep that style.

When you have code in the header file for simple getter/setter methods, it's more or less just fine because you either want to save some working time because of code maintenance (typing same method in header and implementation file), or because you actually want the method to be inline due to function call overhead in time critical places.
You will not only be subjected to slower builds, the linking time might be huge if you have a lot of classes visible more or less everywhere.
Another down side with mixing code between header and implementation files is readability. If you consistenly declare in header file and consistenly keep the code in the implementation file, it's easier to keep the same order between the methods. There are no missing gaps if you know what I mean.
Cheers !

Also note that you can wind up with some strange side-effects if you put code in a header file and aren't careful with your Makefile. I once had a bug due to code I changed in a header that was recompiled into some files but not others because the dependencies weren't correct in the Makefile. This isn't a big deal if you generate your Makefile automatically.

You will want code in the header if you wish to 'inline' the code. Otherwise, there are more advantages to sticking the implementation in the body.

It's up to you. Boost puts almost all of it's code in the header... and they're a well respected library.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js