Alternatives for gmake? - c++

I have a c++ program file with two functions in it. If I change the first function alone, why should both of them have to be recompiled?
Is there any build system which recompiles the first one alone and put it back in the same object file?
Is this possible? The instructions of one function shouldn't depend on other right?
Since gmake recompiles the whole file, it takes a lot of time, cant this be avoided? Putting the second function in a separate file is not a good idea, as it involves creation of unwanted files which is not necessary.

If the second function is quite long or requires more time to compile, place it in a separate file. That is why people separate source files. From what I know, it has to compile the whole file, as a small change in the source will result in a major change in the output file, as the functions would not link to each other.

I doubt that compiling only part of a source file is ever possible, using any programming language. Compilations are done on a per-file basis.

The analysis to decide which semantic parts of a given source file have changed and thus need recompiling would likely outweigh the cost of the compilation itself in most cases.
Build systems get big wins by analyzing the dependencies between source files because the cost of file I/O (particularly for include files) is a large part of the overall compilation cost. Once you've decided to recompile a given source file, you would likely only achieve a tiny speedup by ignoring unchanged parts of the file, even if there were zero cost to computing which parts those were.

All build systems for C++ that I know work on translation unit (file) level, not on function level. Although in theory it should be possible it is complicated when you consider the preprocessor, e.g.
#define ANSWER 42
void foo()
{
#undef ANSWER
#define ANSWER 41
}
int bar()
{
return ANSWER;
}
Although this is a terrible code any standard compliant compiler/build system should support it. And as you can see changing foo (redefining ANSWER) can affect bar.

Putting the second function in a separate file is a good idea, and is necessary if you want to avoid this "problem". If your functions are so large that the time spent recompiling one file is noticeable, then the file is probably too big and should be broken up anyway.

The issue isn't gmake, it's the compiler. If you change one function, you may have no choice but to recompile others. For instance:
if function a calls function b, and you change function b, you need to ensure that the a still calls b correctly, in case b's signature changed.
if function b is between a and c in the memory, and now b grows so that it no longer fits, you may have to move either a or c, which also involves recompiling to generate correct offsets.
If b is no longer in the same place, you need to compile its caller, a to point to the right function.
There are probably more and better cases where this is necessary.

Related

Disadvantages of condensing .cpp files?

When compiling a 'static library' project in MSVC++, I often get .lib files that are several MB in size. If I use conditional macros and include directives to "condense" all my .cpp files in one .cpp file at compile time, the .lib file size decreases considerably.
Are there any disadvantages with this practice?
The main problem of Unity Builds as they are called is that they break the way C++ works.
In C++, a source file, with its includes preprocessed, is called a Translation Unit. Some symbols are "private" to this translation unit:
symbols declared static at namespace level
anything declared in anonymous namespace
If you merge several C++ files, then the compiler will share those private symbols among all the files that are merged together since from its point of view this has become a single Translation Unit.
You will get an error if two local classes suddenly have the same name, and idem for constants. Annoying as hell, but at least you are notified.
For functions however, it may break silently because of overload. When before the compiler would pick static void launch(short u); for your call to launch(1), then suddenly it will shift to static void launch(int i, Target t = "Irak");. Oups ?
Unity Builds are dangerous. What you are looking for is called WPO (Whole Program Optimization) or LTO (Link Time Optimization), look into the innards of your compiler manual to know how to activate it.
A disadvantage would be if you change a single line in the cpp you have to compile the whole code.
Your file might get more complex and you'll have to recompile everyting even if you just change one single source file. Other then that, there's no real disadvantage, unless the files are redefining local functions or variables that might screw you up, when merging everything (e.g. due to multiple definitions within one translation unit).
The size decrease you notice is due to advanced optimizations that become available that way (e.g. reusing more code). Depending on your code you might get similar results by enabling all optimizations for size as well as link time optimizations, which might result in some acceptable solution between both approaches.
It's usually a confusing practice to include cpp to another cpp (at least you should leave explanatory comment about why did you do this).

Why is including a header file such an evil thing?

I have seen many explanations on when to use forward declarations over including header files, but few of them go into why it is important to do so. Some of the reasons I have seen include the following:
compilation speed
reducing complexity of header file management
removing cyclic dependencies
Coming from a .net background I find header management frustrating. I have this feeling I need to master forward declarations, but I have been scrapping by on includes so far.
Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
I can buy the argument for reduced complexity, but what would a practical example of this be?
"to master forward declarations" is not a requirement, it's a useful guideline where possible.
When a header is included, and it pulls in more headers, and yet more, the compiler has to do a lot of work processing a single translation module.
You can see how much, for example, with gcc -E:
A single #include <iostream> gives my g++ 4.5.2 additional 18,560 lines of code to process.
A #include <boost/asio.hpp> adds another 74,906 lines.
A #include <boost/spirit/include/qi.hpp> adds 154,024 lines, that's over 5 MB of code.
This adds up, especially if carelessly included in some file that's included in every file of your project.
Sometimes going over old code and pruning unnecessary includes improves the compilation dramatically just because of that. Replacing includes with forward declarations in the translation modules where only references or pointers to some class are used, improves this even further.
Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?
It cannot because, unlike some other languages, C++ has an ambiguous grammar:
int f(X);
Is it a function declaration or a variable definition? To answer this question the compiler must know what does X mean, so X must be declared before that line.
Because when you're doing something like this :
bar.h :
class Bar {
int foo(Foo &);
}
Then the compiler does not need to know how the Foo struct / class is defined ; so importing the header that defines Foo is useless. Moreover, importing the header that defines Foo might also need importing the header that defines some other class that Foo uses ; and this might mean importing the header that defines some other class, etc.... turtles all the way.
In the end, the file that the compiler is working against is almost like the result of copy pasting all the headers ; so it will get big for no good reason, and when someone makes a typo in a header file that you don't need (or import , or something like that), then compiling your class starts to take waaay too much time (or fail for no obvious reason).
So it's a good thing to give as little info as needed to the compiler.
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
1) reduced disk i/o (fewer files to open, fewer times)
2) reduced memory/cpu usage
most translations need only a name. if you use/allocate the object, you'll need its declaration.
this is probably where it will click for you: each file you compile compiles what is visible in its translation.
a poorly maintained system will end up including a ton of stuff it does not need - then this gets compiled for every file it sees. by using forwards where possible, you can bypass that, and significantly reduce the number of times a public interface (and all of its included dependencies) must be compiled.
that is to say: the content of the header won't be compiled once. it will be compiled over and over. everything in this translation must be parsed, checked that it's a valid program, checked for warnings, optimized, etc. many, many times.
including lazily only adds significant disk/cpu/memory increase, which turns into intolerable build times for you, while introducing significant dependencies (in non-trivial projects).
I can buy the argument for reduced complexity, but what would a practical example of this be?
unnecessary includes introduce dependencies as side effects. when you edit an include (necessary or not), then every file which includes it must be recompiled (not trivial when hundreds of thousands of files must be unnecessarily opened and compiled).
Lakos wrote a good book which covers this in detail:
http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1?ie=UTF8&s=books&qid=1304529571&sr=8-1
Header file inclusion rules specified in this article will help reduce the effort in managing header files.
I used forward declarations simply to reduce the amount of navigation between source files done. e.g. if module X calls some glue or interface function F in module Y, then using a forward declaration means the writing the function and the call can be done by only visiting 2 places, X.c and Y.c not so much of an issue when a good IDE helps you navigate, but I tend to prefer coding bottom-up creating working code then figuring out how to wrap it rather than through top down interface specification.. as the interfaces themselves evolve it's handy to not have to write them out in full.
In C (or c++ minus classes) it's possible to truly keep structure details Private by only defining them in the source files that use them, and only exposing forward declarations to the outside world - a level of black boxing that requires performance-destroying virtuals in the c++/classes way of doing things. It's also possible to avoid needing to prototype things (visiting the header) by listing 'bottom-up' within the source files (good old static keyword).
The pain of managing headers can sometimes expose how modular your program is or isn't - if its' truly modular, the number of headers you have to visit and the amount of code & datastructures declared within them should be minimized.
Working on a big project with 'everything included everywhere' through precompiled headers won't encourage this real modularity.
module dependancies can correlate with data-flow relating to performance issues, i.e. both i-cache & d-cache issues. If a program involves many modules that call each other & modify data at many random places, it's likely to have poor cache-coherency - the process of optimizing such a program will often involve breaking up passes and adding intermediate data.. often playing havoc with many'class diagrams'/'frameworks' (or at least requiring the creation of many intermediates datastructures). Heavy template use often means complex pointer-chasing cache-destroying data structures. In its optimized state, dependancies & pointer chasing will be reduced.
I believe forward declarations speed up compilation because the header file is ONLY included where it is actually used. This reduces the need to open and close the file once. You are correct that at some point the object referenced will need to be compiled, but if I am only using a pointer to that object in my other .h file, why actually include it? If I tell the compiler I am using a pointer to a class, that's all it needs (as long as I am not calling any methods on that class.)
This is not the end of it. Those .h files include other .h files... So, for a large project, opening, reading, and closing, all the .h files which are included repetitively can become a significant overhead. Even with #IF checks, you still have to open and close them a lot.
We practice this at my source of employment. My boss explained this in a similar way, but I'm sure his explanation was more clear.
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
Because include is a preprocessor thing, which means it is done via brute force when parsing the file. Your object will be compiled once (compiler) then linked (linker) as appropriate later.
In C/C++, when you compile, you've got to remember there is a whole chain of tools involved (preprocessor, compiler, linker plus build management tools like make or Visual Studio, etc...)
Good and evil. The battle continues, but now on the battle field of header files. Header files are a necessity and a feature of the language, but they can create a lot of unnecessary overhead if used in a non optimal way, e.g. not using forward declarations etc.
How do forward declarations speed up
compilations since at some point the
object referenced will need to be
compiled?
I can buy the argument for reduced
complexity, but what would a practical
example of this be?
Forward declarations are bad ass. My experience is that a lot of c++ programmers are not aware of the fact that you don't have to include any header file, unless you actually want to use some type, e.g. you need to have the type defined so the compiler understands what you want to do. It's important to try and refrain from including header files in other header files.
Just passing around a pointer from one function to another, only requires a forward declaration:
// someFile.h
class CSomeClass;
void SomeFunctionUsingSomeClass(CSomeClass* foo);
Including someFile.h does not require you to include the header file of CSomeClass, since you are merely passing a pointer to it, not using the class. This means that the compiler only needs to parse one line (class CSomeClass;) instead of an entire header file (that might be chained to other header files etc etc).
This reduces both compile time and link time, and we are talking big optimizations here if you have many headers and many classes.

What are the advantages and disadvantages of separating declaration and definition as in C++?

In C++, declaration and definition of functions, variables and constants can be separated like so:
function someFunc();
function someFunc()
{
//Implementation.
}
In fact, in the definition of classes, this is often the case. A class is usually declared with it's members in a .h file, and these are then defined in a corresponding .C file.
What are the advantages & disadvantages of this approach?
Historically this was to help the compiler. You had to give it the list of names before it used them - whether this was the actual usage, or a forward declaration (C's default funcion prototype aside).
Modern compilers for modern languages show that this is no longer a necessity, so C & C++'s (as well as Objective-C, and probably others) syntax here is histotical baggage. In fact one this is one of the big problems with C++ that even the addition of a proper module system will not solve.
Disadvantages are: lots of heavily nested include files (I've traced include trees before, they are surprisingly huge) and redundancy between declaration and definition - all leading to longer coding times and longer compile times (ever compared the compile times between comparable C++ and C# projects? This is one of the reasons for the difference). Header files must be provided for users of any components you provide. Chances of ODR violations. Reliance on the pre-processor (many modern languages do not need a pre-processor step), which makes your code more fragile and harder for tools to parse.
Advantages: no much. You could argue that you get a list of function names grouped together in one place for documentation purposes - but most IDEs have some sort of code folding ability these days, and projects of any size should be using doc generators (such as doxygen) anyway. With a cleaner, pre-processor-less, module based syntax it is easier for tools to follow your code and provide this and more, so I think this "advantage" is just about moot.
It's an artefact of how C/C++ compilers work.
As a source file gets compiled, the preprocessor substitutes each #include-statement with the contents of the included file. Only afterwards does the compiler try to interpret the result of this concatenation.
The compiler then goes over that result from beginning to end, trying to validate each statement. If a line of code invokes a function that hasn't been defined previously, it'll give up.
There's a problem with that, though, when it comes to mutually recursive function calls:
void foo()
{
bar();
}
void bar()
{
foo();
}
Here, foo won't compile as bar is unknown. If you switch the two functions around, bar won't compile as foo is unknown.
If you separate declaration and definition, though, you can order the functions as you wish:
void foo();
void bar();
void foo()
{
bar();
}
void bar()
{
foo();
}
Here, when the compiler processes foo it already knows the signature of a function called bar, and is happy.
Of course compilers could work in a different way, but that's how they work in C, C++ and to some degree Objective-C.
Disadvantages:
None directly. If you're using C/C++ anyway, it's the best way to do things. If you've got a choice of language/compiler, then maybe you can pick one where this is not an issue. The only thing to consider with splitting declarations into header files is to avoid mutually recursive #include-statements - but that's what include guards are for.
Advantages:
Compilation speed: As all included files are concatenated and then parsed, reducing the amount and complexity of code in included files will improve compilation time.
Avoid code duplication/inlining: If you fully define a function in a header file, each object file that includes this header and references this function will contain it's own version of that function. As a side-note, if you want inlining, you need to put the full definition into the header file (on most compilers).
Encapsulation/clarity: A well defined class/set of functions plus some documentation should be enough for other developers to use your code. There is (ideally) no need for them to understand how the code works - so why require them to sift through it? (The counter-argument that it's may be useful for them to access the implementation when required still stands, of course).
And of course, if you're not interested in exposing a function at all, you can usually still choose to define it fully in the implementation file rather than the header.
The standard requires that when using a function, a declaration must be in scope. This means, that the compiler should be able to verify against a prototype (the declaration in a header file) what you are passing to it. Except of course, for functions that are variadic - such functions do not validate arguments.
Think of C, when this was not required. At that time, compilers treated no return type specification to be defaulted to int. Now, assume you had a function foo() which returned a pointer to void. However, since you did not have a declaration, the compiler will think that it has to return an integer. On some Motorola systems for example, integeres and pointers would be be returned in different registers. Now, the compiler will no longer use the correct register and instead return your pointer cast to an integer in the other register. The moment you try to work with this pointer -- all hell breaks loose.
Declaring functions within the header is fine. But remember if you declare and define in the header make sure they are inline. One way to achieve this is to put the definition inside the class definition. Otherwise prepend the inline keyword. You will run into ODR violation otherwise when the header is included in multiple implementation files.
There are two main advantages to separating declaration and definition into C++ header and source files. The first is that you avoid problems with the One Definition Rule when your class/functions/whatever are #included in more than one place. Secondly, by doing things this way, you separate interface and implementation. Users of your class or library need only to see your header file in order to write code that uses it. You can also take this one step farther with the Pimpl Idiom and make it so that user code doesn't have to recompile every time the library implementation changes.
You've already mentioned the disadvantage of code repetition between the .h and .cpp files. Maybe I've written C++ code for too long, but I don't think it's that bad. You have to change all user code every time you change a function signature anyway, so what's one more file? It's only annoying when you're first writing a class and you have to copy-and-paste from the header to the new source file.
The other disadvantage in practice is that in order to write (and debug!) good code that uses a third-party library, you usually have to see inside it. That means access to the source code even if you can't change it. If all you have is a header file and a compiled object file, it can be very difficult to decide if the bug is your fault or theirs. Also, looking at the source gives you insight into how to properly use and extend a library that the documentation might not cover. Not everyone ships an MSDN with their library. And great software engineers have a nasty habit of doing things with your code that you never dreamed possible. ;-)
Advantage
Classes can be referenced from other files by just including the declaration. Definitions can then be linked later on in the compilation process.
You basically have 2 views on the class/function/whatever:
The declaration, where you declare the name, the parameters and the members (in the case of a struct/class), and the definition where you define what the functions does.
Amongst the disadvantages are repetition, yet one big advantage is that you can declare your function as int foo(float f) and leave the details in the implementation(=definition), so anyone who wants to use your function foo just includes your header file and links to your library/objectfile, so library users as well as compilers just have to care for the defined interface, which helps understanding the interfaces and speeds up compile times.
One advantage that I haven't seen yet: API
Any library or 3rd party code that is NOT open source (i.e. proprietary) will not have their implementation along with the distribution. Most companies are just plain not comfortable with giving away source code. The easy solution, just distribute the class declarations and function signatures that allow use of the DLL.
Disclaimer: I'm not saying whether it's right, wrong, or justified, I'm just saying I've seen it a lot.
One big advantage of forward declarations is that when used carefully you can cut down the compile time dependencies between modules.
If ClassA.h needs to refer to a data element in ClassB.h, you can often use just a forward references in ClassA.h and include ClassB.h in ClassA.cc rather than in ClassA.h, thus cutting down a compile time dependency.
For big systems this can be a huge time saver on a build.
Disadvantage
This leads to a lot of repetition. Most of the function signature needs to be put in two or more (as Paulious noted) places.
Separation gives clean, uncluttered view of program elements.
Possibility to create and link to binary modules/libraries without disclosing sources.
Link binaries without recompiling sources.
When done correctly, this separation reduces compile times when only the implementation has changed.

Writing function definition in header files in C++

I have a class which has many small functions. By small functions, I mean functions that doesn't do any processing but just return a literal value. Something like:
string Foo::method() const{
return "A";
}
I have created a header file "Foo.h" and source file "Foo.cpp". But since the function is very small, I am thinking about putting it in the header file itself. I have the following questions:
Is there any performance or other issues if I put these function definition in header file? I will have many functions like this.
My understanding is when the compilation is done, compiler will expand the header file and place it where it is included. Is that correct?
If the function is small (the chance you would change it often is low), and if the function can be put into the header without including myriads of other headers (because your function depends on them), it is perfectly valid to do so. If you declare them extern inline, then the compiler is required to give it the same address for every compilation unit:
headera.h:
inline string method() {
return something;
}
Member functions are implicit inline provided they are defined inside their class. The same stuff is true for them true: If they can be put into the header without hassle, you can indeed do so.
Because the code of the function is put into the header and visible, the compiler is able to inline calls to them, that is, putting code of the function directly at the call site (not so much because you put inline before it, but more because the compiler decides that way, though. Putting inline only is a hint to the compiler regarding that). That can result in a performance improvement, because the compiler now sees where arguments match variables local to the function, and where argument doesn't alias each other - and last but not least, function frame allocation isn't needed anymore.
My understanding is when the compilation is done, compiler will expand the header file and place it where it is included. Is that correct?
Yes, that is correct. The function will be defined in every place where you include its header. The compiler will care about putting only one instance of it into the resulting program, by eliminating the others.
Depending on your compiler and it's settings it may do any of the following:
It may ignore the inline keyword (it
is just a hint to the compiler, not a
command) and generate stand-alone
functions. It may do this if your
functions exceed a compiler-dependent
complexity threshold. e.g. too many
nested loops.
It may decide than your stand-alone
function is a good candidate for
inline expansion.
In many cases, the compiler is in a much better position to determine if a function should be inlined than you are, so there is no point in second-guessing it. I like to use implicit inlining when a class has many small functions only because it's convenient to have the implementation right there in the class. This doesn't work so well for larger functions.
The other thing to keep in mind is that if you are exporting a class in a DLL/shared library (not a good idea IMHO, but people do it anyway) you need to be really careful with inline functions. If the compiler that built the DLL decides a function should be inlined you have a couple of potential problems:
The compiler building the program
using the DLL might decide to not
inline the function so it will
generate a symbol reference to a
function that doesn't exist and the
DLL will not load.
If you update the DLL and change the
inlined function, the client program
will still be using the old version
of that function since the function
got inlined into the client code.
There will be an increase in performance because implementation in header files are implicitly inlined. As you mentioned your functions are small, inline operation will be so beneficial for you IMHO.
What you say about compiler is also true.There is no difference for compiler—other than inlining—between code in header file or .cpp file.
If your functions are that simple, make them inline, and you'll have to stick them in the header file anyway. Other than that, any conventions are just that - conventions.
Yes, the compiler does expand the header file where it encounters the #include statements.
It depends on the coding standards that apply in your case but:
Small functions without loops and anything else should be inlined for better performance (but slightly larger code - important for some constrained or embedded applications).
If you have the body of the function in the header you will have it by default inline(d) (which is a good thing when it comes to speed).
Before the object file is created by the compiler the preprocessor is called (-E option for gcc) and the result is sent to the compiler which creates the object out of code.
So the shorter answer is:
-- Declaring functions in header is good for speed (but not for space) --
C++ won’t complain if you do, but generally speaking, you shouldn’t.
when you #include a file, the entire content of the included file is inserted at the point of inclusion. This means that any definitions you put in your header get copied into every file that includes that header.
For small projects, this isn’t likely to be much of an issue. But for larger projects, this can make things take much longer to compile (as the same code gets recompiled each time it is encountered) and could significantly bloat the size of your executable. If you make a change to a definition in a code file, only that .cpp file needs to be recompiled. If you make a change to a definition in a header file, every code file that includes the header needs to be recompiled. One small change can cause you to have to recompile your entire project!
Sometimes exceptions are made for trivial functions that are unlikely to change (e.g. where the function definition is one line).
Source: http://archive.li/ACYlo (previous version of Chapter 1.9 on learncpp.com)

Should I put many functions into one file? Or, more or less, one function per file?

I love to organize my code, so ideally I want one class per file or, when I have non-member functions, one function per file.
The reasons are:
When I read the code I will always
know in what file I should find a
certain function or class.
If it's one class or one non-member
function per header file, then I won't
include a whole mess when I
include a header file.
If I make a small change in a function then only that function will have to be recompiled.
However, splitting everything up into many header and many implementation files can considerately slow down compilation. In my project, most functions access a certain number of templated other library functions. So that code will be compiled over and over, once for each implementation file. Compiling my whole project currently takes 45 minutes or so on one machine. There are about 50 object files, and each one uses the same expensive-to-compile headers.
Maybe, is it acceptable to have one class (or non-member function) per header file, but putting the implementations of many or all of these functions into one implementation file, like in the following example?
// foo.h
void foo(int n);
// bar.h
void bar(double d);
// foobar.cpp
#include <vector>
void foo(int n) { std::vector<int> v; ... }
void bar(double d) { std::vector<int> w; ... }
Again, the advantage would be that I can include just the foo function or just the bar function, and compilation of the whole project will be faster because foobar.cpp is one file, so the std::vector<int> (which is just an example here for some other expensive-to-compile templated construction) has to be compiled in only once, as opposed to twice if I compiled a foo.cpp and bar.cpp separately. Of course, my reason (3) above is not valid for this scenario: After just changing foo(){...} I have to recompile the whole, potentially big, file foobar.cpp.
I'm curious what your opinions are!
IMHO, you should combine items into logical groupings and create your files based on that.
When I'm writing functions, there are often a half a dozen or so that are tightly related to each other. I tend to put them together in a single header and implementation file.
When I write classes, I usually limit myself to one heavyweight class per header and implementation file. I might add in some convenience functions or tiny helper classes.
If I find that an implementation file is thousands of lines long, that's usually a sign that there's too much there and I need to break it up.
One function per file could get messy in my opinion. Imagine if POSIX and ANSI C headers were made the same way.
#include <strlen.h>
#include <strcpy.h>
#include <strncpy.h>
#include <strchr.h>
#include <strstr.h>
#include <malloc.h>
#include <calloc.h>
#include <free.h>
#include <printf.h>
#include <fprintf.h>
#include <vpritnf.h>
#include <snprintf.h>
One class per file is a good idea though.
We use the principle of one external function per file. However, within this file there may be several other "helper" functions in unnamed namespaces that are used to implement that function.
In our experience, contrary to some other comments, this has had two main benefits. The first is build times are faster as modules only need to be rebuilt when their specific APIs are modified. The second advantage is that by using a common naming scheme, it is never necessary to spend time searching for the header that contains the function you wish to call:
// getShapeColor.h
Color getShapeColor(Shape);
// getTextColor.h
Color getTextColor(Text);
I disagree that the standard library is a good example for not using one (external) function per file. Standard libraries never change and have well defined interfaces and so neither of the points above apply to them.
That being said, even in the case of the standard library there are some potential benefits in splitting out the individual functions. The first is that compilers could generate a helpful warning when unsafe versions of functions are used, e.g. strcpy vs. strncpy, in a similar way to how g++ used to warn for inclusion of <iostream.h> vs. <iostream>.
Another advantage is that I would no longer be caught out by including memory when I want to use memmove!
One function per file has a technical advantage if you're making a static library (which I guess it's one of the reasons why projects like the Musl-libc project follow this pattern).
Static libraries are linked with object-file granularity and so if you have a static library libfoobar.a composed of*:
foo.o
foo1
foo2
bar.o
bar
then if you link the lib for the bar function, the bar.o archive member will get linked but not the foo.o member. If you link for foo1, then the foo.o member will get linked, bringing in the possibly unnecessary foo2 function.
There are possibly other ways of preventing unneeded functions from being linked in (-ffunction-sections -fdata-sections and --gc-sections) but one function per file is probably most reliable.
There's also the middle ground of putting small number of related functions/data-objects in a file. That way the compiler can better optimize intersymbol references compared to -ffunction-sections/-fdata-sections and you still get at least some granularity for static libs.
I'm ignoring C++ name mangling here for the sake of simplicity
I can see some advantages to your approach, but there are several disadvantages:
Including a package is nightmare. You can end up with 10-20 includes to get the functions you need. For example, imagine if STDIO or StdLib was implemented this way.
Browsing the code will be a bit of pain, since in general it is easier to scroll through a file than to switch files. Obviously too big of file is hard, but even there with modern IDEs it is pretty easy to collapse the file down to what you need and a lot of them have function short cut lists.
Make file maintenance is a pain.
I am a huge fan of small functions and refactoring. When you add overhead (making a new file, adding it to source control,...) it encourages people to write longer functions where instead of breaking one function into three parts, you just make one big one.
You can redeclare some of your functions as being static methods of one or more classes: this gives you an opportunity (and a good excuse) to group several of them into a single source file.
One good reason for having or not having several functions in one source files, if that source file are one-to-one with object files, and the linker links entire object files: if an executable might want one function but not another, then put them in separate source files (so that the linker can link one without the other).
An old programming professor of mine suggested breaking up modules every several hundred lines of code for maintainability. I don't develop in C++ anymore, but in C# I restrict myself to one class per file, and size of the file doesn't matter as long as there's nothing unrelated to my object. You can make use of #pragma regions to gracefully reduce editor space, not sure if the C++ compiler has them, but if it does then definitely make use of them.
If I were still programming in C++ I would group functions by usage using multiple functions per file. So I may have a file called 'Service.cpp' with a few functions that define that "service". Having one function per file will in turn cause regret to find its way back into your project somehow, someway.
Having several thousand lines of code per file isn't necessary some of the time though. Functions themselves should never be much more than a few hundred lines of code at most. Always remember that a function should only do one thing and be kept minimal. If a function does more than one thing, it should be refactored into helper methods.
It never hurts to have multiple source files that define a single entity either. Ie: 'ServiceConnection.cpp' 'ServiceSettings.cpp', and so on so forth.
Sometimes if I make a single object, and it owns other objects I will combine multiple classes into a single file. For example a button control that contains 'ButtonLink' objects, I might combine that into the Button class. Sometimes I don't, but that's a "preference of the moment" decision.
Do what works best for you. Experiment a little with different styles on smaller projects can help. Hope this helps you out a bit.
I also tried to split files in a function per file, but it had some drawbacks. Sometimes functions tend to get larger than they need to (you don't want to add a new .c file every time) unless you are diligent about refactoring your code (I am not).
Currently I put one to three functions in a .c file and group all the .c files for a functionality in a directory. For header files I have Funcionality.h and Subfunctionality.h so that I can include all the functions at once when needed or just a small utility function if the whole package is not needed.
For the header part, you should combine items into logical groupings and create your header files based on that. This seems and is very logical IMHO.
For the source part, you should put each function implementation in a separate source file (static functions are exceptions in this case). This may not seem logical at first, but remember, a compiler knows about the functions, but a linker knows only about the .o and .obj files and its exported symbols. This may change the size of the output file considerably, and this is a very important issue for embedded systems.
Checkout glibc or Visual C++ CRT source tree...