We have these set of "utility" constants defined in a series of file. The problem arises from the fact that TOO MANY files include these global constant files, that, if we add a constant to one of those files and try to build, it builds the whole entire library, which takes up more than an hour.
Could anyone suggest a better way for this approach? That would be greatly appreciated.
First, if you are defining them directly in the header, I'd suggest instead delcaring them extern const, and then defining them in a cpp file:
//in .hpp:
extern const std::string foo;
//in .cpp:
const std::string foo = "FOO";
That way, at least definitions can be changed without a rebuild.
Second, examine where they are being included. If the constant file is being included in a low level header, can the include be moved to the cpp instead? Removing it might lower the coupling so it doesn't have to rebuild as much.
Third, break up that file. I'd suggest mapping out a structure you'd eventually want, start adding new constants to the new structure instead of the old file. Eventually (when you are sure you've got the structure you want), refactor the old file into the new structure, and make the old file include the entire structure. Finally, go through and remove all includes of the old file, pointing them at the appropriate new sections. That'll break up the refactoring so you don't have to do it all at once.
And fourth, you might be able to trick your compiler into not rebuilding if the header file changes. You'd have to check your compiler's documentation, and it might be unsafe, so you'd occasionally want to add full builds as well.
Do you really need every global define to be included in every file? You should probably split the constants up into categories and then split them into different files.
Every .h that is included is simply copied at that point in the file that is including it. If you change something in a file (either directly or via changing something that is included) then it absolutely needs to be recompile.
Another solution would be to have a .h file that has an accessor to a map of string name/values. Then in the .cpp file of that map/accessor you can insert the new values. Every new value you put would only need 1 file to be recompiled.
Another solution is not to include the header file anywhere. Simply extern in the variables you need in each .cpp file.
Perhaps it's time to do some refactoring to improve the cohesion and reduce the coupling in your software design. Splitting the global constant files would allow modules to be more selective about which constants need to be included, which will eliminate some of the
unnecessary coupling. In the extreme case, you could break it all the way down to one constant per file, and ensure that each module only includes the constants it needs to use.
But that could result in poor cohesion, in the sense that the constants might naturally fall into related groups, such that a module that requires one constant will generally also
require many others from that group. So the trick is to find a better grouping of constants in the various global files, then ensure that each module only includes what it needs.
(Edit: I didn't think of external constants. Indeed, my idea is kinda stupid.)
(Edit: My "stupid" macro idea actually saves build time for when constants are added. Thanks for pointing that out, Brian!)
Use parallel building :-)
Seriously, I think one solution would be to create another header called utility_ex.hpp or something where you add new constants that you occasionally merge into utility.hpp (or whatever your current utility constants header is called).
Another (less efficient) solution would be to have a macro like this:
#define constant(name) get_constant(#name)
// # means turn name into a string literal
int get_constant(const char *name);
Now suppose you want MAX_CUSTOMERS to be defined as 100. You can say:
constant(MAX_CUSTOMERS)
in the code. In get_constant's code, you might have:
int get_constant(const char *name) {
if (!strcmp(name, "MAX_CUSTOMERS"))
return 100;
//shouldn't happen
return -1;
}
Related
I am working on a game,
I have a huge list of const global variables .h file which many .cpp file relies on.
(Enemies would like to know the max hp of players etc)
However, this is a compilation nightmare when any variable is changed in the file for game balancing.
I would like to prevent splitting up the files into multiple headers for the ease of the guy in charge of game balancing.
However, recategorizing is kind of a pain (clarity vs compile time, Player's maximum hp is used in many .cpp but it should belong to the PlayerVariables.h, including it in every file that uses it kind of destroys the purpose of splitting them up).
Is there a way to save some time other than the method above?
Just declare the global variables in the header file. And define them in another source file. That way if any of the values are changed only the globals.cpp needs to be recompiled since everything includes globals from the hpp.
In globals.hpp:
extern const unsigned PLAYER_MAX_HEALTH;
In globals.cpp:
#include "globals.hpp"
const unsigned PLAYER_MAX_HEALTH = 100;
You could try adding a configure-step with the preprocessor:
Only have preprocessor #defines in the config.h
Split it with the preprocessor into multiple headers only pulling in those definitions they need to declare actual const variables / enum-values / whatever.
Only copy those files which are different from the previous run over the previous version.
Optionally also have a glabals.cpp defining those symbols which might also need a definition.
Make sure to (re-)run make after the configuration was updated.
The advantage over just declaring external constant variables and defining and initializing them in a different translation-unit is constant-propagation and having proper compile-time-constants.
Many other questions deal with how to allocate a variable by declaring it in a header file and defining it (allocating) in a .cpp file.
What I want to do is not use any .cpp files for my class, and to define all functions as inline (in the header file). The problem that I run into is how to define static member variables so that even when the .h file is included in multiple compilation units I don't get the "first defined here" linker error.
I'm open to preprocessor hacks, etc. if it gets the job done. I just want to avoid any .cpp files.
If it matters I'm using GCC.
You can abuse the singleton pattern if you really must avoid any .cpp files:
class Foo {
public:
static Bar& getMyStatic() {
static Bar bar;
return bar;
};
};
This works because now the variable is a static variable inside a function, and static has a different meaning within a function context than within a class context. And for functions, the linker does recognize multiple identical definitions and throws away the copies.
But, of course, I would strongly advise against avoiding .cpp files: It means that you get into a situation where you have to build the entire program, or at least large parts of it, in one big piece. Every change you do will necessitate a complete rebuilt which slows down your change-compile-test cycle significantly. For very small projects that might not be a problem, but it is for medium to large ones.
With static variables you have to put in a .cpp file to avoid the possibility of multiple static variables when the intention is to have just the one. Besides it is not a good idea to have large inline methods as it is only a hint to the compiler but also makes compilation take longer (you change some of those functions in development and then lots of dependent files will need to get compiled!)
However if you do not want lots of .cpp files with just a few statics in it why not have just one file to store them in.
As long as you only include that header file once in your whole project, you'll be OK. However, that's a pretty strong requirement, and can be difficult to make others adhere to.
You could have a static variable, but that means you have more than one for the entire program, which may or may not matter (bear in mind that you can't change it in the future, so you may have what's known as a "latent bug" - you change some other code, and all of a sudden you have created a new bug, because the variable isn't ONE variable).
I have seen many explanations on when to use forward declarations over including header files, but few of them go into why it is important to do so. Some of the reasons I have seen include the following:
compilation speed
reducing complexity of header file management
removing cyclic dependencies
Coming from a .net background I find header management frustrating. I have this feeling I need to master forward declarations, but I have been scrapping by on includes so far.
Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
I can buy the argument for reduced complexity, but what would a practical example of this be?
"to master forward declarations" is not a requirement, it's a useful guideline where possible.
When a header is included, and it pulls in more headers, and yet more, the compiler has to do a lot of work processing a single translation module.
You can see how much, for example, with gcc -E:
A single #include <iostream> gives my g++ 4.5.2 additional 18,560 lines of code to process.
A #include <boost/asio.hpp> adds another 74,906 lines.
A #include <boost/spirit/include/qi.hpp> adds 154,024 lines, that's over 5 MB of code.
This adds up, especially if carelessly included in some file that's included in every file of your project.
Sometimes going over old code and pruning unnecessary includes improves the compilation dramatically just because of that. Replacing includes with forward declarations in the translation modules where only references or pointers to some class are used, improves this even further.
Why cannot the compiler work for me and figure out my dependencies using one mechanism (includes)?
It cannot because, unlike some other languages, C++ has an ambiguous grammar:
int f(X);
Is it a function declaration or a variable definition? To answer this question the compiler must know what does X mean, so X must be declared before that line.
Because when you're doing something like this :
bar.h :
class Bar {
int foo(Foo &);
}
Then the compiler does not need to know how the Foo struct / class is defined ; so importing the header that defines Foo is useless. Moreover, importing the header that defines Foo might also need importing the header that defines some other class that Foo uses ; and this might mean importing the header that defines some other class, etc.... turtles all the way.
In the end, the file that the compiler is working against is almost like the result of copy pasting all the headers ; so it will get big for no good reason, and when someone makes a typo in a header file that you don't need (or import , or something like that), then compiling your class starts to take waaay too much time (or fail for no obvious reason).
So it's a good thing to give as little info as needed to the compiler.
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
1) reduced disk i/o (fewer files to open, fewer times)
2) reduced memory/cpu usage
most translations need only a name. if you use/allocate the object, you'll need its declaration.
this is probably where it will click for you: each file you compile compiles what is visible in its translation.
a poorly maintained system will end up including a ton of stuff it does not need - then this gets compiled for every file it sees. by using forwards where possible, you can bypass that, and significantly reduce the number of times a public interface (and all of its included dependencies) must be compiled.
that is to say: the content of the header won't be compiled once. it will be compiled over and over. everything in this translation must be parsed, checked that it's a valid program, checked for warnings, optimized, etc. many, many times.
including lazily only adds significant disk/cpu/memory increase, which turns into intolerable build times for you, while introducing significant dependencies (in non-trivial projects).
I can buy the argument for reduced complexity, but what would a practical example of this be?
unnecessary includes introduce dependencies as side effects. when you edit an include (necessary or not), then every file which includes it must be recompiled (not trivial when hundreds of thousands of files must be unnecessarily opened and compiled).
Lakos wrote a good book which covers this in detail:
http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1?ie=UTF8&s=books&qid=1304529571&sr=8-1
Header file inclusion rules specified in this article will help reduce the effort in managing header files.
I used forward declarations simply to reduce the amount of navigation between source files done. e.g. if module X calls some glue or interface function F in module Y, then using a forward declaration means the writing the function and the call can be done by only visiting 2 places, X.c and Y.c not so much of an issue when a good IDE helps you navigate, but I tend to prefer coding bottom-up creating working code then figuring out how to wrap it rather than through top down interface specification.. as the interfaces themselves evolve it's handy to not have to write them out in full.
In C (or c++ minus classes) it's possible to truly keep structure details Private by only defining them in the source files that use them, and only exposing forward declarations to the outside world - a level of black boxing that requires performance-destroying virtuals in the c++/classes way of doing things. It's also possible to avoid needing to prototype things (visiting the header) by listing 'bottom-up' within the source files (good old static keyword).
The pain of managing headers can sometimes expose how modular your program is or isn't - if its' truly modular, the number of headers you have to visit and the amount of code & datastructures declared within them should be minimized.
Working on a big project with 'everything included everywhere' through precompiled headers won't encourage this real modularity.
module dependancies can correlate with data-flow relating to performance issues, i.e. both i-cache & d-cache issues. If a program involves many modules that call each other & modify data at many random places, it's likely to have poor cache-coherency - the process of optimizing such a program will often involve breaking up passes and adding intermediate data.. often playing havoc with many'class diagrams'/'frameworks' (or at least requiring the creation of many intermediates datastructures). Heavy template use often means complex pointer-chasing cache-destroying data structures. In its optimized state, dependancies & pointer chasing will be reduced.
I believe forward declarations speed up compilation because the header file is ONLY included where it is actually used. This reduces the need to open and close the file once. You are correct that at some point the object referenced will need to be compiled, but if I am only using a pointer to that object in my other .h file, why actually include it? If I tell the compiler I am using a pointer to a class, that's all it needs (as long as I am not calling any methods on that class.)
This is not the end of it. Those .h files include other .h files... So, for a large project, opening, reading, and closing, all the .h files which are included repetitively can become a significant overhead. Even with #IF checks, you still have to open and close them a lot.
We practice this at my source of employment. My boss explained this in a similar way, but I'm sure his explanation was more clear.
How do forward declarations speed up compilations since at some point the object referenced will need to be compiled?
Because include is a preprocessor thing, which means it is done via brute force when parsing the file. Your object will be compiled once (compiler) then linked (linker) as appropriate later.
In C/C++, when you compile, you've got to remember there is a whole chain of tools involved (preprocessor, compiler, linker plus build management tools like make or Visual Studio, etc...)
Good and evil. The battle continues, but now on the battle field of header files. Header files are a necessity and a feature of the language, but they can create a lot of unnecessary overhead if used in a non optimal way, e.g. not using forward declarations etc.
How do forward declarations speed up
compilations since at some point the
object referenced will need to be
compiled?
I can buy the argument for reduced
complexity, but what would a practical
example of this be?
Forward declarations are bad ass. My experience is that a lot of c++ programmers are not aware of the fact that you don't have to include any header file, unless you actually want to use some type, e.g. you need to have the type defined so the compiler understands what you want to do. It's important to try and refrain from including header files in other header files.
Just passing around a pointer from one function to another, only requires a forward declaration:
// someFile.h
class CSomeClass;
void SomeFunctionUsingSomeClass(CSomeClass* foo);
Including someFile.h does not require you to include the header file of CSomeClass, since you are merely passing a pointer to it, not using the class. This means that the compiler only needs to parse one line (class CSomeClass;) instead of an entire header file (that might be chained to other header files etc etc).
This reduces both compile time and link time, and we are talking big optimizations here if you have many headers and many classes.
I have a macro definition in header file like this:
// header.h
ARRAY_SZ(a) = ((int) sizeof(a)/sizeof(a[0]));
This is defined in some header file, which includes some more header files.
Now, i need to use this macro in some source file that has no other reason to include header.h or any other header files included in header.h, so should i redefine the macro in my source file or simply include the header file header.h.
Will the latter approach affect the code size/compile time (I think yes), or runtime (i think no)?
Your advice on this!
Include the header file or break it out into a smaller unit and include that in the original header and in your code.
As for code size, unless your headers do something incredibly ill-advised, like declaring variables or defining functions, they should not affect the memory footprint much, if at all. They will affect your compile time to a certain degree as well as polluting your name space.
Including the header in the source file might affect compile time slightly unless you are using pre-compiled headers. It shouldn't affect the code size though. Redefining the macro shouldn't have any effect on compile time or size. It is more of a maintenance and consistency issue though.
should i redefine the macro in my source file or simply include the header file header.h.
Neither. Instead you should clean up the code and break header.h so that one can use ARRAY_SZ() without also getting unrelated stuff.
You ask:
Will the latter approach affect the
code size/compile time (I think yes)
In the case of the specific macro, the answer is "no" to the size, because the sizeof expression can be evaluated at compile time, and therefore "yes" to the time. Neither are likely to be remotely significant.
Unless you're running this on a really limited bit of hardware, or this is called billions and billions of times, you won't notice any difference between the two at either compile time or run time.
Go for whatever seems more readable / maintainable.
Personally, I'd suggest there are better ways of achieving what you're doing there without involving macros (namely inline functions and/or function templates). You have to be careful using your solution because there are a few gotchas you need to keep an eye on.
Including that header and all other headers included into it will increase the compile time. It can affect runtime if there're other definitions that will change how your code compiles - if your code compiles differently because of those defines it will of course run differently. Although the latter is not usual be careful.
Should you declare the getters/setters of the class inside the .h file and then define them in .cpp Or do both in .h file. Which style do you prefer and why? I personally like the latter wherein all of them are in .h and only methods which have logic associated with it other than setters/getters in .cpp.
For me it depends on who's going to be using the .h file. If it's a file largely internal to a module, then I tend to put the tiny methods in the header. If it's a more external header file that presents a more fixed API, then I'll put everything in the .cpp files. In this case, I'll often use the PIMPL Idiom for a full compilation firewall.
The trade-offs I see with putting them in the headers are:
Less typing
Easier inlining for the compiler (although compilers can sometimes do inlining between multiple translation units now anyway.)
More compilation dependencies
I would say that header files should be about interface, not implementation. I'd put them in the .cpp.
For me this depends on what I'm doing with the code. For code that I want maintained and to last over time, I put everything in the .cc file for the following reasons:
The .h file can remain sparse as documentation for people who want to look for function and method definitions.
My group's coding guidelines state that we put everything in the .cpp file and like to follow those, even if the function definition only takes one line. This eliminates guessing games about where things actually live, because you know which file you should examine.
If you're doing frequent recompiles of a big project, keeping the function definition in the .cpp file saves you some time compared to keeping function definitions in header files. This was relevant very recently for us, as we recently went through the code and added a lot of runtime assert statements to validate input data for our classes, and that required a lot of modification to getters and setters. If these method declarations had lived in .cpp files, this would have turned into a clean recompile for us, which can take ~30min on my laptop.
That's not to say that I don't play fast-and-dirty with the rules occasionally and put things in .h files when implementing something really fast, but for code I'm serious about I all code (regardless of length) in the .cpp file. For big projects (some of) the rules are there for a reason, and following them can be a virtue.
Speaking of which, I just thought of yet another Perl script I can hack together to find violations of the coding guidelines. It's good to be popular. :)
I put put all single-liners in the header as long as they do not require too much additional headers included (because calling methods of other classes).
Also I do not try to put all code in one line so I can put most of the methods in the header :-)
But Josh mentioned a good reason to put them in the .cpp anyway: if the header is for external use.
I prefer to keep the .h file as clean as possible. Therefore, small functions that are as simple as get/set I often use to put in a separate file as inline-defined functions, and then include that file (where I use the extension .inl) into the .h header file:
// foo.h
class foo
{
public:
int bar() const;
private:
int m_bar;
};
#include "foo.inl"
// foo.inl
inline
int foo::bar() const
{
return m_bar;
}
I think that this gives the best of two worlds, at the same time hiding most of the implementation from the header, and still keep the advantage of inlining simple code (as a rule of thumb I keep it within at most 3 statements).
I pretty much always follow the division of declaring them in the header, and defining in the source. Every time I don't, I end up having to go back and do it any way later.
I prefer to put them into the .cpp file, for the sake of fast compile/link times. Even tiny one-liners (empty virtual destructors!) can blow up your compile times, if they are instantiated a lot. In one project, I could cut the compile time by a few seconds by moving all virtual destructors into the .cpp files.
Since then, I'm sold on this, and I would only put them into the header again if a profiler tells me that I can profit from inlining. Only downside is you need more typing, but if you create the .cpp file while you write the header, you can often just copy&paste the declarations and fill them out in the .cpp file, so it's not that bad. Worse of course if you later find out you want to move stuff into a .cpp file.
A nice side effect is that reading stuff is simpler when you have only documentation and declarations in your header, especially if new developers join the project.
I use next rule: header for declaration, code file for realization. It becomes for actual when your header would be use outside of project - than more lightweight your header is, then it's more comfort in use