I've been writing a few classes lately; and I was wondering whether it's bad practice, bad for performance, breaks encapsulation or whether there's anything else inherently bad with actually defining some of the smaller member functions inside a header (I did try Google!). Here's an example I have of a header I've written with a lot of this:
class Scheduler {
public:
typedef std::list<BSubsystem*> SubsystemList;
// Make sure the pointer to entityManager is zero on init
// so that we can check if one has been attached in Tick()
Scheduler() : entityManager(0) { }
// Attaches a manager to the scheduler - used by Tick()
void AttachEntityManager( EntityManager &em )
{ entityManager = &em; }
// Detaches the entityManager from a scheduler.
void DetachEntityManager()
{ entityManager = 0; }
// Adds a subsystem to the scheduler; executed on Tick()
void AddSubsystem( BSubsystem* s )
{ subsystemList.push_back(s); }
// Removes the subsystem of a type given
void RemoveSubsystem( const SubsystemTypeID& );
// Executes all subsystems
void Tick();
// Destroys subsystems that are in subsystemList
virtual ~Scheduler();
private:
// Holds a list of all subsystems
SubsystemList subsystemList;
// Holds the entity manager (if attached)
EntityManager *entityManager;
};
So, is there anything that's really wrong with inlining functions like this, or is it acceptable?
(Also, I'm not sure if this'd be more suited towards the 'code review' site)
Inlining increases coupling, and increases "noise" in the class
definition, making the class harder to read and understand. As a
general rule, inlining should be considered as an optimization measure,
and only used when the profiler says it's necessary.
There are a few exceptions: I'll always inline the virtual destructor of
an abstract base class if all of the other functions are pure virtual;
it seems silly to have a separate source file just for an empty
destructor, and if all of the other functions are pure virtual, and
there are no data members, the destructor isn't going to change without
something else changing. And I'll occasionally provide inlined
constructors for "structures"—classes in which all data members
are public, and there are no other functions. I'm also less rigorous
about avoiding inline in classes which are defined in a source file,
rather than a header—the coupling issues obviously don't apply in
that case.
All of your member functions are one-liners, so in my opinion thats acceptable. Note that inline functions may actually decrease code size (!!) because optimizing compilers increase the size of (non-inline) functions in order to make them fit into blocks.
In order to make your code more readable I would suggest to use inline definitions as follows:
class Scheduler
{
...
void Scheduler::DetachEntityManager();
...
};
inline void Scheduler::DetachEntityManager()
{
entityManager = 0;
}
In my opinion thats more readable.
I think inlining (if I understood you right, you mean the habit of writing trivial code right into the header file, and not the compiler behaviour) aids readability by two factors:
It distinguishes trivial methods from non-trivial ones.
It makes the effect of trivial methods available at a glance, being self-documenting code.
From a design POV, it doesn't really matter. You are not going to change your inlined method without changing the subsystemList member, and a recompile is necessary in both cases. Inlining does not affect encapsulation, since the method is still a method with a public interface.
So, if the method is a dumb one-liner without a need for lengthy documentation or a conceivable need of change that does not encompass an interface change, I'd advise to go for inlining.
It will increase executable size and in some occasions this will lead to worse performance.
Keep in mind that an inline method requires it's source code to be visible to whoever uses it (ie. code in the header) this means that a small change in the implementation of your inlined methods will cause a recompilation on everything that uses the header where the inline method was defined.
On the other hand, it is a small performance increase, it's good for short methods that are called really frequently, since it will save you the typical overhead of calling to methods.
Inline methods are fine if you know where to use them and don't spam them.
Edit:
Regarding style and encapsulation, using inline methods prevents you from using things like Pointer to implementation, forward declarations, etc.. since your code is in the header.
Inlining has three "drawbacks" at least:
inline functions are at odds with the virtual keyword (I mean conceptually, IMO, either you want a piece of code to be substituted for the function call, or you want the function call to be virtual, i.e. polymorphic; anyway, see also this for more details as to when it could make sense practically);
your binary code will be larger;
if you include the inline method in the class definition, you reveal implementation detail.
Apart from that it is plainly ok to inline methods, although it is also true that modern compilers are already sufficiently smart to inline methods on their own when it makes sense for performance. So, in a sense I think it is better to leave it to the compiler altogether...
Methods inside class body are usually inline automatically. Also, inline is a suggestion and not a command. Compilers are generally smart enough to judge whether to inline a function or not.
You can refer to this similar question.
In fact you can write all your functions in the header file, if the function is too large the compiler will automatically not inline the function. Just write the function body where you think it fits best, let the compiler decide. The inline keyword is ignored often as well, if you really insist on inlining the function use __forceinline or something similar (I think that is MS specific).
Related
I'm working on a project and to clean the code up in a big function, there is a segment of it that I think should be a separate function. But that separate function will only be used once, inside that bigger function. How should I treat it? Should it just be a normal void or is there a keyword I can throw before it? Could it be an inline function? I've heard of those but don't totally understand what they do. Thanks!
Stick it in the same source (.cpp) file. Place it within a namespace {} -- an anonymous namespace. This guarantees it cannot be used/referred to outside of that source file, which both communucates to developers and compilers somewhat useful information.
inline all by itself is a bad idea due to potential odr violations (if another independent function with the same signature and name exists, bad things would happen). For "famous" functions in header files the risk is mitigated somewhat. inline once you put it in an anonymous namespace is innocuous, and may give amcompiler a hint that may be useful. It probably does not matter.
inline is used for small and simple functions where you want to avoid the overhead of calling the function. It basically copies the code of the inline function inside your bigger function.
http://www.cplusplus.com/articles/2LywvCM9/
Granting the run once need is inside a namespace, I thus instantiate a RunOnce object and passing a lambda:
namespace mystuff {
int somevars = 5;
RunOnce initialize([&](){
// do one-time initalization here
somevars = 0;
});
}
Where,
class RunOnce {
public:
RunOnce(std::function<void(void)> init) {
init();
}
}
It is normally recommended to place a code with the single, clearly defined responsibility into dedicated function, even it that function is only called once. At least this is that I have seen in all books on topic.
This make the code more readable, maintainable and also you can now write a Unit test for the extracted function. If the function is unlikely to require dedicated testing (well covered with wider Unit tests, etc), you can still define it separately in the same file with static keyword, so it does not conflict with anything outside.
I have a long and confusing static method in one of my classes. It is full or error checking code and as a consequence is turning into unreadable spaghetti! It looks something like this:
void myMethod(int foo, int bar)
{
int y = functionCall(foo);
if (!y)
{
int x = functionCall(bar);
if (!x)
{
// lots of code with further nested ifs for error checking
// it all starts to get a bit confusing
}
else
{
// error handling
}
}
else
{
// error handling
}
}
So that the code is readable and more modular (allowing me to more easily test/ maintain etc) I would like to break it down into some smaller functions. This is not about code reuse as the functions will only ever be called from this one place - it is purely about readability and making 100's of lines of complicated code more understandable for humans.
So my question is this.
If I am to do this will I lose efficiency as I am making unnecessary calls and so extra work for the processor?
If I make these smaller functions should I declare them inline to help the linker realise that they are only used by this one function and should be blown up in place?
Will the linker be able to manage this kind of optimization itself?
Finally if I am to declare it inline what is the correct way to do this?
Should I put the inline function declaration in the header file and the code body in the .cpp file?
i.e.
in MyClass.hpp :
inline static int myMethodPart1();
in MyClass.cpp
int MyClass::myMethodPart1()
{ /* body */ }
Or should I perhaps not declare it in the header or ..... ?
With regards to how to organize and divide the code, I would say you need to use good judgment. If you feel it is a problem big enough to post here, then addressing it is probably worth the effort. That said, I will try to address the components of your post individually.
The cost of function calls. Function calls are insanely cheap. The system essentially just dereferences a single pointer and it’s there. Similar already happens in loops, conditionals, and other forms of branching. Declaring:
While( x != 0 )
{
Do stuff;
}
Will compile, at a low level, to effectively having “do stuff;” as a separate function called repeatedly. As such, the cost of splitting your function into multiple functions is low and possibly, if done cleanly and with a smart compiler, non-existent.
Regarding inlining. As I explained in the comments, the inline keyword does not mean (quite) what you think it means and what it suggests it means. Compilers have a tendency to ignore inline with regards to actually inlining the function, and at best take it as a suggestion. What inline does do is prevent multiple definitions of the function from becoming an error. This is important behavior if you define a function within a header, because that function definition will be compiled into every cpp's object file. If not declared inline, linking these objects into an executable can generate a multiple definition error. Some compilers implicitly inline functions defined in such a way, but you should never depend upon compiler-specific behavior.
Actually getting a function inlined is to an extent up to the good graces of the compiler. I have seen it stated, although I cannot now find where, that defining a function within the class declaration (in the header) is a fairly strong nod to the compiler to inline.
That said, as I noted before, inlining is not a particularly important matter. The cost of calling a function is insanely low, and really the only area one should be concerned about it is in functions called often - like getter and setter functions.
How to use inline. Having established inline doesn't inline, usually, your question about using it is mostly addressed as above. If you define a function within the class declaration, use the inline keyword to avoid possible linker errors. Otherwise, it's largely a meaningless keyword to most modern compilers so far as I am aware.
Where to put functions formed from splitting a single function. This is very much an opinion based question but there are two options I think that seem best:
First, you can make it a protected member of the class. If you do so, you should probably include a symbolic nod that this is not a general-purpose function - a leading underscore in the name is typically the symbol for "do not touch."
Alternatively, you can define the extra functions in the .cpp file, not within the class itself. For example [MyClass.cpp]:
void functionA()
{
stuff;
}
void functionB()
{
stuff;
}
void MyClass::myFunction()
{
functionA();
functionB();
}
This completely prevents these functions form being called outside this cpp file. It also prevents calls from child classes, which may or may not be desirable behavior. Use your discretion in choosing where to put them.
A final note. Be careful about how you divide up complicated functions, or you could end up with something worse than a single function. Moving things elsewhere might only serve to hide the fact the actual logic is messy. I personally find it much simpler to follow a single branching function than one that calls other functions. It is more difficult to read, especially for someone not familiar with the code, if calls are being made outside the function for potentially non-obvious reasons.
It might be beneficial to think how you could reorganize the code to be simpler, if possible, and keep it in one function - or divide it in such a way it would be reusable.
This is the declaration in the header file:
class PrimeSieve
{
populate(int lim);
vector<int> sieve;
long long limit;
public:
unsigned int limit();
};
Should I define the accessor method in the .cpp file or in the .h, inline?
I'm new to C++, but I'd like to follow best practices. I've seen this around in some of the books—is this considered standard?
unsigned int limit() { return limit; };
Definitely write the accessor inline in the header file. It makes better optimizations possible, and doesn't reduce encapsulation (since changes to the format of private data require recompiling all units that include the header anyway).
In the case of a complicated algorithm, you might want to hide the definition in an implementation file. Or when the implementation requires some types/header files not otherwise required by the class definition. Neither of those cases applies to simple accessors.
For one-liners, put it inside the class definition. Slightly longer member functions should still be in the header file, but might be declared explicitly inline, following the class definition.
Most newer compilers are smart enough to inline what is necessary and leave everything else alone. So let the compiler do what its good at and don't try to second guess it.
Put all your code in the .cpp and the code declarations in the .h.
A good rule of thumb is to put all your code in the .cpp file, so this would argue against an inline function in the .h file.
For simple data types in classes fully visible to clients of the class, there is no real difference as you need to recompile the client whenever the class definition changes.
The main reason to make an accessor rather than use the member directly is to allow the implementation to remove the data member later on and still keep the interface compatible; if the interface containing the accessor is unchanged, the result is typically binary compatible, otherwise, it's source compatible. Having the accessor inline means defining it as part of the interface that you are changing, so you can ever only be source compatible.
The other reason to have an accessor is a DLL boundary: If your accessor needs to call into another function, and you allow it to be inlined, then this function's symbol needs to be exported to the client as well.
Depending on the complexity of the project, it can be beneficial to define an interface for your code as an abstract class, which allows you to change the implementation to your heart's content without the client ever seeing the change; in this case, accessors are defined as abstract in the interface class and clients cannot inline them, ever.
The argument for declaring the accessor inline is that this eliminates the call over-head, and can enable some further optimisations.
My experienced of measured performance is that the gain from doing this is usually rather modest. I consequently no longer do it by default.
More than being kind of global programming standards, these vary from organizations to organizaions. Of course, getLimit() would still be better than mere limit().
guys. I have read several threads about the interaction between inline and virtual co-existing in one function. In most cases, compilers won't consider it as inline. However, is the principle applied to the scenario when a non-virtual inline member function call a virtual function? say:
class ABC{
public:
void callVirtual(){IAmVitrual();}
protected:
virtual void IAmVirtual();
};
What principle? I would expect the compiler to generate a call to the virtual function. The call (in effect a jump-to-function-pointer) may be inlined but the IAmVirtual function is not.
The virtual function itself is not inline, and it is not called with qualification needed to inline it even if it were, so it can't be inlined.
The whole point of virtual functions is that the compiler generally doesn't know which of the derived class implementations will be needed at run-time, or even if extra derived classes will be dynamically loaded from shared libraries. So, in general, it's impossible to inline. The one case that the compiler can inline is when it happens to know for sure which type it's dealing with because it can see the concrete type in the code and soon afterwards - with no chance of the type having changed - see the call to the virtual function. Even then, it's not required to try to optimise or inline, it's just the only case where it's even possible.
You shouldn't try to fight this unless the profiler's proven the virtual calls are killing you. Then, first try to group a bunch of operations so one virtual call can do more work for you. If virtual dispatch is still just too slow, consider maintaining some kind of discriminated union: it's a lot less flexible and cleanly extensible, but can avoid the virtual function call overheads and allow inlining.
All that assumes you really need dynamic dispatch: some programmers and systems over-use virtual functions just because OO was the in thing 20 years ago, or they've used an OO-only language like Java. C++ has a rich selection of compile-time polymorphic mechanisms, including templates.
In your case callVirtual() will be inlined. Any non-virtual function can be a good candidate of being inline (obviously last decision is upto compiler).
Virtual functions have to be looked up in the Virtual Method Table, and as a result the compiler cannot simply move them to be inline. This is generally a runtime look up. An inline function however may call a virtual one and the compiler can put that call (the code to look up the call in the VMT) inline.
I was wondering if the use of accessors can significantly affect performance of an application. Let's say we have a class Point and there are two private fields. We can get access to these fields by calling public functions such as GetX().
class Point
{
public:
Point(void);
double GetX();
double GetY();
void SetX(double x);
void SetY(double y);
~Point(void);
private:
double x,y;
};
However if we need to get the value of field x a lot of time (e.g if we process images) wouldn't this construction affect the performance of application? Maybe it would be faster just to make fields x and y public?
First and foremost, this is probably premature optimization, and in the general case accessors are not the source of application-level bottlenecks. However, they're not magic pixie dust. It's generally not the case that accessors will hurt performance. There are a few things to consider:
If the implementation is inline or if you have a toolchain that supports link-time optimization, it's likely that there will be 0 impact. Here's an example that lets you get absolutely the same performance on a compiler that doesn't suck.
class Point {
public: double GetX() const;
private: double x;
};
inline double Point::GetX() const { return x; }
If the implementation is out-of-line, then you have the added cost of a function call. If, as you say, the function is being called many times, then at least the code is more or less guaranteed to be in the cache, but the relative % of overhead may be high: the work to perform the function call is higher than the work of moving a double around, and there's a pointer indirection because the function actually uses this as a parameter.
If the implementation is both out-of-line and part of a relocatable library (Linux *.so or Windows *.dll), there's an additional indirection that occurs in order to manage the relocation.
Both of the latter costs are reduced on x86-64 hardware relative to x86 32-bit; so much so that you should just not worry about it. I can't speak about other architectures.
Penultimately, if you have many trivial objects with trivial getters and setters, and if you have no profile-guided optimization or link-time optimization, there may be caching effects due to large numbers of tiny functions. It's likely that each function requires a minimum of one cache line, and the functions are not going to be naturally organized in a way that groups commonly-used sections together. This cost is something you should probably ignore unless you're writing a very large-scale C++ project or core component, such as the KDE base system.
Ultimately, don't worry about it.
Such methods should always be inlined by the compiler and the performance of that will be identical to making them public. You can use the inline keyword to help the compiler along, but that's just a hint. If it's really critical that you avoid function call overhead, read the generated assembly. If they're getting inlined you're ok. Otherwise you might want to consider loosening their visibility.
In a typical case, no, there will not be a difference in performance (unless you've fairly specifically told the compiler not to inline any functions). If you allow it to inline functions, however, chances are that it'll generate identical assembly language for both.
That should not, however, be seen as an excuse for ruining your design by including these abominations. First of all, a class should generally provide high level operations, so (for example) you could have a move_relative and move_absolute, so instead of something like this:
Point whatever;
whatever.SetX(GetX()+3);
whatever.SetY(GetY()+4);
...you'd do something like this:
Point whatever;
whatever.move_relative(3, 4);
There are times, however, that exposing something as data really does make sense and work well. If/when you are going to do that, C++ already provides a good way to encapsulate access to the data: a class. It also provides a predefined name for SetXXX and GetXXX -- they're operator= and operator T respectively. The right way to do this is something like this:
template <class T>
class encapsulate {
T value;
public:
encapsulate(T const &t) : value(t) {}
encapsulate &operator=(encapsulate const &t) { value = t.value; }
operator T() { return value; }
};
Using this, your Point class looks like:
struct Point {
encapsulate<double> x, y;
};
With this, the data you want to be public looks and acts as if it is. At the same time, you retain full control over getting/setting the values by changing the encapsulate to something that does whatever you need done.
Point whatever;
whatever.x = whatever.x + 3;
whatever.y = whatever.y + 4;
Though I haven't bothered to in the demo template above, it's fairly easy to support the normal compound assignment operators (+=, -=, *=, /=, etc.) as well. Depending on the situation, it's often useful to eliminate many of these though. Just for example, adding/subtracting to an X/Y coordinate often makes sense -- but multiplication and division frequently won't, so you can just add += and -=, and if somebody accidentally types in /= or |= (for just a couple of examples), their code simply won't compile.
This also provides better enforcement of whatever constraints you need on the data. With private data and an accessor/mutator, other code in the class can (and almost inevitably will) modify the data in ways you didn't want. With a class dedicated to nothing by enforcing the correct constraints, that issue is virtually eliminated. Instead, code both inside and outside the class does a simple assignment (or uses the value, as the case may be) and it's routed through the operator=/operator T automatically -- code inside the class can't bypass whatever checking is needed.
Since you're (apparently) concerned with efficiency, I'll add that this won't normally have any run-time cost either. In fact, being a template gives it a slight advantage in that regard. Where code in a normal function could (even if only by accident) be rewritten in a way that prevented inline expansion, using a template eliminates that -- if you try to rewrite it in a way that otherwise wouldn't generate inline code, with a template it won't compile at all.
As long as you define the functions in the header so the compiler can inline them there should be no difference at all. But even if they aren't inlined you still shouldn't make them public unless profiling indicates that it's a significant bottleneck and that making the variables public improves the problem. Making variables public decreases encapsulation and maintainability. For a bit more on public variables, see my answer on What good are public variables then?
The short answer is yes, this will affect the performance. Whether you will notice the difference or not is another matter that depends on how much code you have in the accessors, among other things.
The more important questions, though, is do you need what you gain from using accessors? If you make the fields public, then you lose control over their values. Do you want to allow x or y to be NaN? or +-infinity? Making them public would make such cases possible.
If you decide later that a double is not acceptable for your point class (maybe you need more precision or the precision isn't necessary), then accessing the fields directly would cause trouble. While this change might also require changes in the accessors, the setters should be fine with overloaded methods. And you may still be fine with a public representation of a double whereas the internal representation is not a double (although this is not so likely with a Point class, I imagine).
There are other cases where you might want to have side effects on accessors and setters as well that making the fields public would circumvent. Maybe you want to create events for when your point changes, but if the fields are public, then your class won't know when the values change.
ADDED
Ok, so my glossing over with my "yes" so that I could get to the non-performance issues that I felt more important wasn't appreciated.
In many cases, the yes is probably as correct as it will be imperceptible. True, using inline and a kick-ass compiler may very well end up with the same code (assuming an accessor like double GetX() { return x; }), but there are a lot of ifs there. Compilers will only inline things that end up in the same object file (often created from a single code file). So you also need a kick-ass linker to optimize the references in other object files (by the time you get to the linker, the inline hint may not even still remain in the code). So some, but not necessarily all, of the code may end up being identical, but that would be something you can confirm only after the fact and isn't useful.
If you're concerned about image processing then it might be worth allowing for friend classes so that an image class that you code can have access directly to the fields, but again I don't think that even in that case the accessor will be adding a lot to your runtime.