I'm making a gomoku artificial intelligence and I was wondering myself wich was the best optimized thing to do to have my table available in all my functions. This table is under the form of a char map[MAPSIZE][MAPSIZE]. In my algorithm I a lot of read access to this table.
Is it faster to access this map if it's passed as:
An argument in all my functions.
A member of my algorithm class.
A global variable.
An argument to function but as a pointer.
In a near future I will have to make a lot of copies of this table to implement a search tree.
Thanks for your time,
If it makes sense for it to be a class member - make it a class member. This is a design decision that shouldn't be made out of optimization considerations (at least not yet, you can later, after measuring, trade-off design for performance if you think it's worth it).
The alternative is passing it by reference (or pointer, but reference is more C++-ish).
Is it faster to access this map if it's passed as: ...
Different methods do incur different runtime costs. However, the difference is almost certainly irrelevant unless you do something grossly inefficient (e.g. unnecessarily copy the entire table in each method).
I suggest you design this with correctness and clarity in mind, and worry about optimizations later.
Related
Not sure if this has already been asked before. While answering this very simple question, I asked myself the following instead. Consider this:
void foo()
{
int i{};
const ReallyAnyType[] data = { item1, item2, item3,
/* many items that may be potentially heavy to recreate, e.g. of class type */ };
/* function code here... */
}
Now in theory, local variables are recreated every time control reaches function, right? I.e. look at int i above - it's going to be recreated on the stack for sure. What about the array above? Can a compiler be as smart as to optimize its creation to occur only once, or do I need a static modifier here anyway? What about if the array is not const? (OK, if it's not const, there probably i snot sense in creating it only once, since re-initialization to the default state may be required between calls due to modifications being made during function execution.)
Might sound like a basic question, but for some reason I still ponder. Also, ignore the "why would you want to do this" - this is just a language question, not applied to a certain programming problem or design. I mean both C and C++ here. Should there be differences between the two regarding this question, please outline those.
There a two questions here, I think:
Can a compiler optimize a non-static const object to be effectively static so that it is only created once; and
Is it a reasonable expectation that a given compiler will do so.
I think the answer to the second question is "No", because I don't see the point of doing a huge amount of control flow analysis to save the programmer the trouble of typing the word static. However, I've often been surprised what optimizations people spend their time writing (as opposed to the optimizations which I think they should be working on :-) ). All the same, I would strongly recommend using the word static if that's what you wanted.
For the first question, there are circumstances under which the compiler could perform the optimization based on the "as-if" rule, but in very few cases would it work out.
First of all, if any object or subobject in the initializer has a non-trivial constructor/destructor, then the construction/destruction is visible, and this is not an example of copy elision. (This paragraph is C++ only, of course.)
The same would be true if any computation in the initializer list has visible side-effects.
And it should go without saying that if any subobject's value is not constant, the computation of that subobject would need to be done on each construction.
If the object and all subobjects are trivially copyable, all the initializer-list computations are constant, and the only construction cost is that of copying from a template into the object, then the compiler still couldn't perform the optimization if there is any chance that the addresses of more than one live instance of the object might be simultaneously visible. For example, if the function were recursive, and the object's address was used somewhere (hard to avoid for an array), then there would be the possibility that the addresses of two of these objects from different recursive invocations of the function might be compared. And they would have to compare unequal, since they are in fact separate objects. (And, now that I think of it, the function would not even need to be recursive in a multi-threaded environment.)
So the burden of proof for a compiler wishing to optimize that object into a single static instance is quite high. As I said, it may well be that a given compiler actually attempts to perform that task, but I definitely wouldn't expect it to.
The compiler would almost certainly do whatever is deemed most optimal, but most likely it will have it in read-only memory and turn your local variable into a pointer that points to the array in read-only memory. This assumes your array is equivalent to a POD type (or a class composed of POD types; if your class does something non-trivial and/or modifies other things, there is no way the compiler can fairly do this optimization).
I have seen many instances where people have advised against using std::function<> because it is a heavyweight mechanism. Could someone please explain why that is so?
std::function is a type erasure class.
It takes whatever it is constructed from, and erases everything except:
Invoke with the signature in question (with possible implicit casting)
Destroy
Copy
Cast back to exact original type
and possibly
Move
This involves some overhead. A typical decent-quality std::function will have small object optimization (like small string optimization), avoiding a heap allocation when the amount of memory used is small.
A function pointer will fit in there.
However, there is still overhead. If you initialize a std::function with a compatible function pointer, instead of directly calling the function pointer in question, you do a virtual function table lookup, or invoke some other function, which then invokes the function pointer.
With a vtable implementation, that is a possible cache miss, an instruction cache miss, then another instruction cache miss. With a function pointer, the pointer is probably stored locally, and it is called directly, resulting on one possible instruction cache miss.
On top of this, in practice compilers understand function pointers better than std::functions: a number of compilers can figure out that the pointer is constant value during inlining or whole program optimization. I have never seen one that pulls that off with std::function.
For larger objects (say larger than sizeof(std::string) in one implementation), a heap allocation is also done by the std::function. This is another cost. For function pointers and reference wrappers, SOO is guaranteed by the standard.
Directly storing the lambda without storing it in a std::function is even better than a function pointer: in that case, the code being run is implicit in the type of the lambda. This makes it trivial for code to work out what is going to happen when it is called, and inlining easy for the compiler.
Only do type erasure when you need to.
Under the hood, std::function typically uses type erasure (one simplified explanation for how it may be implemented is here). The cost of storing your function object inside the std::function object may involve a heap allocation. The cost of invoking your function object is typically an indirection through a pointer plus a virtual function call. Also, while compilers are getting better at this, the virtual function call usually inhibits inlining of your function.
That being said, I recommend using std::function unless you know via measurements that the cost is too high (typically when you cannot afford heap allocations, your function will be called many times in a place that requires very low latency, etc.), as it is better to write straightforward code than to prematurely optimize.
Depending of the implementation, std::function will add some overhead due to the use of type easure. They have been some other implementation such as Don Clugston's fast delegate, with a C++11 implementation here. Please note that it uses UB to make the fastest possible delegate, but is still extremely portable.
If you want type erasure it's the right tool for the job and almost certainly not your bottleneck and not something you could write faster anyway.
However sometimes it can be all to tempting to use type erasure when it really isn't required. That's where to draw the line. For example if all you want to do is keep hold of a lambda locally then it's probably not the right tool and you should just use:
auto l = [](){};
Likewise for function pointers you don't plan to type erase - just use a function pointer type.
You also don't need type erasure for templates from <algorithm> or your own equivalents because there's simply no need for heterogenous functor types to coexist.
It's not so.
To put it simply, it's not too heavyweight unless you profiled your program and showed that it is too heavyweight. Since evidently you did not (otherwise you would know the answer to this question), we can safely conclude that it is in fact not too heavyweight at all.
You should always profile, before concluding that it's too slow.
Considering the fact that a pointer to a function returning another pointer to another function is the mechanism used in C to introduce some runtime polymorphism/callbacks, What is the equivalent way to implement this in C++ while improving locality and lower the cost about pointers and indirections ?
For example this syntactic sugar can help but I'm not really interested in this, altough it's a nice way to do things in a C++ way instead of a more C-ish typedef, I'm more interested in improving locality while trying to reduce the use of explicit pointers at runtime.
The real reason as to why people use function pointers in C to emulate polymorphism is not performance but the fact that C neither supports real polymorphism nor templates. These are two alternatives you have in C++. All three approaches are compared in this thread.
Note that even though calling a function pointer does not require the additional vtable lookup that virtual function calls do, calling virtual functions and function pointers both suffer from the same major performance problem: Branch prediction in both cases is not as reliable and you tend to end up with more pipeline flushes.
I think you can use Virtual function for meet part of the requirement.
Is there any efficiency disadvantage associated with deep inheritance trees (in c++), i.e, a large set of classes A, B, C, and so on, such that B extends A, C extends B, and so one. One efficiency implication that I can think of is, that when we instantiate the bottom most class, say C, then the constructors of B and A are also called, which will have performance implications.
Let's enumerate the operations we should consider:
Construction/destruction
Each constructor/destructor will call its base class equivalents. However, as James McNellis pointed out, you were obviously going to do that work anyway. You didn't derived from A just because it was there. So the work is going to get done one way or another.
Yes, it will involve a few more function calls. But function call overhead will be nothing compared to the actual work any significantly deep class hierarchy will have to actually do. If you're at the point where function call overhead is actually important for performance, I would strongly suggest that calling constructors at all is probably not what you want to be doing in that code.
Object Size
In general, the overhead for a derived class is nothing. The overhead for virtual members is a pointer or for virtual inheritance.
Member Function Calls, Static
By this, I mean calling non-virtual member functions, or calling virtual member functions with class names (ClassName::FunctionName syntax). Both of these allow the compiler to know at compile time which function to call.
The performance of this is invariant with the size of the hierarchy, since it's compile-time determined.
Member Function Calls, Dynamic
This is calling virtual functions with the full and complete expectation of runtime calls.
Under most sane C++ implementations, this is invariant with the size of the object hierarchy. Most implementations use a v-table for each class. Each object has a v-table pointer as a member. For any particular dynamic call, the compiler accesses the v-table pointer, picks out the method, and calls it. Since the v-table is the same for each class, it won't be any slower for a class that has a deep hierarchy than one with a shallow one.
Virtual inheritance plays a bit with this.
Pointer Casts, Static
This refers to static_cast or any equivalent operation. This means the implicit cast from a derived class to a base class, the explicit use of static_cast or C-style casts, etc.
Note that this technically includes reference casting.
The performance of static casts between classes (up or down) is invariant with the size of the hierarchy. Any pointer offsets will be compile-time generated. This should be true for virtual inheritance as well as non-virtual inheritance, but I'm not 100% certain of that.
Pointer Casts, Dynamic
This obviously refers to the explicit use of dynamic_cast. This is typically used when casting from a base class to a derived one.
The performance of dynamic_cast will likely change for a large hierarchy. But sane implementations should only check the classes between the current class and the requested one. So it's simply linear in the number of classes between the two, not linear in the number of classes in the hierarchy.
Typeof
This means the use of the typeof operator to fetch the std::type_info object associated with an object.
The performance of this will be invariant with the size of the hierarchy. If the class is a virtual one (has virtual functions or virtual base classes), then it will simply pull it out of the vtable. If it's not virtual, then it's compile-time defined.
Conclusion
In short, most operations are invariant with the size of the hierarchy. But even in the cases where it has an impact, it's not a problem.
I'd be more concerned with some design ethic where you felt the need to build such a hierarchy. In my experience, hierarchies like this come from two lines of design.
The Java/C# ideal of having everything derived from a common base class. This is a horrible idea in C++ and should never be used. Each object should derive from what it needs to, and only that. C++ was built on the "pay for what you use" principle, and deriving from a common base works against that. In general, anything you could do with such a common base class is either something you shouldn't be doing period, or something that could be done with function overloading (using operator<< to convert to strings, for example).
Misuse of inheritance. Using inheritance when you should be using containment. Inheritance creates an "is a" relationship between objects. More often than not, "has a" relationships (one object having another as a member) are far more useful and flexible. They make it easier to hide data, and you don't allow the user to pretend one class is another.
Make sure that your design does not fall afoul of one of these principles.
There will be but not as bad as the programmer performance implications.
As #Nicol points out, it may be doing a number of things.
If those are things that you require to be done, regardless of design, because they are all precisely necessary steps in getting the program from call main to exit within the fewest possible cycles, then your design is simply a matter of coding clarity (or maybe lack of it :).
In my experience performance tuning, as in this example, what I often see as a huge source of wasted time is over-design of data (i.e. class) structures.
Wierdly enough, the justification for the data structures is often (guess what?) - performance!
In my experience, the thing to do with data structure is keep it as simple as possible and as normalized as possible. If it is completely normalized, then any single change to it can't make it inconsistent. You can't always achieve complete normality, in which case you have to deal with the possibility that the data can be temporarily inconsistent.
This is why people write notification handlers, and this is encouraged in OOP.
The idea is, if you change something in one place, that can trigger notifications that "automatically" propagate the change to other places, trying to maintain consistency.
The problem with notifications is they can run away. Simply changing some boolean property from true to false can cause a fire-storm of notifications ripping through the data structure in ways no one programmer understands, updating databases, painting windows, zipping files, etc. etc. I often find this is where most clock cycles go.
I think it is simpler and far more efficient to temporarily tolerate inconsistency, and periodically repair it with some kind of sweeping process.
Another way data structures go along with huge inefficiency is if the data is effectively being interpreted by some process to produce some output.
This is very common in graphics.
If the data changes at a very slow rate, it may make sense to "compile" it rather than "interpret" it.
In other words, translate it into a simpler instruction set, or source code which is compiled "on the fly", which can then execute far more quickly to produce the desired output.
I'm learning C++ (coming from java) and recently discovered that you can pass functions around. This is really cool and I think immensely useful. Now I was thinking on how I could use this and one of the idea's that popped into my head was a completely customizable class.
The best example of my train of though for completely customizable classes (code) would be say a person class. Person would have all functions pertaining to P. Later Person may pick up a sword (S), so now Person has access to all functions pertaining to both P and S.
Are there limits or performance issues with this? Is this sloppy and just plain frowned upon?
Any insight is educational, thanks.
~Aedon
When passing around functions - i.e. pointers to functions really - calls are always indirect and therefore possibly slower than a direct call (and definitely slower than an inlined call altogether).
The STL is modeled with functors. That is: light function objects that have a operator() member which gets called. This has the advantage of being a very likely candidate for inlining, especially if the functor and operator() are very simple (as e.g. std::less<T>).
See also: http://www.sgi.com/tech/stl/functors.html
There is a slight performance hit since a pointer or reference must be dereferenced before calling the function.
This is a very advantageous feature. Many design patterns and polymorphism depend on pointers to functions. Check out the "Visitor Design Pattern".
Another usage is for a table of functions. For example, you could write a generic menu engine that displays different menus by using different functions.
Also research "Factory design pattern."
There's absolutely nothing wrong with passing around a function, but it's kind of primitive and limiting. Often you have some data that you want to associate with the function, in addition to the parameters you're passing to it. Also you might want to group related functions together and pass them as one. Congratulations, you've just described a C++ class!
If you want to see how C++ can really blur the line, consider a functor. This is a class that has an operator() method, so that you can call it just as you would a function. It has two immediate advantages over a plain function: it can hold state between calls, and it can be inlined by the compiler for superior performance. It's not uncommon for std::sort to outperform the older C qsort for example, because qsort uses a function pointer while std::sort uses a functor.