Member hooks versus base hooks in intrusive data structures - c++

I'm coding an intrusive data structure and wondering whether to use base or member hooks. As the code will be called many times, my question regards performance and to what extent the compilers are able to inline such code.
Base hooks are based on inheritance while member hooks use pointers-to-members via template parameters.
My design choice would be to use member hooks, but my experience says pointers are much harder to optimize than static code. On the other hand, all those pointers are known at compile time and perhaps the compiler can do some magic to analyze what's happening.
Does anyone has some experience with this? Any data, hints or references are welcome.

As for most "X vs Y, what is faster?" questions there is only one proper answer for this one:
Ask your profiler.
Experience is vague. Human guesswork can not take into account all the nitty gritty details and pitfalls of compiler optimizations. Compilers differ in what they can optimize and how good they do it. Sometimes even between different versions of the same compiler. The only thing that can tell you well how your implementations can be optimized by your specific compiler(s) on your specific platform(s) is a proper measurement of performance with typical problem sizes.
Even if there is someone who tells you he knows what is faster and gives you some pretty graphs: can you trust him enough to not make those measurements? Does he know what your specific environment looks like? Does he and his graphs take into account the special corner cases of your problems? Most probably not.

Since data and hooks are in a "has a" Relationship I'd also prefer member hooks from a design point of view. I also don't think there is a difference in optimization between putting the hooks in a base class than putting them into a class directly.
There is also some consideration about these different approaches in Boost intrusive.

Related

Cost of instance members vs. static members vs. non-class methods

This is a quite generic question but I haven't found clear and specific answers for C++.
I have a class with numerous methods, which may have a large number of instances implemented simultaneaously with calls to a few methods (<5) which are time-critical i.e. for real-time simulation, others methods being not time-critical.
Would it be more time-efficient to declare the non-critical methods as static members, or even as non-class functions (as far as relevant/possible).
Now what about memory cost of dynamic methods vs. static methods or non-class methods? Is there a risk of memory shortage with a very large number of instances in either option?
If I missed a reference answering precisely this question, please forgive me or just give me a hint how to proceed, thanks in advance!
The reason you won't find specific answers is that trying to predict how your optimizer is going to do can be difficult.
I recently dug way deeper than I normally do into the code when trying to get an answer to this question where a supposedly totally harmless change had a small but measurable effect of performance.
If you read the outcome of the deep-dive I did on the code, optimization is a bit like the butterfly principle. A tiny tiny change can have "ripple-on" effects that cause a much larger effect than you'd expect at first glance. Sure, it was only around 3% in that case, but in time critical code, that can matter. I know, I've been there. ;)
Hence #Cheers and hth - Alf is 100% correct. The ONLY way you can know for absolutely sure is to try all possible combinations and to benchmark them.

How can I better learn to "not pay for what you don't use"?

I've just gotten answers to this question which, at the bottom line, tell me: "Doing X doesn't make sense since it would make you pay for things you might not use."
I find this maxim difficult to follow; my instincts lean more towards seeing what I consider clear semantics, with things defined "in their place". More generally, it's not immediate for me to realize what the hidden costs and secret tariffs would be for a particular design choice?.
Is this covered by (non-reference) books on C++? Is there someplace relevant online to better enlighten myself on following this principle?
In the case you are presenting it is not as general a statement as it seems.
Doing X doesn't make sense since it would make you pay for things you
might not use.
This is merely a statement that if you can, avoid using virtual functions. They add overhead to the function call.
Virtual functions can often be redesigned by using templates and regular function calls. One std:: example is std::vector. In Java for instance a Vector implements interfaces to be usable in algorithms. Accomplished by virtual function calls. std::vector uses iterators.
Despite the question being overly broad and asking for off site material I think it is interesting enough to deserve an answer. Remember that C++ was originally just "C with classes" and it is still possible today to write what is basically C code without using any of the nice abstractions that C++ gives you. For example if you don't want the cost of using exceptions then don't use them, if you don't want the cost of RTTI (virtual functions) then don't use them, if you don't want the overhead of using templates... etc.
As for resources, I'm going to break the rules and recommend Game Programming Patterns which despite the name is a good general purpose guide to writing performant C++.
How can I better learn to “not pay for what you don't use”?
The key to "not paying for what you don't use" is abstractions. When you clearly understand the purpose of a class or a function, you add the data and arguments that are absolutely necessary for the class and the function to work correctly, with as little overhead as possible.
You have to be very vigilant about adding member variables, member functions (virtual as well as non-virtual) to a class. Every member variable adds to the memory requirements of the class. Every member function requires maintenance. Presence of virtual member functions add to the memory requirements of the class as well as a small penalty at run time.
You have to be very vigilant about the arguments to a function. You don't want the user to be burdened with supplying arguments that don't make sense. You also don't want to leave out any arguments by making hidden assumptions.

Heavy use of templates for mobile platforms

I've been flicking through the book Modern C++ Design by Andrei Alexandrescu and it seems interesting stuff. However it makes very extensive use of templates and I would like to find out if this should be avoided if using C++ for mobile platform development (Brew MP, WebOS, iOS etc.) due to size considerations.
In Symbian OS C++ the standard use of templates is discouraged, the Symbian OS itself uses them but using an idiom known as thin templates where the underlying implementation is done in a C style using void* pointers with a thin template layered on top of this to achieve type safety.
The reason they use this idiom as opposed to regular use of templates is specifically to avoid code bloating.
So what are opinions (or facts) on the use of templates when developing applications for mobile platforms.
Go ahead and use templates wherever they make your code easier to understand and to maintain. Avoidance of templates on mobile platforms can be categorized as "premature optimization".
If you run into executable-size issues, then redesign if necessary, but don't start with the assumption that templates will cause problems before you see any actual problems.
A lot of the stuff in "Modern C++ Design" and similar books is not going to lead to bloated code, because so much of it is really designed to ensure type safety and do compile-time metaprogramming magic, rather than to generate code.
Templates can be used to do a lot of different things. They can generate more code than you expect, but that's not a reason to ban their use. It wasn't so long ago that various authorities recommended avoiding exceptions, virtual functions, floating-point math, and even classes due to concerns about code size and performance, but people did those things, and somehow everything worked out fine.
Templates don't necessarily lead to code bloat. If you write a function or class template and instantiate it for a dozen different types then yes, you get a lot of duplicate code generated (probably, anyway. Some compilers can merge identical instantiations back together).
But if a template is instantiated for one type only, then there is zero cost in code size. If you instantiate it a couple of times, you pay a certain cost, but you'd also end up paying if you used any of the other ways to achieve the same thing. Dynamic polymorphism (virtual functions and inheritance) isn't free either. You pay for that in terms of vtables, code generated to facilitate all the type casts and conversions necessary, and simply because of code that can't be inlined or optimized away.
Taking std::vector as an example, then yes, if you use both vector<int> and vector<float>, you get two copies of some of the code. But with templates, only the code that is actually used gets compiled. The member functions that you never call won't generate any code, and even in the functions that are compiled, the compiler may be able to eliminate a lot of code. For example, for certain types, exception handling code may be unnecessary, so the compiler can eliminate it, yielding smaller code than if you'd used dynamic polymorphism, where the compiler would've been unable to make any assumptions about the type being stored. So in this made-up example, you'd get some code generated for both vector<int> and vector<float>, but each of them is going to be a lot smaller than a polymorphic vector as you might find in Java, for example.
The main problem with using templates is that it requires a compiler which supports it. On a PC, that's no problem. On any other platform which has a mature C++ compiler available, it's no problem.
But not all platforms have a modern heavy-duty C++ compiler available. Some don't support a number of advanced features, and some are just not good enough at the optimizations required to make template code work (templates tend to require a lot of inlining, for example). So on some platforms, it may be best to avoid templates. Not because of any concern for code size, but because the compiler may be unable to handle it.
In my personal experience using (and even abusing) templates very rarely result in large code bloat, and compiling with -Os will help a lot.
It's not that common to see huge template classes duplicated (instantiated) many times, both because rarely classes are huge, and because in most cases you only instantiate templates with a few different arguments, not hundreds. Besides it's easy to reuse some common code in your biggest template classes/functions, and compiler will help you in doing this.
Usually size of data (graphics, audio, ...) is orders of magnitude bigger than the code. So I wouldn't worry.
Of course there could be exceptions to what I said, but I guess they'll mostly be about advanced (special / weird / complicated) stuff, not with the most common everyday classes.
Summarizing my suggestion: use templates as much as you want, if something will go wrong you'll find that out by profiling, and you will easily be able to optimize the size.
Whatever you do, do NOT try writing some code, compile it, and compare executable size or code duplication.
I would say that generally (not just related to mobile development) this advice holds. Some of the techniques described in Modern C++ Design can lead to long build times and code bloat.
This is especially true when dealing with "obscure" devices, like cell phones. Many template techniques rely on that the compiler and linker do a perfect job in eliminating unused/duplicate code. If they don't, you risk having hundreds of duplicate std::vector instances scattered all over your code. And trust me, I have seen this happen.
This is not to say Modern C++ Design is a bad book, or that templates are bad. But especially on embedded projects it's best to watch out, because it can bite.

Any performance reason to put attributes protected/private?

I "learned" C++ at school, but there are several things I don't know, like where or what a compiler can optimize, seems I already know that inline and const can boost a little...
If performance is an important thing (gaming programming for example), does putting class attributes not public (private or protected) allow the compiler to make more optimized code ?
Because all my previous teacher were saying was it's more "secure" or "prevent not wanted or authorized class access/behavior", but in the end, I'm wonder if putting attributes not public can limit the scope and thus fasten things.
I don't criticize my teachers (should I), but the programming class I was in wasn't very advanced...
The teachers were right to tell you to use private and protected to hide implementation and to teach you about information hiding instead of propsing questionable performance optimizations. Try to think of an appropriate design first and of performance second, in 99% of the cases this will be the better choice (even in performance critical scenarios). Performance bottlenecks can appear in a lot of unpredicted cases and are much easier to come by if your design is sound.
To directly answer your question however: any reduction in scope may help the compiler to do certain optimizations, form the top of my head I can not think of any however right now in regards to making members private.
No. Making members private or protected is not going to provide any performance benefits; of course, the benefits to your design (information hiding) are huge.
There's no such thing as public, private and protected once your code is compiled, so it cannot affect performance.
There's also no such thing as const in machine code (except perhaps ROM), but the compiler can make some logical optimisations to your program by knowing whether a value can change (in some situations).
inline rarely has any effect. It is merely a suggestion to the compiler, which the compiler is free to ignore (and often does). The compiler will inline functions as it sees fit.

Design and Readbility

I am working on a project written in C++ which involves modification of existing code. The code uses object oriented principles(design patterns) heavily and also complicated stuff like smart pointers.
While trying to understand the code using gdb,I had to be very careful about the various polymorphic functions being called by the various subclasses.
Everyone knows that the intent of using design patterns and other complicated stuff in your code is to make it more reusable i.e maintainable but I personally feel that, it is much easier to understand and debug a procedure oriented code as you definitely know which function will actually be called.
Any insights or tips to handle such situations is greatly appreciated.
P.S: I am relatively less experienced with OOP and large projects.
gdb is not a tool for understanding code, it is a low-level debugging tool. Especially when using C++ as a higher level language on a larger project, it's not going to be easy to get the big picture from stepping through code in a debugger.
If you consider smart pointers and design patterns to be 'complicated stuff' then I respectfully suggest that you study their use until they don't seem complicated. They should be used to make things simpler, not more complex.
While procedural code may be simple to understand in the small, using object oriented design principals can provide the abstractions required to build a very large project without it turning into unmaintainable spaghetti.
For large projects, reading code is a much more important skill than operating a debugger. If a function is operating on a polymorphic base class then you need to read the code and understand what abstract operations it is performing. If there is an issue with a derived class' behaviour, then you need to look at the overrides to see if these are consistent with the base class contract.
If and only if you have a specific question about a specific circumstance that the debugger can answer should you step through code in a debugger. Questions might be something like 'Is this overriden function being called?'. This can be answered by putting a breakpoint in the overriden function and stepping over the call which you believe should be calling the overriden function to see if the breakpoint is hit.
Port it into Doxygen as a first step.
Modifying comments should have no effect on the code.
Doxygen will allow you to get an overview of the structure of the program.
Over time, as you figure out more about the program, you add comments that get picked up by Doxygen. The quality of the document grows over time, and will be helpful to the next poor SOB that gets stuck with the program
There is an excellent book called Object-Oriented Reengineering Patterns that, in a first part, provides patterns on how to understand legacy code (e.g. "refactor to understand").
A pdf version of the book is available for free at http://scg.unibe.ch/download/oorp/
Diagrams. Does your IDE have a tool that can reverse-engineer class diagrams from the code? That may help you understand the relationships between classes. Also, have the other developers actually written documentation on what they are doing and why? Is there a decisions document explaining why they designed and built in the way they did (Ok, sometimes this is not necessary - but if it exists, it would also help).
Also, do you know WHAT design patterns were used? Do they have names? Can you look them up and find other simpler examples of them? Maybe try writing a small app that also implements the design pattern, just to try it for yourself. That can also improve understanding.
I generally do the following:
Draw a simplified class diagram
Write some pseudocode
Ask a developer who is likely to be familiar with the code layout