Will C++ optimize out empty/non-virtual/void method calls? - c++

Example code:
class DummyLock {
public:
void lock() {}
void unlock() {}
};
...
template <class T>
class List {
T _lock;
...
public:
void append(void* smth) {
_lock.lock();
...
_lock.unlock();
}
};
...
List<DummyLock> l;
l.append(...);
So, will it optimize out these method calls if lock type is a templated type? If no, what is the best approach to making a template list that has policies as template arguments (as in Andrei Alexandrescu C++ book)

Assuming inlining is enabled (so "some optimisation turned on"), then yes, any decent compiler should make this sort of thing into zero instructions. Particularly in a template, as templates require [in nearly all of the current compilers, at least] the compiler to "see" the source of the object. In a non-templated sitution, then it's possible to come up with a scenario where you are "out of line" declaring the empty lock code, and the compiler can't know that the function is empty.
(Looks scary with void *smth in your append tho' - I hope you do intent to have that as a second template type in your real implementation)
As always when it comes to "does the compiler do this", if it's really important, you need to check that YOUR compiler does what you expect in this particular case. clang++ -S or g++ -S would for example show if there are calls made or not within your append function.

Yes, any real-world C++ compiler (i.e. gcc, cland, VC++), will output no code for empty inline functions when optimization is turned on.

Related

How do I force the compiler to evaluate a switch at compile time?

I'm working on an embedded project (only C++14 compiler available) and I would like to optimize the speed of execution.
Here is an example of what I am doing.
enum gpio_type{
TYPE_1,
TYPE_2
}
template <gpio_type T>
class test{
test(){}
void set_gpio(bool output)
{
switch (T)
{
case TYPE_1:
do_something();
break;
case TYPE_2:
do_something_else();
break;
}
}
}
Will the compiler automatically remove the dead code at compile time? If it does, it is a standard feature or compiler dependent? If it does not, is it possible to write the code in a way that will force optimizing?
You could specialize set_gpio for the different enum values - eg. :
template <gpio_type T>
class test{
public:
test(){}
void set_gpio(bool output);
};
template<> void test<TYPE_1>::set_gpio(bool output) {
do_something();
}
template<> void test<TYPE_2>::set_gpio(bool output) {
do_something_else();
}
As other answers have indicated, you might not need to if your compiler is anywhere close to decent at optimizing. But the above might be more readable nevertheless.
Constant propagation and dead code elimination are one of the simplest compiler optimizations. And since T is a compile time constant I would be extremely extremely surprised if the code isn't optimized by any compiler.
I have tested 15 compilers and platforms on godbolt from the venerable x86 to arm, avr, risc-v, raspberry and arduino (and more). All of them just compile to the equivalent of a tail call jump. No tests, no conditional jumps. Go check it out for yourself.
At this point I can say with reasonable confidence that there is no performance reason to modify your code.
That might depend on if you turn on optimization or not and how intelligent your compiler is. I guess current compilers would optimize in this case at least if they inline the function.
But if you want to be 100% sure
specialize the template for different enum values or
use your switch and look at the assembler output of your compiler to check if the compiler optimized like you want it or
use C++17 and if constexpr
I would pass the functor (do_something or do_something_else) as a template argument.
Thus, your code of set_gpio becomes clearer and you are sure to have a compile time deduction which function to take.
In this post you can see how this is done:
Function passed as template argument

What is practical difference between `inline` and `template<class = void>`?

We have 2 methods to declare function in header-only library. They are inline and template<class = void>. In boost source code I can see both variants. Example follows:
inline void my_header_only_function(void)
{
// Do something...
return;
}
template<class = void> void my_header_only_function(void)
{
// Do something...
return;
}
I know what is difference according to C++ standard. However, any C++ compiler is much more than just standard, and also standard is unclear often.
In situation where template argument is never used and where it is not related to recursive variadic template, is there (and what is) practical difference between 2 variants for mainstream compilers?
I think this can be used as a weird way to allow library extension (or mocking) from outside library code by providing specialization for void or a non-template version of the function in the same namespace:
#include <iostream>
template<class = void>
int
foo(int data)
{
::std::cout << "template" << std::endl;
return data;
}
// somewhere else
int
foo(int data)
{
::std::cout << "non-template" << std::endl;
return data;
}
int main()
{
foo(1); // non template overload is selected
return 0;
}
online compiler
One difference is that binary code for the function may become part of the generated object file even if the function is never used in that file, but there will never be any code for the template if it's not used.
I'm the author of Beast. Hopefully I will be able to shed some light on why you see one versus the other. It really is very simple, the template seems less likely to be inlined into calling functions, bloating the code needlessly. I know that "inline" is really only supposed to mean "remove duplicate definitions" but sometimes compiler implementors get overzealous. The template thing is a little bit harder on the compile (Travis craps out sometimes at only 2GB RAM). So I decided to try writing some new stuff using the "inline" keyword. I still don't know how I feel about it.
The short answer is that I was doing it one way for a long time and then I briefly did it the other way for no particularly strong reason. Sorry if that is not as exciting as the other theories! (which were very interesting in fact)

Template specialization vs. Compiler optimization

I have a class with a boolean template argument.
template<bool b>
class Foo {
void doFoo();
};
I want doFoo to do different things based on the value of b.
Naively I could write
Option 1
template<bool b> void Foo<b>::doFoo() {
if (b) cout << "hello";
else cout << "goodbye";
}
This seems inefficient to me because I have to execute an if every time the function is called event though the correct branch should be known at compile time. I could fix this with template specialization:
Option 2
template<> void Foo<true>::doFoo() { cout << "hello"; }
template<> void Foo<false>::doFoo() { cout << "goodbye"; }
This way I don't have any conditionals executed in runtime. This solution is a bit more complicated (especially since in my real code the class has several template arguments and you can't partially specialize functions so I will need to wrap the function in a class).
My question is, is the compiler smart enough to know not to execute the conditional in option 1 since it always executes the same way or do I need to write the specializations? If the compiler is smart enough I would be happy to know if this is compiler dependent or a language feature that I can rely on?
The compiler will probably optimize the branch away since it is known at compile time what b is. This is not guaranteed though and the only way to know for sure is to check the assembly.
If you can use C++17 you can use if constexpr and that guarantees only one branch will exist.
This seems inefficient to me because I have to execute an if every time the function is called
The compiler will probably optimize this away - but it's not guaranteed by the standard. To be certain, you should look at the output of the compiler you care about (with the compile options you plan to use): eg. clang doesn't have a branch in the linked example (the un-optimized version has lots of function call boilerplate but no branch).
In C++17 you can use if constexpr, and the branch not taken will be discarded at compile time.

Can instantiation of a template lead to a binary code duplication, does compiler prevent it?

Suppose, we declare the template:
template <class functor, int index>
class MyClass
{
public:
MyClass(){someFunction(index);}
private:
void someFunction(int index)
{
while(index--)
functor();
}
int commonFunction(void)
{
return M_PI;
}
};
Pay attention that the method commonFunction doesn`t depend on the template parameters.
Client uses this template:
MyClass<func1,100> t1;
MyClass<func2,100> t2;
// ...
MyClass<funci,100> ti;
// where i, for example in 1 .. 1000
Will instantiation of the template lead to the duplication of commonFunction in the binary code?
Can a compiler prevent that duplication?
Does C++ standart defines that duplication can be prevented, so every compiler should provide optimization?
Of course this can be easily solved by implementing common functionality for all templates in a base class and moving differences in the templated class, like this:
class baseMyClass
{
int commonFunction(void)
{
return M_PI;
}
};
template <class functor, int index>
class MyClass : private baseMyClass
{
public:
MyClass(){someFunction(index);}
private:
void someFunction(int index)
{
while(index--)
functor();
}
};
But the purpose of my question is to find out:
Does standart defines that in the cases that look like the one I gave optimization should be performed, so we can simply use template and rely on a compiler?
Does standart defines that in the cases that look like the one I gave optimization should be performed, so we can simply use template and rely on a compiler?
No, the Standard does not require by any means that conforming compilers perform such kind of optimization. Code bloating is known to be one of the drawbacks of templates.
This said, since your function does not do anything else than returning a constant, it will probably be inlined, and even in case it will not be inlined, it is possible that the linker will recognize that several identical instantiations of that function have been generated, and merge them all.
However, this behavior is not mandated by the Standard.
The standard does not mandate optimisation on any case. So the answer to your last question is no for any case you can think of. Now, the standard does not prevent optimisation either in this case, and I guess many compilers will be smart enough to do it in this simple case.

Function hooking in C++?

With "hooking" I mean the ability to non-intrusively override the behavior of a function. Some examples:
Print a log message before and/or after the function body.
Wrap the function body in a try catch body.
Measure duration of a function
etc...
I have seen different implementations in various programming languages and libraries:
Aspect Oriented Programming
JavaScript's first class functions
OOP decorator pattern
WinAPI subclassing
Ruby's method_missing
SWIG's %exception keyword which is meant to wrap all functions in a try/catch block can be (ab)used for the purpose of hooking
My questions are:
IMO this is such an incredibly useful feature that I wonder why it has never been implemented as a C++ language feature. Are there any reasons that prevent this from being made possible?
What are some recommended techniques or libraries to implement this in a C++ program?
If you're talking about causing a new method to be called before/after a function body, without changing the function body, you can base it on this, which uses a custom shared_ptr deleter to trigger the after-body function. It cannot be used for try/catch, since the before and after need to be separate functions using this technique.
Also, the version below uses shared_ptr, but with C++11 you should be able to use unique_ptr to get the same effect without the cost of creating and destroying a shared pointer every time you use it.
#include <iostream>
#include <boost/chrono/chrono.hpp>
#include <boost/chrono/system_clocks.hpp>
#include <boost/shared_ptr.hpp>
template <typename T, typename Derived>
class base_wrapper
{
protected:
typedef T wrapped_type;
Derived* self() {
return static_cast<Derived*>(this);
}
wrapped_type* p;
struct suffix_wrapper
{
Derived* d;
suffix_wrapper(Derived* d): d(d) {};
void operator()(wrapped_type* p)
{
d->suffix(p);
}
};
public:
explicit base_wrapper(wrapped_type* p) : p(p) {};
void prefix(wrapped_type* p) {
// Default does nothing
};
void suffix(wrapped_type* p) {
// Default does nothing
}
boost::shared_ptr<wrapped_type> operator->()
{
self()->prefix(p);
return boost::shared_ptr<wrapped_type>(p,suffix_wrapper(self()));
}
};
template<typename T>
class timing_wrapper : public base_wrapper< T, timing_wrapper<T> >
{
typedef base_wrapper< T, timing_wrapper<T> > base;
typedef boost::chrono::time_point<boost::chrono::system_clock, boost::chrono::duration<double> > time_point;
time_point begin;
public:
timing_wrapper(T* p): base(p) {}
void prefix(T* p)
{
begin = boost::chrono::system_clock::now();
}
void suffix(T* p)
{
time_point end = boost::chrono::system_clock::now();
std::cout << "Time: " << (end-begin).count() << std::endl;
}
};
template <typename T>
class logging_wrapper : public base_wrapper< T, logging_wrapper<T> >
{
typedef base_wrapper< T, logging_wrapper<T> > base;
public:
logging_wrapper(T* p): base(p) {}
void prefix(T* p)
{
std::cout << "entering" << std::endl;
}
void suffix(T* p)
{
std::cout << "exiting" << std::endl;
}
};
template <template <typename> class wrapper, typename T>
wrapper<T> make_wrapper(T* p)
{
return wrapper<T>(p);
}
class X
{
public:
void f() const
{
sleep(1);
}
void g() const
{
std::cout << __PRETTY_FUNCTION__ << std::endl;
}
};
int main () {
X x1;
make_wrapper<timing_wrapper>(&x1)->f();
make_wrapper<logging_wrapper>(&x1)->g();
return 0;
}
There are compiler-specific features you can leverage such as, such as GCC's -finstrument-functions. Other compilers will likely have similar features. See this SO question for additional details.
Another approach is to use something like Bjarne Stroustrup's function wrapping technique.
To answer your first question:
Most dynamic languages have their method_missing constructs, PHP has a magic methods (__call and __callStatic) and Python has __getattr__. I think the reason this isn't available in C++ that it goes against the typed nature of C++. Implementing this on a class means that any typos will end up calling this function (at runtime!), which prevents catching these problems at compile time. Mixing C++ with duck typing doesn't seem to be a good idea.
C++ tries to be as fast as possible, so first class functions are out of question.
AOP. Now this is more interesting, techincally there's nothing that prevents this being added to the C++ standard (apart from the fact that adding another layer of complexity to an already extremly complex standard is might not be a good idea). In fact there are compilers which are able to wave code, AspectC++ is one of them. A year ago or so it wasn't stable but it looks like since then their managed to release 1.0 with a pretty decent test suite so it might does the job now.
There are a couple of techniques, here's a related question:
Emulating CLOS :before, :after, and :around in C++.
IMO this is an incredibly useful feature, so why is it not a C++ language feature? Are there any reasons that prevent this from being made possible?
C++ the language does not provide any means to do so directly. However, it also does not pose any direct constraint against this (AFAIK). This type of feature is easier to implement in an interpreter than in native code, because the interpret is a piece of software, not a CPU streaming machine instructions. You could well provide a C++ interpreter with support for hooks if you wanted to.
The problem is why people use C++. A lot of people are using C++ because they want sheer execution speed. To achieve that goal, compilers output native code in the operating system's preferred format and try to hard code as much stuff into the compiled executable file. The last part often means computing addresses at compile/link time. If you fix a function's address at that time (or even worse, inline the function body) then there is no more support for hooks.
That being said, there are ways to make hooking cheap, but it requires compiler extensions and is totally not portable. Raymond Chen blogged about how hot patching is implemented in the Windows API. He also recommends against its use in regular code.
This is not a C++ thing, but to accomplish some of things you mention, I have used the LD_PRELOAD environment variable in *nix systems. A good example of this technique in action is the faketime library that hooks into the time functions.
At least on c++ framework that I use provides a set of pure virtual classes
class RunManager;
class PhysicsManager;
// ...
Each of which defined a set of actions
void PreRunAction();
void RunStartAction()
void RunStopAction();
void PostRunAction();
which are NOPs, but which the user can override where deriving from the Parent class.
Combine that with conditional compilation (yeah, I know "Yuk!") and you can get what you want.
There has to be a way to implement the functionality without affecting the performance of code that doesn't use the functionality. C++ is designed on the principle that you only pay performance costs for the features you use. Inserting if checks in every function to check if its been overridden would be unacceptably slow for many C++ projects. In particular, making it work so that there's no performance cost while still allowing for independent compilation of the overridden and overriding functions will be tricky. If you only allow for compile time overriding, then it's easier to do performantly (the linker can take care of overwriting addresses), but you're comparing to ruby and javascript which let you change these things at runtime.
Because it would subvert the type system. What does it mean for a function to be private or non-virtual if someone can override its behavior anyway?
Readability would greatly suffer. Any function might have its behavior overridden somewhere else in the code! The more context you need to understand what a function does, the harder it is to figure out a large code base. Hooking is a bug, not a feature. At least if being able to read what you wrote months later is a requirement.