I have read in Scott Meyers' Effective C++ book that:
When you inline a function you may enable the compiler to perform context specific optimizations on the body of function. Such optimization would be impossible for normal function calls.
Now the question is: what is context specific optimization and why it is necessary?
I don't think "context specific optimization" is a defined term, but I think it basically means the compiler can analyse the call site and the code around it and use this information to optimise the function.
Here's an example. It's contrived, of course, but it should demonstrate the idea:
Function:
int foo(int i)
{
if (i < 0) throw std::invalid_argument("");
return -i;
}
Call site:
int bar()
{
int i = 5;
return foo(i);
}
If foo is compiled separately, it must contain a comparison and exception-throwing code. If it's inlined in bar, the compiler sees this code:
int bar()
{
int i = 5;
if (i < 0) throw std::invalid_argument("");
return -i;
}
Any sane optimiser will evaluate this as
int bar()
{
return -5;
}
If the compile choose to inline a function, it will replace a function call to this function by the body of the function. It now has more code to optimize inside the caller function body. Therefore, it often leads to better code.
Imagine that:
bool callee(bool a){
if(a) return false;
else return true;
}
void caller(){
if(callee(true)){
//Do something
}
//Do something
}
Once inlined, the code will be like this (approximatively):
void caller(){
bool a = true;
bool ret;
if(a) ret = false;
else ret = true;
if(ret){
//Do something
}
//Do something
}
Which may be optimized further too:
void caller(){
if(false){
//Do something
}
//Do something
}
And then to:
void caller(){
//Do something
}
The function is now much smaller and you don't have the cost of the function call and especially (regarding the question) the cost of branching.
Say the function is
void fun( bool b) { if(b) do_sth1(); else do_sth2(); }
and it is called in the context with pre-defined false parameter
bool param = false;
...
fun( param);
then the compiler may reduce the function body to
...
do_sth2();
I don't think that context specific optimization means something specific and you probably can't find exact definition.
Nice example would be classical getter for some class attributes, without inlining it program has to:
jump to getter body
move value to registry (eax on x86 under windows with default Visual studio settings)
jump back to callee
move value from eax to local variable
While using inlining can skip almost all the work and move value directly to local variable.
Optimizations strictly depend on compiler but lot of think can happen (variable allocation may be skipped, code may get reorder and so on... But you always save call/jump which is expensive instruction.
More reading on optimisation here.
Related
Consider the following bar functions
#include <iostream>
void foo(){
std::cout << "Hello" << std::endl;
}
void bar1(){
return foo();
}
void bar2(){
foo();
}
void bar3(){
foo();
return;
}
int main()
{
bar1();
bar2();
bar3();
return 1;
}
These functions do exactly the same thing, and actually godbolt produces the same code for all three (as one would hope). The question I have is simply whether there are any software engineering paradigms/guidelines that advocate one form over the other, and if there are any reasons why you would prefer one over the other. They seem to produce the same machine code, but I am imaging that one might be viewed as "easier to maintain", or something like that.
This is quite opinion-based. Though I'd say the general consensus is to write it like bar2(). Don't return explicitly unless you have to return early and don't do return func() if func() returns a void, that just confuses readers because you're not actually returning a value.
I totally agree with Sombrero Chicken's answer. But I'll also add that the construct like
void bar1(){
return foo();
}
doesn't make much sense for ordinary functions that return void, but may be useful for template code when you don't know the actual return type, e.g:
template <typename T>
auto SomeTemplateFunction(...)
{
// do some works
...
return SomeOtherTemplateFunction<T>(...);
}
This will work regardless SomeOtherTemplateFunction<T>'s return type is void or not.
It's quite opinion based, what I can say is that (3) is tagged by clang-tidy rules as part of the readibility-redundant-control-flow.
The idea is that the control flow here is already defined, the return is superfluous and should then be removed.
It's my first year of using C++ and learning on the way. I'm currently reading up on Return Value Optimizations (I use C++11 btw). E.g. here https://en.wikipedia.org/wiki/Return_value_optimization, and immediately these beginner examples with primitive types spring to mind:
int& func1()
{
int i = 1;
return i;
}
//error, 'i' was declared with automatic storage (in practice on the stack(?))
//and is undefined by the time function returns
...and this one:
int func1()
{
int i = 1;
return i;
}
//perfectly fine, 'i' is copied... (to previous stack frame... right?)
Now, I get to this and try to understand it in the light of the other two:
Simpleclass func1()
{
return Simpleclass();
}
What actually happens here? I know most compilers will optimise this, what I am asking is not 'if' but:
how the optimisation works (the accepted response)
does it interfere with storage duration: stack/heap (Old: Is it basically random whether I've copied from stack or created on heap and moved (passed the reference)? Does it depend on created object size?)
is it not better to use, say, explicit std::move?
You won't see any effect of RVO when returning ints.
However, when returning large objects like this:
struct Huge { ... };
Huge makeHuge() {
Huge h { x, y, x };
h.doSomething();
return h;
}
The following code...
auto h = makeHuge();
... after RVO would be implemented something like this (pseudo code) ...
h_storage = allocate_from_stack(sizeof(Huge));
makeHuge(addressof(h_storage));
auto& h = *properly_aligned(h_storage);
... and makeHuge would compile to something like this...
void makeHuge(Huge* h_storage) // in fact this address can be
// inferred from the stack pointer
// (or just 'known' when inlining).
{
phuge = operator (h_storage) new Huge(x, y, z);
phuge->doSomething();
}
In my jpg decoder I have a loop with an if statement that will always be true or always be false depending on the image. I could make two separate functions to avoid the if statement but I was wondering out of curiosity what the effect on efficiency would be using a function pointer instead of the if statement. It will point to the inline function if true or point to an empty inline function if false.
class jpg{
private:
// emtpy function
void inline nothing();
// real function
void inline function();
// pointer to inline function
void (jpg::*functionptr)() = nullptr;
}
jpg::nothing(){}
main(){
functionptr = &jpg::nothing;
if(trueorfalse){
functionptr = &jpg::function;
}
while(kazillion){
(this->*functionptr)();
dootherstuff();
}
}
Could this be faster than an if statement? My guess is no, because the inline will be useless since the compiler won't know which function to inline at compile time and the function pointer address resolve is slower than an if statement.
I have profiled my program and while I expected a noticeable difference one way or the other when I ran my program... I did not experience a noticeable difference. So I'm just wondering out of curiosity.
It is very likely that the if statement would be faster than invoking a function, as the if will just be a short jump vs the overhead of a function call.
This has been discussed here: Which one is faster ? Function call or Conditional if Statement?
The "inline" keyword is just a hint to the compiler to tell it to try to put the instructions inline when assembling it. If you use a function pointer to an inline, the inline optimization cannot be used anyway:
Read: Do inline functions have addresses?
If you feel that the if statement is slowing it too much, you could eliminate it altogether by using separate while statements:
if (trueorfalse) {
while (kazillion) {
trueFunction();
dootherstuff();
}
} else {
while (kazillion) {
dootherstuff();
}
}
Caution 1: I am not really answering the above question, on purpose. If one wants to know what it faster between an if statement and a function call via a pointer in the above example, then mbonneau gives a very good answer.
Caution 2: The following is pseudo-code.
Besides curiosity, I truly think one should not ask himself what is faster between an if statement and a function call to optimize his code. The gain would certainly be very small, and the resulting code might be twisted in such a way it could impact readability AND maintenance.
For my research, I do care about performance, this is a fundamental notion I have to stick with. But I do more care about code maintenance, and if I have to choose between a good structure and a slight optimization, I definitely choose the good structure. Then, if it was me, I would write the above code as follows (avoiding if statements), using composition through a Strategy Pattern.
class MyStrategy {
public:
virtual void MyFunction( Stuff& ) = 0;
};
class StrategyOne : public MyStrategy {
public:
void MyFunction( Stuff& ); // do something
};
class StrategyTwo : public MyStrategy {
public:
void MyFunction( Stuff &stuff ) { } // do nothing, and if you
// change your mind it could
// do something later.
};
class jpg{
public:
jpg( MyStrategy& strat) : strat(strat) { }
void func( Stuff &stuff ) { return strat.MyFunction( stuff ); }
private:
...
MyStrategy strat;
}
main(){
jpg a( new StrategyOne );
jpg b( new StrategyTwo );
vector<jpg> v { a, b };
for( auto e : v )
{
e.func();
dootherstuff();
}
}
Assume I have the following function:
// Precondition: foo is '0' or 'MAGIC_NUMBER_4711'
// Returns: -1 if foo is '0'
// 1 if foo is 'MAGIC_NUMBER_4711'
int transmogrify(int foo) {
if (foo == 0) {
return -1;
} else if (foo == MAGIC_NUMBER_4711) {
return 1;
}
}
The compiler complains "missing return statement", but I know that foo never has different values than 0 or MAGIC_NUMBER_4711, or else my function shall have no defined semantics.
What are preferable solutions to this?
Is this really an issue, i.e. what does the standard say?
Sometimes, your compiler is not able to deduce that your function actually has no missing return. In such cases, several solutions exist:
Assume the following simplified code (though modern compilers will see that there is no path leak, just exemplary):
if (foo == 0) {
return bar;
} else {
return frob;
}
Restructure your code
if (foo == 0) {
return bar;
}
return frob;
This works good if you can interpret the if-statement as a kind of firewall or precondition.
abort()
if (foo == 0) {
return bar;
} else {
return frob;
}
abort(); return -1; // unreachable
Return something else accordingly. The comment tells fellow programmers and yourself why this is there.
throw
#include <stdexcept>
if (foo == 0) {
return bar;
} else {
return frob;
}
throw std::runtime_error ("impossible");
Disadvantages of Single Function Exit Point
flow of control control
Some fall back to one-return-per-function a.k.a. single-function-exit-point as a workaround. This might be seen as obsolete in C++ because you almost never know where the function will really exit:
void foo(int&);
int bar () {
int ret = -1;
foo (ret);
return ret;
}
Looks nice and looks like SFEP, but reverse engineering the 3rd party proprietary libfoo reveals:
void foo (int &) {
if (rand()%2) throw ":P";
}
This argument does not hold true if bar() is nothrow and so can only call nothrow functions.
complexity
Every mutable variable increases the complexity of your code and puts a higher burden on the cerebral capacity on your code's maintainer. It means more code and more state to test and verify, in turn means that you suck off more state from the maintainers brain, in turn means less maintainer's brain capacity left for the important stuff.
missing default constructor
Some classes have no default construction and you would have to write really bogus code, if possible at all:
File mogrify() {
File f ("/dev/random"); // need bogus init because it requires readable stream
...
}
That's quite a hack just to get it declared.
In C89 and in C99, the return statement is never required. Even if it is a function that has a return different than void.
C99 only says:
(C99, 6.9.1p12 "If the } that terminates a function is reached, and the value of the function call is used by the caller, the behavior is undefined."
In C++11, the Standard says:
(C++11, 6.6.3p2) "Flowing off the end of a function is equivalent to a return with no value; this results in undefined behavior in a value-returning function"
Just because you can tell that the input will only have one of two values doesn't mean the compiler can, so it's expected that it will generate such a warning.
You have a couple options for helping the compiler figure this out.
You could use an enumerated type for which the two values are the only valid enumerated values. Then the compiler can tell immediately that one of the two branches has to execute and there's no missing return.
You could abort at the end of the function.
You could throw an appropriate exception at the end of the function.
Note that the latter two options are better than silencing the warning because it predictably shows you when the pre-conditions are violated rather than allowing undefined behavior. Since the function takes an int and not a class or enumerated type, it's only a matter of time before someone calls it with a value other than the two allowed values and you want to catch those as early in the development cycle as possible rather than pushing them off as undefined behavior because it violated the function's requirements.
Actually the compiler is doing exactly what it should.
int transmogrify(int foo) {
if (foo == 0) {
return -1;
} else if (foo == MAGIC_NUMBER_4711) {
return 1;
}
// you know you shouldn't get here, but the compiler has
// NO WAY of knowing that. In addition, you are putting
// great potential for the caller to create a nice bug.
// Why don't you catch the error using an ELSE clause?
else {
error( "transmorgify had invalid value %d", foo ) ;
return 0 ;
}
}
Some functions which calculate booleans:
bool a()
{
return trueorfalse;
}
bool b()
{
//...
}
bool c()
{
//...
}
This condition
//somewhere else
if((a()&&b()&&c()) || (a()&&b()&&!c()) )
{
doSomething();
}
can also be written as
if(a()&&b())
{
doSomething();
}
Will compilers usually optimize this away?
And what about pure boolean values:
if((a&&b&&c) || (a&&b&&!c))
{
doSomething();
}
Since the functions may have side effects, the conditional cannot be "optimized" in any way, since all the functions will have to be called (conditionally) in a well-defined manner.
If you do want optimization, you can assign the result to variables first:
const bool ba = a(), bb = b(), bc = c();
if (ba && bb && bc || ba && bb && !bc) { /* ... */ } // probably optimized to "ba && bb"
It's possible that constexpr functions introduced in C++11 will allow for optimization if they yield a constant expression, though, but I'm not sure.
You can even condense this down: In the following code, f() has to be called twice:
if (f() && false || f() && true)
{
// ...
}
No they won't. The reason why is that the optimization would be visible to the user because it would change the observable side effects. For example In your optimized version c() would never execute even though the user explicitly tried to do so. This can and will lead to bugs.
Since your premise a flawed, no, they won't.
(a()&&b()&&c()) || (a()&&b()&&!c()) definitely can't be rewritten as (a()&&b())
C (and C++) isn't a functional programming language (like Haskell).
But the problem is that it can't be refactored in that way, generally speaking!
If any of the functions have side effects that change the result of c() then the second call would possibly return a different result from the first one.
Not only that, but due to short-circuit execution things could be muddied even further.
Very often in C the return value of a function gives whether the function was executed successfully of not. For example calling a graphics routine, converting a file. Think how often you use pointers to change something external to the function. Or call another function that outputs something. As someone said this isn't functional programming.
If the compiler is able to determine that foo() changes and does nothing then it may by all means simplify it but I would NOT count on it.
Here is a very simple example
bool foo()
{
std::cout << "this needs to be printed each time foo() is called, even though its called in a logical expression\n";
return true;
}
int main()
{
if ((foo() && !(foo()) || foo() && !(foo())))
return 0;
return 1;
}
Edit any boolean algebra of variables should be simplified.