Can the compiler optimize away method call? - c++

I have a C++ class like this:
class MyClass {
int calculate( int arg1 ) const;
void side_effect( int arg) const;
}
Which is used like this:
{
MyClass m;
m.calculate( 100 );
m.side_effect( 100 );
}
Is the compiler free skip the m.calculate( ) call - since I do not hold on to the return value? For the side_effect( )method I really hope the compiler can not skip the call - even though the method is marked const?
EDIT: The reason I ask this question is that my calculate( ) function has a side effect, but through the use of mutable it is marked as const. Now in the normal case I want to hold on to the return value, and the whole problem is moot - but in the case illustrated above I am only interested in being certain that the side effect has been invoked (Yes - I know it is not pretty ...). Reading the answers/comments I get the feeling you think compiler can deduce whether a method has side effects; that was surprising to me?

It depends on what m.calculate() does.
If it only retrieves values and then throws them away then, indeed, there is nothing useful for your computer to do here and your finished program may very well not even make the call.
However, the extent to which compilers can perform optimisations is limited by the visibility of the function's definition, among other things.
The const has nothing to do with it. So, if m.side_effect() has side effects, it cannot be skipped.

The compile will optimize away any code that has no effect at runtime, if it can tell that at compile time, e.g. branches that are never called and values that are never used. Const data and objects can be handy here since their value is known at compile time. Putting const on a method itself won't change that: it just means that you know this method won't change it's object. The object itself is still non-const, so even though the method is const, the compile doesn't 100% know what the data values could be at the time that method runs. There's still the possibility that a non-const method changed the object at some point.
Additionally, methods declared in a class like that are inline: they will be replaced by the code themselves instead of a function call. Whatever "action" the side_effect method does will be directly written into the code in place of the function call itself. The entire thing won't be optimized away unless it actually does nothing, which shouldn't be a problem.
BTW. A const method can still affect related objects, e.g. accessing data through a pointer inside the class, it can freely change the data that's pointed to, but can't change the address of the pointer itself.

Let's distinguish the 'can and 'should' to be more presize.
With regards to 'can', the ability of the compiller to optimize some cetain code constructions depends on the code design and how "smart" the compiler is. As more hints the compiler has, as more tightly it can optimize something.
In your case, if the functions side_effect and calculate are dummy and don't do anything (and calculate returns a statically hardcoded value), they can be optimized out. Besides that, their parameters can also be optimized (please don't mix this case with 'should') at the place they are declared and defined. But for this reason (especially if some simple compiller is used) you probably need to give more hints about the parameters declaring them as const references to int rather to pass them by value.
In this case, your code
MyClass m;
m.calculate( 100 );
m.side_effect( 100 );
won't produce any instruction to the executable neither for the functions, nor for their parameters.
Also, giving consts, you won't force the compiller to optimize some code, but you help the compiler to recognize that the code can be optimized when it should be optimized.

Related

Is there any point to the const variable here?

#include <iostream>
int square(int const &i) {
return i * i;
}
int main() {
int side = 5;
std::cout << square(side) << "\n";
}
Just looking at some code and this is a basic question but the const doesn't really do anything here does it? I mean it ensures that I can't change the value of i but I mean it's kinda useless isn't it?
I mean it ensures that I can't change the value of i
Yes, that is what it does.
but I mean it's kinda useless isn't it?
In this example it's not useful, but imagine that your function was 300 lines long instead of 1 line long, and was being maintained over several years by multiple different programmers of varying skill levels.
When looking at code in the middle of a big function like that, it's often very useful to know what the value of i will be on a given line. If i has been marked as const, then it's easy to know that the value of i is guaranteed to be equal to the value that was passed in to the function, because the compiler (more-or-less) guarantees that to you; if any of the code earlier in the function had tried to assign a different value to i, the function would not have compiled. Without the const tag, on the other hand, you'll have to manually read through all the code earlier in the function to verify "by eye" that none of that code assigned a different value to i, or if it did, under what circumstances that might occur and what new value it might assign. That's a lot of extra programmer-time, and assignments like that might be very easy to miss.
Hence, the const tag can be a real time-saver for programmers, in some cases.
A second benefit is that with the const tag you can call the function with a temporary-value as an argument, like this:
square(9)
... whereas without the const the above would be a compile-time error. (In this case you could get also around that error by changing the argument type to a simple int or const int instead of an int &, but in general you often want to pass by-reference to avoid unnecessary copying of objects during function-calls)
This isn't useless because later on you will be studying copy constructor where you will study about Deep copy and Shallow copy. This concept of const is very useful and helpful. If you are using it in your early days of C++, Trust me, you will be really happy in future.
In this case the optimizer can see both functions so performance-wise it will have no effect.
However if the optimizer cannot see the definition of square while compiling main, and you would have used side again, then there would a difference.
In the const case all reads from side would have been constant folded to 5.
In the non const case, after calling square, all reads from side would have to read the variable because square might have changed it.
I mean it ensures that I can't change the value of i
Technically, it does not ensure that. It is conventional that a function accepting a reference to const should not modify the referred object, and it is harder to accidentally modify the referred value, but there is no guarantee and the function can change the value. I would recommend to conform to that convention whenever possible (and it nearly always is possible).
It is kinda useless to use a reference parameter in this case.
As you guessed, there is no point to make the i variable const.
There is, as far as I know, one reason for that :
i is type int, which is smaller than the pointer/reference size, there is no need to pass by ref, actually a pass by value is more efficient.

Compiler deduction of rvalue-references for variables going out of scope

Why won't the compiler automatically deduce that a variable is about to go out of scope, and therefore let it be considered an rvalue-reference?
Take for example this code:
#include <string>
int foo(std::string && bob);
int foo(const std::string & bob);
int main()
{
std::string bob(" ");
return foo(bob);
}
Inspecting the assembly code clearly shows that the const & version of "foo" is called at the end of the function.
Compiler Explorer link here: https://godbolt.org/g/mVi9y6
Edit: To clarify, I'm not looking for suggestions for alternative ways to move the variable. Nor am I trying to understand why the compiler chooses the const& version of foo. Those are things that I understand fine.
I'm interested in knowing of a counter example where the compiler converting the last usage of a variable before it goes out of scope into an rvalue-reference would introduce a serious bug into the resulting code. I'm unable to think of code that breaks if a compiler implements this "optimization".
If there's no code that breaks when the compiler automatically makes the last usage of a variable about to go out of scope an rvalue-reference, then why wouldn't compilers implement that as an optimization?
My assumption is that there is some code that would break where compilers to implement that "optimization", and I'd like to know what that code looks like.
The code that I detail above is an example of code that I believe would benefit from an optimization like this.
The order of evaluation for function arguments, such as operator+(foo(bob), foo(bob)) is implementation defined. As such, code such as
return foo(bob) + foo(std::move(bob));
is dangerous, because the compiler that you're using may evaluate the right hand side of the + operator first. That would result in the string bob potentially being moved from, and leaving it in a valid, but indeterminate state. Subsequently, foo(bob) would be called with the resulting, modified string.
On another implementation, the non-move version might be evaluated first, and the code would behave the way a non-expert would expect.
If we make the assumption that some future version of the c++ standard implements an optimization that allows for the compiler to treat the last usage of a variable as an rvalue reference, then
return foo(bob) + foo(bob);
would work with no surprises (assuming appropriate implementations of foo, anyway).
Such a compiler, no matter what order of evaluation it uses for function arguments, would always evaluate the second (and thus last) usage of bob in this context as an rvalue-reference, whether that was the left hand side, or right hand side of the operator+.
Here's a piece of perfectly valid existing code that would be broken by your change:
// launch a thread that does the calculation, moving v to the thread, and
// returns a future for the result
std::future<Foo> run_some_async_calculation_on_vector(std::pmr::vector<int> v);
std::future<Foo> run_some_async_calculation() {
char buffer[2000];
std::pmr::monotonic_buffer_resource rsrc(buffer, 2000);
std::pmr::vector<int> vec(&rsrc);
// fill vec
return run_some_async_calculation_on_vector(vec);
}
Move constructing a container always propagates its allocator, but copy constructing one doesn't have to, and polymorphic_allocator is an allocator that doesn't propagate on container copy construction. Instead, it always reverts to the default memory resource.
This code is safe with copying because run_some_async_calculation_on_vector receives a copy allocated from the default memory resource (which hopefully persists throughout the thread's lifetime), but is completely broken by a move, because then it would have kept rsrc as the memory resource, which will disappear once run_some_async_calculation returns.
The answer to your question is because the standard says it's not allowed to. The compiler can only do that optimization under the as if rule. String has a large constructor and so the compiler isn't going to do the verification it would need to.
To build on this point a bit: all that it takes to write code that "breaks" under this optimization is to have the two different versions of foo print different things. That's it. The compiler produces a program that prints something different than the standard says that it should. That's a compiler bug. Note that RVO does not fall under this category because it is specifically addressed by the standard.
It might make more sense to ask why the standard doesn't say so, e.g.why not extend the rule governing returning at the end of a function, which is implicitly treated as an rvalue. The answer is most likely because it rapidly becomes complicated to define correct behavior. What do you do if the last line were return foo(bob) + foo(bob)? And so on.
Because the fact it will go out of scope does not make it not-an-lvalue while it is in scope. So the - very reasonable - assumption is that the programmer wants the second version of foo() for it. And the standard mandates this behavior AFAIK.
So just write:
int main()
{
std::string bob(" ");
return foo(std::move(bob));
}
... however, it's possible that the compiler will be able to optimize the code further if it can inline foo(), to get about the same effect as your rvalue-reference version. Maybe.
Why won't the compiler automatically deduce that a variable is about to go out of scope, and can therefore be considered an rvalue-reference?
At the time the function gets called, the variable is still in scope. If the compiler changes the logic of which function is a better fit based on what happens after the function call, it will be violating the standard.

Is there any difference in performance to declare a large variable inside a function as `static`?

Not sure if this has already been asked before. While answering this very simple question, I asked myself the following instead. Consider this:
void foo()
{
int i{};
const ReallyAnyType[] data = { item1, item2, item3,
/* many items that may be potentially heavy to recreate, e.g. of class type */ };
/* function code here... */
}
Now in theory, local variables are recreated every time control reaches function, right? I.e. look at int i above - it's going to be recreated on the stack for sure. What about the array above? Can a compiler be as smart as to optimize its creation to occur only once, or do I need a static modifier here anyway? What about if the array is not const? (OK, if it's not const, there probably i snot sense in creating it only once, since re-initialization to the default state may be required between calls due to modifications being made during function execution.)
Might sound like a basic question, but for some reason I still ponder. Also, ignore the "why would you want to do this" - this is just a language question, not applied to a certain programming problem or design. I mean both C and C++ here. Should there be differences between the two regarding this question, please outline those.
There a two questions here, I think:
Can a compiler optimize a non-static const object to be effectively static so that it is only created once; and
Is it a reasonable expectation that a given compiler will do so.
I think the answer to the second question is "No", because I don't see the point of doing a huge amount of control flow analysis to save the programmer the trouble of typing the word static. However, I've often been surprised what optimizations people spend their time writing (as opposed to the optimizations which I think they should be working on :-) ). All the same, I would strongly recommend using the word static if that's what you wanted.
For the first question, there are circumstances under which the compiler could perform the optimization based on the "as-if" rule, but in very few cases would it work out.
First of all, if any object or subobject in the initializer has a non-trivial constructor/destructor, then the construction/destruction is visible, and this is not an example of copy elision. (This paragraph is C++ only, of course.)
The same would be true if any computation in the initializer list has visible side-effects.
And it should go without saying that if any subobject's value is not constant, the computation of that subobject would need to be done on each construction.
If the object and all subobjects are trivially copyable, all the initializer-list computations are constant, and the only construction cost is that of copying from a template into the object, then the compiler still couldn't perform the optimization if there is any chance that the addresses of more than one live instance of the object might be simultaneously visible. For example, if the function were recursive, and the object's address was used somewhere (hard to avoid for an array), then there would be the possibility that the addresses of two of these objects from different recursive invocations of the function might be compared. And they would have to compare unequal, since they are in fact separate objects. (And, now that I think of it, the function would not even need to be recursive in a multi-threaded environment.)
So the burden of proof for a compiler wishing to optimize that object into a single static instance is quite high. As I said, it may well be that a given compiler actually attempts to perform that task, but I definitely wouldn't expect it to.
The compiler would almost certainly do whatever is deemed most optimal, but most likely it will have it in read-only memory and turn your local variable into a pointer that points to the array in read-only memory. This assumes your array is equivalent to a POD type (or a class composed of POD types; if your class does something non-trivial and/or modifies other things, there is no way the compiler can fairly do this optimization).

is it good practice to add const at end of member functions - where appropriate?

Is it a good practice, in C++, to add const at the end of a member function definition every time the function does not modify the object, i.e., every time the function is 'eligible' for const?
I know that it's necessary in this case:
class MyClass {
public:
int getData() const;
};
void function(const MyClass &m) { int a = m.getData(); dosomething... }
but other than this, and other uses of const for actual functionality, does adding const to the end actually change the way code is executed (faster/slower) or is it just a 'flag' for the compiler to handle cases such as the one above?
In other words, if const (at the end) is not needed for functionality in a class, does adding it make any difference?
Please see this excellent article about const correctness by Herb Sutter (C++ standards committee secretary for 10 years.)
Regarding optimizations, he later wrote this article where he states that "const is mainly for humans, rather than for compilers and optimizers." Optimizations are impossible because "too many things could go wrong...[your function] might perform const_casts."
However, const correctness is a good idea for two reasons: It is a cheap (in terms of your time) assertion that can find bugs; and, it signals the intention that a function should theoretically not modify the object's external state, which makes code easier to understand.
every time the function does not modify the object, i.e., every time the function is 'eligible' for const?
In my opinion, Yes. It ensures that you call such functions on const objects or const expressions involving the object:
void f(const A & a)
{
a.inspect(); //inspect must be a const member function.
}
Even if it modifies one or few internal variables once or twice, even then I usually make it const member function. And those variables are declared with mutable keyword:
class A
{
mutable bool initialized_cache; //mutable is must!
public:
void inspect() const //const member function
{
if ( !initialized_cache)
{
initialized_cache= true;
//todo: initialize cache here
}
//other code
}
};
Yes. In general, every function that is logically const should be made const. The only gray areas are where you modify a member through a pointer (where it can be made const but arguably should not be const) or where you modify a member that is used to cache a computation but otherwise has no effect (which arguably should be made const, but will require the use of the keyword mutable to do so).
The reason why it's incredibly important to use the word const is:
It is important documentation to other developers. Developers will assume that anything marked const does not mutate the object (which is why it might not be a good idea to use const when mutating state through a pointer object), and will assume that anything not marked const mutates.
It will cause the compiler to catch unintentional mutations (by causing an error if a function marked const unintintionally calls a non-const function or mutates an element).
Yes, it is a good practice.
At the software engineering level it allows you to have read-only objects, e.g. you can prevent objects from being modified by making them const. And if an object is const, you are only allowed to call const functions on it.
Furthermore, I believe the compiler can make certain optimizations if it he knows that an object will only be read (e.g., share common data between several instances of the object as we know they are never being modified).
The 'const' system is one of the really messy features of C++. It is simple in concept, variables declared with ‘const’ added become constants and cannot be altered by the program, but, in the way is has to be used to bodge in a substitute for one of the missing features of C++, it gets horridly complicated and frustratingly restrictive. The following attempts to explain how 'const' is used and why it exists. Of the mixes of pointers and ‘const’, the constant pointer to a variable is useful for storage that can be changed in value but not moved in memory and the pointer (constant or otherwise) is useful for returning constant strings and arrays from functions which, because they are implemented as pointers, the program could otherwise try to alter and crash. Instead of a difficult to track down crash, the attempt to alter unalterable values will be detected during compilation.
For example, if a function which returns a fixed ‘Some text’ string is written like
char *Function1()
{ return “Some text”;}
then the program could crash if it accidentally tried to alter the value doing
Function1()[1]=’a’;
whereas the compiler would have spotted the error if the original function had been written
const char *Function1()
{ return "Some text";}
because the compiler would then know that the value was unalterable. (Of course, the compiler could theoretically have worked that out anyway but C is not that clever.)
When a subroutine or function is called with parameters, variables passed as the parameters might be read from to transfer data into the subroutine/function, written to to transfer data back to the calling program or both to do both. Some languages enable one to specify this directly, such as having ‘in:’, ‘out:’ & ‘inout:’ parameter types, whereas in C one has to work at a lower level and specify the method for passing the variables choosing one that also allows the desired data transfer direction.
For example, a subroutine like
void Subroutine1(int Parameter1)
{ printf("%d",Parameter1);}
accepts the parameter passed to it in the default C & C++ way which is a copy. Therefore the subroutine can read the value of the variable passed to it but not alter it because any alterations it makes are only made to the copy and lost when the subroutine ends so
void Subroutine2(int Parameter1)
{ Parameter1=96;}
would leave the variable it was called with unchanged not set to 96.
The addition of an ‘&’ to the parameter name in C++ (which was a very confusing choice of symbol because an ‘&’ infront of variables elsewhere in C generate pointers!) like causes the actual variable itself, rather than a copy, to be used as the parameter in the subroutine and therefore can be written to thereby passing data back out the subroutine. Therefore
void Subroutine3(int &Parameter1)
{ Parameter1=96;}
would set the variable it was called with to 96. This method of passing a variable as itself rather than a copy is called a ‘reference’ in C.
That way of passing variables was a C++ addition to C. To pass an alterable variable in original C, a rather involved method using a pointer to the variable as the parameter then altering what it pointed to was used. For example
void Subroutine4(int *Parameter1)
{ *Parameter1=96;}
works but requires the every use of the variable in the called routine so altered and the calling routine altered to pass a pointer to the variable which is rather cumbersome.
But where does ‘const’ come into this? Well, there is a second common use for passing data by reference or pointer instead of copy. That is when copying a the variable would waste too much memory or take too long. This is particularly likely with large compound user-defined variable types (‘structures’ in C & ‘classes’ in C++). So a subroutine declared
void Subroutine4(big_structure_type &Parameter1);
might being using ‘&’ because it is going to alter the variable passed to it or it might just be to save copying time and there is no way to tell which it is if the function is compiled in someone else’s library. This could be a risk if one needs to trust the the subroutine not to alter the variable.
To solve this, ‘const’ can be used the in the parameter list like
void Subroutine4(big_structure_type const &Parameter1);
which will cause the variable to passed without copying but stop it from then being altered. This is messy because it is essentially making an in-only variable passing method from a both-ways variable passing method which was itself made from an in-only variable passing method just to trick the compiler into doing some optimization.
Ideally, the programmer should not need control this detail of specifying exactly how it variables are passed, just say which direction the information goes and leave the compiler to optimize it automatically, but C was designed for raw low-level programming on far less powerful computers than are standard these days so the programmer has to do it explicitly.
My understanding is that it is indeed just a flag. However, that said, you want to add it wherever you can. If you fail to add it, and a function elsewhere in your code does something like
void function(const MyClass& foo)
{
foo.getData();
}
You will run into issues, for the compiler cannot guarantee that getData does not modify foo.
Making member functions const ensures that calling code that has const objects can still call the function. It is about this compiler check - which helps create robust self-documenting code and avoid accidental modifications of objects - and not about run-time performance. So yes, you should always add const if the nature of the function is such that it doesn't need to modify the observable value of the object (it can still modify member variables explicitly prefixed with the mutable keyword, which is intended for some quirky uses like internal caches and counters that don't affect the client-visible behaviour of the object).

Does the compiler optimize the function parameters passed by value?

Lets say I have a function where the parameter is passed by value instead of const-reference. Further, lets assume that only the value is used inside the function i.e. the function doesn't try to modify it. In that case will the compiler will be able to figure out that it can pass the value by const-reference (for performance reasons) and generate the code accordingly? Is there any compiler which does that?
If you pass a variable instead of a temporary, the compiler is not allowed to optimize away the copy if the copy constructor of it does anything you would notice when running the program ("observable behavior": inputs/outputs, or changing volatile variables).
Apart from that, the compiler is free to do everything it wants (it only needs to resemble the observable behavior as-if it wouldn't have optimized at all).
Only when the argument is an rvalue (most temporary), the compiler is allowed to optimize the copy to the by-value parameter even if the copy constructor has observable side effects.
Only if the function is not exported there is a chance the compiler to convert call-by-reference to call-by-value (or vise-versa).
Otherwise, due to the calling convention, the function must keep the call-by-value/reference semantic.
I'm not aware of any general guarantees that this will be done, but if the called function is inlined, then this would then allow the compiler to see that an unnecessary copy is being made, and if the optimization level is high enough, the copy operation would be eliminated. GCC can do this at least.
You might want to think about whether the class of this parameter value has a copy constructor or not. If it doesn't, then the performance difference between pass-by-value and pass-by-const-ref is probably neglible.
On the other hand, if class does have a copy constructor that does stuff, then the optimization you are hoping for probably will not happen because the compiler cannot remove the call to the constructor--it cannot know that the side effects of the constructor are not important to you.
You might be able to get more useful answers if you say what the class of the parameter is, or if it is a custom class, describe what fields it has and whether it has a copy constructor.
With all optimisations the answer is generally "maybe". The only way to check is to examine the output assembly and see what it's really doing. If the standard allows it, whether or not it really happens is down to the whims of the compiler. You should not rely on it happening because an arbitrary change elsewhere in your codebase may change the heuristics used by the optimizer which might cause it to stop performing a certain optimization.
Play it safe: code it how you intend - pass by reference if that's what you want. However, if you're writing templated code which could work on types of any size, the choice is not so clear. Personally I'd side with passing by const reference - the compiler could also perform a different optimisation, where a small type which can fit inside the size of a reference is passed by value, rather than by const reference. But again, it might happen, it might not.
This post is an excellent reference to this kind of optimization:
http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/