If there is a C or C++ code like this:
if (func())
;
can compiler optimise out call to function func() if it cannot be sure whether function has any side-effects?
Origin of my question: I sometimes call assert macros in a way like this:
if (func())
assert(0);
if I want to make sure that func() is always called and that asssertion fails in debug mode if func() returns wrong value. But recently I was warned that my code doesn't guarantee that function is always called.
If the compiler cannot prove that optimizing away the call to func does not change the observable behavior of your program, it is not allowed to make the optimization.
So unless the compiler can prove that not calling the function has no observable effect, the call will take place. Note that compilers can be smart sometimes, so if you want to be sure, make sure the function actually does have a side effect. (On the other hand, if it doesn't, you need not care.)
This is known as the as-if rule.
(This is a C++ answer. Please post a question for one programming language only, not two.)
No, a function that may have side effects cannot be optimised out, because then you may be "optimising out" side effects. And since by "side effects" we really mean "the things that your program does", a compiler permitted to do such a thing would not be particularly useful. That's why the standard's "as-if" rule prevents the sort of optimisation you're talking about.
Related
I have a class in which I overloaded the == operator with memcmp() on a specific member. Due to a bad copy done in the code (memcpy called with bigger size than it should) I had a segfault when invoking the == operator.
I understand that UB is mysterious and obviously undefined, but still there is something I noticed that intrigues me.
While debugging, I swapped the == call with it's implementation (i.e. the a==b was swapped with memcmp(a.member_x, b.member_x, SIZE)) and no segfault!
So, is there a difference between using the operator itself and replacing it with the implementation or is this just the UB?
To clarify: Yes, this code includes UB. It's bad and it's results are undefined. What I want to know is: does something different happens when calling an operator or calling it's body? The UB just made me think that a difference might exist (and obviously was fixed)
Undefined Behavior means that "anything can happen". "Anything" includes "working just as intended". It can mean that you can get different behavior without changing anything, and it can mean that you get the same behavior even though you changed something.
In the past, warnings about relying on undefined behavior have often included the proverbial "launching of the nuclear missiles".
However, with modern aggressively optimizing compilers, the behavior can be much more subtle. In the past, undefined behavior would usually lead to "whatever happens happens". E.g. in your example, you would either read "junk" in memory if you are allowed to access it, or segfault if you aren't. But the operation (i.e. "compare these two chunks of memory") would still happen somehow.
This is no longer "guaranteed" (not that there ever were any guarantees when it comes to UB) with modern aggressively optimizing compilers. The compiler will no longer just do nonsense things.
With modern optimizing compilers, the compiler must often decide (or prove) that a certain optimization is safe, i.e. that it doesn't alter observable specified behavior. And since UB means "anything can happen" it means that the part of the optimizer that proves that certain optimizations are safe can "assume anything it wants". In essence, it can assume that all optimizations are safe, and then proceed however it wants to provide the most aggressive optimization possible.
As a result, UB is much less predictable and much less obvious than it once was. For example, UB in one place of the program can lead to the optimizer optimizing something in a way that it changes the behavior of something else in a different part of the program that is connected to this piece of code somehow (e.g. it calls it, or both manipulate the same state).
Let's say we have two threads manipulating shared mutable state. One of the two threads exhibits UB. Then, the optimizer can decide that that thread doesn't manipulate the state ("anything can happen", remember?) and since it can now prove that the state will only ever be accessed by one thread, it can optimize away all locks! [Note: I have no idea whether any compiler in reality does this, but it would be allowed to!]
Here's another example to demonstrate that "anything can happen" really, truly does mean "anything": let's assume there are two possible optimizations that could be applied in some code higher up the stack which calls your operator==. One optimization is only valid if the compiler can prove that operator== will always be true. The other optimization is only valid if the compiler can prove that it will always be false. This means, of course, that neither optimization can be applied since in general, your operator== could return either true or false.
But! We have UB. So, the compiler can decide to just assume that it will always be true and apply optimization #1. Or it could decide that it will always be false and apply optimization #2. Okay, fair enough. However, it can also decide to apply both optimizations! Remember: "anything can happen". Not just "anything that makes sense according to the logical framework of the C++ spec" but "anything" period. If the compiler needs something to be true and false at the same time, it is free to assume so in the presence of UB.
You can think of a modern optimizing compiler as trying to prove theorems about your code, and then applying optimizations based on those proofs. And UB allows it to prove any and all theorems.
This is not currently a problem, but I am concerned if the code gets ported or we change compilers.
I have code with a block
{
MyClass myObj;
// copy some other variables but never touch myObj
.
.
} // expect destructor to be called on myObj
where myObj is never used in the block code but the constructor has a side effect and I rely on the destructor code of MyClass to be executed at the close of the block. This works as expected on my current arm compiler with some optimization turned on.
My question is, is there any thing I need to do, like declaring something volatile or setting some common attribute to prevent an optimizer from detecting myObj as an unused variable or some such.
This is not a C++11 compiler. As I said this is not currently a problem but I did not want to leave an odd future bug for someone else.
Apart from explicitly defined cases like RVO (return value optimization), optimization is not allowed to change the observable behaviour of the program. Optimizations must follow the so called "as-if" rule.
Insofar as the compiler you're using is even marginally compliant with the standard (I'm looking at you Turbo C++). This is a non-issue because the standard makes strong guarantees about construction and destruction. Those guarantees are the foundation of RAII which is the basis of the "Modern" c++ style.
I have very little (read no) compiler expertise, and was wondering if the following code snippet would automatically be optimized by a relatively recent (VS2008+/GCC 4.3+) compiler:
Object objectPtr = getPtrSomehow();
if (objectPtr->getValue() == something1) // call 1
dosomething1;
else if (objectPtr->getValue() == something2) // call N (there are a few more)
dosomething2;
return;
where getValue() simply returns a member variable that is one of an enum. (The call has no observable effect)
My coding style would be to make one call before the "switch" and save the value to compare it against each of the somethingX's, but I was wondering if this was a moot point with today's compilers.
I was also unsure of what to google to find the answer to this myself.
Thank you,
AK
It's not moot, especially if the method is mutable.
If getValue is not declared const, the call can't be optimized away, as subsequent calls could return different values.
If it is declared const, it's easier, but also not trivial for the compiler to optimize the call. It would need access to the implementation, to make sure the call doesn't have side effects. There's also the chance that it returns a different value even if marked const (modifies and returns a global).
Unless the compiler can examine the definition of getValue() while it compiles that piece of code, it can't elide the second call because it doesn't know whether that call has observable effects and whether it returns the same value the second time around.
Even if it sees the definition, it probably (this is my wild guess from having a few peeks at some compilers' internals) won't go out of its way to check that. The only chance you stand is the implementation being trivial and inlined twice, and then caught by common subexpression elimination. EDIT: Since the definition is in the header, and quite small, it's likely that this (inlining and subsequent CSE) will ocurr. Still, if you want to be sure, check the output of g++ -O2 -S or your compiler's equivalent.
So in summary, you shouldn't expect the optimization to occur. Then again, getValue is probably quite cheap, so it's unlikely to be worth the manual optimizations. What's an extra line compared to a couple of machine cycles? Not much, in most cases. If you're writing code where it is much, you shouldn't be asking but just checking it (disassembly/profiling).
As other answers have noted, the compiler generally cannot eliminate the second call since there may be side effects.
However, some compilers have a way of telling the compiler that the function has no side effects and that this optimization is allowed. In GCC, a function may be declared pure. For example:
int square(int) __attribute__((pure));
says that the function has “no effects except to return a value, and [the] return value depends only on the parameters and/or global variables.”
You wrote:
My coding style would be to make one call before the "switch" and save the value to compare
it against each of the somethingX's, but I was wondering if this was a moot point
with today's compilers.
Yes, it's a moot point. What the compiler does is it's business. Your hands will be full trying to write maintainable code without trying to micromanage a piece of software that is far better at its job than any of us will ever hope to be.
Focus on writing maintainable code and trust the compiler to carry out its task. If your later find your code is too slow, then you can worry about optimizing.
Remember the proverb:
Premature optimization is the root of all evil.
Suppose I have the following:
int main() {
SomeClass();
return 0;
}
Without optimization, the SomeClass() constructor will be called, and then its destructor will be called, and the object will be no more.
However, according to an IRC channel that constructor/destructor call may be optimized away if the compiler thinks there's no side effect to the SomeClass constructors/destructors.
I suppose the obvious way to go about this is not to use some constructor/destructor function (e.g use a function, or a static method or so), but is there a way to ensure the calling of the constructors/destructors?
However, according to an IRC channel that constructor/destructor call may be optimized away if the compiler thinks there's no side effect to the SomeClass constructors/destructors.
The bolded part is wrong. That should be: knows there is no observable behaviour
E.g. from § 1.9 of the latest standard (there are more relevant quotes):
A conforming implementation executing a well-formed program shall produce the same observable behavior
as one of the possible executions of the corresponding instance of the abstract machine with the same program
and the same input. However, if any such execution contains an undefined operation, this International
Standard places no requirement on the implementation executing that program with that input (not even
with regard to operations preceding the first undefined operation).
As a matter of fact, this whole mechanism underpins the sinlge most ubiquitous C++ language idiom: Resource Acquisition Is Initialization
Backgrounder
Having the compiler optimize away the trivial case-constructors is extremely helpful. It is what allows iterators to compile down to exactly the same performance code as using raw pointer/indexers.
It is also what allows a function object to compile down to the exact same code as inlining the function body.
It is what makes C++11 lambdas perfectly optimal for simple use cases:
factorial = std::accumulate(begin, end, [] (int a,int b) { return a*b; });
The lambda compiles down to a functor object similar to
struct lambda_1
{
int operator()(int a, int b) const
{ return a*b; }
};
The compiler sees that the constructor/destructor can be elided and the function body get's inlined. The end result is optimal 1
More (un)observable behaviour
The standard contains a very entertaining example to the contrary, to spark your imagination.
§ 20.7.2.2.3
[ Note: The use count updates caused by the temporary object construction and destruction are not
observable side effects, so the implementation may meet the effects (and the implied guarantees) via
different means, without creating a temporary. In particular, in the example:
shared_ptr<int> p(new int);
shared_ptr<void> q(p);
p = p;
q = p;
both assignments may be no-ops. —end note ]
IOW: Don't underestimate the power of optimizing compilers. This in no way means that language guarantees are to be thrown out of the window!
1 Though there could be faster algorithms to get a factorial, depending on the problem domain :)
I'm sure is 'SomeClass::SomeClass()' is not implemented as 'inline', the compiler has no way of knowing that the constructor/destructor has no side effects, and it will call the constructor/destructor always.
If the compiler is optimizing away a visible effect of the constructor/destructor call, it is buggy. If it has no visible effect, then you shouldn't notice it anyway.
However let's assume that somehow your constructor or destructor does have a visible effect (so construction and subsequent destruction of that object isn't effectively a no-op) in such a way that the compiler could legitimately think it wouldn't (not that I can think of such a situation, but then, it might be just a lack of imagination on my side). Then any of the following strategies should work:
Make sure that the compiler cannot see the definition of the constructor and/or destructor. If the compiler doesn't know what the constructor/destructor does, it cannot assume it does not have an effect. Note, however, that this also disables inlining. If your compiler does not do cross-module optimization, just putting the constructor/destructor into a different file should suffice.
Make sure that your constructor/destructor actually does have observable behaviour, e.g. through use of volatile variables (every read or write of a volatile variable is considered observable behaviour in C++).
However let me stress again that it's very unlikely that you have to do anything, unless your compiler is horribly buggy (in which case I'd strongly advice you to change the compiler :-)).
Lets say I have a function where the parameter is passed by value instead of const-reference. Further, lets assume that only the value is used inside the function i.e. the function doesn't try to modify it. In that case will the compiler will be able to figure out that it can pass the value by const-reference (for performance reasons) and generate the code accordingly? Is there any compiler which does that?
If you pass a variable instead of a temporary, the compiler is not allowed to optimize away the copy if the copy constructor of it does anything you would notice when running the program ("observable behavior": inputs/outputs, or changing volatile variables).
Apart from that, the compiler is free to do everything it wants (it only needs to resemble the observable behavior as-if it wouldn't have optimized at all).
Only when the argument is an rvalue (most temporary), the compiler is allowed to optimize the copy to the by-value parameter even if the copy constructor has observable side effects.
Only if the function is not exported there is a chance the compiler to convert call-by-reference to call-by-value (or vise-versa).
Otherwise, due to the calling convention, the function must keep the call-by-value/reference semantic.
I'm not aware of any general guarantees that this will be done, but if the called function is inlined, then this would then allow the compiler to see that an unnecessary copy is being made, and if the optimization level is high enough, the copy operation would be eliminated. GCC can do this at least.
You might want to think about whether the class of this parameter value has a copy constructor or not. If it doesn't, then the performance difference between pass-by-value and pass-by-const-ref is probably neglible.
On the other hand, if class does have a copy constructor that does stuff, then the optimization you are hoping for probably will not happen because the compiler cannot remove the call to the constructor--it cannot know that the side effects of the constructor are not important to you.
You might be able to get more useful answers if you say what the class of the parameter is, or if it is a custom class, describe what fields it has and whether it has a copy constructor.
With all optimisations the answer is generally "maybe". The only way to check is to examine the output assembly and see what it's really doing. If the standard allows it, whether or not it really happens is down to the whims of the compiler. You should not rely on it happening because an arbitrary change elsewhere in your codebase may change the heuristics used by the optimizer which might cause it to stop performing a certain optimization.
Play it safe: code it how you intend - pass by reference if that's what you want. However, if you're writing templated code which could work on types of any size, the choice is not so clear. Personally I'd side with passing by const reference - the compiler could also perform a different optimisation, where a small type which can fit inside the size of a reference is passed by value, rather than by const reference. But again, it might happen, it might not.
This post is an excellent reference to this kind of optimization:
http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/