Inlining infinite recursion - c++

Is this code defined behavior?
inline int a() { return 0 + a(); }
int main() { a(); }
If optimizations are enabled then Clang optimizes it out but GCC doesn't. So the code is not portable in practice. Does the C++ spec say anything about this?

As I discuss in this answer, regardless of the presence of the inline keyword, the behaviour of your code is certainly undefined since you call this function:
[C++11: 1.10/24]: The implementation may assume that any thread will eventually do one of the following:
terminate,
make a call to a library I/O function,
access or modify a volatile object, or
perform a synchronization operation or an atomic operation.
Clang is permitted to elide the entire thing, just as GCC is permitted to run it without inlining and reach a stack overflow. A compiler would also be free to attempt actual inlining, and it is even permitted to crash during compilation in such a case.
Crucially, there is no rule in the standard that makes the semantics for an infinite recursion differ just because a function is marked inline or even actually inlined ([C++11: 7.1.2]).
Of course, I reckon that if you were to never invoke this function, by the as-if rule a compiler can elide it entirely and then you have no problem.

Related

Can compiler optimize-out call to function with possible side effects?

If there is a C or C++ code like this:
if (func())
;
can compiler optimise out call to function func() if it cannot be sure whether function has any side-effects?
Origin of my question: I sometimes call assert macros in a way like this:
if (func())
assert(0);
if I want to make sure that func() is always called and that asssertion fails in debug mode if func() returns wrong value. But recently I was warned that my code doesn't guarantee that function is always called.
If the compiler cannot prove that optimizing away the call to func does not change the observable behavior of your program, it is not allowed to make the optimization.
So unless the compiler can prove that not calling the function has no observable effect, the call will take place. Note that compilers can be smart sometimes, so if you want to be sure, make sure the function actually does have a side effect. (On the other hand, if it doesn't, you need not care.)
This is known as the as-if rule.
(This is a C++ answer. Please post a question for one programming language only, not two.)
No, a function that may have side effects cannot be optimised out, because then you may be "optimising out" side effects. And since by "side effects" we really mean "the things that your program does", a compiler permitted to do such a thing would not be particularly useful. That's why the standard's "as-if" rule prevents the sort of optimisation you're talking about.

Is a C++ optimizer allowed to move statements across a function call?

Note: No multithreading at all here. Just optimized single-threaded code.
A function call introduces a sequence point. (Apparently.)
Does it follow that a compiler (if the optimizer inlines the function) is not allowed to move/intermingle any instructions prior/after with the function's instructions? (As long as it can "proove" no observable effects obviously.)
Explanatory background:
Now, there is a nice article wrt. a benchmarking class for C++, where the author stated:
The code we time won’t be rearranged by the optimizer and will always
lie between those start / end calls to now(), so we can guarantee our
timing will be valid.
to which I asked how he can be sure, and nick replied:
You can check the comment in this answer
https://codereview.stackexchange.com/a/48884. I quote : “I would be
careful about timing things that are not functions because of
optimizations that the compiler is allowed to do. I am not sure about
the sequencing requirements and the observable behavior understanding
of such a program. With a function call the compiler is not allowed to
move statements across the call point (they are sequenced before or
after the call).”
What we do is basically abstract the callable (function, lambda, block
of code surrounded by lambda) and have a signle call
callable(factor) inside the measure structure that acts as a
barrier (not the barrier in multithreading, I believe I convey the
message).
I am quite unsure about this, especially the quote:
With a function call the compiler is not allowed to
move statements across the call point (they are sequenced before or
after the call).
Now, I was always under the impression that when an optimizer inlines some function (which may very well be the case in a (simple) benchmark scenario), it is free to rearrange whatever it likes as long as it does not affect observable behavior.
That is, as far as the language / the optimizer are concerned, these two snippets are exactly the same:
void f() {
// do stuff / Multiple statements
}
auto start = ...;
f();
auto stop = ...;
vs.
auto start = ...;
// do stuff / Multiple statements
auto stop = ...;
Now, I was always under the impression that when an optimizer inlines
some function (which may very well be the case in a (simple) benchmark
scenario), it is free to rearrange whatever it likes as long as it
does not affect observable behavior.
It absolutely is. The optimizer doesn't even need to inline it for this to occur in theory.
However, timing functions are observable behaviour- specifically, they are I/O on the part of the system. The optimizer cannot know that that I/O will produce the same outcome (it obviously won't) if performed in a different order to other I/O calls, which can include non-obvious things like even memory allocation calls that can invoke syscalls to get their memory.
What this basically means is that by and large, for most function calls, the optimizer can't do a great deal of re-arranging because there's potentially a vast quantity of state involved that it can't reason about.
Furthermore, the optimizer can't really know that re-arranging your function calls will actually make the code run faster, and it will make debugging it harder, so they don't have a great deal of incentive to go screwing around with the program's stated order.
Basically, in theory the optimizer can do this, but in reality it won't because doing so would be a massive undertaking for not a lot of benefit.
You'll only encounter conditions like this if your benchmark is fairly trivial or consists virtually entirely of primitive operations like integer addition- in which case you'll want to check the assembly anyway.
Your concern is perfectly valid, the optimizer is allowed to move anything past a function call if it can prove that this does not change observable behavior (other than runtime, that is).
The point about using a function to stop the optimizer from doing things is not to tell the optimizer about the function. That is, the function must not be inlined, and it must not be included in the same compilation unit. Since optimizers are generally a compiler feature, moving the function definition to a different compilation unit deprives the optimizer of the information necessary to prove anything about the function, and consequently stops it from moving anything across the function call.
Beware that this assumes that there is no linker doing global analysis for optimization. If it does, it can still skrew you.
What the comment you quoted has not considered is that sequence points are not primarily about order of execution (although they do constrain it, they don't act as full barriers), but rather about values of expressions.
C++11 actually gets rid of the "sequence point" terminology completely, and instead discussed ordering of "value computation" and "side effects".
To illustrate, the following code exhibits undefined behavior because it doesn't respect ordering:
int a = 5;
int x = a++ + a;
This version is well-defined:
int a = 5;
a++;
int x = a + a;
When the sequence point / ordering of side effects and value computations guarantees us, is that the a used in x = a + a is 6, not 5. So the compiler cannot rewrite it to:
int a = 5;
int x = a + a;
a++;
However, it's perfectly legal to rewrite it as:
int a = 5;
int x = (a+1) + (a+1);
a++;
The order of execution between assigning x and assigning a isn't constrained, because neither of them is volatile or atomic<T> and they aren't externally visible side effects.
The standard leaves definitively free room for the optimizer to sequence operations across the boundary of a function:
1.9/15 Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or
after the execution of the body of the called function is
indeterminately sequenced with respect to the execution of the called
function.
as long as the as-if rule is respectd:
1.9/5 A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible
executions of the corresponding instance of the abstract machine with
the same program and the same input.
The practice of leaving the optimizer in the blind as suggested by cmaster is in general very effective. By the way, the global optimization issue at linking can also be circumvented using dynamic linking of the benchmarked function.
There is, however, another a hard sequencing constraint that can be used to achieve the same purpose, even within the same compilation unit:
1.9/15 When calling a function (whether or not the function is inline), every value computation and side effect associated with any
argument expression, or with the postfix expression designating the
called function, is sequenced before execution of every expression
or statement in the body of the called function.
So you may use safely an expression like:
my_timer_off(stop, f( my_timer_on(start) ) );
This "functional" writing ensures that:
my_timer_on() is evaluated before any statement of f() is executed,
f() is called before the body of my_timer_off() is executed
thus ensuring the sequence timer-on / f / timer-off (the my_timer_xx would take the start/stop by value).
Of course, this assumes that the signature of the benchmarked function f() can be changed to allow the expression above.

Can functions be optimized away if they have side effects?

I want to initialize some static data on the main thread.
int32_t GetFoo(ptime t)
{
static HugeBarData data;
return data.Baz(t);
}
int main()
{
GetFoo(); // Avoid data race on static field.
// But will it be optimized away as unnecessary?
// Spawn threads. Call 'GetFoo' on the threads.
}
If the complier may decide to remove it, how can I force it to stay there?
The only side-effecting functions that a C++ compiler can optimize away are unnecessary constructor calls, particularly copy constructors.
Cf Under what conditions does C++ optimize out constructor calls?
Compilers must optimize according to the "as-if" rule. That is, after any optimization, the program must still behave (in the logical sense) as if the code were not optimized.
If there are side-effects to a function, any optimization must preserve the side effects. However, if the compiler can determine that the result of the side-effects don't affect the rest of the program, it can optimize away even the side-effects. Compilers are very conservative about this area. If your compiler optimizes away side-effects of the HugeBarData constructor or Baz call, which are required elsewhere in the program, this is a bug in the compiler.
There are some exceptions where the compiler can make optimizations which alter the behaviour of the program from the non-optimized case, usually involving copies. I don't think any of those exceptions apply here.

Does it make sense to declare inline functions noexcept?

From what I can tell, the SO community is divided on whether declaring a function noexcept enables meaningful compiler optimizations that would not otherwise be possible. (I'm talking specifically about compiler optimizations, not library implementation optimizations based on move_if_noexcept.) For purposes of this question, let's assume that noexcept does make meaningful code-generation optimizations possible. With that assumption, does it make sense to declare inline functions noexcept? Assuming such functions are actually inlined, this would seem to require that compilers generate the equivalent of a try block around the code resulting from the inline function at the call site, because if an exception arises in that region, terminate must be called. Without noexcept, that try block would seem to be unnecessary.
My original interest was in whether it made sense to declare Lambda functions noexcept, given that they are implicitly inline, but then I realized that the same issues arise for any inline function, not just Lambdas.
let's assume that noexcept does make meaningful code-generation optimizations possible
OK
Assuming such functions are actually inlined, this would seem to
require that compilers generate the equivalent of a try block around
the code resulting from the inline function at the call site, because
if an exception arises in that region
Not necessarily, because it might be that the compiler can look at the function body and see that it cannot possibly throw anything. Therefore the nominal exception-handling can be elided.
If the function is "fully" inlined (that is, if the inlined code contains no function calls) then I would expect that the compiler can fairly commonly make this determination -- but not for example in a case where there's a call to vector::push_back() and the writer of the function knows that sufficient space has been reserved but the compiler doesn't.
Be aware also that in a good implementation a try block might not actually require any code at all to be executed in the case where nothing is thrown.
With that assumption, does it make sense to declare inline functions noexcept?
Yes, in order to get whatever the assumed optimizations are of noexcept.
It is worth noting that there was an interesting discussion in circles of power about nothrow-related issues. I highly recommend reading these:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3227.html
http://www.stroustrup.com/N3202-noexcept.pdf
Apparently, quite influential people are interested in adding some sort of automatic nothrow deduction to C++.
After some pondering I've changed my position to almost opposite, see below
Consider this:
when you call a function that has noexcept on declaration -- you benefit from this (no need to deal with unwindability, etc)
when compiler compiles a function that has noexcept on definion -- (unless compiler can prove that function is indeed nothrow) performance suffers (now compiler needs to ensure that no exception can escape this function). You are asking it to enforce no-exceptions promise
I.e. noexcept both hurts you and benefits you. Which is not the case if function is inlined! When it is inlined -- there is no benefit from noexcept on declaration whatsoever (declaration and definition become one thing)... That is unless you are actually want compiler to enforce this for safety sake. And by safety I mean you'd rather terminate than produce wrong result.
It should be obvious now -- there is no point declaring inlined functions noexcept (keep in mind that not every inline function is gonna get inlined).
Lets have a look at different categories of functions which don't throw (you just know they don't):
non-inlined, compiler can prove it doesn't throw -- noexcept won't hurt function body (compiler will simply ignore specification) and call sites will benefit from this promise
non-inlined, compiler can't prove it doesn't throw -- noexcept will hurt function body, but benefit call sites (hard to tell what is more beneficial)
inlined, compiler can prove it doesn't throw -- noexcept serves no purpose
inlined, compiler can't prove it doesn't throw -- noexcept will hurt call site
As you see, nothrow is simply badly designed language feature. It only works if you want to enforce no-exception promise. There is no way to use it correctly -- it can give you "safety", but not performance.
noexcept keyword ended up being used both as promise (on declaration) and enforcement(on definition) -- not a perfect approach, I think (lol, second stab at exception specs and we still didn't get it right).
So, what to do?
declare your behavior (alas, language has nothing to help you here)! E.g.:
void push_back(int k); // throws only if there is no unused memory
don't put noexcept on inline functions (unless it is unlikely to be inlined, e.g. very large)
for non-inline functions (or function that is unlikely to be inlined) -- make a call. The larger function gets the smaller noexcept's negative effect becomes (comparatively) -- at some point it probably makes sense specifying it for callers' benefit
use noexcept on move constructor and move assignment operator (and destructor?). It could affects them negatively, but if you don't -- certain library functions (std::swap, some container operations) won't take the most efficient path (or won't provide the best exception guarantee). Basically any place that uses noexcept operator on your function (as of now) will force you to use noexcept specifier.
use noexcept if you don't trust calls your function makes and rather die than have it behave unexpectedly
pure virtual functions -- more often than not you don't trust people implementing these interfaces. Often it makes sense buying insurance (by specifying noexcept)
Well, how else noexcept could be designed?
I'd use two different keywords -- one for declaring a promise and another for enforcing it. E.g. noexcept and force_noexcept. In fact, enforcement isn't really required -- it can be done with try/catch + terminate() (yes, it will unwind the stack, but who cares if it is followed by std::terminate()?)
I'd force compiler to analyze all calls in given function to determine if it can throw. If it does and a noexcept promise was made -- compiler error will be emitted
For code that can throw, but you know it doesn't there should be a way to assure compiler that it is ok. Smth like this:
vector<int> v;
v.reserve(1);
...
nothrow { // memory pre-allocated, this won't throw
v.push_back(10);
}
if promise is broken (i.e. someone changed vector code and now it provides other guarantees) -- undefined behavior.
Disclaimer: this approach could be too impractical, who knows...

C++: Calling a constructor to a temporary object

Suppose I have the following:
int main() {
SomeClass();
return 0;
}
Without optimization, the SomeClass() constructor will be called, and then its destructor will be called, and the object will be no more.
However, according to an IRC channel that constructor/destructor call may be optimized away if the compiler thinks there's no side effect to the SomeClass constructors/destructors.
I suppose the obvious way to go about this is not to use some constructor/destructor function (e.g use a function, or a static method or so), but is there a way to ensure the calling of the constructors/destructors?
However, according to an IRC channel that constructor/destructor call may be optimized away if the compiler thinks there's no side effect to the SomeClass constructors/destructors.
The bolded part is wrong. That should be: knows there is no observable behaviour
E.g. from § 1.9 of the latest standard (there are more relevant quotes):
A conforming implementation executing a well-formed program shall produce the same observable behavior
as one of the possible executions of the corresponding instance of the abstract machine with the same program
and the same input. However, if any such execution contains an undefined operation, this International
Standard places no requirement on the implementation executing that program with that input (not even
with regard to operations preceding the first undefined operation).
As a matter of fact, this whole mechanism underpins the sinlge most ubiquitous C++ language idiom: Resource Acquisition Is Initialization
Backgrounder
Having the compiler optimize away the trivial case-constructors is extremely helpful. It is what allows iterators to compile down to exactly the same performance code as using raw pointer/indexers.
It is also what allows a function object to compile down to the exact same code as inlining the function body.
It is what makes C++11 lambdas perfectly optimal for simple use cases:
factorial = std::accumulate(begin, end, [] (int a,int b) { return a*b; });
The lambda compiles down to a functor object similar to
struct lambda_1
{
int operator()(int a, int b) const
{ return a*b; }
};
The compiler sees that the constructor/destructor can be elided and the function body get's inlined. The end result is optimal 1
More (un)observable behaviour
The standard contains a very entertaining example to the contrary, to spark your imagination.
§ 20.7.2.2.3
[ Note: The use count updates caused by the temporary object construction and destruction are not
observable side effects, so the implementation may meet the effects (and the implied guarantees) via
different means, without creating a temporary. In particular, in the example:
shared_ptr<int> p(new int);
shared_ptr<void> q(p);
p = p;
q = p;
both assignments may be no-ops. —end note ]
IOW: Don't underestimate the power of optimizing compilers. This in no way means that language guarantees are to be thrown out of the window!
1 Though there could be faster algorithms to get a factorial, depending on the problem domain :)
I'm sure is 'SomeClass::SomeClass()' is not implemented as 'inline', the compiler has no way of knowing that the constructor/destructor has no side effects, and it will call the constructor/destructor always.
If the compiler is optimizing away a visible effect of the constructor/destructor call, it is buggy. If it has no visible effect, then you shouldn't notice it anyway.
However let's assume that somehow your constructor or destructor does have a visible effect (so construction and subsequent destruction of that object isn't effectively a no-op) in such a way that the compiler could legitimately think it wouldn't (not that I can think of such a situation, but then, it might be just a lack of imagination on my side). Then any of the following strategies should work:
Make sure that the compiler cannot see the definition of the constructor and/or destructor. If the compiler doesn't know what the constructor/destructor does, it cannot assume it does not have an effect. Note, however, that this also disables inlining. If your compiler does not do cross-module optimization, just putting the constructor/destructor into a different file should suffice.
Make sure that your constructor/destructor actually does have observable behaviour, e.g. through use of volatile variables (every read or write of a volatile variable is considered observable behaviour in C++).
However let me stress again that it's very unlikely that you have to do anything, unless your compiler is horribly buggy (in which case I'd strongly advice you to change the compiler :-)).