What exactly is a 'side-effect' in C++?

What exactly is a 'side-effect' in C++? - c++

Is it a standard term which is well defined, or just a term coined by developers to explain a concept (.. and what is the concept)? As I understand this has something to do with the all-confusing sequence points, but am not sure.
I found one definition here, but doesn't this make each and every statement of code a side effect?
A side effect is a result of an operator, expression, statement, or function that persists even after the operator, expression, statement, or function has finished being evaluated.
Can someone please explain what the term 'side effect' formally means in C++, and what is its significance?
For reference, some questions talking about side effects:
Is comma operator free from side effect?
Force compiler to not optimize side-effect-less statements
Side effects when passing objects to function in C++

A "side effect" is defined by the C++ standard in [intro.execution], by:
Reading an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.

The term "side-effect" arises from the distinction between imperative languages and pure functional languages. A C++ expression can do three things:
compute a result (or compute "no result" in the case of a void expression),
raise an exception instead of evaluating to a result,
in addition to 1 or 2, otherwise alter the state of the abstract machine on which the program is nominally running.
(3) are side-effects, the "main effect" being to evaluate the result of the expression. Exceptions are a slightly awkward special case, in that altering the flow of control does change the state of the abstract machine (by changing the current point of execution), but isn't a side-effect. The code to construct, handle and destroy the exception may have its own side-effects, of course.
The same principles apply to functions, with the return value in place of the result of the expression.
So, int foo(int a, int b) { return a + b; } just computes a return value, it doesn't alter anything else. Therefore it has no side-effects, which sometimes is an interesting property of a function when it comes to reasoning about your program (e.g. to prove that it is correct, or by the compiler when it optimizes). int bar(int &a, int &b) { return ++a + b; } does have a side-effect, since modifying the caller's object a is an additional effect of the function beyond simply computing a return value. It would not be permitted in a pure functional language.
The stuff in your quote about "has finished being evaluated" refers to the fact that the result of an expression (or return value of a function) can be a "temporary object", which is destroyed at the end of the full expression in which it occurs. So creating a temporary isn't a "side-effect" by that definition: other changes are.

What exactly is a 'side-effect' in C++? Is it a standard term which is well defined...
c++11 draft - 1.9.12: Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects. When a call to a library I/O function returns or an access to a volatile object is evaluated the side effect is considered complete, even though some external actions implied by the call (such as the I/O itself) or by the volatile access may not have completed yet.
I found one definition here, but doesn't this make each and every statement of code a side effect?
A side effect is a result of an operator, expression, statement, or function that persists even after the operator, expression, statement, or function has finished being evaluated.
Can someone please explain what the term 'side effect' formally means in C++, and what is its significance?
The significance is that, as expressions are being evaluated they can modify the program state and/or perform I/O. Expressions are allowed in myriad places in C++: variable assignments, if/else/while conditions, for loop setup/test/modify steps, function parameters etc.... A couple examples: ++x and strcat(buffer, "append this").
In a C++ program, the Standard grants the optimiser the right to generate code representing the program operations, but requires that all the operations associated with steps before a sequence point appear before any operations related to steps after the sequence point.
The reason C++ programmers tend to have to care about sequence points and side effects is that there aren't as many sequence points as you might expect. For example: given x = 1; f(++x, ++x);, you may expect a call to f(2, 3) but it's actually undefined behaviour. This behaviour is left undefined so the compiler's optimiser has more freedom to arrange operations with side effects to run in the most efficient order possible - perhaps even in parallel. It also avoid burdening compiler writers with detecting such conditions.
1.Is comma operator free from side effect?
Yes - a comma operator introduces a sequence point: the steps on the left must be complete before those on the right execute. There are a list of sequence points at http://en.wikipedia.org/wiki/Sequence_point - you should read this! (If you have to ask about side effects, then be careful in interpreting this answer - the "comma operator" is NOT invoked between function arguments, array initialisation elements etc.. The comma operator is relatively rarely used and somewhat obscure. Do some reading if you're not sure what the comma operator really is.)
2.Force compiler to not optimize side-effect-less statements
I assume you mean "side-effect-ful" statements. Compiler's are not obliged to support any such option. What behaviour would they exhibit if they tried? - the Standard doesn't define what they should do in such situations. Sometimes a majority of programmers might share an intuitive expectation, but other times it's really arbitary.
3.Side effects when passing objects to function in C++
When calling a function, all the parameters must have been completely evaluated - and their side effects triggered - before the function call takes place. BUT, there are no restrictions on the compiler related to evaluating specific parameter expressions before any other. They can be overlapping, in parallel etc.. So, in f(expr1, expr2) - some of the steps in evaluating expr2 might run before anything from expr1, but expr1 might still complete first - it's undefined.

1.9.6
The observable behavior of the abstract machine is its sequence of
reads and writes to volatile data and calls to library I/O
functions.
A side-effect is anything that affects observable behavior.
Note that there are exceptions specified by the standard, where observable behavior doesn't have to conform to that of the abstract machine - see return value optimization, temporary copy elision.

Related

Show where temporaries are created in C++

What is the fastest way to uncover where temporaries are created in my C++ code?
The answer is not always easily deducible from the standard and compiler optimizations can further eliminate temporaries.
I have experimented with godbolt.org and its fantastic. Unfortunately it often hides the trees behind the wood of assembler when it comes to temporaries. Additionally, aggressive compiler optimization options make the assembler totally unreadable.
Any other means to accomplish this?

"compiler optimizations can further eliminate temporaries."
It seems you have a slight misunderstanding of the C++ semantics. The C++ Standard talks about temporaries to define the formal semantics of a program. This is a compact way to describe a large set of possible executions.
An actual compiler doesn't need to behave at all like this. And often, they won't. Real compilers know about registers, real compilers don't pretend that POD's have (trivial) constructors and destructors. This happens already before optimizations. I don't know of any compiler that will generate trivial ctors in debug mode.
Now some semantics described by the Standard can only be achieved by a fairly close approximation. When destructors have visible side effects (think std::cout), temporaries of those types cannot be entirely eliminated. But real compilers might implement the visible side effect while not allocating any storage. The notion of a temporary existing or not existing is a binary view, and in reality there are intermediate forms.

Due to the "as-if" rule it is probably unreliable to try to view the compilation process to see where temporaries are created.
But reading the code (and coding) while keeping in mind the following paragraph of the standard may help in finding where temporaries are created or not, [class.temporary]/2
The materialization of a temporary object is generally delayed as long as possible in order to avoid creating unnecessary temporary objects. [ Note: Temporary objects are materialized:
when binding a reference to a prvalue ([dcl.init.ref], [expr.type.conv], [expr.dynamic.cast], [expr.static.cast], [expr.const.cast], [expr.cast]),
when performing member access on a class prvalue ([expr.ref], [expr.mptr.oper]),
when performing an array-to-pointer conversion or subscripting on an array prvalue,
when initializing an object of type std::initializer_list from a braced-init-list ([dcl.init.list]),
for certain unevaluated operands ([expr.typeid], [expr.sizeof]), and
when a prvalue appears as a discarded-value expression.
In this paragraph coming from the C++17 standard, the term prvalue has a new definition [basic.lval]/1:
A prvalue is an expression whose evaluation initializes an object or a bit-field, or computes the value of the operand of an operator, as specified by the context in which it appears.
And in the last standard (pre C++20), the paragraph [basic.lval] has been moved to Expressions [expr], so what we knew as value categories is evolving to become expression categories.

C++ short circuit evaluation w.r.t optimization

Simple question, but surprisingly hard to search for.
For the statement A && B I know there is a sequence point between the evaluation of A and B, and I know that the order of evaluation is left-to-right, but what is a compiler allowed to do when it can prove that B is always false (perhaps even explicitly so)?
Namely, for function_with_side_effects() && false is the compiler allowed to optimize away the function call?

A compiler is allowed to optimise out anything, as long as it doesn't break the as-if rule. The as-if rule states that with respect to observable behaviour, a program must behave as if it was executed by the exact rules of the C++ abstract machine (basically normal, unoptimised semantics of code).
Observable behaviour is:
Access to volatile objects
Writing to files
Input & output on interactive devices
As long as the program does the three things above in correct order, it is allowed to deviate from other source code functionality as much as it wants.
Of course, in practice, the number of operations which must be left intact by the compiler is much larger than the above, simply because the compiler has to assume that any function whose code it cannot see can, potentially, have an observable effect.
So, in your case, unless the compiler can prove that no action inside function_with_side_effects can ever affect observable behaviour (directly or indirectly by e.g. setting a flag tested later), it has to execute a call of function_with_side_effects, because it could violate the as-if rule if it didn't.
As #T.C. correctly pointed out in comments, there are a few exceptions to the as-if rule, when a compiler is allowed to perform optimisations which change observable behaviour; the most commonly encountered among these exceptions being copy elision. However, none of the exceptions come into play in the code in question.

No.
In general, the C++ Standard specifies the result of the computation in terms of observable effects and as long as your code is written in a Standard-compliant way (avoid Undefined Behavior, Unspecified Behavior and Implementation-Defined Behavior) then a compliant compiler has to produce the observable effects in the order they are specified.
There are only two caveats in the Standard: Copy Elision when returning a value allows the compiler to omit a call to the Copy Constructor or to the Move Constructor without a care for their (potential) observable effects.
The compiler is otherwise only allowed to optimize non-observable behavior, such as for example using less CPU registers or not writing a value in a memory location you never read afterward.
Note: in C++, the address of an object can be observed, and is thus considered observable; it's low-level like that.
In your particular case, let's refer to the Standard:
[expr.log.and] Logical AND operator
The && operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.
The key here is (2): sequenced after/sequenced before is standardese speak for ordering the observable events.

According to standard:
5.14 Logical AND operator:
1 The && operator groups left-to-right. The operands are both contextually converted to bool.
The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
2 The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the
second expression.
So, according to these rules compiler will generate code where function_with_side_effects() will be evaluated.

The expression function_with_side_effects() && false is the same as function_with_side_effects() except that the result value is unconditionally false. The function call cannot be eliminated. The left operand of && is always evaluated; it is only the right operand whose evaluation is conditional.
Perhaps you're really asking about false && function_with_side_effects()?
In this expression, the function call must not happen. This is obvious statically, from the false. Since it must not happen, the translation of the code can be such that the function isn't referenced in the generated code.
However, suppose that we have false && nonexistent_function() which is the only reference to nonexistent_function, and that function isn't defined anywhere. If this is completely optimized away, then the program will link. Thereby, the implementation will fail to diagnose a violation of the One Definition Rule.
So I suspect that, for conformance, the symbol still has to be referenced, even if it isn't used in the code.

Is a C++ optimizer allowed to move statements across a function call?

Note: No multithreading at all here. Just optimized single-threaded code.
A function call introduces a sequence point. (Apparently.)
Does it follow that a compiler (if the optimizer inlines the function) is not allowed to move/intermingle any instructions prior/after with the function's instructions? (As long as it can "proove" no observable effects obviously.)
Explanatory background:
Now, there is a nice article wrt. a benchmarking class for C++, where the author stated:
The code we time won’t be rearranged by the optimizer and will always
lie between those start / end calls to now(), so we can guarantee our
timing will be valid.
to which I asked how he can be sure, and nick replied:
You can check the comment in this answer
https://codereview.stackexchange.com/a/48884. I quote : “I would be
careful about timing things that are not functions because of
optimizations that the compiler is allowed to do. I am not sure about
the sequencing requirements and the observable behavior understanding
of such a program. With a function call the compiler is not allowed to
move statements across the call point (they are sequenced before or
after the call).”
What we do is basically abstract the callable (function, lambda, block
of code surrounded by lambda) and have a signle call
callable(factor) inside the measure structure that acts as a
barrier (not the barrier in multithreading, I believe I convey the
message).
I am quite unsure about this, especially the quote:
With a function call the compiler is not allowed to
move statements across the call point (they are sequenced before or
after the call).
Now, I was always under the impression that when an optimizer inlines some function (which may very well be the case in a (simple) benchmark scenario), it is free to rearrange whatever it likes as long as it does not affect observable behavior.
That is, as far as the language / the optimizer are concerned, these two snippets are exactly the same:
void f() {
// do stuff / Multiple statements
}
auto start = ...;
f();
auto stop = ...;
vs.
auto start = ...;
// do stuff / Multiple statements
auto stop = ...;

Now, I was always under the impression that when an optimizer inlines
some function (which may very well be the case in a (simple) benchmark
scenario), it is free to rearrange whatever it likes as long as it
does not affect observable behavior.
It absolutely is. The optimizer doesn't even need to inline it for this to occur in theory.
However, timing functions are observable behaviour- specifically, they are I/O on the part of the system. The optimizer cannot know that that I/O will produce the same outcome (it obviously won't) if performed in a different order to other I/O calls, which can include non-obvious things like even memory allocation calls that can invoke syscalls to get their memory.
What this basically means is that by and large, for most function calls, the optimizer can't do a great deal of re-arranging because there's potentially a vast quantity of state involved that it can't reason about.
Furthermore, the optimizer can't really know that re-arranging your function calls will actually make the code run faster, and it will make debugging it harder, so they don't have a great deal of incentive to go screwing around with the program's stated order.
Basically, in theory the optimizer can do this, but in reality it won't because doing so would be a massive undertaking for not a lot of benefit.
You'll only encounter conditions like this if your benchmark is fairly trivial or consists virtually entirely of primitive operations like integer addition- in which case you'll want to check the assembly anyway.

Your concern is perfectly valid, the optimizer is allowed to move anything past a function call if it can prove that this does not change observable behavior (other than runtime, that is).
The point about using a function to stop the optimizer from doing things is not to tell the optimizer about the function. That is, the function must not be inlined, and it must not be included in the same compilation unit. Since optimizers are generally a compiler feature, moving the function definition to a different compilation unit deprives the optimizer of the information necessary to prove anything about the function, and consequently stops it from moving anything across the function call.
Beware that this assumes that there is no linker doing global analysis for optimization. If it does, it can still skrew you.

What the comment you quoted has not considered is that sequence points are not primarily about order of execution (although they do constrain it, they don't act as full barriers), but rather about values of expressions.
C++11 actually gets rid of the "sequence point" terminology completely, and instead discussed ordering of "value computation" and "side effects".
To illustrate, the following code exhibits undefined behavior because it doesn't respect ordering:
int a = 5;
int x = a++ + a;
This version is well-defined:
int a = 5;
a++;
int x = a + a;
When the sequence point / ordering of side effects and value computations guarantees us, is that the a used in x = a + a is 6, not 5. So the compiler cannot rewrite it to:
int a = 5;
int x = a + a;
a++;
However, it's perfectly legal to rewrite it as:
int a = 5;
int x = (a+1) + (a+1);
a++;
The order of execution between assigning x and assigning a isn't constrained, because neither of them is volatile or atomic<T> and they aren't externally visible side effects.

The standard leaves definitively free room for the optimizer to sequence operations across the boundary of a function:
1.9/15 Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or
after the execution of the body of the called function is
indeterminately sequenced with respect to the execution of the called
function.
as long as the as-if rule is respectd:
1.9/5 A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible
executions of the corresponding instance of the abstract machine with
the same program and the same input.
The practice of leaving the optimizer in the blind as suggested by cmaster is in general very effective. By the way, the global optimization issue at linking can also be circumvented using dynamic linking of the benchmarked function.
There is, however, another a hard sequencing constraint that can be used to achieve the same purpose, even within the same compilation unit:
1.9/15 When calling a function (whether or not the function is inline), every value computation and side effect associated with any
argument expression, or with the postfix expression designating the
called function, is sequenced before execution of every expression
or statement in the body of the called function.
So you may use safely an expression like:
my_timer_off(stop, f( my_timer_on(start) ) );
This "functional" writing ensures that:
my_timer_on() is evaluated before any statement of f() is executed,
f() is called before the body of my_timer_off() is executed
thus ensuring the sequence timer-on / f / timer-off (the my_timer_xx would take the start/stop by value).
Of course, this assumes that the signature of the benchmarked function f() can be changed to allow the expression above.

Could a C++ implementation, in theory, parallelise the evaluation of two function arguments?

Given the following function call:
f(g(), h())
since the order of evaluation of function arguments is unspecified (still the case in C++11 as far as I'm aware), could an implementation theoretically execute g() and h() in parallel?
Such a parallelisation could only kick in were g and h known to be fairly trivial (in the most obvious case, accessing only data local to their bodies) so as not to introduce concurrency issues but, beyond that restriction I can't see anything to prohibit it.
So, does the standard allow it? Even if only by the as-if rule?
(In this answer, Mankarse asserts otherwise; however, he does not cite the standard, and my read-through of [expr.call] hasn't revealed any obvious wording.)

The requirement comes from [intro.execution]/15:
... When calling a function ... Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function [Footnote: In other words, function executions do not interleave with each other.].
So any execution of the body of g() must be indeterminately sequenced with (that is, not overlapping with) the evaluation of h() (because h() is an expression in the calling function).
The critical point here is that g() and h() are both function calls.
(Of course, the as-if rule means that the possibility cannot be entirely ruled out, but it should never happen in a way that could affect the observable behaviour of a program. At most, such an implementation would just change the performance characteristics of the code.)

As long as you can't tell, whatever the compiler does to evaluate these functions is entirely up to the compiler. Clearly, the evaluation of the functions cannot involve any access to shared, mutable data as this would introduce data races. The basic guiding principle is the "as if"-rule and the fundamental observable operations, i.e., access to volatile data, I/O operations, access to atomic data, etc. The relevant section is 1.9 [intro.execution].

Not unless the compiler knew exactly what g(), h(), and anything they call does.
The two expressions are function calls, which may have unknown side effects. Therefore, parallelizing them could cause a data-race on those side effects. Since the C++ standard does not allow argument evaluation to cause a data-race on any side effects of the expressions, the compiler can only parallelize them if it knows that no such data race is possible.
That means walking though each function and look at exactly what they do and/or call, then tracking through those functions, etc. In the general case, it's not feasible.

Easy answer: when the functions are sequenced, even if indeterminately, there is no possibility for a race condition between the two, which is not true if they are parallelized. Even a pair of one line "trivial" functions could do it.
void g()
{
*p = *p + 1;
}
void h()
{
*p = *p - 1;
}
If p is a name shared by g and h, then a sequential calling of g and h in any order will result in the value pointed to by p not changing. If they are parallelized, the reading of *p and the assigning of it could be interleaved arbitrarily between the two:
g reads *p and finds the value 1.
f reads *p and also finds the value 1.
g writes 2 to *p.
f, still using the value 1 it read before will write 0 to *p.
Thus, the behavior is different when they are parallelized.

built-in type variable is returned from a function

If a local object is returned in a function call, it has to do at least three steps:
Copy constructor is called to hold a copy.
Destroy local object.
A copy is return.
For example:
x = y + z
If x is an integer object. A copy of y + z should be returned, then a new object is created, then assignment operator of x will take this object as parameter.
So my questions are:
Is the same process used for built-in type such as int, double...?
If they're not the same, how's it done?

The language specification does not say "how it is done" for built-in types and built-in operators. The language simply says that the result of binary + for built-in types is an rvalue - the sum of the operand values. That's it. There's no step-by-step description of what happens when a built-in operator is used (with some exceptions like &&, , etc.).
The reason you can come up with a step-by-step description of how an overloaded operator works (which is what you have in your question) is because the process of evaluating overloaded operator comes through several sequence points. A sequence point in a C++ program implements the concept of discrete time: it is the only thing that separates something that happens before from things that happen after. Without a separating sequence point, there's no "before" and no "after".
In case of an overloaded operator, there quite a few sequence points involved in the process of its evaluation, which is why you can describe this process as a sequence of steps. The process of evaluation of built-in operator + has no sequence points in it, so there absolutely no way to describe what happens there in step-by-step fashion. From the language point of view, the built-in + is evaluated through a blurry indivisible mix of unspecified actions that produce the correct result.
It is done this way to give the compiler better optimization opportunities when evaluating built-in operators.

This depends on several factors, especially the compiler's level of or capacity for optimization. This can also, to some extend, depend on the calling convention.
All built-in types can fit on a register (except for "unusually large" built-in types like "long long int"). Basically, for all calling conventions, if the return type can fit on the EAX register, that is where it's put by the callee and retrieved by the caller. So, that would be the answer to your question.
For larger objects, the procedure that you described is only true in principle, but this whole copy-destroy-temporary-copy-destroy thing is very inefficient and it is amongst the highest priorities of any compiler's optimization algorithm. Since objects are too large to fit on a register. They are typically put on the stack and left there to be retrieved by the caller. Since very often, they are just stored right back into another local variable, compilers will try to merge those stack slots together, and also, often the local variable in the called function will also be at the same slot, so, at the end, you get no copy, no destruction, no temporary, no overhead... that's the ideal "optimized" situation, but the compiler is not always able to realize that and it also requires the object to be of a POD class.

If you're just talking about the example you have right now (x = y + z), then there is no function call - the addition takes place right in the registers.
If you're actually calling a function (x = sum(y, z), say), then you can get a couple different behaviours based on calling conventions and data types (types that can't fit in a single register get special treatment), but with ints, it's pretty safe to assume they'll end up passed back in the EAX register.
No constructors / destructors are involved!
For more about the individual data types, check out the Return Values section on this page - I think these are for the cdecl calling convention, but they should be pretty ubiquitous.
For more on calling conventions in general, Wikipedia (x86 calling conventions) does a pretty thorough job. You'll be interested in cdecl (standard C functions) and thiscall (C++ class-member functions) in particular.
Hope this helps!

There are two questions here. First, your list about what needs to be done for returning a user-defined type from a function is basically correct; except that all actual compilers use return value optimization to avoid the temporary copy.
The other question is "what about built-in types?" Conceptually the same thing happens, it's just that (1) built in types have "trivial constructors" and "trivial destructors" (i.e., the compiler knows there's no need to actually call any functions to construct/destruct these types), (2) the compiler knows more about operations on the built-in types than operations on user-defined types and won't actually need to call functions to, say, add two ints (instead the compiler will just use the relevant assembly code instructions, and (3) the compiler knows more about built-in types than user-defined types and can use return value optimization even more often.
For the record, rvalue references and related changes to C++-0x are largely about giving the programmer more ability to control things like return value optimization.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js