C++ function evaluation order in assignment operator - c++

int& foo() {
printf("Foo\n");
static int a;
return a;
}
int bar() {
printf("Bar\n");
return 1;
}
void main() {
foo() = bar();
}
I am not sure which one should be evaluated first.
I have tried in VC that bar function is executed first. However, in compiler by g++ (FreeBSD), it gives out foo function evaluated first.
Much interesting question is derived from the above problem, suppose I have a dynamic array (std::vector)
std::vector<int> vec;
int foobar() {
vec.resize( vec.size() + 1 );
return vec.size();
}
void main() {
vec.resize( 2 );
vec[0] = foobar();
}
Based on previous result, the vc evaluates the foobar() and then perform the vector operator[]. It is no problem in such case. However, for gcc, since the vec[0] is being evaluated and foobar() function may lead to change the internal pointer of array. The vec[0] can be invalidated after executation of foobar().
Is it meant that we need to separate the code such that
void main() {
vec.resize( 2 );
int a = foobar();
vec[0] = a;
}

Order of evaluation would be unspecified in that case. Dont write such code
Similar example here

The concept in C++ that governs whether the order of evaluation is defined is called the sequence point.
Basically, at a sequence point, it is guaranteed that all expressions prior to that point (with observable side effects) have been evaluated, and that no expressions beyond that point have been evaluated yet.
Though some might find it surprising, the assignment operator is not a sequence point. A full list of all sequence points is in the Wikipedia article.

c++17 guarantees that bar() will be executed before foo().
Before c++17 this was unspecified behaviour and different compilers would evaluate in different orders. If both sides of the expression modify the same memory location then the behaviour is undefined.

Order of evaluation of an expression is Unspecified Behaviour.
It depends on the compiler which order it chooses to evaluate.
You should refrain from writing shuch codes.
Though if there is no side effect then the order shouldn't matter.
If the order matters, then your code is wrong/ Not portable/ may give different result accross different compilers**.

Related

Is "assert (this)" a viable pattern?

Suppose the C++ below. Before calling of a->method1() it has an
assert (a) to check if a is sane.
The call a->method2() has no such assertion; instead method2 itself
checks for a valid this by means of assert (this).
It that viable code re. the C++ specification?
Even if it's covered by the standard, it not good style of course, and
it's error prone if the code ever changes, e.g. if the method is
refactored to a virtual method. I am just curios about what the
standard has to say, and whether g++ code words by design or just by
accident.
The code below works as expected with g++, i.e. the assertion in
method2 triggers as intended, because just to call method2 no
this pointer is needed.
#include <iostream>
#include <cassert>
struct A
{
int a;
A (int a) : a(a) {}
void method1 ()
{
std::cout << a << std::endl;
}
void method2 ()
{
assert (this);
std::cout << a << std::endl;
}
};
void func1 (A *a)
{
assert (a);
a->method1();
}
void func2 (A *a)
{
a->method2();
}
int main ()
{
func1 (new A (1));
func2 (new A (2));
func2 (nullptr);
}
Output
1
2
Assertion failed: this, file main.cpp, line 16
Even if it's [permitted] by the standard
It isn't.
it not good style of course
Nope.
and it's error prone if the code ever changes, e.g. if the method is refactored to a virtual method.
I concede that a virtual member function is more likely to cause a "crash" here, but you already have undefined behaviour and that's not just a theoretical concern: you can expect things like the assertion or conditions to be elided, or other weird things to happen.
This pattern is a big no-no.
I am just curios about what the standard has to say
It says:
[expr.ref/2] [..] For the second option (arrow) the first expression shall be a prvalue having pointer type. The expression E1->E2 is converted to the equivalent form (*(E1)).E2 [..]
[expr.unary.op/1] The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. [..]
Notice that it doesn't explicitly say "the object must exist", but by saying that the expression refers to the object, it implicitly tells us that there must be an object. This sort of "gap" falls directly into the definition of undefined behaviour, by design.
whether g++ code words by design or just by accident.
The last one.
Answering your question up front: "C++: Is "assert (this)" a viable pattern?" - No.
assert(this); is pointless. The C++ standard guarantees that the this pointer is never nullptr in valid programs.
If your program has undefined behaviour then all bets are, of course, off and this might be nullptr. But an assert is not the correct fix in that case, fixing the UB is.
this cannot be nullptr, (else there is already undefined behavior).
in your case
a->method2(); // with a == nullptr
invokes undefined behavior, so checking afterward is useless.
Better signature to mean not null pointer is reference:
void func3(A& a)
{
a.method1();
}
int main ()
{
A a1(1); // no new, so no (missing) delete :-)
A a2(2);
func1(&a1);
func2(&a2);
func2(nullptr); :/
func3(a1);
}

Evaluation order of Operator *

I have the following piece of code :
int f(int &x, int c){
c = c - 1;
if (c == 0) return 1;
x = x + 1;
return f(x, c)*x;
}
Now, suppose I call the above function like this :
int p = 5;
std::cout << f(p, p) << std::endl;
The output is 9^4, since x is passed by reference, hence the final value of x should be 9, but when the return statement of the above function is changed to :
return x*f(x, c);
the output is 3024 (6*7*8*9). Why is there a difference in output ? Has it anything to do with the order of evaluation of Operator* ? If we are asked to predict the output of the above piece of code, is it fixed, compiler-dependent or unspecified ?
When you write:
f(x,c)*x
the compiler may choose to retrieve the stored value in x (for the second operand) either before or after calling f. So there are many possible ways that execution could proceed. The compiler does not have to use any consistency in this choice.
To avoid the problem you could write:
auto x_temp = x;
return f(x, c) * x_temp;
Note: It is unspecified behaviour; not undefined behaviour because there is a sequence point before and after any function call (or in C++11 terminology, statements within a function are indeterminately-sequenced with respect to the calling code, not unsequenced).
The cause is that f() function has side effect on its x parameter. The variable passed to this parameter is incremented by the value of the second parameter c when the function returns.
Therefore when you swap the order of the operand, you get different results as x contains different values before and after the function is called.
However, note that behaviour of the code written in such way is undefined as compiler is free to swap evaluation of operand in any order. So it can behave differently on different platforms, compilers or even with different optimization settings. Because of that it's generally necessary to avoid such side effects. For details see http://en.cppreference.com/w/c/language/eval_order

Effects of declaring a function as pure or const to GCC, when it isn't

GCC can suggest functions for attribute pure and attribute const with the flags -Wsuggest-attribute=pure and -Wsuggest-attribute=const.
The GCC documentation says:
Many functions have no effects except the return value and their return value depends only on the parameters and/or global variables. Such a function can be subject to common subexpression elimination and loop optimization just as an arithmetic operator would be. These functions should be declared with the attribute pure.
But what can happen if you attach __attribute__((__pure__)) to a function that doesn't match the above description, and does have side effects? Is it simply the possibility that the function will be called fewer times than you would want it to be, or is it possible to create undefined behaviour or other kinds of serious problems?
Similarly for __attribute__((__const__)) which is stricter again - the documentation states:
Basically this is just slightly more strict class than the pure attribute below, since function is not allowed to read global memory.
But what can actually happen if you attach __attribute__((__const__)) to a function that does access global memory?
I would prefer technical answers with explanations of actual possible scenarios within the scope of GCC / G++, rather than the usual "nasal demons" handwaving that appears whenever undefined behaviour gets mentioned.
But what can happen if you attach __attribute__((__pure__))
to a function that doesn't match the above description,
and does have side effects?
Exactly. Here's a short example:
extern __attribute__((pure)) int mypure(const char *p);
int call_pure() {
int x = mypure("Hello");
int y = mypure("Hello");
return x + y;
}
My version of GCC (4.8.4) is clever enough to remove second call to mypure (result is 2*mypure()). Now imagine if mypure were printf - the side effect of printing string "Hello" would be lost.
Note that if I replace call_pure with
char s[];
int call_pure() {
int x = mypure("Hello");
s[0] = 1;
int y = mypure("Hello");
return x + y;
}
both calls will be emitted (because assignment to s[0] may change output value of mypure).
Is it simply the possibility that the function will be called fewer times
than you would want it to be, or is it possible to create
undefined behaviour or other kinds of serious problems?
Well, it can cause UB indirectly. E.g. here
extern __attribute__((pure)) int get_index();
char a[];
int i;
void foo() {
i = get_index(); // Returns -1
a[get_index()]; // Returns 0
}
Compiler will most likely drop second call to get_index() and use the first returned value -1 which will result in buffer overflow (well, technically underflow).
But what can actually happen if you attach __attribute__((__const__))
to a function that does access global memory?
Let's again take the above example with
int call_pure() {
int x = mypure("Hello");
s[0] = 1;
int y = mypure("Hello");
return x + y;
}
If mypure were annotated with __attribute__((const)), compiler would again drop the second call and optimize return to 2*mypure(...). If mypure actually reads s, this will result in wrong result being produced.
EDIT
I know you asked to avoid hand-waving but here's some generic explanation. By default function call blocks a lot of optimizations inside compiler as it has to be treated as a black box which may have arbitrary side effects (modify any global variable, etc.). Annotating function with const or pure instead allows compiler to treat it more like expression which allows for more aggressive optimization.
Examples are really too numerous to give. The one which I gave above is common subexpression elimination but we could as well easily demonstrate benefits for loop invariants, dead code elimination, alias analysis, etc.

Call function with local static variable

Assume that we have simplest function with local static variable:
int f()
{
static int a = 0;
return ++a;
}
Let's call this function multiple times and print result:
int main()
{
int a = f();
int b = f();
std::cout<<a<<b;
}
Output is "12" - ok, as expected. But this call
int main()
{
std::cout<<f()<<f();
}
produces reverse order - "21". Why?
Because the order in which functions are executed in a compound statement is undefined. This means that by the end of the std::cout<<f()<<f() line, you are guaranteed to have called f() twice, and you are guaranteed to have printed the two results, but which result is first is not defined and can vary across compilers.
There is a difference because f() has side effects. Side effects are results of the function that can't be measured by its return value. In this case, the side effect is that the static variable is modified. If the function had no side effect (or if you were calling multiple functions with no overlapping side effects), which function is called first wouldn't change anything.
This has been asked/answered before: what is wrong here? associativity? evaluation order? how to change order?
Not all operators are ordered in C++. The link has a good explanation.

Why can't a function have a reference argument in C?

For example: void foo( int& i ); is not allowed. Is there a reason for this, or was it just not part of the specification? It is my understanding that references are generally implemented as pointers. In C++, is there any functional difference (not syntactic/semantic) between void foo( int* i ) and void foo( int& i )?
Because references are a C++ feature.
References are merely syntactic vinegar for pointers. Their implementation is identical, but they hide the fact that the called function might modify the variable. The only time they actually fill an important role is for making other C++ features possible - operator overloading comes to mind - and depending on your perspective these might also be syntactic vinegar.
For example: void foo( int& i ); is not allowed. Is there a reason for this, or was it just not part of the specification?
It was not a part of the specification. The syntax "type&" for references were introduced in C++.
It is my understanding that references are generally implemented as pointers. In C++, is there any functional difference (not syntactic/semantic) between void foo( int* i ) and void foo( int& i )?
I am not sure if it qualifies as a semantic difference, but references offer better protection against dereferencing nulls.
Because the & operator has only 2 meanings in C:
address of its operand (unary),
and, the bitwise AND operator (binary).
int &i; is not a valid declaration in C.
For a function argument, the difference between pointer and reference is not that big a deal, but in many cases (e.g. member variables) having references substantially limits what you can do, since it cannot be rebound.
References were not present in C. However, C did have what amounts to mutable arguments passed by reference. Example:
int foo(int in, int *out) { return (*out)++ + in; }
// ...
int x = 1; int y = 2;
x = foo(x, &y);
// x == y == 3.
However, it was a common error to forget to dereference "out" in every usage in more complicated foo()s. C++ references allowed a smoother syntax for representing mutable members of the closure. In both languages, this can confound compiler optimizations by having multiple symbols referring to the same storage. (Consider "foo(x,x)". Now it's undefined whether the "++" occurs after only "*out" or also after "in", since there's no sequence point between the two uses and the increment is only required to happen sometime after the value of the left expression is taken.)
But additionally, explicit references disambiguate two cases to a C++ compiler. A pointer passed into a C function could be a mutable argument or a pointer to an array (or many other things, but these two adequately illustrate the ambiguity). Contrast "char *x" and "char *y". (... or fail to do so, as expected.) A variable passed by reference into a C++ function is unambiguously a mutable member of the closure. If for instance we had
// in class baz's scope
private: int bar(int &x, int &y) {return x - y};
public : int foo(int &x, int &y) {return x + bar(x,y);}
// exit scope and wander on ...
int a = 1; int b = 2; baz c;
a = c.foo(a,b);
We know several things:
bar() is only called from foo(). This means bar() can be compiled so that its two arguments are found in foo()'s stack frame instead of it's own. It's called copy elision and it's a great thing.
Copy elision gets even more exciting when a function is of the form "T &foo(T &)", the compiler knows a temporary is going in and coming out, and the compiler can infer that the result can be constructed in place of the argument. Then no copying of the temporary in or the result out need be compiled in. foo() can be compiled to get its argument from some enclosing stack frame and write its result directly to some enclosing stack frame.
a recent article about copy elision and (surprise) it works even better if you pass by value in modern compilers (and how rvalue references in C++0x will help the compilers skip even more pointless copies), see http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/ .