Assume that we have simplest function with local static variable:
int f()
{
static int a = 0;
return ++a;
}
Let's call this function multiple times and print result:
int main()
{
int a = f();
int b = f();
std::cout<<a<<b;
}
Output is "12" - ok, as expected. But this call
int main()
{
std::cout<<f()<<f();
}
produces reverse order - "21". Why?
Because the order in which functions are executed in a compound statement is undefined. This means that by the end of the std::cout<<f()<<f() line, you are guaranteed to have called f() twice, and you are guaranteed to have printed the two results, but which result is first is not defined and can vary across compilers.
There is a difference because f() has side effects. Side effects are results of the function that can't be measured by its return value. In this case, the side effect is that the static variable is modified. If the function had no side effect (or if you were calling multiple functions with no overlapping side effects), which function is called first wouldn't change anything.
This has been asked/answered before: what is wrong here? associativity? evaluation order? how to change order?
Not all operators are ordered in C++. The link has a good explanation.
Related
Suppose I hae to following:
int &f() {
static int x = 0;
return x;
}
void main() { f() = 5; }
I realize that this function returns a reference to an integer
(I have tried to use this Function returning int&).
Does it mean that x will be equal to 5 in this case?
I do not really realize what f() = 5in that...
In addition, what change could it make If would omit 'static' above?.
I know that static int is an integer which exist actually before the program exists, but I am not sure it helps me to understand what change would happen.
I am trying to find out the answers for that with using debugger.
Does it mean that x will be equal to 5 in this case?
Yes. After an integer has been assigned a value, it will be equal to that value.
what change could it make If would omit 'static' above?
Depends on how you intend the program to behave. Only removing static would make the program to have undefined behaviour, so that would not be a good idea. One possible change would be to remove the entire function declaration, and the call to it.
I have the following piece of code :
int f(int &x, int c){
c = c - 1;
if (c == 0) return 1;
x = x + 1;
return f(x, c)*x;
}
Now, suppose I call the above function like this :
int p = 5;
std::cout << f(p, p) << std::endl;
The output is 9^4, since x is passed by reference, hence the final value of x should be 9, but when the return statement of the above function is changed to :
return x*f(x, c);
the output is 3024 (6*7*8*9). Why is there a difference in output ? Has it anything to do with the order of evaluation of Operator* ? If we are asked to predict the output of the above piece of code, is it fixed, compiler-dependent or unspecified ?
When you write:
f(x,c)*x
the compiler may choose to retrieve the stored value in x (for the second operand) either before or after calling f. So there are many possible ways that execution could proceed. The compiler does not have to use any consistency in this choice.
To avoid the problem you could write:
auto x_temp = x;
return f(x, c) * x_temp;
Note: It is unspecified behaviour; not undefined behaviour because there is a sequence point before and after any function call (or in C++11 terminology, statements within a function are indeterminately-sequenced with respect to the calling code, not unsequenced).
The cause is that f() function has side effect on its x parameter. The variable passed to this parameter is incremented by the value of the second parameter c when the function returns.
Therefore when you swap the order of the operand, you get different results as x contains different values before and after the function is called.
However, note that behaviour of the code written in such way is undefined as compiler is free to swap evaluation of operand in any order. So it can behave differently on different platforms, compilers or even with different optimization settings. Because of that it's generally necessary to avoid such side effects. For details see http://en.cppreference.com/w/c/language/eval_order
GCC can suggest functions for attribute pure and attribute const with the flags -Wsuggest-attribute=pure and -Wsuggest-attribute=const.
The GCC documentation says:
Many functions have no effects except the return value and their return value depends only on the parameters and/or global variables. Such a function can be subject to common subexpression elimination and loop optimization just as an arithmetic operator would be. These functions should be declared with the attribute pure.
But what can happen if you attach __attribute__((__pure__)) to a function that doesn't match the above description, and does have side effects? Is it simply the possibility that the function will be called fewer times than you would want it to be, or is it possible to create undefined behaviour or other kinds of serious problems?
Similarly for __attribute__((__const__)) which is stricter again - the documentation states:
Basically this is just slightly more strict class than the pure attribute below, since function is not allowed to read global memory.
But what can actually happen if you attach __attribute__((__const__)) to a function that does access global memory?
I would prefer technical answers with explanations of actual possible scenarios within the scope of GCC / G++, rather than the usual "nasal demons" handwaving that appears whenever undefined behaviour gets mentioned.
But what can happen if you attach __attribute__((__pure__))
to a function that doesn't match the above description,
and does have side effects?
Exactly. Here's a short example:
extern __attribute__((pure)) int mypure(const char *p);
int call_pure() {
int x = mypure("Hello");
int y = mypure("Hello");
return x + y;
}
My version of GCC (4.8.4) is clever enough to remove second call to mypure (result is 2*mypure()). Now imagine if mypure were printf - the side effect of printing string "Hello" would be lost.
Note that if I replace call_pure with
char s[];
int call_pure() {
int x = mypure("Hello");
s[0] = 1;
int y = mypure("Hello");
return x + y;
}
both calls will be emitted (because assignment to s[0] may change output value of mypure).
Is it simply the possibility that the function will be called fewer times
than you would want it to be, or is it possible to create
undefined behaviour or other kinds of serious problems?
Well, it can cause UB indirectly. E.g. here
extern __attribute__((pure)) int get_index();
char a[];
int i;
void foo() {
i = get_index(); // Returns -1
a[get_index()]; // Returns 0
}
Compiler will most likely drop second call to get_index() and use the first returned value -1 which will result in buffer overflow (well, technically underflow).
But what can actually happen if you attach __attribute__((__const__))
to a function that does access global memory?
Let's again take the above example with
int call_pure() {
int x = mypure("Hello");
s[0] = 1;
int y = mypure("Hello");
return x + y;
}
If mypure were annotated with __attribute__((const)), compiler would again drop the second call and optimize return to 2*mypure(...). If mypure actually reads s, this will result in wrong result being produced.
EDIT
I know you asked to avoid hand-waving but here's some generic explanation. By default function call blocks a lot of optimizations inside compiler as it has to be treated as a black box which may have arbitrary side effects (modify any global variable, etc.). Annotating function with const or pure instead allows compiler to treat it more like expression which allows for more aggressive optimization.
Examples are really too numerous to give. The one which I gave above is common subexpression elimination but we could as well easily demonstrate benefits for loop invariants, dead code elimination, alias analysis, etc.
int& foo() {
printf("Foo\n");
static int a;
return a;
}
int bar() {
printf("Bar\n");
return 1;
}
void main() {
foo() = bar();
}
I am not sure which one should be evaluated first.
I have tried in VC that bar function is executed first. However, in compiler by g++ (FreeBSD), it gives out foo function evaluated first.
Much interesting question is derived from the above problem, suppose I have a dynamic array (std::vector)
std::vector<int> vec;
int foobar() {
vec.resize( vec.size() + 1 );
return vec.size();
}
void main() {
vec.resize( 2 );
vec[0] = foobar();
}
Based on previous result, the vc evaluates the foobar() and then perform the vector operator[]. It is no problem in such case. However, for gcc, since the vec[0] is being evaluated and foobar() function may lead to change the internal pointer of array. The vec[0] can be invalidated after executation of foobar().
Is it meant that we need to separate the code such that
void main() {
vec.resize( 2 );
int a = foobar();
vec[0] = a;
}
Order of evaluation would be unspecified in that case. Dont write such code
Similar example here
The concept in C++ that governs whether the order of evaluation is defined is called the sequence point.
Basically, at a sequence point, it is guaranteed that all expressions prior to that point (with observable side effects) have been evaluated, and that no expressions beyond that point have been evaluated yet.
Though some might find it surprising, the assignment operator is not a sequence point. A full list of all sequence points is in the Wikipedia article.
c++17 guarantees that bar() will be executed before foo().
Before c++17 this was unspecified behaviour and different compilers would evaluate in different orders. If both sides of the expression modify the same memory location then the behaviour is undefined.
Order of evaluation of an expression is Unspecified Behaviour.
It depends on the compiler which order it chooses to evaluate.
You should refrain from writing shuch codes.
Though if there is no side effect then the order shouldn't matter.
If the order matters, then your code is wrong/ Not portable/ may give different result accross different compilers**.
In my c++ program, I have this function,
char MostFrequentCharacter(ifstream &ifs, int &numOccurances);
and in main(), is this code,
ifstream in("file.htm");
int maxOccurances = 0;
cout <<"Most freq char is "<<MostFrequentCharacter(in, maxOccurances)<<" : "<<maxOccurances;
But this is not working, though I am getting the correct char, the maxOccurance remains zero.
But if I replace the above code in main with this,
ifstream in("file.htm");
int maxOccurances = 0;
char maxFreq = MostFrequentCharacter(in, maxOccurances);
cout <<"Most freq char is "<<maxFreq<<" : "<<maxOccurances;
Then, it is working correctly. My question is why is it not working in first case.
In C++,
cout << a << b
By Associativity evaluates to:
(cout << a) << b
but the compiler is free to evaluate them in any order.
i.e, the compiler can evaluate b first, then a, then the first << operation and the the second << operation. This because there is no sequence point associated with <<
For the sake of simplicity let us consider the following code, which is equivalent:
#include<iostream>
int main()
{
int i = 0;
std::cout<<i<<i++;
return 0;
}
In the above source code:
std::cout<<i<<i++;
evaluates to the function call:
operator<<(operator<<(std::cout,i),i++);
In this function call whether operator<<(std::cout,i) or i++ gets evaluated first is Unspecified. i.e:
operator<<(std::cout,i) maybe evaluated first Or
i++ maybe evaluated first Or
Some Magic Ordering implemented by the compiler
Given the above, that there is no way to define this ordering and hence no explanation is possible either.
Relevant Quote from the C++03 Standard:
Section 1.9
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Because in the first case, the value of maxOccurances in the expression is being resolved before the call to MostFrequentCharacter. It doesn't have to be that way though, it is unspecified behavior.
You may experience different results with different compilers, or compiler options. If you try that same thing on VC++ for example, I believe you will see different results.
You just have to note that where you see << you are actually calling the operator<< method - so the compiler is working out the value of the arguments to pass into that function before your variable is modified.
In other words, what you have is similar to
operator<<(operator<<(cout, f(x)), x);
...and since the evaluation order of function arguments is undefined, it depends on the compiler.
Cout works right to left in your compiler so first rightmost is evaluated then left one. :)
So the value of referenced variable isn't changed.