In C and C++, is there a fixed order for evaluation of parameter to the function? I mean, what do the standards say? Is it left-to-right or right-to-left?
I am getting confusing information from the books.
Is it necessary that function call should be implemented using stack only? What do the C and C++ standards say about this?
C and C++ are two completely different languages; don't assume the same rules always apply to both. In the case of parameter evaluation order, however:
C99:
6.5.2.2 Function calls
...
10 The order of evaluation of the function designator, the actual arguments, and
subexpressions within the actual arguments is unspecified, but there is a sequence point
before the actual call.
[Edit]
C11 (draft):
6.5.2.2 Function calls
...
10 There is a sequence point after the evaluations of the function designator and the actual
arguments but before the actual call. Every evaluation in the calling function (including
other function calls) that is not otherwise specifically sequenced before or after the
execution of the body of the called function is indeterminately sequenced with respect to
the execution of the called function.94)
...
94) In other words, function executions do not ‘‘interleave’’ with each other.
C++:
5.2.2 Function call
...
8 The order of evaluation of arguments is unspecified. All side effects of argument expression evaluations take effect
before the function is entered. The order of evaluation of the postfix expression and the argument expression list is
unspecified.
Neither standard mandates the use of the hardware stack for passing function parameters; that's an implementation detail. The C++ standard uses the term "unwinding the stack" to describe calling destructors for automatically created objects on the path from a try block to a throw-expression, but that's it. Most popular architectures do pass parameters via a hardware stack, but it's not universal.
[Edit]
I am getting confusing information from the books.
This is not in the least surprising, since easily 90% of books written about C are simply crap.
While the language standard isn't a great resource for learning either C or C++, it's good to have handy for questions like this. The official™ standards documents cost real money, but there are drafts that are freely available online, and should be good enough for most purposes.
The latest C99 draft (with updates since original publication) is available here. The latest pre-publication C11 draft (which was officially ratified last year) is available here. And a publicly availble draft of the C++ language is available here, although it has an explicit disclaimer that some of the information is incomplete or incorrect.
Keeping it safe: the standard leaves it up to the compiler to determine the order in which arguments are evaluated. So you shouldn't rely on a specific order being kept.
In C/C++ is there a fixed order for evaluation of parameter to the function. I mean what does standards says is it left-to-right or right-to-left . I am getting confusing information from the books.
No, the order of evaluation of function parameters (and of two sub-expressions in any expression) is unspecified behaviour in C and C++. In plain English that means that the left-most parameter could be evaluated first, or it could be the right-most one, and you cannot know which order that applies for a particular compiler.
Example:
static int x = 0;
int* func (int val)
{
x = val;
return &x;
}
void print (int val1, int val2)
{
cout << val1 << " " << val2 << endl;
}
print(*func(1), *func(2));
This code is very bad. It relies of order of evaluation of print's parameters. It will print either "1 1" (right-to-left) or "2 2" (left-to-right) and we cannot know which. The only thing guaranteed by the standard is that both calls to func() are completed before the call to print().
The solution to this is to be aware that the order is unspecified, and write programs that don't rely on the order of evaluation. For example:
int val1 = *func(1);
int val2 = *func(2);
print(val1, val2); // Will always print "1 2" on any compiler.
Is it necessary that function call should be implemented using stack only. what does C/C++ standards says about this.
This is known as "calling convention" and nothing that the standard specifies at all. How parameters (and return values) are passed, is entirely up to the implementation. They could be passed in CPU registers or on the stack, or in some other way. The caller could be the one responsible for pushing/popping parameters on the stack, or the function could be responsible.
The order of evaluation of function parameters is only somewhat associated with the calling convention, since the evaluation occurs before the function is called. But on the other hand, certain compilers can choose to put the right-most parameter in a CPU register and the rest of them on the stack, as one example.
Just only to speak for C language, the order of evaluation inside the function parameters depend on compiler. from The C Programming Language of Brian Kernighan and Dennis Ritchie;
Similarly, the order in which function arguments are evaluated is not
specified, so the statement
printf("%d %d\n", ++n, power(2, n)); /*WRONG */
can produce different results with different compilers,
depending on whether n is incremented before power is called. The
solution, of course, is to write
++n;
printf("%d %d\n", n, power(2, n));
As far as I know, the function printf has an exception here.
For an ordinary function foo, the order of evaluation of foo(bar(x++), baz(++x)) is undefined. Correct! However printf, being an ellipsis function has a somewhat more definite order of evaluation.
The printf in the standard library, in fact, has no information about the number of arguments that has been sent to. It just tries to figure out the number of variables from the string placeholders; namely, from the number of percent operators (%) in the string. The C compiler starts pushing arguments from the very right towards left; and the address of the string is passed as a last argument. Having no exact information about the number of arguments, printf evaluates the last address (the string) and starts replacing the %'s (from left to right) with values of the corresponding addresses in the stack. That is, for a printf like below;
{
int x = 0;
printf("%d %d %f\n", foo(x), bar(x++), baz(++x));
}
The order of evaluation is:
x is incremented by 1,
function baz is called with x = 1; return value is pushed to the stack,
function bar is called with x = 1; return value is pushed to the stack,
x is incremented by 1,
function foo is called with x = 2; return value is pushed to the stack,
the string address is pushed to the stack.
Now, printf has no information about the number of arguments that has been sent to. Moreover, if -Wall is not issued in compilation, the compiler will not even complain about the inconsistent number of arguments. That is; the string may contain 3 %'s but the number of arguments in the printf line can be 1, 2, 3, 4, 5 or even can contain just the string itself without any arguments at all.
Say, the string has 3 placeholders and you've send 5 arguments (printf("%d %f %s\n", k1, k2, k3, k4, k5)). If compilation warning is turned off, the compiler will not complain about excessive number of arguments (or insufficient number of placeholders). Knowing the stack address, printf will;
treat the first integer width in the stack and print it as an integer (without knowing whether it is an integer or not),
treat the next double width in the stack and print it as a double (without knowing whether it is a double or not),
treat the next pointer width in the stack as a string address and will try to print the string at that address, until it finds a string terminator (provided, the address pointed to belong to that process; otherwise raises segmentation error).
and ignore the remaining arguments.
Related
I've looked at a bunch of questions regarding sequence points, and haven't been able to figure out if the order of evaluation for x*f(x) is guaranteed if f modifies x, and is this different for f(x)*x.
Consider this code:
#include <iostream>
int fx(int &x) {
x = x + 1;
return x;
}
int f1(int &x) {
return fx(x)*x; // Line A
}
int f2(int &x) {
return x*fx(x); // Line B
}
int main(void) {
int a = 6, b = 6;
std::cout << f1(a) << " " << f2(b) << std::endl;
}
This prints 49 42 on g++ 4.8.4 (Ubuntu 14.04).
I'm wondering whether this is guaranteed behavior or unspecified.
Specifically, in this program, fx gets called twice, with x=6 both times, and returns 7 both times. The difference is that Line A computes 7*7 (taking the value of x after fx returns) while Line B computes 6*7 (taking the value of x before fx returns).
Is this guaranteed behavior? If yes, what part of the standard specifies this?
Also: If I change all the functions to use int *x instead of int &x and make corresponding changes to places they're called from, I get C code which has the same issues. Is the answer any different for C?
In terms of evaluation sequence, it is easier to think of x*f(x) as if it was:
operator*(x, f(x));
so that there are no mathematical preconceptions on how multiplication is supposed to work.
As #dan04 helpfully pointed out, the standard says:
Section 1.9.15: “Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.”
This means that the compiler is free to evaluate these arguments in any order, the sequence point being operator* call. The only guarantee is that before the operator* is called, both arguments have to be evaluated.
In your example, conceptually, you could be certain that at least one of the arguments will be 7, but you cannot be certain that both of them will. To me, this would be enough to label this behaviour as undefined; however, #user2079303 answer explains well why it is not technically the case.
Regardless of whether the behaviour is undefined or indeterminate, you cannot use such an expression in a well-behaved program.
The evaluation order of arguments is not specified by the standard, so the behaviour that you see is not guaranteed.
Since you mention sequence points, I'll consider the c++03 standard which uses that term while the later standards have changed wording and abandoned the term.
ISO/IEC 14882:2003(E) §5 /4:
Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified...
There is also discussion on whether this is undefined behaviour or is the order merely unspecified. The rest of that paragraph sheds some light (or doubt) on that.
ISO/IEC 14882:2003(E) §5 /4:
... Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined.
x is indeed modified in f and it's value is read as an operand in the same expression where f is called. And it's not specified whether x reads the modified or non-modified value. That might scream Undefined Behaviour! to you, but hold your horses, because the standard also states:
ISO/IEC 14882:2003(E) §1.9 /17:
... When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body. There is also a sequence point after the copying of a returned value and before the execution of any expressions outside the function 11) ...
So, if f(x) is evaluated first, then there is a sequence point after copying the returned value. So the above rule about UB does not apply because the read of x is not between the next and previous sequence point. The x operand will have the modified value.
If x is evaluated first, then there is a sequence point after evaluating the arguments of f(x) Again, the rule about UB does not apply. In this case x operand will have the non-modified value.
In summary, the order is unspecified but there is no undefined behaviour. It's a bug, but the outcome is predictable to some degree. The behaviour is the same in the later standards, even though the wording changed. I'll not delve into those since it's already covered well in other good answers.
Since you ask about similar situation in C
C89 (draft) 3.3/3:
Except as indicated by the syntax 27 or otherwise specified later (for the function-call operator () , && , || , ?: , and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.
The function call exception is already mentioned here. Following is the paragraph that implies the undefined behaviour if there were no sequence points:
C89 (draft) 3.3/2:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.26
And here are the sequence points defined:
C89 (draft) A.2
The following are the sequence points described in 2.1.2.3
The call to a function, after the arguments have been evaluated (3.3.2.2).
...
... the expression in a return statement (3.6.6.4).
The conclusions are the same as in C++.
A quick note on something I don't see covered explicitly by the other answers:
if the order of evaluation for x*f(x) is guaranteed if f modifies x, and is this different for f(x)*x.
Consider, as in Maksim's answer
operator*(x, f(x));
now there are only two ways of evaluating both arguments before the call as required:
auto lhs = x; // or auto rhs = f(x);
auto rhs = f(x); // or auto lhs = x;
return lhs * rhs
So, when you ask
I'm wondering whether this is guaranteed behavior or unspecified.
the standard doesn't specify which of those two behaviours the compiler must choose, but it does specify those are the only valid behaviours.
So, it's neither guaranteed nor entirely unspecified.
Oh, and:
I've looked at a bunch of questions regarding sequence points, and haven't been able to figure out if the order of evaluation ...
sequence points are a used in the C language standard's treatment of this, but not in the C++ standard.
In the expression x * y, the terms x and y are unsequenced. This is one of the three possible sequencing relations, which are:
A sequenced-before B: A must be evaluated, with all side-effects complete, before B begins evaluationg
A and B indeterminately-sequenced: one of the two following cases is true: A is sequenced-before B, or B is sequenced-before A. It is unspecified which of those two cases holds.
A and B unsequenced: There is no sequencing relation defined between A and B.
It is important to note that these are pair-wise relations. We cannot say "x is unsequenced". We can only say that two operations are unsequenced with respect to each other.
Also important is that these relations are transitive; and the latter two relations are symmetric.
unspecified is a technical term which means that the Standard specifies a set number of possible results. This is different to undefined behaviour which means that the Standard does not cover the behaviour at all. See here for further reading.
Moving onto the code x * f(x). This is identical to f(x) * x, because as discussed above, x and f(x) are unsequenced, with respect to each other, in both cases.
Now we come to the point where several people seem to be coming unstuck. Evaluating the expression f(x) is unsequenced with respect to x. However, it does not follow that any statements inside the function body of f are also unsequenced with respect to x. In fact, there are sequencing relations surrounding any function call, and those relations cannot be ignored.
Here is the text from C++14:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [Note: Value computations and side effects associated with different argument expressions are unsequenced. —end note ]
Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.
with footnote:
In other words, function executions do not interleave with each other.
The bolded text clearly states that for the two expressions:
A: x = x + 1; inside f(x)
B: evaluating the first x in the expression x * f(x)
their relationship is: indeterminately sequenced.
The text regarding undefined behaviour and sequencing is:
If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, and they are not potentially concurrent (1.10), the behavior is undefined.
In this case, the relation is indeterminately sequenced, not unsequenced. So there is no undefined behaviour.
The result is instead unspecified according to whether x is sequenced before x = x + 1 or the other way around. So there are only two possible outcomes, 42 and 49.
In case anyone had qualms about the x in f(x), the following text applies:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function.
So the evaluation of that x is sequenced before x = x + 1. This is an example of an evlauation that falls under the case of "specifically sequenced before" in the bolded quote above.
Footnote: the behaviour was exactly the same in C++03, but the terminology was different. In C++03 we say that there is a sequence point upon entry and exit of every function call, therefore the write to x inside the function is separated from the read of x outside the function by at least one sequence point.
You need to distinguish:
a) Operator precedence and associativity, which controls the order in which the values of subexpressions are combined by their operators.
b) The sequence of subexpression evaluation. E.g. in the expression f(x)/g(x), the compiler can evaluate g(x) first and f(x) afterwards. Nonetheless, the resulting value must be computed by dividing respective sub-values in the right order, of course.
c) The sequence of side-effects of the subexpressions. Roughly speaking, for example, the compiler might, for sake of optimization, decide to write values to the affected variables only at the end of the expression or any other suitable place.
As a very rough approximation, you can say, that within a single expression, the order of evaluation (not associativity etc.) is more or less unspecified. If you need a specific order of evaluation, break down the expression into series of statements like this:
int a = f(x);
int b = g(x);
return a/b;
instead of
return f(x)/g(x);
For exact rules, see http://en.cppreference.com/w/cpp/language/eval_order
Order of evaluation of the operands of almost all C++ operators is
unspecified. The compiler can evaluate operands in any order, and may
choose another order when the same expression is evaluated again
As the order of evaluation is not always the same hence you may get unexpected results.
Order of evaluation
When I run this code, the output is 11, 10.
Why on earth would that be? Can someone give me an explanation of this that will hopefully enlighten me?
Thanks
#include <iostream>
using namespace std;
void print(int x, int y)
{
cout << x << endl;
cout << y << endl;
}
int main()
{
int x = 10;
print(x, x++);
}
The C++ standard states (A note in section 1.9.16):
Value computations and side effects associated with the different argument expressions are unsequenced.
In other words, it's undefined and/or compiler-dependent which order the arguments are evaluated in before their value is passed into the function. So on some compilers (which evaluate the left argument first) that code would output 10, 10 and on others (which evaluate the right argument first) it will output 11, 10. In general you should never rely on undefined behaviour.
To help you understand this, imagine that each argument expression is evaluated before the function is called like so (not that this is exactly how it actually works, it's just an easy way to think of it that will help you understand the sequencing):
int arg1 = x; // This line
int arg2 = x++; // And this line can be swapped.
print(arg1, arg2);
The C++ Standard says that the two argument expression are unsequenced. So, if we write out the argument expressions on separate lines like this, their order should not be significant, because the standard says they can be evaluated in any order. Some compilers might evaluate them in the order above, others might swap them:
int arg2 = x++; // And this line can be swapped.
int arg1 = x; // This line
print(arg1, arg2);
That makes it pretty obvious how arg2 can hold the value 10, while arg1 holds the value 11.
You should always avoid this undefined behaviour in your code.
On a whole the statement:
print(x, x++);
results in an Undefined Behavior. Once a program has an Undefined Behavior it ceases to be an valid C++ program and literally any behavior is possible.So it is pointless to find reasoning for such an program.
Why is this Undefined Behavior?
Lets evaluate the program step by step to the point where we can beyond any doubt prove that it causes Undefined Behavior.
The order of evaluation of arguments to a function is Unspecified[Ref 1].
Unspecified means that an implementation is allowed to implement this particular functionality as it desires and it is not required to document the detail about it.
Applying the above rule to your function call:
print(x, x++);
An implementation might evaluate this as:
Left to Right or
Right to Left or
Any Magical order(in case of more than two function arguments)
In short you cannot rely on an implementation to follow any specific order because it is not required to as per the C++ Standard.
In C/C++ you cannot read or write to a variable more than once without an intervening sequence point[Ref 2].If you do so it results in an Undefined Behavior.Irrespective of whether either of the arguments gets evaluated first in the said function, there is no sequence point between them,a sequence point exists only after evaluation of all function arguments[Ref 3].
In this case x is being accessed without an intervening sequence point and hence it results in an Undefined Behavior.
Simply put it is best to write any code which does not invoke such Undefined Behaviors because once you do so you cannot expect any specific behavior from such a program.
[Ref 1] C++03 Standard §5.2.2.8
Para 8:
[...] The order of evaluation of function arguments is unspecified. [...]
[Ref 2]C++03 5 Expressions [expr]:
Para 4:
....
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
[Ref 3]C++03 1.9 Program execution [intro.execution]:
Para 17:
When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body.
x++ is a function parameter and they may be evaluated in an unspecified order which means the behavior is undefined and not portable (or legal).
I believe this has to do with the function call stack where the last argument goes in first. So x++ is your y and x is the local x in print().
Late answer. Ignoring the issue of the order of evaluation, note that the C++ standard explains how post increment and post decrement operate: " Post-increment and post-decrement creates a copy of the object, increments or decrements the value of the object and returns the copy from before the increment or decrement."
https://en.cppreference.com/w/cpp/language/operator_incdec
As an example where the difference in outcome is significant, consider std::list::splice, such as:
mylist.splice(where, mylist, iter++);
This will move the node pointed by iter to just before the node pointed by where. The sequence will be make a copy of iter to be passed to splice, increment iter, then call splice using the copy of iter before it was incremented. After splice returns, iter will point to the next node after the node iter originally pointed to, as opposed to the next node after iter's new location in the list after it was moved.
In C and C++, is there a fixed order for evaluation of parameter to the function? I mean, what do the standards say? Is it left-to-right or right-to-left?
I am getting confusing information from the books.
Is it necessary that function call should be implemented using stack only? What do the C and C++ standards say about this?
C and C++ are two completely different languages; don't assume the same rules always apply to both. In the case of parameter evaluation order, however:
C99:
6.5.2.2 Function calls
...
10 The order of evaluation of the function designator, the actual arguments, and
subexpressions within the actual arguments is unspecified, but there is a sequence point
before the actual call.
[Edit]
C11 (draft):
6.5.2.2 Function calls
...
10 There is a sequence point after the evaluations of the function designator and the actual
arguments but before the actual call. Every evaluation in the calling function (including
other function calls) that is not otherwise specifically sequenced before or after the
execution of the body of the called function is indeterminately sequenced with respect to
the execution of the called function.94)
...
94) In other words, function executions do not ‘‘interleave’’ with each other.
C++:
5.2.2 Function call
...
8 The order of evaluation of arguments is unspecified. All side effects of argument expression evaluations take effect
before the function is entered. The order of evaluation of the postfix expression and the argument expression list is
unspecified.
Neither standard mandates the use of the hardware stack for passing function parameters; that's an implementation detail. The C++ standard uses the term "unwinding the stack" to describe calling destructors for automatically created objects on the path from a try block to a throw-expression, but that's it. Most popular architectures do pass parameters via a hardware stack, but it's not universal.
[Edit]
I am getting confusing information from the books.
This is not in the least surprising, since easily 90% of books written about C are simply crap.
While the language standard isn't a great resource for learning either C or C++, it's good to have handy for questions like this. The official™ standards documents cost real money, but there are drafts that are freely available online, and should be good enough for most purposes.
The latest C99 draft (with updates since original publication) is available here. The latest pre-publication C11 draft (which was officially ratified last year) is available here. And a publicly availble draft of the C++ language is available here, although it has an explicit disclaimer that some of the information is incomplete or incorrect.
Keeping it safe: the standard leaves it up to the compiler to determine the order in which arguments are evaluated. So you shouldn't rely on a specific order being kept.
In C/C++ is there a fixed order for evaluation of parameter to the function. I mean what does standards says is it left-to-right or right-to-left . I am getting confusing information from the books.
No, the order of evaluation of function parameters (and of two sub-expressions in any expression) is unspecified behaviour in C and C++. In plain English that means that the left-most parameter could be evaluated first, or it could be the right-most one, and you cannot know which order that applies for a particular compiler.
Example:
static int x = 0;
int* func (int val)
{
x = val;
return &x;
}
void print (int val1, int val2)
{
cout << val1 << " " << val2 << endl;
}
print(*func(1), *func(2));
This code is very bad. It relies of order of evaluation of print's parameters. It will print either "1 1" (right-to-left) or "2 2" (left-to-right) and we cannot know which. The only thing guaranteed by the standard is that both calls to func() are completed before the call to print().
The solution to this is to be aware that the order is unspecified, and write programs that don't rely on the order of evaluation. For example:
int val1 = *func(1);
int val2 = *func(2);
print(val1, val2); // Will always print "1 2" on any compiler.
Is it necessary that function call should be implemented using stack only. what does C/C++ standards says about this.
This is known as "calling convention" and nothing that the standard specifies at all. How parameters (and return values) are passed, is entirely up to the implementation. They could be passed in CPU registers or on the stack, or in some other way. The caller could be the one responsible for pushing/popping parameters on the stack, or the function could be responsible.
The order of evaluation of function parameters is only somewhat associated with the calling convention, since the evaluation occurs before the function is called. But on the other hand, certain compilers can choose to put the right-most parameter in a CPU register and the rest of them on the stack, as one example.
Just only to speak for C language, the order of evaluation inside the function parameters depend on compiler. from The C Programming Language of Brian Kernighan and Dennis Ritchie;
Similarly, the order in which function arguments are evaluated is not
specified, so the statement
printf("%d %d\n", ++n, power(2, n)); /*WRONG */
can produce different results with different compilers,
depending on whether n is incremented before power is called. The
solution, of course, is to write
++n;
printf("%d %d\n", n, power(2, n));
As far as I know, the function printf has an exception here.
For an ordinary function foo, the order of evaluation of foo(bar(x++), baz(++x)) is undefined. Correct! However printf, being an ellipsis function has a somewhat more definite order of evaluation.
The printf in the standard library, in fact, has no information about the number of arguments that has been sent to. It just tries to figure out the number of variables from the string placeholders; namely, from the number of percent operators (%) in the string. The C compiler starts pushing arguments from the very right towards left; and the address of the string is passed as a last argument. Having no exact information about the number of arguments, printf evaluates the last address (the string) and starts replacing the %'s (from left to right) with values of the corresponding addresses in the stack. That is, for a printf like below;
{
int x = 0;
printf("%d %d %f\n", foo(x), bar(x++), baz(++x));
}
The order of evaluation is:
x is incremented by 1,
function baz is called with x = 1; return value is pushed to the stack,
function bar is called with x = 1; return value is pushed to the stack,
x is incremented by 1,
function foo is called with x = 2; return value is pushed to the stack,
the string address is pushed to the stack.
Now, printf has no information about the number of arguments that has been sent to. Moreover, if -Wall is not issued in compilation, the compiler will not even complain about the inconsistent number of arguments. That is; the string may contain 3 %'s but the number of arguments in the printf line can be 1, 2, 3, 4, 5 or even can contain just the string itself without any arguments at all.
Say, the string has 3 placeholders and you've send 5 arguments (printf("%d %f %s\n", k1, k2, k3, k4, k5)). If compilation warning is turned off, the compiler will not complain about excessive number of arguments (or insufficient number of placeholders). Knowing the stack address, printf will;
treat the first integer width in the stack and print it as an integer (without knowing whether it is an integer or not),
treat the next double width in the stack and print it as a double (without knowing whether it is a double or not),
treat the next pointer width in the stack as a string address and will try to print the string at that address, until it finds a string terminator (provided, the address pointed to belong to that process; otherwise raises segmentation error).
and ignore the remaining arguments.
I think it should be 01 but someone says its "undefined", any reason for that?
c++ is both an increment and an assignment. When the assignment occurs (before or after other code on that line) is left up to the discretion of the compiler. It can occur after the cout << or before.
This can be found in the C99 standard
http://www.open-std.org/JTC1/SC22/wg14/www/docs/n1124.pdf
You can find it on page 28 in the pdf or section 5.1.2.3
the actual increment of p can occur at any time between the previous sequence point and the next sequence point
Since someone asked for the C++ standard (as this is a C++ question) it can be found in section 1.9.15 page 10 (or 24 in pdf format)
evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced
It also includes the following code block:
i = v[i++]; // the behavior is undefined
i = 7, i++, i++; // i becomes 9
i = i++ + 1; // the behavior is undefined
I feel that the C99 standard's explanation is clearer, but it is true in both languages.
It is undefined behavior if you modify a value and then read it (or try to modify it again) without an intervening sequence point. The concept of a sequence point in C++ is a bit technical (you can read a little about it here), but the bottom line is that stream insertion (<<) is not a sequence point.
The reason why this is undefined behavior is because, in the absence of a sequence point, the compiler is allowed to re-order operations in any way it sees fit. That is, it is permitted to retrieve the value of c (and hold onto it for the second insertion) and then afterwords execute c++ to get the value for the first insertion. So you can't be sure whether the increment will occur before or after the value of c for the second insertion is determined.
The reason it is undefined is that the compiler is free to calculate function parameters in any order. Consider if you where calling a function (because you are, but it's easier to envision when it's in function syntax):
cout.output(c++).output(c);
The compiler may hit the parameters in reverse order, forward order, or whatever. It may call the first output before calculating the parameter to the second output or it may do both and then call.
The behavior is defined but unspecified. The relative order of evaluating the two uses of 'c' in the expression isn't specified. However, if you convert it to functional notation, it looks like this:
cout.operator<<(c++).operator<<(c);
There's a sequence point between evaluating the arguments to a function, and executing the body of the function, and function bodies aren't interleaved, so the result is only unspecified, not undefined behavior.
If you didn't have an overloaded operator:
int c=0;
int a = c++ << c;
Then the behavior would be undefined, because both modifying and using the value of c without an intervening sequence point.
Edit: The sequence litb brings up is simply wrong. The standard specifies (§1.9/17): "When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body."
This clearly written with the idea that arguments are evaluated, then (immediately afterward) the body of the function is executed. The sequence he suggests, in which arguments to one function are evaluated, then arguments to another, then execution of both function bodies doesn't seem to have been intended, but also isn't prohibited. That, however, changes nothing -- the requirement is still that: "...there is a sequence point after the evaluation of all function arguments (if any)..."
The subsequent language about execution of the body does NOT remove the requirement for a sequence point after evaluating all function arguments. All other evaluation, whether of the function body or other function arguments follows that sequence point. I can be as pedantic and perverse as anybody about mis-reading what's clearly intended (but not quite stated) -- but I can't imagine how "there is a sequence point after the evaluation of all function arguments" can be read as meaning "there is NOT a sequence point after the evaluation of all function arguments."
Neil's point is, of course, correct: the syntax I've used above is for member functions. For a non-member overload, the syntax would be more like:
operator<<(operator<<(cout,c++), c);
This doesn't remove the requirement for sequence points either though.
As far as it being unspecified: it's pretty simple really: there's a sequence point after evaluating all function arguments, so all the arguments for one function call must be fully evaluated (including all side effects), then arguments for the other function call can be evaluated (taking into account any side effects from the other) -- BUT there's no requirement about WHICH function call's arguments must be evaluated first or second, so it could be c, then c++, or it could be c++, then c -- but it has to be one or the other, not an interleaving.
As I see it,
f(c++);
is equivalent to:
f(c); c += 1;
And
f(c++,c++);
is equivalent to:
f(c,c); c += 1; c += 1;
But it may be the case that
f(c++,c++);
becomes
f(c,c+1); c+= 2;
An experiment with gcc and clang, first in C
#include <stdio.h>
void f(int a, int b) {
printf("%d %d\n",a,b);
}
int main(int argc, char **argv) {
int c = 0;
f(c++,c++);
return 0;
}
and in C++
#include <iostream>
int main(int argc, char **argv) {
int c = 0;
std::cout << c++ << " " << c++ << std::endl;
return 0;
}
Is interesting, as gcc and g++ compiled results in
1 0
whereas clang compiled results in
0 1
There's been some debate going on in this question about whether the following code is legal C++:
std::list<item*>::iterator i = items.begin();
while (i != items.end())
{
bool isActive = (*i)->update();
if (!isActive)
{
items.erase(i++); // *** Is this undefined behavior? ***
}
else
{
other_code_involving(*i);
++i;
}
}
The problem here is that erase() will invalidate the iterator in question. If that happens before i++ is evaluated, then incrementing i like that is technically undefined behavior, even if it appears to work with a particular compiler. One side of the debate says that all function arguments are fully evaluated before the function is called. The other side says, "the only guarantees are that i++ will happen before the next statement and after i++ is used. Whether that is before erase(i++) is invoked or afterwards is compiler dependent."
I opened this question to hopefully settle that debate.
Quoth the C++ standard 1.9.16:
When calling a function (whether or
not the function is inline), every
value computation and side effect
associated with any argument
expression, or with the postfix
expression designating the called
function, is sequenced before
execution of every expression or
statement in the body of the called
function. (Note: Value computations
and side effects associated with the
different argument expressions are
unsequenced.)
So it would seem to me that this code:
foo(i++);
is perfectly legal. It will increment i and then call foo with the previous value of i. However, this code:
foo(i++, i++);
yields undefined behavior because paragraph 1.9.16 also says:
If a side effect on a scalar object is
unsequenced relative to either another
side effect on the same scalar object
or a value computation using the value
of the same scalar object, the
behavior is undefined.
To build on Kristo's answer,
foo(i++, i++);
yields undefined behavior because the order that function arguments are evaluated is undefined (and in the more general case because if you read a variable twice in an expression where you also write it, the result is undefined). You don't know which argument will be incremented first.
int i = 1;
foo(i++, i++);
might result in a function call of
foo(2, 1);
or
foo(1, 2);
or even
foo(1, 1);
Run the following to see what happens on your platform:
#include <iostream>
using namespace std;
void foo(int a, int b)
{
cout << "a: " << a << endl;
cout << "b: " << b << endl;
}
int main()
{
int i = 1;
foo(i++, i++);
}
On my machine I get
$ ./a.out
a: 2
b: 1
every time, but this code is not portable, so I would expect to see different results with different compilers.
The standard says the side effect happens before the call, so the code is the same as:
std::list<item*>::iterator i_before = i;
i = i_before + 1;
items.erase(i_before);
rather than being:
std::list<item*>::iterator i_before = i;
items.erase(i);
i = i_before + 1;
So it is safe in this case, because list.erase() specifically doesn't invalidate any iterators other than the one erased.
That said, it's bad style - the erase function for all containers returns the next iterator specifically so you don't have to worry about invalidating iterators due to reallocation, so the idiomatic code:
i = items.erase(i);
will be safe for lists, and will also be safe for vectors, deques and any other sequence container should you want to change your storage.
You also wouldn't get the original code to compile without warnings - you'd have to write
(void)items.erase(i++);
to avoid a warning about an unused return, which would be a big clue that you're doing something odd.
It's perfectly OK.
The value passed would be the value of "i" before the increment.
++Kristo!
The C++ standard 1.9.16 makes a lot of sense with respect to how one implements operator++(postfix) for a class. When that operator++(int) method is called, it increments itself and returns a copy of the original value. Exactly as the C++ spec says.
It's nice to see standards improving!
However, I distinctly remember using older (pre-ANSI) C compilers wherein:
foo -> bar(i++) -> charlie(i++);
Did not do what you think! Instead it compiled equivalent to:
foo -> bar(i) -> charlie(i); ++i; ++i;
And this behavior was compiler-implementation dependent. (Making porting fun.)
It's easy enough to test and verify that modern compilers now behave correctly:
#define SHOW(S,X) cout << S << ": " # X " = " << (X) << endl
struct Foo
{
Foo & bar(const char * theString, int theI)
{ SHOW(theString, theI); return *this; }
};
int
main()
{
Foo f;
int i = 0;
f . bar("A",i) . bar("B",i++) . bar("C",i) . bar("D",i);
SHOW("END ",i);
}
Responding to comment in thread...
...And building on pretty much EVERYONE's answers... (Thanks guys!)
I think we need spell this out a bit better:
Given:
baz(g(),h());
Then we don't know whether g() will be invoked before or after h(). It is "unspecified".
But we do know that both g() and h() will be invoked before baz().
Given:
bar(i++,i++);
Again, we don't know which i++ will be evaluated first, and perhaps not even whether i will be incremented once or twice before bar() is called. The results are undefined! (Given i=0, this could be bar(0,0) or bar(1,0) or bar(0,1) or something really weird!)
Given:
foo(i++);
We now know that i will be incremented before foo() is invoked. As Kristo pointed out from the C++ standard section 1.9.16:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [ Note: Value computations and side effects associated with different argument expressions are unsequenced. -- end note ]
Though I think section 5.2.6 says it better:
The value of a postfix ++ expression is the value of its operand. [ Note: the value obtained is a copy of the original value -- end note ] The operand shall be a modifiable lvalue. The type of the operand shall be an arithmetic type or a pointer to a complete effective object type. The value of the operand object is modified by adding 1 to it, unless the object is of type bool, in which case it is set to true. [ Note: this use is deprecated, see Annex D. -- end note ] The value computation of the ++ expression is sequenced before the modification of the operand object. With respect to an indeterminately-sequenced function call, the operation of postfix ++ is a single evaluation. [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single postfix ++ operator. -- end note ] The result is an rvalue. The type of the result is the cv-unqualified version of the type of the operand. See also 5.7 and 5.17.
The standard, in section 1.9.16, also lists (as part of its examples):
i = 7, i++, i++; // i becomes 9 (valid)
f(i = -1, i = -1); // the behavior is undefined
And we can trivially demonstrate this with:
#define SHOW(X) cout << # X " = " << (X) << endl
int i = 0; /* Yes, it's global! */
void foo(int theI) { SHOW(theI); SHOW(i); }
int main() { foo(i++); }
So, yes, i is incremented before foo() is invoked.
All this makes a lot of sense from the perspective of:
class Foo
{
public:
Foo operator++(int) {...} /* Postfix variant */
}
int main() { Foo f; delta( f++ ); }
Here Foo::operator++(int) must be invoked prior to delta(). And the increment operation must be completed during that invocation.
In my (perhaps overly complex) example:
f . bar("A",i) . bar("B",i++) . bar("C",i) . bar("D",i);
f.bar("A",i) must be executed to obtain the object used for object.bar("B",i++), and so on for "C" and "D".
So we know that i++ increments i prior to calling bar("B",i++) (even though bar("B",...) is invoked with the old value of i), and therefore i is incremented prior to bar("C",i) and bar("D",i).
Getting back to j_random_hacker's comment:
j_random_hacker writes: +1, but I had to read the standard carefully to convince myself that this was OK. Am I right in thinking that, if bar() was instead a global function returning say int, f was an int, and those invocations were connected by say "^" instead of ".", then any of A, C and D could report "0"?
This question is a lot more complicated than you might think...
Rewriting your question as code...
int bar(const char * theString, int theI) { SHOW(...); return i; }
bar("A",i) ^ bar("B",i++) ^ bar("C",i) ^ bar("D",i);
Now we have only ONE expression. According to the standard (section 1.9, page 8, pdf page 20):
Note: operators can be regrouped according to the usual mathematical rules only where the operators really are associative or commutative.(7) For example, in the following fragment: a=a+32760+b+5; the expression statement behaves exactly the same as: a=(((a+32760)+b)+5); due to the associativity and precedence of these operators. Thus, the result of the sum (a+32760) is next added to b, and that result is then added to 5 which results in the value assigned to a. On a machine in which overflows produce an exception and in which the range of values representable by an int is [-32768,+32767], the implementation cannot rewrite this expression as a=((a+b)+32765); since if the values for a and b were, respectively, -32754 and -15, the sum a+b would produce an exception while the original expression would not; nor can the expression be rewritten either as a=((a+32765)+b); or a=(a+(b+32765)); since the values for a and b might have been, respectively, 4 and -8 or -17 and 12. However on a machine in which overflows do not produce an exception and in which the results of overflows are reversible, the above expression statement can be rewritten by the implementation in any of the above ways because the same result will occur. -- end note ]
So we might think that, due to precedence, that our expression would be the same as:
(
(
( bar("A",i) ^ bar("B",i++)
)
^ bar("C",i)
)
^ bar("D",i)
);
But, because (a^b)^c==a^(b^c) without any possible overflow situations, it could be rewritten in any order...
But, because bar() is being invoked, and could hypothetically involve side effects, this expression cannot be rewritten in just any order. Rules of precedence still apply.
Which nicely determines the order of evaluation of the bar()'s.
Now, when does that i+=1 occur? Well it still has to occur before bar("B",...) is invoked. (Even though bar("B",....) is invoked with the old value.)
So it's deterministically occurring before bar(C) and bar(D), and after bar(A).
Answer: NO. We will always have "A=0, B=0, C=1, D=1", if the compiler is standards-compliant.
But consider another problem:
i = 0;
int & j = i;
R = i ^ i++ ^ j;
What is the value of R?
If the i+=1 occurred before j, we'd have 0^0^1=1. But if the i+=1 occurred after the whole expression, we'd have 0^0^0=0.
In fact, R is zero. The i+=1 does not occur until after the expression has been evaluated.
Which I reckon is why:
i = 7, i++, i++; // i becomes 9 (valid)
Is legal... It has three expressions:
i = 7
i++
i++
And in each case, the value of i is changed at the conclusion of each expression. (Before any subsequent expressions are evaluated.)
PS: Consider:
int foo(int theI) { SHOW(theI); SHOW(i); return theI; }
i = 0;
int & j = i;
R = i ^ i++ ^ foo(j);
In this case, i+=1 has to be evaluated before foo(j). theI is 1. And R is 0^0^1=1.
To build on MarkusQ's answer: ;)
Or rather, Bill's comment to it:
(Edit: Aw, the comment is gone again... Oh well)
They're allowed to be evaluated in parallel. Whether or not it happens in practice is technically speaking irrelevant.
You don't need thread parallelism for this to occur though, just evaluate the first step of both (take the value of i) before the second (increment i). Perfectly legal, and some compilers may consider it more efficient than fully evaluating one i++ before starting on the second.
In fact, I'd expect it to be a common optimization. Look at it from an instruction scheduling point of view. You have the following you need to evaluate:
Take the value of i for the right argument
Increment i in the right argument
Take the value of i for the left argument
Increment i in the left argument
But there's really no dependency between the left and the right argument. Argument evaluation happens in an unspecified order, and need not be done sequentially either (which is why new() in function arguments is usually a memory leak, even when wrapped in a smart pointer)
It's also undefined what happens when you modify the same variable twice in the same expression.
We do have a dependency between 1 and 2, however, and between 3 and 4.
So why would the compiler wait for 2 to complete before computing 3? That introduces added latency, and it'll take even longer than necessary before 4 becomes available.
Assuming there's a 1 cycle latency between each, it'll take 3 cycles from 1 is complete until the result of 4 is ready and we can call the function.
But if we reorder them and evaluate in the order 1, 3, 2, 4, we can do it in 2 cycles. 1 and 3 can be started in the same cycle (or even merged into one instruction, since it's the same expression), and in the following, 2 and 4 can be evaluated.
All modern CPU's can execute 3-4 instructions per cycle, and a good compiler should try to exploit that.
Sutter's Guru of the Week #55 (and the corresponding piece in "More Exceptional C++") discusses this exact case as an example.
According to him, it is perfectly valid code, and in fact a case where trying to transform the statement into two lines:
items.erase(i);
i++;
does not produce code that is semantically equivalent to the original statement.
To build on Bill the Lizard's answer:
int i = 1;
foo(i++, i++);
might also result in a function call of
foo(1, 1);
(meaning that the actuals are evaluated in parallel, and then the postops are applied).
-- MarkusQ