#include <iostream>
using namespace std;
int main()
{
int arr[3] = { 10, 20, 30 };
cout << arr[-2] << endl;
cout << -2[arr] << endl;
return 0;
}
Output:
4196160
-30
Here arr[-2] is out of range and invalid, causing undefined behavior.
But -2[arr] evaluates to -30. Why?
Isn't arr[-2] equivalent to -2[arr]?
-2[arr] is parsed as -(2[arr]). In C (and in C++, ignoring overloading), the definition of X[Y] is *(X+Y) (see more discussion of this in this question), which means that 2[arr] is equal to arr[2].
The compiler parses this expression
-2
like
unary_minus decimal_integer_literal
That is definitions of integer literals do not include signs.
In turn the expression
2[arr]
is parsed by the compiler as a postfix expression.
Postfix expressions have higher precedence than unary expressions. Thus this expression
-2[arr]
is equivalent to
- ( 2[arr] )
So the unary minus is applied to the lvalue returned by the postfix expression 2[arr].
On the other hand if you wrote
int n = -2;
and then
n[arr]
then this expression would be equivalent to
arr[-2]
-2[arr] is equivalent to -(2[arr]), which is equivalent to -arr[2]. However, (-2)[arr] is equivalent to arr[-2].
This is because E1[E2] is identical to (*((E1)+(E2)))
The underlying problem is with operator precedence. In C++ the [], ie the Subscript operator hold more precedence (somewhat akin to preferance) than the - unary_minus operator.
So when one writes,
arr[-2]
The compiler first executes arr[] then - , but the unary_minus is enclosed within the bounds of the [-2] so the expression is decomposed together.
In the,
-2[arr]
The same thing happens but, the compiler executes 2[] first the n the - operator so it ends up being
-(2[arr]) not (-2)[arr]
Your understanding of the concept that,
arr[i] i[arr] and *(i+arr) are all the same is correct. They are all equivalent expressions.
If you want to write in that way, write it as (-2)[arr]. You will get the same value for sure.
Check this out for future referance :http://en.cppreference.com/w/cpp/language/operator_precedence
Related
Does spaces have any meaning in these expressions:
assume:
int a = 1;
int b = 2;
1)
int c = a++ +b;
Or,
2)
int c = a+ ++b;
When I run these two in visual studio, I get different results. Is that the correct behavior, and what does the spec says?
In general, what should be evaluated first, post-increment or pre-increment?
Edit: I should say that
c =a+++b;
Does not compile on visual studio. But I think it should. The postfix++ seems to be evaluated first.
Is that the correct behavior
Yes, it is.
Postfix ++ first returns the current value, then increments it. so int c = a++ +b means compute the value of c as the sum between current a(take the current a value, and only after taking it, increment a) and b;
Prefix ++ first increments the current value, then returns the value already incremented, so in this case, int c = a+ ++b means compute c as the sum between a and the return of the next expression, ++b, which means b is first incremented, then returned.
In general, what should be evaluated first, post-increment or
pre-increment?
In this example, it is not about which gets evaluated first, it is about what each does - postfix first returns the value, then increments it; prefix first increments the value, then returns it.
Hth
Maybe it helps to understand the general architecture of how programs are parsed.
In a nutshell, there are two stages to parsing a program (C++ or others): lexer and parser.
The lexer takes the text input and maps it to a sequence of symbols. This is when spaces are handled because they tell where the symbols are. Spaces really matter at some places (like between int and c, to not confuse with the symbol intc) but not others (like between a and ++ because there is no ambiguity to separate them).
The first example:
int c = a++ +b;
gives the following symbols, each on its own row (implementations may do this in slightly different ways of course):
int
c
=
a
++
+
b
;
While in the other case:
int c = a+ ++b;
the symbols are instead:
int
c
=
a
+
++
b
;
The parser then builds a tree (Abstract Syntax Tree, AST) out of the symbols and according to some grammar. In particular, according to the C++ grammar, + as an addition has a lower precedence than the unary ++ operator (regardless of postfix or prefix). This means that the first example is semantically the same as (a++) + b while the second is like a+ (++b).
For your examples, the ASTs will be different, because the spaces already lead to a different output at the lexer phase.
Note that spaces are not required between ++ and +, so a+++b would theoretically be fine, but this is not recommended for readability. So, some spaces are important for technical reasons while others are important for us users to read the code.
Yes they should be different; the behaviour is correct.
There are a few possible sources for your confusion.
This question is not about "spaces in operators". You have different operators. If you were to remove the space, you would have a different question. See What is i+++ increment in c++
It's also not about "what should be evaluated first, post-increment or pre-increment". It's about understanding the difference between post-increment and pre-increment.
Both increment the variable to which they apply.
But the post-increment expression returns the value from before the increment.
Whereas the pre-increment expression returns the value after the increment.
I.e.
//Given:
int a = 1;
int b = 2;
//Post-increment
int c = a++ +b; =>
1 + 2; (and a == 2) =>
3;
//Pre-increment
int c = a+ ++b; =>
1 + 3; (and b == 3) =>
4;
Another thing that might be causing confusion. You wrote: a++ +b;. And you may be assuming that +b is the unary + operator. This would be an incorrect assumption because you have both left and right operands making that + a binary additive operator (as in x + y).
Final possible confusion. You may be wondering why:
in a++ +b the ++ is a post-increment operator applied to a.
whereas in a+ ++b it's a pre-increment operator applied to b.
The reason is that ++ has higher precedence than the binary additive +. And in both cases it would be impossible to apply ++ to +.
I got the this:
int main(){
int Array[] = { 10, 20, 30 };
cout << -2[Array] << endl;
system("Pause");
return 0;
}
The output is:
-30
I want to know why the output is -30 and why causes this undefined behavior?
does anyone knows?
-2[Array] is parsed as -(2[Array]), since subscripting has higher precedence than unary minus.
Now, 2[Array] is just a weird way to write Array[2], so you get -Array[2], i.e. -30. No undefined behavior is involved in the whole expression.
This is fairly simple.
First, let's analyse the expression:
-2[Array]
is
-(2[Array])
Now a[b] is *(a+b) and since addition is commutative this is also *(b+a) i.e. Array[2].
Array[2] is 30; -Array[2] is -30. Thus, -2[Array] is also -30.
I sincerely hope you do not intend to use this in production code.
In C++, the following statement is true:
a[5] == 5[a]
This is because the syntax using [] is converted to:
*(a + 5)
E.g.
a[5] == *(a + 5)
Which means that
5[a] == *(5 + a)
Thus the notation -2[Array] is converted to - *(2 + Array).
It applies the unary operator (-, minus sign) to pointer incremented by the value outside of [] thus: - *(2 + Array). You can check it out by removing the minus sign, thus applying + unary operator. I would NOT recommend using this syntax.
Why are they called "primary"? In the order of evaluence they are the first?
C++03 standard defines the expression in chapter 5 (Note 1):
An expression is a sequence of operators and operands that specifies a computation.
Then the 5.1 "Primary expressions" defines the list of primary expressions:
(1) primary-expression:
literal
this
( expression )
id-expression
My main question is in connection the third point:
( expression )
So, according to the standard, every expression with brackets are primary expressions and they are calculated firstly. It looks logical, and gives an exact explanation of the behavior of brackets in C++ expressions (precedence).
So this means that for example
(variable + 10)
is a primary expression.
var = (variable + 10) * 3
and according to my theory, it looks logic, BUT from other sources I know
(variable + 10)
is NOT a primary expression, but WHY? I don't understand, however the standard defines the (expression) as a primary expression.
Please, help me because I can't. Thank you very much, and sorry for my bad English.
Hi.
C++ expressions can be complex, which is to say they can be made up of nested expressions, combined through the use of operators, and those nested expressions may in turn be complex.
If you decompose a complex expression into ever smaller units, at some point you'll be left with units that are atomic in the sense that they cannot be decomposed further. Those are primary expressions; they include identifiers, literals, the keyword this, and lambda expressions.
However, it is true that there is one non-atomic construct that the C++ Standard defines as primary: Expressions enclosed in round brackets (aka parentheses). So the (variable + 10) example you give is a primary expression (and so are the sub-expressions variable (which is an identifier), and 10 (which is a literal).
I believe the Standard lists them as primary expressions because they play the some role as truly atomic expressions when it comes to the order of evaluation: Anything within the brackets must be evaluated before the value of the backeted expressions can enter into evaluations with other expressions: In (5+10)*a, the value of 5+10 must be evaluated before it can enter into the evaluation of *a. [Note that this does not mean 5+10 is evaluated before the expression a is evaluated. It only means that 5+10 must be evaluated before the multiplication itself can be evaluated.]
So, bracketed sub-expressions, in this sense, act as if they were atomic.
And I guess this is why the Standard doesn't use the term "atomic expressions" for this concept. They act as if they were atomic, but at least the bracketed variety is not actually atomic. "Primary" seems, to me, to be a good choice of words.
The term primary-expression is an element of the language grammar. They could equally have been called foobar-expression , it's just a name that doesn't have any deeper meaning.
A primary-expression is not necessarily atomic, evaluated first, top-level, more important than other expressions , or anything like that . And all expressions are "building blocks" because any expression can be added to to form a larger expression.
The definition was already given in the question so I won't repeat it here.
(variable + 10) is a primary-expression, your "other sources" are wrong. Here is another example of primary-expression:
([]{
int n, t1 = 0, t2 = 1, nextTerm = 0;
cout << "Enter the number of terms: ";
cin >> n;
cout << "Fibonacci Series: ";
for (int i = 1; i <= n; ++i)
{
// Prints the first two terms.
if(i == 1)
{
cout << " " << t1;
continue;
}
if(i == 2)
{
cout << t2 << " ";
continue;
}
nextTerm = t1 + t2;
t1 = t2;
t2 = nextTerm;
cout << nextTerm << " ";
}
return 0;
}())
(this calls a lambda function and parenthesizes the result; code borrowed from : here).
Also, the question flirts with a common misconception about precedence and order of evaluation. The standard doesn't mention "precedence" at all. A precedence table is a way of presenting the rules of the the language grammar in a way that is easier to read.
It refers to the way that operands are grouped with operators, not the order in which subexpressions are executed. In the case of f() + ([]{int n, t1 = 0, t2 = 1, nextTerm = 0; cout << "Enter the number of terms: ";.... etc. etc. , the f() may or may not be called before the lambda is called. The parentheses around the lambda do not cause it to be evaluated first.
So I know that C++ has an Operator Precedence and that
int x = ++i + i++;
is undefined because pre++ and post++ are at the same level and thus there is no way to tell which one will get calculated first. But what I was wondering is if
int i = 1/2/3;
is undefined. The reason I ask is because there are multiple ways to look at that (1/2)/3 OR 1/(2/3).
My guess is that it is a undefined behavior but I would like to confirm it.
If you look at the C++ operator precedence and associativity, you'll see that the division operator is Left-to-right associative, which means this will be evaluated as (1/2)/3, since:
Operators that are in the same cell (there may be several rows of operators listed in a cell) are evaluated with the same precedence, in the given direction. For example, the expression a=b=c is parsed as a=(b=c), and not as (a=b)=c because of right-to-left associativity.
In your example the compiler is free to evaluate "1" "2" and "3" in any order it likes, and then apply the divisions left to right.
It's the same for the i++ + i++ example. It can evaluate the i++'s in any order and that's where the problem lies.
It's not that the function's precedence isn't defined, it's that the order of evaluation of its arguments is.
The first code snippet is undefined behaviour because variable i is being modified multiple times inbetween sequence points.
The second code snippet is defined behaviour and is equivalent to:
int i = (1 / 2) / 3;
as operator / has left-to-right associativity.
It is defined, it goes from left to right:
#include <iostream>
using namespace std;
int main (int argc, char *argv[]) {
int i = 16/2/2/2;
cout<<i<<endl;
return 0;
}
print "2" instead of 1 or 16.
It might be saying that it is undefined because you have chosen an int, which is the set of whole numbers.
Try a double or float which include fractions.
I've read that usually statements in c++ end with a semi-colon; so that might help explain what an expression statement would be. But then what would you call an expression by giving an example?
In this case, are both just statements or expression statements or expressions?
int x;
x = 0;
An expression is "a sequence of operators and operands that specifies a computation" (that's the definition given in the C++ standard). Examples are 42, 2 + 2, "hello, world", and func("argument"). Assignments are expressions in C++; so are function calls.
I don't see a definition for the term "statement", but basically it's a chunk of code that performs some action. Examples are compound statements (consisting of zero or more other statements included in { ... }), if statements, goto statements, return statements, and expression statements. (In C++, but not in C, declarations are classified as statements.)
The terms statement and expression are defined very precisely by the language grammar.
An expression statement is a particular kind of statement. It consists of an optional expression followed by a semicolon. The expression is evaluated and any result is discarded. Usually this is used when the statement has side effects (otherwise there's not much point), but you can have a expression statement where the expression has no side effects. Examples are:
x = 42; // the expression happens to be an assignment
func("argument");
42; // no side effects, allowed but not useful
; // a null statement
The null statement is a special case. (I'm not sure why it's treated that way; in my opinion it would make more sense for it to be a disinct kind of statement. But that's the way the standard defines it.)
Note that
return 42;
is a statement, but it's not an expression statement. It contains an expression, but the expression (plus the ;) doesn't make up the entire statement.
These are expressions (remember math?):
1
6 * 7
a + b * 3
sin(3) + 7
a > b
a ? 1 : 0
func()
mystring + gimmeAString() + std::string("\n")
The following are all statements:
int x; // Also a declaration.
x = 0; // Also an assignment.
if(expr) { /*...*/ } // This is why it's called an "if-statement".
for(expr; expr; expr) { /*...*/ } // For-loop.
A statement is usually made up of an expression:
if(a > b) // a > b is an expr.
while(true) // true is an expr.
func(); // func() is an expr.
To understand what is an expression statement, you should first know what is an expression and what is an statement.
An expression in a programming language is a combination of one or more explicit values, constants, variables, operators, and functions that the programming language interprets (according to its particular rules of precedence and of association) and computes to produce ("to return", in a stateful environment) another value. This process, as for mathematical expressions, is called evaluation.
Source: https://en.wikipedia.org/wiki/Expression_(computer_science)
In other words expressions are a sort of data items. They can have single or multiple entities like constants and variables. These entities may be related or connected to each other by operators. Expressions may or may not have side effects, in that they evaluate to something by means of computation which changes a state. For instance numbers, things that look like mathematical formulas and calculations, assignments, function calls, logical evaluations, strings and string operations are all considered expressions.
function calls: According to MSDN, function calls are considered expressions. A function call is an expression that passes control and arguments (if any) to a function and has the form:
expression (expression-list opt) which is invoked by the ( ) function operator.
source: https://msdn.microsoft.com/en-us/library/be6ftfba.aspx
Some examples of expressions are:
46
18 * 3 + 22 / 2
a = 4
b = a + 3
c = b * -2
abs(c)
b >= c
c
"a string"
str = "some string"
strcat(str, " some thing else")
str2 = "some string" + " some other string" // in C++11 using string library
Statements are fragments of a program that execute in sequence and cause the computer to carry out some definite action. Some C++ statement types are:
expression statements;
compound statements;
selection statements;
iteration statements;
jump statements;
declaration statements;
try blocks;
atomic and synchronized blocks (TM TS).
Source: http://en.cppreference.com/w/cpp/language/statements
I've read usually statements in c++ ends with a semicon;
Yes usually! But not always. Consider the following piece of code which is a compound statement but does not end with a semicolon, rather it is enclosed between two curly braces:
{ // begining of a compound statement
int x; // A declaration statement
int y;
int z;
x = 2; // x = 2 is an expression, thus x = 2; with the trailing semicolon is an expression statement
y = 2 * x + 5;
if(y == 9) { // A control statement
z = 52;
} else { // A branching statement of a control statement
z = 0;
}
} // end of a compound statement
By now, as you might be guessing, an expression statement is any statement that has an expression followed by a semicolon. According to MSDN an expression statement is a statement that causes the expressions to be evaluated. No transfer of control or iteration takes place as a result of an expression statement.
Source: https://msdn.microsoft.com/en-us/library/s7ytfs2k.aspx
Some Examples of expression statements:
x = 4;
y = x * x + 10;
radius = 5;
pi = 3.141593;
circumference = 2. * pi * radius;
area = pi * radius * radius;
Therefore the following can not be considered expression statements since they transfer the control flow to another part of a program by calling a function:
printf("The control is passed to the printf function");
y = pow(x, 2);
side effects: A side effect refers to the modification of a state. Such as changing the value of a variable, writing some data on a disk showing a menu in the User Interface, etc.
Source: https://en.wikipedia.org/wiki/Side_effect_(computer_science)
Note that expression statements don't need to have side effects. That is they don't have to change or modify any state. For example if we consider a program's control flow as a state which could be modified, then the following expression statements
won't have any side effects over the program's control flow:
a = 8;
b = 10 + a;
k++;
Wheres the following expression statement would have a side effect, since it would pass the control flow to sqrt() function, thus changing a state:
d = sqrt(a); // The control flow is passed to sqrt() function
If we consider the value of a variable as a state as well, modifying it would be a side effect thus all of expression statements above have side effects, because they all modify a state. An expression statement that does not have any side effect is not very useful. Consider the following expression statements:
x = 7; // This expression statement sets the value of x to 7
x; // This expression statement is evaluated to 7 and does nothing useful
In the above example x = 7; is a useful expression statement for us. It sets the value of x to 7 by = the assignment operator. But x; evaluates to 7 and it doesn't do anything useful.
According to The C++ Programming Language by Bjarne Stroustrup Special(3rd) Edition, a statement is basically any declaration, function call, assignment, or conditional. Though, if you look at the grammar, it is much more complicated than that. An expression, in simple terms, is any math or logical operation(s).
The wikipedia links that ok posted in his answer can be of help too.
In my opinion,
a statement *states* the purpose of a code block. i.e. we say this block of code if(){} is an if-statement, or this x=42; is an expression statement. So code such as 42; serves no purporse, therefore, this is *not* a statement.
and,
an expression is any legal combination of symbols that represents a value (Credit to Webopedia); it combines variables and constants to produce new values(Quoted from Chapter 2 in The C Programming Language). Therefore, it also has a mathematical connotation. For instance, number 42 in x=42; is an expression (x=42; is not an expression but rather an expression statement), or func(x) is an expression because it will evaluate to something. On the contrary, int x; is not an expression because it is not representing any value.
I think this excerpt from a technical book is most useful and clear.
Read the paragraphs till the start of 1.4.2 statements would be useful enough.
An expression is "a sequence of operators and operands that specifies a computation"
These are expressions:
1
2 + 2
"hi"
cout << "Hello, World!"
The last one is indeed an expression; << is the output operator, cout (of type ostream) and "Hello, World!" (string literals) are the operands. The operator returns the left-hand operand, so (cout << "Hello, ") << "World!" is also a valid expression but also not a statement.
An expression becomes an expression statement when it is followed by a semicolon:
1;
2 + 2;
"hi";
cout << "Hello, World!";
An expression is part of a statement, OR a statement itself.
int x; is a statement and expression.
See this : http://en.wikipedia.org/wiki/Expression_%28programming%29
http://en.wikipedia.org/wiki/Statement_%28programming%29