Explaining the difference between a statement and an expression in c++ - c++

I am trying to understand thoroughly the difference between a statement and an expression
But i am finding it confusing even after reading this answer
Expression Versus Statement
look at the following:
std::cout << "Hello there? " ;
I could say that it is a statement as it is ending with a semi- colon BUT i could also say
It is an expression since i have an ostream , an output operator and a string literal
and this expression yields a value which is the left hand operand.
Which one is correct?

Let's see what the C++ grammar can tell us:
statement:
labeled-statement
attribute-specifier-seq_opt expression-statement
attribute-specifier-seq_opt compount-statement
attribute-specifier-seq_opt selection-statement
attribute-specifier-seq_opt iteration-statement
attribute-specifier-seq_opt jump-statement
declaration-statement
attribute-specifier-seq_opt try-block
expression-statement:
expression_opt ';'
So it is a statement; in particular, an "expression statement", which consists of a (potentially empty) expression followed by a semi-colon. In other words,
std::cout << "Hello there? "
is an expression, while
std::cout << "Hello there? " ;
is a statement.

Which one is correct?
Both: it is an expression statement. C and C++ let you put an expression into a body of code, add a semicolon, and make it a statement.
Here are some more examples:
x++; // post-increment produces a value which you could use
a = 5; // Assignment produces a value
max(a, b); // Call of a non-void function is an expression
2 + x; // This calculation has no side effects, but it is allowed
Note that this is true in the specific case of C and C++, but may not be true in case of other languages. For example, the last expression statement from the list above would be considered invalid in Java or C#.

The definition of expression is given in the C Standard (6.5 Expressions)
1 An expression is a sequence of operators and operands that specifies
computation of a value, or that designates an object or a function, or
that generates side effects, or that performs a combination thereof.
The value computations of the operands of an operator are sequenced
before the value computation of the result of the operator.
As for expression-statements then they are ended with a semicolon. Here is the definition of the expression statement in C++
expression-statement:
expression opt;
And
An expression statement with the expression missing is called a null
statement.
Relative to the last quote I would like to point to a difference between C and C++. In C++ declarations are statements while in C declarations are not statements. So in C++ you may place a label before a declaration while in C you may not do so. So in C you have to place a null statement before a declaration. Compare
C++
Label:
int x;
C
Label: ;
int x;

Related

Does each expression in C++ have a non-reference type

Hi i am reading about expression in C++ and across the statement
Statement 0.0
Each expression has some non-reference type
The quoted statement is from en.cppreference.com/w/cpp/language/value_category. Check line 2 at the top of the page.
Now i took some examples to understand what this means. For example:
int i = 100; // this expression has type int
int &j = i; // this expression has type int or int&?
My confusion is that i know that j is a reference to int that is j is int& but according to the quoted statement every expression has a non-reference type will imply that int &j = i; has type int. Is this correct?
Other examples that i am getting confused about:
int a[4] = {2,4,4,9};
a[3]; // will this expression be int& type or int type?
Now in the statement a[3]; i know that a is a array lvalue and so a[3] returns a lvalue reference to the last element. But getting confused about will the quoted statement 0.0 imply that this whole expression a[3]; be a int or an int& type?
Here is another example:
b[4]; // Here assume that b is an array rvalue. So will this expression has type int&& or int?
So my question is that does something similar happen for pointers also? Meaning do we have a similar statement(0.0) for pointers also?
int x = 34;
int *l = &x; // will this expression have type int* or int?
I now that here l is a pointer to int(compound type). If there is no similar statement for pointers then what is the need for this statement for references? That is why do we strip off the reference part only?
int i = 100; // this expression has type int
int &j = i; // this expression has type int or int&?
These statements are not expressions at all. These are declararions. They do contain sub expressions 100 and i, both of which have the type int. If you used the id expression j after this declaration, the type of that expression would be int.
So my question is that does something similar happen for pointers also?
No. Pointers are non-reference types, and something similar doesn't happen to expressions with pointer types.
why do we strip off the reference part only?
This is simply how the language works. It allows us treat objects and references to objects identically.
This is part of why you dont need to (nor can you) explicitly use an indirection operator to access the referred object, unlike needing to use an indirection operator to access a pointed object.
Here is the actual language rule (from latest standard draft):
[expr.type]
If an expression initially has the type “reference to T” ([dcl.ref], [dcl.init.ref]), the type is adjusted to T prior to any further analysis.
The expression designates the object or function denoted by the reference, and the expression is an lvalue or an xvalue, depending on the expression.
Expression : An expression is made up of one or more operands and yields a result when it is evaluated.
Note the wording " when it is evaluated ".
Examples of expressions are: the literal 4, or some variable n. Again note that the expression is not yet evaluated and so have no result. Also you can create more complicated expressions from an operator and one or more operands. For example 3 + 4*5 is an unevaluated-expression. An expression with two or more operators is called a compound expression.
Each expression in C++ is either a rvalue or a lvalue.
Expression Statement: Statements end with a semicolon. When we add a semicolon to an expression, it becomes an expression statement. The affect of this is that this causes the expression to be evaluated and its result discarded at the end of the statement. So for example the literal 5 is an expression but if you add a semicolon ; after it then we will have a expression statement. Also the result of this expression will be discarded at the end of the expression statement. Lets looks at another example, cout << n; is a expression statement as a whole. It consists of the following expressions:
1. expression `cout`
2. expression 'n'
And it consists of one operator << and a null statement ;
This whole statement will cause a side-effect which will be the printing of value of n on the screen.
Update:
Example 1: std::cout << n; will have a side-effect of printing the value of n on the screen but more importantly it will also have a resulting value which will be the object std::cout which is discarded at the end of the statement.
Example 2: int i(20 + 1); consists of 3 things:
type : int
identifier i
expression 20 + 1
Example 3: float p; This has no expression. This is just variable definition. Also called declaration statement.
Example 4: float k = 43.2; This has an expression at the right hand side which is 43.2, a type float and an identifier k.
Example 5: i = 43;. This is an expression statement. There are two expressions and one operator here. The result is the variable i.
Example 6: int &r = i;. This is a declaration statement since it consists of the expression i on the right hand side. Also on the left hand side we have a type(int) and a declarator(&r). Since it is not an expression statement there will be no value that will be discarded.
Example 7: int *p = &i; This is an declaration statement since it consists of the expression i on the right hand side. Also on the left hand side we have a type(int) and a declarator(*p). Since it is not an expression statement there will be no value that will be discarded.
Example 8: i = a < b ? a : b; This is an expression statement. Here the expressions are:
the left hand side variable i
the variable a on the right hand side
the variable b on the right hand side
Also there is one condition in the middle(a < b ). If the condition evaluates to true then the result of this will be the variable a and b otherwise.
I have come to the following conclusions. To begin with, note that the type of an entity is only about what you can do with this entity, for example, addition is defined for integer types, but not for all class types.
As the question states, an expression can only have a non-reference type, i.e. can be either a fundamental type, a pointer type, a function type, ..., an array type, or a class type, but not a reference type. Which means that the available options for the type of expression listed above allow us to describe everything so that we can do everything with this expression that is possible to do with it in C++. That is, even in situations where we might expect the type to be a reference type, for example,
T a;
static_cast<T&>(a); // suspicious expression
or
T a;
T& f(int &a) { return a; }
f(a); // another suspicious expression
we cannot do anything with this expressions that we could not do with an expression of the non-reference type T and of the same value category.
Note, that we can write
int a;
static_cast<int&>(a)=3;
but the program containing following lines will be ill-formed,
int b;
static_cast<int>(b)=3; // error: lvalue required as left operand of assigment.
because the ability to stand on the left side of an assignment is determined by the value category. Thus, describing an expression with a type and a value category, as opposed to describing it with a type alone, removes the need for a separate reference type, because that property is already described by the value category alone.
C++ expressions can have forms like j+0 or (j). These expressions definitely have type int.
To simplify the grammar, the name of a variable by itself is also usable as an expression. If we didn't have this rule, in the grammar we'd have a lot of variable-or-expression constructs. But it means that the expression j has the same type as the expression j+0, namely int.
I also encountered the same question and above answers don't solve it. But I find a post written by SCOTT MEYERS which discusses the question.
In a nutshell, if you take the standard literally then it is true that an expression can have reference type but may not very useful for understanding the language.

do while loop without {}

Is the below syntax of writing do while loop containing single statement without using braces like other loops namely while, for, etc. correct? I'm getting the required output but I want to know whether this has any undefined behavior.
int32_t i = 1;
do
std::cout << i << std::endl;
while(++i <= 10);
The original resource to answer this question should be the C++ standards: http://www.open-std.org/jtc1/sc22/wg21/docs/standards.
In C++17, 9.5 Iteration statements [stmt.iter] says do takes a statement:
do statement while ( expression ) ;
So this is definitely fine.
9.3 Compound statement or block [stmt.block] says
So that several statements can be used where one is expected, the compound statement (also, and equivalently,
called “block”) is provided.
so people tend/like to use { ... } for do statements.
The grammar for a do-while loop is given here:
do statement while ( expression ) ;
The grammar for a statement allows for a expression-statement:
statement:
attribute-specifier-seq opt expression-statement
expression-statement:
expression opt ;
So the body of a do-while doesn't need to be inside {}, and your code is valid.
cppreference indicates that it's the same with do while as it is with while and if and so on.
The relevant syntax:
attr (optional) do statement while ( expression ) ;
attr(C++11) - any number of attributes
expression - any expression which is contextually convertible to bool.
This expression is evaluated after each iteration, and if it yields false, the loop is exited.
statement - any statement, typically a
compound statement, which is the body of the loop
The key here is that a statement is typically a compound statement enclosed in curly braces, but not necessarily so.
As with those constructions, I would tend to prefer braces even when it is only a single line, but it's good to know that it works.
If you have only one statement inside the do-while or if or for, the braces aren't necessary
i.e.
do
statement;
while (cond)
is the same as
do
{
statement;
}
while (cond)
One the other hand
Likewise
if(cond)
statement;
is same
if(cond)
{
statement;
}
If you write
if(cond)
statement1;
statement2;
Then this will considered as
if(cond)
{
statement1;
}
statement2;
The body of a do-while construct must be a statement.
attr(optional) do statement while ( expression ) ;
attr(C++11) - any number of attributes
expression - any expression which is contextually convertible to bool. This expression is evaluated after each iteration, and if it yields false, the loop is exited.
statement - any statement, typically a compound statement, which is the body of the loop
cppreference notes that this statement is usually a compound statement, which is a block surrounded by curly braces.
Compound statements or blocks are brace-enclosed sequences of statements.
However, this statement can also just be an expression terminated by a semi-colon, which is the case in your example.
No, it's completely fine!
Any loop in C++ which expects curly braces and doesn't find them takes the first line into consideration and move on.

What is the correct answer for cout << a++ << a;?

Recently in an interview there was a following objective type question.
int a = 0;
cout << a++ << a;
Answers:
a. 10
b. 01
c. undefined behavior
I answered choice b, i.e. output would be "01".
But to my surprise later I was told by an interviewer that the correct answer is option c: undefined.
Now, I do know the concept of sequence points in C++. The behavior is undefined for the following statement:
int i = 0;
i += i++ + i++;
but as per my understanding for the statement cout << a++ << a , the ostream.operator<<() would be called twice, first with ostream.operator<<(a++) and later ostream.operator<<(a).
I also checked the result on VS2010 compiler and its output is also '01'.
You can think of:
cout << a++ << a;
As:
std::operator<<(std::operator<<(std::cout, a++), a);
C++ guarantees that all side effects of previous evaluations will have been performed at sequence points. There are no sequence points in between function arguments evaluation which means that argument a can be evaluated before argument std::operator<<(std::cout, a++) or after. So the result of the above is undefined.
C++17 update
In C++17 the rules have been updated. In particular:
In a shift operator expression E1<<E2 and E1>>E2, every value computation and side-effect of E1 is sequenced before every value computation and side effect of E2.
Which means that it requires the code to produce result b, which outputs 01.
See P0145R3 Refining Expression Evaluation Order for Idiomatic C++ for more details.
Technically, overall this is Undefined Behavior.
But, there are two important aspects to the answer.
The code statement:
std::cout << a++ << a;
is evaluated as:
std::operator<<(std::operator<<(std::cout, a++), a);
The standard does not define the order of evaluation of arguments to an function.
So Either:
std::operator<<(std::cout, a++) is evaluated first or
ais evaluated first or
it might be any implementation defined order.
This order is Unspecified[Ref 1] as per the standard.
[Ref 1]C++03 5.2.2 Function call
Para 8
The order of evaluation of arguments is unspecified. All side effects of argument expression evaluations take effect before the function is entered. The order of evaluation of the postfix expression and the argument expression list is unspecified.
Further, there is no sequence point between evaluation of arguments to a function but a sequence point exists only after evaluation of all arguments[Ref 2].
[Ref 2]C++03 1.9 Program execution [intro.execution]:
Para 17:
When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body.
Note that, here the value of c is being accessed more than once without an intervening sequence point, regarding this the standard says:
[Ref 3]C++03 5 Expressions [expr]:
Para 4:
....
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
The code modifies c more than once without intervening sequence point and it is not being accessed to determine the value of the stored object. This is clear violation of the above clause and hence the result as mandated by the standard is Undefined Behavior[Ref 3].
Sequence points only define a partial ordering. In your case, you have
(once overload resolution is done):
std::cout.operator<<( a++ ).operator<<( a );
There is a sequence point between the a++ and the first call to
std::ostream::operator<<, and there is a sequence point between the
second a and the second call to std::ostream::operator<<, but there
is no sequence point between a++ and a; the only ordering
constraints are that a++ be fully evaluated (including side effects)
before the first call to operator<<, and that the second a be fully
evaluated before the second call to operator<<. (There are also
causual ordering constraints: the second call to operator<< cannot
preced the first, since it requires the results of the first as an
argument.) §5/4 (C++03) states:
Except where noted, the order of
evaluation of operands of individual operators and subexpressions of
individual expressions, and the order in which side effects take place,
is unspecified. Between the previous and next sequence point a scalar
object shall have its stored value modified at most once by the
evaluation of an expression. Furthermore, the prior value shall be
accessed only to determine the value to be stored. The requirements of
this paragraph shall be met for each allowable ordering of the
subexpressions of a full expression; otherwise the behavior is
undefined.
One of the allowable orderings of your expression is a++, a, first
call to operator<<, second call to operator<<; this modifies the
stored value of a (a++), and accesses it other than to determine
the new value (the second a), the behavior is undefined.
The correct answer is to question the question. The statement is unacceptable because a reader cannot see a clear answer. Another way to look at it is that we have introduced side-effects (c++) that make the statement much harder to interpret. Concise code is great, providing it's meaning is clear.

Which category of a statement initialization is?

The Question :
In programming , assignment statement is an expression , but how about initialization?Is it an expression??
the parentheses of a while loop should contain an expression , so i try to put an initialization into it , and the compiler prompt me an error , this shows initialization is not an expression.
To further prove it , i try the for loop , and i do this for(int num = 3 ; num2 = 4 ; num3 = 5).Surprisingly the compiler give me errors again.
So if an initialization is not an expression , what kind of statement it is??
Thanks for spending time reading my question
In both C and C++, assignment is an expression. E.g. a = 5 is an assignment-expression.
In both C and C++ you can use any expression followed by a semi-colon where statement is required - such as the body of a function. This type of statement is an expression-statement. (Technially, you can leave out the expression entirely. ; is a degenerate expression-statement.)
You can only use a declaration where a declaration is expected, not everywhere where you can use an expression.
The following is not an expression or an expression-statement, it is a declaration. (Technically, in C++, it can form a declaration-statement when used where a statement is expected, in C it is just a declaration.) Note that there is no assignment-expression sub-part to this declaration, = 3 is an initializer for the declared entity num.
int num = 3;
These two common uses of = (initialization and assignment) are sometimes confused. Where = is being used to initialize the entity being declared in a declaration, it is initialization, where it is being used to change the value of an already declared entity, it is assignment.
Here is where C and C++ differ: in C, the parenthesised entity immediately following the while keyword must be an expression so something like while (int num = 0) { /* ... */ } is not valid.
In C++ the entity can be a condition, which allows for a simple declaration with an initializer as well as a simple expression, as in C. In C++, where the condition is in the form of a declaration, the declared entity is initialized on each iteration and implicitly converted to bool to determine whether to execute the loop body.
The for loop is special in both languages. In both languages the first part of the parenthesized list following the for keyword can effectively be either a declaration or an expression-statement.
The for loop takes three expressions in C and C++. The first is executed before the loop is run. The value of the second is used to determine when the loop ends. The third expression is executed at the end of each iteration of the for loop.
You can abuse the intent of the for loop and put whatever expression you want in those three sections.
The while loop while(<expression>) {<body>} is equivalent to the for loop for(;<expression>;) {<body>}.

Is comma operator free from side effect?

For example for such statement:
c += 2, c -= 1
Is it true that c += 2 will be always evaluated first, and c in second expression c-= 1 will always be updated value from expression c += 2?
Yes, it is guaranteed by the standard, as long as that comma is a non-overloaded comma operator. Quoting n3290 §5.18:
The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-
value expression (Clause 5)83. Every value computation and side effect associated with the left expression
is sequenced before every value computation and side effect associated with the right expression. The type
and value of the result are the type and value of the right operand; the result is of the same value category
as its right operand, and is a bit-field if its right operand is a glvalue and a bit-field.
And the corresponding footnote:
83 However, an invocation of an overloaded comma operator is an ordinary function call; hence, the evaluations of its argument
expressions are unsequenced relative to one another (see 1.9).
So this holds only for the non-overloaded comma operator.
The , between arguments to a function are not comma operators. This rule does not apply there either.
For C++03, the situation is similar:
The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions separated by a comma is evaluated left-to-right and the value of the left expression is
discarded. The lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conver-
sions are not applied to the left expression. All side effects (1.9) of the left expression, except for the
destruction of temporaries (12.2), are performed before the evaluation of the right expression. The type and
value of the result are the type and value of the right operand; the result is an lvalue if its right operand is.
Restrictions are the same though: does not apply to overloaded comma operators, or function argument lists.
Yes, the comma operator guarantees that the statements are evaluated in left-to-right order, and the returned value is the evaluated rightmost statement.
Be aware, however, that the comma in some contexts is not the comma operator. For example, the above is not guaranteed for function argument lists.
Yes, in C++ the comma operator is a sequence point and those expression will be evaluated in the order they are written. See 5.18 in the current working draft:
[snip] is evaluated left-to-right. [snip]
I feel that your question is lacking some explanation as to what you mean by "side effects". Every statement in C++ is allowed to have a side effect and so is an overloaded comma operator.
Why is the statement you have written not valid in a function call?
It's all about sequence points. In C++ and C it is forbidden to modify a value twice inside between two sequence points. If your example truly uses operator, every self-assignment is inside its own sequence point. If you use it like this foo(c += 2, c -= 2) the order of evaluation is undefined. I'm actually unsure if the second case is undefined behaviour as I do not know if an argument list is one or many sequence points. I ought to ask a question about this.
It should be always evaluated from left to right, as this is the in the definition of the comma operator:
Link
You've got two questions.
The first question: "Is comma operator free from side effect?"
The answer to this is no. The comma operator naturally facilitates writing expressions with side effects, and deliberately writing expressions with side effects is what the operator is commonly used for. E.g., in while (cin >> str, str != "exit") the state of the input stream is changed, which is an intentional side effect.
But maybe you don't mean side-effect in the computer science sense, but in some ad hoc sense.
Your second question: "For example for such statement: c += 2, c -= 1 Is it true that c += 2 will be always evaluated first, and c in second expression c-= 1 will always be updated value from expression c += 2?"
The answer to this is yes in the case of a statement or expression, except when the comma operator is overloaded (very unusual). However, sequences like c += 2, c -= 1 can also occur in argument lists, in which case, what you've got is not an expression, and the comma is not a sequence operator, and the order of evaluation is not defined. In foo(c += 2, c -= 1) the comma is not a comma operator, but in foo((c += 2, c -= 1)) it is, so it may pay to pay attention to the parentheses in function calls.