What's the difference between the comma operator and the comma separator? [duplicate] - c++

This question already has answers here:
How does the compiler know that the comma in a function call is not a comma operator?
(6 answers)
Closed 8 years ago.
In C++, the comma token (i.e., ,) is either interpreted as a comma operator or as a comma separator.
However, while searching in the web I realized that it's not quite clear in which cases the , token is interpreted as the binary comma operator and where is interpreted as a separator between statements.
Moreover, considering multiple statements/expressions in one line separated by , (e.g., a = 1, b = 2, c = 3;), there's a turbidness on the order in which they are evaluated.
Questions:
In which cases a comma , token is interpreted as an operator and in which as a separator?
When we have one line multiple statements/expressions separated by comma what's the order of evaluation for either the case of the comma operator and the case of the comma separator?

When a separator is appropriate -- in arguments to a function call or macro, or separating values in an initializer list (thanks for the reminder, #haccks) -- comma will be taken as a separator. In other expressions, it is taken as an operator. For example,
my_function(a,b,c,d);
is a call passing four arguments to a function, whereas
result=(a,b,c,d);
will be understood as the comma operator. It is possible, through ugly, to intermix the two by writing something like
my_function(a,(b,c),d);
The comma operator is normally evaluated left-to-right.
The original use of this operation in C was to allow a macro to perform several operations before returning a value. Since a macro instantiation looks like a function call, users generally expect it to be usable anywhere a function call could be; having the macro expand to multiple statements would defeat that. Hence, C introduced the , operator to permit chaining several expressions together into a single expression while discarding the results of all but the last.
As #haccks pointed out, the exact rules for how the compiler determines which meaning of , was intended come out of the language grammar, and have previously been discussed at How does the compiler know that the comma in a function call is not a comma operator?

You cannot use comma to separate statements. The , in a = 1, b = 2; is the comma operator, whose arguments are two assignment expressions. The order of evaluation of the arguments of the comma operator is left-to-right, so it's clear what the evaluation order is in that case.
In the context of the arguments to a function-call, those arguments cannot be comma-expressions, so the top-level commas must be syntactic (i.e. separating arguments). In that case, the evaluation order is not specified. (Of course, the arguments might be parenthesized expressions, and the parenthesized expression might be a comma expression.)
This is expressed clearly in the grammar in the C++ standard. The relevant productions are expression, which can be:
assignment-expression
or
expression , assignment-expression
and expression-list, which is the same as an initializer-list, which is a ,-separated list of initializer-clause, where an initializer-clause is either:
assignment-expression
or
braced-init-list
The , in the second expression production is the comma-operator.

Related

Why +++x will be divided into ++(+x) instead of +(++x) in C++? [duplicate]

This question already has answers here:
Why doesn't a+++++b work?
(9 answers)
Closed 10 months ago.
When I type this code bellow
int x = 1;
+++x;
it would be divided into ++(+x), and of course the sentence is wrong cause there's a rvalue after ++.
I am curious about why it can not be +(++x), in which the code is correct.
Is this depend on the IDE or the compiler ?
Can it be find in C++ Standard ? Or it's just a undefined behaviour ?
Thanks a lot to answer this question and forgive my poor English.
From C++20 (draft N4860) [lex.pptoken]/3.3
— Otherwise, the next preprocessing token is the longest sequence of characters that could constitute
a preprocessing token, even if that would cause further lexical analysis to fail, ...
and [lex.pptoken]/6
[Example: The program fragment x+++++y is parsed as x ++ ++ + y, which, if x and y have integral types,
violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correct
expression. —end example]
So, it is a rule of the language, that the + goes with the variable, because the ++ is first grouped together.
Funnily, this reminds me of an old problem where: std::vector<std::vector<int>> a used to cause problems because >> would be one token instead of two (since it's supposed to be the longest sequence of characters). This is addressed by [temp.names]/3
When a name is considered to be a template-name, and it is followed by a <, the < is always taken as the
delimiter of a template-argument-list and never as the less-than operator. When parsing a template-argumentlist,
the first non-nested > is taken as the ending delimiter rather than a greater-than operator. Similarly,
the first non-nested >> is treated as two consecutive but distinct > tokens, the first of which is taken as the
end of the template-argument-list and completes the template-id. [Note: The second > token produced by this
replacement rule may terminate an enclosing template-id construct or it may be part of a different construct
(e.g., a cast). —end note]
This is a consequence of the maximum munch tokenization principle:
A C++ implementation must collect as many consecutive characters as possible into a token.
From lex.pptoken#3.3:
Otherwise, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token, even if that would cause further lexical analysis to fail, except that a header-name is only formed within a #include directive.
And since ++ is the longest valid token, the parser treats the expression as if ++ +x.

Can parentheses override an expression's order of evaluation? [duplicate]

This question already has answers here:
Operator Precedence vs Order of Evaluation
(6 answers)
Closed 5 years ago.
Grouping operators and operands and Order of Evaluation are two important concepts of expression in C++.
Grouping
For expression with multiple operators, how the operands grouped with the specific operators is decided by the precedence and associativity of the operators and may depend on the order of evaluation.
Order
In C++, only 4 operators have the specified order of evaluations (logical AND, logical OR, conditional and comma operator). For the other operators, the evaluation order is unspecified.
Parentheses
Parentheses can override the precedence and associativity, and therefore specify the grouping of a compound expression.
However, the book by Peter Gottschling claims the parentheses can change the order of the evaluation. I personally doubt it; I think it's an error! In the example from the quotation below, the parentheses do not tell which expression of x, y and z is evaluated first, which one is later and which one is the last. It only groups the expression y + z as the left operand of the * operator.
An expression surrounded by parentheses is an expression as well,
e.g., (x + y). As this grouping by parentheses precedes all operators,
we can change the order of evaluation to suit our needs: x * (y + z)
computes the addition first. Discovering Modern C++, Chapter 1.4.1
Question
Can parentheses override expressions' order of evaluation?
The quoted sentence is poorly worded. The author didn't mean that the order of evaluation is changed, or even specified; I think the word "order" was meant in terms of how a human might read the expression (i.e. precedence).
Of course, if the three variables are independent and reading them has no side-effects, the "as if" rule makes the unspecified order irrelevant, as it wouldn't change the value of the expression.

What is the difference between "vector<pair<int,int>> q;" and "vector<pair<int,int> > q;" [duplicate]

This question already has answers here:
Template issue with vector [duplicate]
(2 answers)
Closed 7 years ago.
I get compile error with the former but latter works just fine.
error: ‘>>’ should be ‘> >’ within a nested template argument list
Thanks
In the (now obsolete) revisions C++98 and C++03, the character sequence ">>" was unconditionally interpreted as the "right shift operator" token, so if you wanted to close multiple template argument lists, you would need to leave some intervening whitespace.
As of C++11, the lexical rules of the language have been modified to interpret ">>" as two consecutive template argument list ends, and the whitespace is no longer necessary. (However, this makes it necessary to parenthesize shift expressions in a template argument list.)
(In the same wash, C++11 also interprets <::foo, when used as the first template argument, in the "obvious" way (beginning of argument list, followed by namespace qualifier) rather than consuming <: as the alternative token for [.)
Before C++11, you had to use whitespace to separate the angle brackets in nested templates - otherwise the compiler was interpreting it as right-shift operator ">>". In C++11, you can ommit the whitespace and it will be interpreted as brackets.
However some compilers (eg. MSVC++) ignore the standard and allow you to ommit the whitespace even when not using C++11 standard.

Explain the difference:

int i=1,2,3,4; // Compile error
// The value of i is 1
int i = (1,2,3,4,5);
// The value of i is 5
What is the difference between these definitions of i in C and how do they work?
Edit: The first one is a compiler error. How does the second work?
= takes precedence over ,1. So the first statement is a declaration and initialisation of i:
int i = 1;
… followed by lots of comma-separated expressions that do nothing.
The second code, on the other hand, consists of one declaration followed by one initialisation expression (the parentheses take precedence so the respective precedence of , and = are no longer relevant).
Then again, that’s purely academic since the first code isn’t valid, neither in C nor in C++. I don’t know which compiler you’re using that it accepts this code. Mine (rightly) complains
error: expected unqualified-id before numeric constant
1 Precedence rules in C++ apply regardless of how an operator is used. = and , in the code of OP do not refer to operator= or operator,. Nevertheless, they are operators as far as C++ is concerned (§2.13 of the standard), and the precedence of the tokens = and , does not depend on their usage – it so happens that , always has a lower precedence than =, regardless of semantics.
You have run into an interesting edge case of the comma operator (,).
Basically, it takes the result of the previous statement and discards it, replacing it with the next statement.
The problem with the first line of code is operator precedence. Because the = operator has greater precedence than the , operator, you get the result of the first statement in the comma chain (1).
Correction (thanks #jrok!) - the first line of code neither compiles, nor is it using the comma as an operator, but instead as an expression separator, which allows you to define multiple variable names of the same type at a time.
In the second one, all of the first values are discarded and you are given the final result in the chain of items (5).
Not sure about C++, but at least for C the first one is invalid syntax so you can't really talk about a declaration since it doesn't compile. The second one is just the comma operator misused, with the result 5.
So, bluntly, the difference is that the first isn't C while the second is.

The limitations of the comma operator

I have read this question and I want to add to it that what are the things that can not be done using the comma operator. This has confused me a lot, as I can do this:
int arr[3];
arr[0]=1,arr[1]=2,arr[2]=3;
But when I do:
int arr[3],arr[0]=1,arr[1]=2,arr[2]=3;
It gives me a compiler error.
I want to ask that what are the limitations of the comma operator in real practice?
One thing to realize is that not all uses of a comma in C are instances of the comma operator. Changing your second example to be a syntactically declaration:
int a0=1,a1=2,a2=3;
the commas are not operators, they're just syntax required to separate instances of declarators in a list.
Also, the comma used in parameter/argument lists is not the comma operator.
In my opinion the use of the comma operator is almost always a bad idea - it just causes needless confusion. In most cases, what's done using a comma operator can be more clearly done using separate statements.
Two exceptions that come to mind easily are inside the control clauses of a for statement, and in macros that absolutely need to cram more than one 'thing' into a single expression, and even this should only be done when there's no other reasonable option).
You can use the comma operator most anywhere that an expression can appear. There are a few exceptions; notably, you cannot use the comma operator in a constant expression.
You also have to be careful when using the comma operator where the comma is also used as a separator, for example, when calling functions you must use parentheses to group the comma expression:
void f(int, bool);
f(42, 32, true); // wrong
f((42, 32), true); // right (if such a thing can be considered "right")
Your example is a declaration:
int arr[3],arr[0]=1,arr[1]=2,arr[2]=3;
In a declaration, you can declare multiple things by separating them with the comma, so here too the comma is used as a separator. Also, you can't just tack on an expression to the end of a declaration like this. (Note that you can get the desired result by using int arr[3] = { 1, 2, 3 };).