Accessing arrays by index[array] in C and C++ - c++

There is this little trick question that some interviewers like to ask for whatever reason:
int arr[] = {1, 2, 3};
2[arr] = 5; // does this line compile?
assert(arr[2] == 5); // does this assertion fail?
From what I can understand, a[b] gets converted to *(a + b) and since addition is commutative, it doesn't really matter their order, so 2[a] is really *(2 + a) and that works fine.
Is this guaranteed to work by C and/or C++'s specs?

Yes. 6.5.2.1 paragraph 1 (C99 standard) describes the arguments to the [] operator:
One of the expressions shall have type "pointer to object type", the other expression shall have integer type, and the result has type "type".
6.5.2.1 paragraph 2 (emphasis added):
A postfix expression followed by an expression in square brackets [] is a subscripted
designation of an element of an array object. The definition of the subscript operator []
is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that
apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the
initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th
element of E1 (counting from zero).
It says nothing requiring the order of the arguments to [] to be sane.

In general 2[a] is identical to a[2] and this is guaranteed to be equivalent in both C and C++ (assuming no operator overloading), because as you meantioned it translates into *(2+a) or *(a+2), respectively. Because the plus operator is commutative, the two forms are equivalent.
Although the forms are equivalent, please for the sake of all that's holy (and future maintenance programmers), prefer the "a[2]" form over the other.
P.S., If you do get asked this at an interview, please do exact revenge on behalf of the C/C++ community and make sure that you ask the interviewer to list all trigraph sequences as a precondition to you giving your answer. Perhaps this will disenchant him/her from asking such (worthless, with regard to actually programming anything) questions in the future. In the odd event that the interviewer actually knows all nine of the trigraph sequences, you can always make another attempt to stomp them with a question about the destruction order of virtual base classes - a question that is just as mind bogglingly irrelevant for everyday programming.

Related

What does the term 'equivalent' mean in C++ standard?

According to the draft of the C++ 11 standard, page 421, Table 23 — CopyAssignable, it says that post-condition of an expression t = v of copyassignable type is
t is equivalent to v, the value of v is unchanged
But I'm not sure what the term 'equivalent' means here. Is it mean t == v? Or something like all bytes are equivalent 'deeply' in sense of 'deepcopy'?
As far as I can tell, there is no separate definition of the term "equivalent" that would apply in the standard. It may be interpreted as plain English. Here are a few definitions from dictionaries:
corresponding or virtually identical especially in effect or function
equal to or having the same effect as something
something that is the same amount, price, size, etc. as something else or has the same purpose as something else
Another interpretation is that your quote is the definition for equivalent in this context i.e. the meaning of equality for the type in question is defined by the assignment operator of that type.

Is (a=1)=2 undefined behaviour in C++98?

Similar codes for example (a+=1)%=7;, where a is an int variable.
We know that operator += or = is not a sequence point, therefore we have two side-effects between two adjcent sequence points. (we are using cpp98's sequence point rules here)
However, assignment operators like += or = guarantees to return the lvalue after assignment, which means the order of execution is to some degree "defined".
So is that an undefined behaviour ?
(a=1)=2 was undefined prior to C++11, as the = operator did not introduce a sequence point and therefore a is modified twice without an intervening sequence point. The same applies to (a+=1)%=7
The text was:
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.
It's worth mentioning that the description of the assignment operator is defective:
The result of the assignment operation is the value stored in the left operand after the assignment has taken place; the result is an lvalue.
If the result is an lvalue then the result cannot be the stored value (that would be an rvalue). Lvalues designate memory locations. This sentence seems to imply an ordering relation, but regardless of how we want to interpret it, it doesn't use the term "sequence point" and therefore the earlier text about sequence points applies.
If anything, that wording casts a bit of doubt on expressions like (a=1) + 2. The C++11 revision of sequencing straightened out all these ambiguities.

C++ Array Definition with Lower and Upper Bound?

My daughter's 12th standard C++ textbook says that
the notation of an array can (also) be given as follows: Array name
[lower bound L, upper bound U]
This was a surprise for me. I know Pascal has this notation, but C++? Had never seen this earlier. I wrote a quick program in her prescribed compiler (the ancient Turbo C++ 4.5), and that does not support it. Did not find this syntax in Stanley Lippman's book either. Internet search did not throw up this. Or maybe I didn't search correctly?
So, is it a valid declaration?
This is not valid, from the draft C++ standard section 8.3.4 Arrays the declaration must be of this form:
D1 [ constant-expressionopt] attribute-specifier-seqopt
and we can from section 5.19 Constant expressions the grammar for constant expression is:
constant-expression:
conditional-expression
This grammar does not allow us to get to the comma operator either to do something like this:
int a[ 1, 2 ] ;
^
as others have implied since there is no path to comma operator from conditional-expression. Although if you add parenthesis we can get to the comma operator since conditional-expression allows us to get to primary-expression which gets us () so the following would be valid:
int a[ (1, 2) ] ;
^ ^
Note, in C++03 you were explicitly forbidden from using the comma operator in a constant expression.
No it's not true, unless someone has overloaded the comma operator and possibly [] as well which is very unlikely. (Boost Spirit does both but for very different reasons).
Without any overloading at all, Array[x, y] is syntatically invalid since the size must be a constant-expression and these cannot contain the comma operator; as to do so would make it an assignment-expression.
Burn the book and put Stroustrup in her Christmas stocking!

Difference between sequence points and operator precedence? 0_o

Let me present a example :
a = ++a;
The above statement is said to have undefined behaviors ( I already read the article on UB on SO)
but according precedence rule operator prefix ++ has higher precedence than assignment operator =
so a should be incremented first then assigned back to a. so every evaluation is known, so why it is UB ?
The important thing to understand here is that operators can produce values and can also have side effects.
For example ++a produces (evaluates to) a + 1, but it also has the side effect of incrementing a. The same goes for a = 5 (evaluates to 5, also sets the value of a to 5).
So what you have here is two side effects which change the value of a, both happening between sequence points (the visible semicolon and the end of the previous statement).
It does not matter that due to operator precedence the order in which the two operators are evaluated is well-defined, because the order in which their side effects are processed is still undefined.
Hence the UB.
Precedence is a consequence of the grammar rules for parsing expressions. The fact that ++ has higher precedence than = only means that ++ binds to its operand "tighter" than =. In fact, in your example, there is only one way to parse the expression because of the order in which the operators appear. In an example such as a = b++ the grammar rules or precedence guarantee that this means the same as a = (b++) and not (a = b)++.
Precedence has very little to do with the order of evaluation of expression or the order in which the side-effects of expressions are applied. (Obviously, if an operator operates on another expression according to the grammar rules - or precedence - then the value of that expression has to be calculated before the operator can be applied but most independent sub-expressions can be calculated in any order and side-effects also processed in any order.)
why it is UB ?
Because it is an attempt to change the variable a two times before one sequence point:
++a
operator=
Sequence point evaluation #6: At the end of an initializer; for example, after the evaluation of 5 in the declaration int a = 5;. from Wikipedia.
You're trying to change the same variable, a, twice. ++a changes it, and assignment (=) changes it. But the sequence point isn't complete until the end of the assignment. So, while it makes complete sense to us - it's not guaranteed by the standard to give the right behavior as the standard says not to change something more than once in a sequence point (to put it simply).
It's kind of subtle, but it could be interpreted as one of the following (and the compiler doesn't know which:
a=(a+1);a++;
a++;a=a;
This is because of some ambiguity in the grammar.

Is "++l *= m" undefined behaviour?

I have started studying about C++0x. I came across the follow expression somewhere:
int l = 1, m=2;
++l *= m;
I have no idea whether the second expression has well defined behavior or not. So I am asking it here.
Isn't it UB? I am just eager to know.
The expression is well defined in C++0x. A very Standardese quoting FAQ is given by Prasoon here.
I'm not convinced that such a high ratio of (literal Standards quotes : explanatory text) is preferable, so I'm giving an additional small explanation: Remember that ++L is equivalent to L += 1, and that the value computation of that expression is sequenced after the increment of L. And in a *= b, value computation of expression a is sequenced before assignment of the multiplication result into a.
What side effects do you have?
Increment
Assignment of the multiplication result
Both side-effects are transitively sequenced by the above two sequenced after and sequenced before.
In the code above, prefix ++ has precedence over *=, and so gets executed first. The result is that l equals 4.
UPDATE: It is indeed undefined behavior. My assumption that precedence ruled was false.
The reason is that l is both an lvalue and rvalue in *=, and also in ++. These two operations are not sequenced. Hence l is written (and read) twice "without a sequence point" (old standard wording), and behavior is undefined.
As a sidenote, I presume your question stems from changes regarding sequence points in C++0x. C++0x has changed wording regarding "sequence points" to "sequenced before", to make the standard clearer. To my knowledge, this does not change the behavior of C++.
UPDATE 2: It turns out there actually is a well defined sequencing as per sections 5.17(1), 5.17(7) and 5.3.2(1) of the N3126 draft for C++0x. #Johannes Schaub's answer is correct, and documents the sequencing of the statement. Credit should of course go to his answer.