strange behavior of arr[-1] in c++ - c++

#include<iostream>
using namespace std;
int main(){
int arr[] = {1,2,3};
printf("outside loop trail 1: arr[-1] = %d \n", arr[-1]);
for(int i = 0; i<10; i++){
printf("\ninside loop trail i = %d, arr[-1] = %d \n", i, arr[-1]);
}
}
Question:
Why the output inside the loop is the sequence 0, 1, 2 (same as the loop index i); but the output outside the loop changes every time I execute the code? Thanks!
Output after
g++ -o explore explore.cpp && ./explore
outside loop trail 1: arr[-1] = 537839344
inside loop trail i = 0, arr[-1] = 0
inside loop trail i = 1, arr[-1] = 1
inside loop trail i = 2, arr[-1] = 2
run ./explore for a second time:
outside loop trail 1: arr[-1] = 1214220016
inside loop trail i = 0, arr[-1] = 0
inside loop trail i = 1, arr[-1] = 1
inside loop trail i = 2, arr[-1] = 2

This is actually covered in the standard. For example, C++17 [expr.add] /4 states:
When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 <= i + j <= n; otherwise, the behavior is undefined.
The reason I'm discussing adding pointers and integers is because of the equivalence of array[index] and *(array + index), as per C++17 [expr.sub] /1 (that's sub as in subscripting, not subtraction):
The expression E1[E2] is identical (by definition) to *((E1)+(E2)).
Now that's a lot to take in but it basically means that the result of adding a 'pointer to array element' and an 'index', gives you a pointer that is required to point to either an element in the array or just beyond the last one(1).
Since a pointer before the first one (array[-1]) does not meet that requirement, it's undefined behaviour. Once you do that, all bets are off and the implementation is free to do what it likes. You can count yourself lucky it didn't erase your hard disk after playing derisive_laughter.ogg :-)
Note that there's nothing wrong per se with a negative index, the following code gives you the second element (the final "pointer" is still within the array):
int array[100];
int *ptrThird = &(array[2]);
int second = ptrThird[-1];
(1) Note that a pointer is allowed to point just beyond the array provided you don't try to dereference it. Unfortunately, array[index] is a dereferncing operation so, while int array[10]; int *p = &(array[10]); is valid, int x = array[10]; is not.

This is an undefined behavior.

Generally speaking, an array index used this way is equivalent to doing pointer math thusly:
arr[n] -> *(arr+n)
By using a negative index, you are referencing memory before the start of the memory block associated with the array data. If you use an index that is outside the bounds of the array, the result is, as others have pointed out, undefined.

expr.sub/1:
A postfix expression followed by an expression in square brackets is a postfix expression. One of the expressions shall be a glvalue of type “array of T” or a prvalue of type “pointer to T” and the other shall be a prvalue of unscoped enumeration or integral type. The result is of type “T”. The type “T” shall be a completely-defined object type. The expression E1[E2] is identical (by definition) to *((E1)+(E2)), except that in the case of an array operand, the result is an lvalue if that operand is an lvalue and an xvalue otherwise. The expression E1 is sequenced before the expression E2.
expr.add/4:
When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
(4.2) Otherwise, if P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n and the expression P - J points to the (possibly-hypothetical) element x[i - j] if 0 ≤ i − j ≤ n.
(4.3) Otherwise, the behavior is undefined.

Related

When should I decrement a variable inside of a bracket and when should I do it outside?

I was implementing a version of insertion sort when I noticed my function did not work properly if implemented the following way. This version is supposed to sort the elements as they are copied into a new array while keeping the original intact.
vector<int> insertionSort(vector<int>& heights) {
vector<int> expected(heights.size(), 0);
int j, key;
for(int i = 0; i < expected.size(); i++){
expected[i] = heights[i];
j = i-1;
key = expected[i];
while(j >= 0 && expected[j] > key){
expected[j+1] = expected[j--];
}
expected[j+1] = key;
}
return expected;
}
I noticed that when doing expected[j--] the function does not work as it should but when I decrement outside of the bracket it works fine.
In other words, what is the difference between
while(j >= 0 && expected[j] > key){
expected[j+1] = expected[j--];
}
and
while(j >= 0 && expected[j] > key){
expected[j+1] = expected[j];
--j;
}
To answer this, we need to take a look at what order the arguments to expected[j+1] = expected[j--]; are evaluated in. Looking at cppreference's page on order of evaluation, we see the following applies for C++17 and newer:
In every simple assignment expression E1 = E2 and every compound assignment expression E1 #= E2, every value computation and side effect of E2 is sequenced before every value computation and side effect of E1
In your case, this means that every value computation and side effect of expected[j--] is computed before it begins evaluating expected[j+1]. In particular, that means that j+1 will be based on the value j has after you've decremented it with j--, not the value it had before.
Prior to C++17, it was indeterminate whether the left hand side or the right hand side of the assignment operation was sequenced first. This means that in C++14 and earlier, your code exhibits undefined behavior:
If a side effect on a memory location is unsequenced relative to a value computation using the value of any object in the same memory location, the behavior is undefined.
In this case "a memory location" is j and the decrement in j-- is unsequenced relative to the value computation j+1. This is very similar to cppreference's example of undefined behavior:
a[i] = i++; // undefined behavior until C++17
In the second version of your code, the decrement to j does not take place until after the assignment has been completed.

Is memmove copying 0 bytes but referencing out of bounds safe

I have read online that memmove is expected to perform no action if the number of bytes to copy is 0. However what I want to know is if it is expected that the source and destination pointers will not be read in that case
Below is a simplified version of some of my code, the section I am interested in is shiftLeft:
#include <array>
#include <cstring>
#include <iostream>
class Foo final {
unsigned just = 0;
unsigned some = 0;
unsigned primitives = 0;
};
template <unsigned Len>
class Bar final {
unsigned depth = 0;
std::array<Foo, Len> arr;
public:
Bar() = default;
// Just an example
void addFoo() {
arr[depth] = Foo();
depth++;
}
void shiftLeft(unsigned index) {
// This is what my question focuses on
// If depth is 10 and index is 9 then index + 1 is out of bounds
// However depth - index - 1 would be 0 then
std::memmove(
&arr[index],
&arr[index + 1],
(depth - index - 1) * sizeof(Foo)
);
depth--;
}
};
int main() {
Bar<10> bar;
for (unsigned i = 0; i < 10; ++i)
bar.addFoo();
bar.shiftLeft(9);
return 0;
}
When Len is 10, depth is 10, and index is 9 then index + 1 would read out of bounds. However also in that case depth - index - 1 is 0 which should mean memmove would perform no action. Is this code safe or not?
The memmove function will copy n bytes. If n is zero, it will do nothing.
The only possible issue is with this, where index is already at the maximum value for array elements:
&arr[index + 1]
However, you are permitted to refer to array elements (in terms of having a pointer point to them) within the array or the hypothetical element just beyond the end of the array.
You may not dereference the latter but you're not doing that here. In other words, while arr[index + 1] on its own would attempt a dereference and therefore be invalid, evaluating the address of it is fine.
This is covered, albeit tangentially, in C++20 [expr.add]:
When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n; otherwise, the behavior is undefined.
Note the if 0 ≤ i + j ≤ n clause, particularly the final ≤. For an array int x[10], the expression &(x[10]) is valid.
It's also covered in [basic.compound] (my emphasis):
A value of a pointer type that is a pointer to or past the end of an object represents the address of the first byte in memory occupied by the object or the first byte in memory after the end of the storage occupied by the object, respectively.

Why indexes are different from regular variables

Can anyone explain me, please. Why in C++
array[i] = array[++i]; doesn't work like array[i] = array[i + 1];
but i = ++i; does work like i = i + 1;
The value evaluations of the left and right operands of the assignment operator are not sequenced.
Thus this expression
array[i] = array[++i];
can behave either like
array[i] = array[i + 1];
or like
array[i + 1] = array[i + 1];
On the other hand (C++ Standard)
...in all cases, the assignment is sequenced after the value computation
of the right and left operands,
Thus in this expression
i = ++i;
the left operand will be overwritten by the value of ++i.
The answer to your question is regarding the return values of pre-increment, no increment and post-increment.
The expression ++i increments the variable,then returns the incremented value.
The expression i++ increments the variable, the returns the value before incrementing.
So, the expressions array[i + 1] and array[++i] refer the same element, except array[++i] has the side effect of incrementing the index variable.
The problem you are having is that you have misunderstood what is happening with ++i.
++i is a statement saying increment before the evaluation of the current line and place that result in the expression. The reason ++i works like just adding one and assigning is because that is essentially what it does! These statements are equivalent:
array[i] = array[ ++i ]
array[i] = array[ i = i + 1 ]
Remember that the assignment operator returns the value from the right hand side (the new value assigned). The only reason why that's different than just saying array[i + 1] is because of the side effect that i has been changed.
Now, about i = ++i. Using the same logic we can replace that with
i = (i = i + 1)
Observe now that once you have assigned i, you return the value on the right hand side...meaning of course that that is equivalent to:
i = i + 1

C++: Post-Increments resulted in the same value

The following of post-increments will result as follows:
n = 1;
j = n++; //j = 1, n = 2
j = n++; //j = 2, n = 3
j = n++; //j = 3, n = 4
My question is why the following resulted in n = 1 and not n = 3?
n = 1;
n = n++; //n = 1
n = n++; //n = 1
n = n++; //n = 1
If the code was done with pre-increment of n (++n), the result is n = 4 which is to be expected. I know the second code segment should never be done like that in the first place but it is something that I came across and I was curious as to why it resulted like that.
Please advise.
Your second example is not allowed and has an undefined behaviour.
You should use a temporary variable if you need something like that. But hardly you need something like that.
Quoting Wikipedia:
Since the increment/decrement operator modifies its operand, use of
such an operand more than once within the same expression can produce
undefined results. For example, in expressions such as x − ++x, it is
not clear in what sequence the subtraction and increment operators
should be performed. Situations like this are made even worse when
optimizations are applied by the compiler, which could result in the
order of execution of the operations to be different from what the
programmer intended.
Other examples from C++11 standard include:
i = v[i++]; // the behavior is undefined
i = 7, i++, i++; // i becomes 9
i = i++ + 1; // the behavior is undefined
i = i + 1; // the value of i is incremented
f(i = -1, i = -1); // the behavior is undefined
The other answers explain correctly that this code results in undefined behaviour. You may be interested as to why the behaviour is as you see it on your compiler.
As far as most compilers are concerned the expression x = n++ will be compiled into the following fundamental instructions:
take a copy of n, call it n_copy;
add 1 to n
assign n_copy to x
Therefore the expression n = n++ becomes:
take a copy of n, call it n_copy;
add 1 to n
assign n_copy to n
Which is logically equivalent to:
assign n to n
Which is logically equivalent to:
do nothing.
That's why in your case you see n == 1. Not all compilers will necessarily produce the same answer.

Two dimensional array, what does *(pointerArray[i] + j)?

i just got this task of finding out how this code works.
int array[rows][coloums];
int *pointerArray[rows];
for (int i = 0; i < rows; i++) {
pointerArray[i] = array[i];
for (int j = 0; j < coloums; j++) {
*(pointerArray[i] + j) = 0;
}
}
The thing I'm courious about is the *(pointerArray[i] + j), I think it's the same thing as pointerArray[i][j], since you can access the element both ways, But can anyone tell me what is actually happening with the *()? Like how does the compiler know that im asking for the same as pointerArray[i][j]?
Thanks for the answers!
When you do pointerArray[i] + j, you request the element pointerArray[i], which is a int*, and increment that pointer by j (also returning an int*). The *(...) simply dereferences the pointer and returns the int at that position. * is called the dereference operator (in this case). So yes, it's equivalent to pointerArray[i][j].
In this context, the * operator is the dereference operator. The value it prepends will be the location in memory at which it will return a value.
The parenthesis are grouping an addition operation so that the compiler knows that the result of this addition will be used for the dereference. It's simply a case of order-of-operations.
Keep in mind that the [] operator does the same thing as the dereference operator, because arrays are essentially a kind of pointer variable. If you imagine a two-dimensional array as a 2D grid of values with rows and columns, in memory the data is laid out such that each row is strung one after the next in sequential order. The first index in the array (i) along with the type of the array (int) tells the compiler at what offset to look for the first location in the row. The second index in the array (j) tells it at what offset within that row to look.
*(pointerArray[i] + j) basically means: "Find the beginning of the ith row of data in pointerArray, and then pick the jth element of that row, and give me that value.