I have a pointer which is defined as follows:
A ***b;
What does accessing it as follows do:
A** c = b[-1]
Is it an access violation because we are using a negative index to an array? Or is it a legal operation similar to *--b?
EDIT Note that negative array indexing has different support in C and C++. Hence, this is not a dupe.
X[Y] is identical to *(X + Y) as long as one of X and Y is of pointer type and the other has integral type. So b[-1] is the same as *(b - 1), which is an expression that may or may not be evaluated in a well-formed program – it all depends on the initial value of b! For example, the following is perfectly fine:
int q[24];
int * b = q + 13;
b[-1] = 9;
assert(q[12] == 9);
In general, it is your responsibility as a programmer to guarantee that pointers have permissible values when you perform operations with them. If you get it wrong, your program has undefined behaviour. For example:
int * c = q; // q as above
c[-1] = 0; // undefined behaviour!
Finally, just to reinforce the original statement, the following is fine, too:
std::cout << 2["Good morning"] << 4["Stack"] << 8["Overflow\n"];
Related
This question already has answers here:
Access array beyond the limit in C and C++ [duplicate]
(7 answers)
How dangerous is it to access an array out of bounds?
(12 answers)
Closed 9 years ago.
Say I have an array like so:
int val[10];
and I intentionally index it with everything from negative values to anything higher than 9, but WITHOUT using the resulting value in any way. This would be for performance reasons (perhaps it's more efficient to check the input index AFTER the array access has been made).
My questions are:
Is it safe to do so, or will I run into some sort of memory protection barriers, risk corrupting memory or similar for certain indices?
Is it perhaps not at all efficient if I access data out of range like this? (assuming the array has no built in range check).
Would it be considered bad practice? (assuming a comment is written to indicate we're aware of using out of range indices).
It is undefined behavior. By definition, undefined means "anything could happen." Your code could crash, it could work perfectly, it could bring about peace and harmony amongst all humans. I wouldn't bet on the second or the last.
It is Undefined Behavior, and you might actually run afoul of the optimizers.
Imagine this simple code example:
int select(int i) {
int values[10] = { .... };
int const result = values[i];
if (i < 0 or i > 9) throw std::out_of_range("out!");
return result;
}
And now look at it from an optimizer point of view:
int values[10] = { ... };: valid indexes are in [0, 9].
values[i]: i is an index, thus i is in [0, 9].
if (i < 0 or i > 9) throw std::out_of_range("out!");: i is in [0, 9], never taken
And thus the function rewritten by the optimizer:
int select(int i) {
int values[10] = { ... };
return values[i];
}
For more amusing stories about forward and backward propagation of assumptions based on the fact that the developer is not doing anything forbidden, see What every C programmer should know about Undefined Behavior: Part 2.
EDIT:
Possible work-around: if you know that you will access from -M to +N you can:
declare the array with appropriate buffer: int values[M + 10 + N]
offset any access: values[M + i]
As verbose said, this yields undefined behavior. A bit more precision follows.
5.2.1/1 says
[...] The expression E1[E2] is identical (by definition) to *((E1)+(E2))
Hence, val[i] is equivalent to *((val)+i)). Since val is an array, the array-to-pointer conversion (4.2/1) occurs before the addition is performed. Therefore, val[i] is equivalent to *(ptr + i) where ptr is an int* set to &val[0].
Then, 5.7/2 explains what ptr + i points to. It also says (emphasis are mine):
[...] If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
In the case of ptr + i, ptr is the pointer operand and the result is ptr + i. According to the quote above, both should point to an element of the array or to one past the last element. That is, in the OP's case ptr + i is a well defined expression for all i = 0, ..., 10. Finally, *(ptr + i) is well defined for 0 <= i < 10 but not for i = 10.
Edit:
I'm puzzled to whether val[10] (or, equivalently, *(ptr + 10)) yields undefined behavior or not (I'm considering C++ not C). In some circumstances this is true (e.g. int x = val[10]; is undefined behavior) but in others this is not so clear. For instance,
int* p = &val[10];
As we have seen, this is equivalent to int* p = &*(ptr + 10); which could be undefined behavior (because it dereferences a pointer to one past the last element of val) or the same as int* p = ptr + 10; which is well defined.
I found these two references which show how fuzzy this question is:
May I take the address of the one-past-the-end element of an array?
Take the address of a one-past-the-end array element via subscript: legal by the C++ Standard or not?
If you put it in a structure with some padding ints, it should be safe (since the pointer actually points to "known" destinations).
But it's better to avoid it.
struct SafeOutOfBoundsAccess
{
int paddingBefore[6];
int val[10];
int paddingAfter[6];
};
void foo()
{
SafeOutOfBoundsAccess a;
bool maybeTrue1 = a.val[-1] == a.paddingBefore[5];
bool maybeTrue2 = a.val[10] == a.paddingAfter[0];
}
I learned using Xor operator to swap two integers,like:
int a = 21;
int b = 7;
a^=b^=a^=b;
I would finally get a=7 and b=21.
I try to use xor operator on array like this way:
int main()
{
int a[] = {7,21};
a[0]^=a[1]^=a[0]^=a[1];
cout << a[0] <<',' <<a[1];
return 0;
}
The output is 0,7
I compile the code on Xcode and g++, they have the same issue.
Xor swap on array works fine with multiple lines:
int main()
{
int a[] = {7,21};
a[0]^=a[1];
a[1]^=a[0];
a[0]^=a[1];
cout << a[0] <<',' <<a[1];
return 0;
}
I would get output as 21,7
Here is the information what I already find:
- the issue is about sequence point: Array + XOR swap fails
- even for simple integers, they may have side affect to this undefined behavior: Why is this statement not working in java x ^= y ^= x ^= y;
- some other issue on xor swap: Weird XOR swap behavior while zeroing out data
So I should avoid using xor swap, instead, swap with temp would guarantee correct result.
But I still not very clear about what happen on a[0]^=a[1]^=a[0]^=a[1]; what is the sequence point issue with it?
I could not figure out what's the different on compiler between a[0]^=a[1]^=a[0]^=a[1]; and a^=b^=a^=b; ?
My doubt is:
" How does compiler output 0,7 for a[0]^=a[1]^=a[0]^=a[1];. "
I know this is sequence pointer issue, I could understand why printf("%d,%d",i++, i++); is undefined as some compiler parse parameter of function from left to right, and some do it from right to left.
But I do not know what is the problem on a[0]^=a[1]^=a[0]^=a[1];, it looks just the same as a^=b^=a^=b;. So I'd like to know how it works with array. So that I would know more about kind like "sequence pointer on index of array"
You cannot modify a variable more than once without an intervening sequence point, if you do so, it is Undefined Behavior.
a^=b^=a^=b;
Trying to modify the values of a and b in the above statement breaks this rule and you end up with an Undefined Behavior.
Note that Undefined Behavior means that any behavior is possible and you can get any output.
Good Reads:
Undefined Behavior and Sequence Points
C-Faq
int main() {
int a = 10;
int b = a * a++;
printf("%i %i", a, b);
return 0;
}
Is the output of the above code undefined behavior?
No in
int b = a * a++;
the behavior is undefined, so the result can be anything - that's not what "implementation dependent" means.
You might wonder why it's UB here since a is modified only once. The reason is there's also a requirement in 5/4 paragraph of the Standard that the prior value shall be accessed only to determine the value to be stored. a shall only be read to determine the new value of a, but here a is read twice - once to compute the first multiplier and once again to compute the result of a++ that has a side-effect of writing a new value into a. So even though a is modified once here it is undefined behavior.
What is the output of the following code:
int main() {
int k = (k = 2) + (k = 3) + (k = 5);
printf("%d", k);
}
It does not give any error, why? I think it should give error because the assignment operations are on the same line as the definition of k.
What I mean is int i = i; cannot compile.
But it compiles. Why? What will be the output and why?
int i = i compiles because 3.3.1/1 (C++03) says
The point of declaration for a name is immediately after its complete declarator and before its initializer
So i is initialized with its own indeterminate value.
However the code invokes Undefined Behaviour because k is being modified more than once between two sequence points. Read this FAQ on Undefined Behaviour and Sequence Points
int i = i; first defines the variable and then assigns a value to it. In C you can read from an uninitialized variable. It's never a good idea, and some compilers will issue a warning message, but it's possible.
And in C, assignments are also expressions. The output will be "10", or it would be if you had a 'k' there, instead of an 'a'.
Wow, I got 11 too. I think k is getting assigned to 3 twice and then once to 5 for the addition. Making it just int k = (k=2)+(k=3) yields 6, and int k = (k=2)+(k=4) yields 8, while int k = (k=2)+(k=4)+(k=5) gives 13. int k = (k=2)+(k=4)+(k=5)+(k=6) gives 19 (4+4+5+6).
My guess? The addition is done left to right. The first two (k=x) expressions are added, and the result is stored in a register or on the stack. However, since it is k+k for this expression, both values being added are whatever k currently is, which is the second expression because it is evaluated after the other (overriding its assignment to k). However, after this initial add, the result is stored elsewhere, so is now safe from tampering (changing k will not affect it). Moving from left to right, each successive addition reassigns k (not affected the running sum), and adds k to the running sum.
Can you please explain this code? It seems a little confusing to me
Is "a" a double array? I would think it's just an integer, but then in the cout statement it's used as a double array. Also in the for loop condition it says a<3[b]/3-3, it makes no sense to me, however the code compiles and runs. i'm just having trouble understanding it, it seems syntactically incorrect to me
int a,b[]={3,6,5,24};
char c[]="This code is really easy?";
for(a=0;a<3[b]/3-3;a++)
{
cout<<a[b][c];
}
Array accessors are almost syntactic sugar for pointer arithmetic. a[b] is equivalent to b[a] is equivalent to *(a+b).
That said, using index[array] rather than array[index] is utterly horrible and you should never use it.
Wow. This is really funky. This isn't really 2 dimensional array. it works because c is an array and there is an identity in the C language that treats this
b[3]
as the same as this
3[b]
so this code translates into a loop that increments a while a < (24/3-3) since 3[b] is the same as b[3] and b[3] is 24. Then it uses a[b] (which is the same as b[a]) as an index into the array c.
so, un-obfuscated this code is
int a;
int b[] = {3,5,6,24}
char c[] = "This code is really easy?";
for (a = 0; a < 5; a++)
{
cout << c[b[a]];
}
which is broken since b[4] doesn't exist, so the output should be the 3rd, 5th, 6th and 24th characters of the string c or
sco?
followed by some random character or a crash.
No, two variables are declared in the first statement: int a and int b[].
a[b][c] is just a tricky way of saying c[b[a]], that is because of the syntax for arrays: b[0] and 0[b] are equivalent.
int a,b[]={3,6,5,24};
Declares two variables, an int a and an array of ints b
char c[]="This code is really easy?";
Declares an array of char with the given string
for(a=0;a<3[b]/3-3;a++)
Iterates a through the range [0..4]:
3[b] is another way of saying b[3], which is 24.
24 / 3 = 8
8 - 3 = 5
cout << a[b][c];
This outputs the following result:
a[b] is equivalent to b[a], which will be b[0..4]
b[0..4][c] is another way of saying c[b[0..4]]
Well there is a simple trick in the code. a[3] is exactly the same as 3[a] for c compiler.
After knowing this your code can be transformed into more meaningful:
int a,b[]={3,6,5,24};
char c[]="This code is really easy?";
for(a=0;a<b[3]/3-3;a++)
{
cout<<c[b[a]];
}
a<3[b]/3-3 is the same as writing
a < b[3]/3-3
and a[b] is the same is b[a] since a is an integer
sp b[a] is one of the items from {3,6,5,24}
which then means a[b][c] is b[a][c]
which is either c[{3,6,5,24}]
foo[bar] "expands" to "*(foo + bar)" in C. So a[b] is actually the same as b[a] (because addition is commutative), meaning the ath element of the array b. And a[b][c] is the same as c[b[a]] i.e. the ith char in c where i is the ath element in b.
Okay - first, let's tackle the for loop.
When you write b[3], this is equivelent to *(b+3). *(b+3) is also equivelent to *(3+b), which can be written as 3[b]. This basically can be rewritten, more understandably, as:
for(a=0; a < ((b[3]/3) - 3); a++)
Since b[3] is a constant value (24), you can see this as:
for(a=0; a < ((24/3) - 3); a++)
or
for(a=0; a < (8 - 3); a++)
and finally:
for(a=0; a < 5; a++)
In your case, this will make a iterate from 0-4. You then output a[b][c], which can be rewritten as c[b[a]].
However, I don't see how this compiles and runs correctly, since it's accessing c[b[4]] - and b only has 4 elements. This, as written, is buggy.
First: 'a' is not initialized. Let's assume that it is initialized to 0.
'3[b]/3-3' equals 5. The loop will go from 0 to 4 using 'a'. ('3[b]' is 'b[3]')
In the a==4 step 'a[b]' (so 'b[a]') will be out of bounds (bounds of 'b' is 0..3) so it has undefined behavior. On my computer somethimes 'Segmentation fault' sometimes not. Until that point it outputs: "soc?"