Strange behavior when trying hacking a constant in C++ [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Modifying a const through a non-const pointer
I'm studying C++, and very interesting about pointers. And I tried to change value of a constant value (my teacher called it backdoor, please clarify if I'm wrong) like this:
const int i = 0;
const int* pi = &i;
int hackingAddress = (int)pi;
int *hackingPointer = (int*)pi;
*hackingPointer = 1;
cout << "Address:\t" << &i << "\t" << hackingPointer << endl;
cout << "Value: \t" << i << "\t\t" << *hackingPointer << endl;
system("PAUSE");
return 0;
However, the result is very strange. Although the two addresses are the same, the values are different.
How is my code executed? And where is 0 and 1 value is stored exactly?

You've discovered a little thing that C++ developers call undefined behavior. You told the compiler that "i is a constant with the value 0". So when you ask the compiler for the value of i, it tells you that it is 0.
Mucking around with trying to change the value of a constant violates the assumptions made by the compiler (that constants are going to be, well, constant), and so, the compiler is going to generate invalid or inconsistent code.
There are a lot of situations in C++ where it is possible to do something without the compiler catching it as an error, but the result is undefined. And if you do that, then you get results like what you're seeing. The compiler does something weird and unexpected.
Oh, and if your teacher is trying to teach you anything from an example such as this, he's wrong, and you should be very scared.
The only guarantee you get from code like this is this:
the compiler can do literally anything it likes
When you write code, you have an implicit contract with the compiler:
"If I write well-defined C++ code, then you convert it into an executable with the same effects as described by the C++ standard".
When you do something like this, you violate the contract. And then the compiler isn't obliged to follow it either. If you give the compiler code that is not well-defined according to the C++ standard, then it can't, and isn't going to, create an executable which does as the C++ standard specifies.

It seems, that compiler has optimized (inlined int const value)
cout << "Value: \t" << i << "\t\t" << *hackingPointer << endl;
to
cout << "Value: \t" << 0 << "\t\t" << *(0x0044ff28) << endl;
Anyway, you have still succeeded to change value of memory where i is stored. But do not try this at home :-)

It is not permitted to change the values of a constant, in fact it's undefined behaviour so your program could do anything as a result.
In this instance it looks like your compiler optimised the read away at compile time because it knew the value is fixed. Lots of implementations might just crash when you try and change it, but you cannot and should not bet or rely upon the result of any undefined behaviour ever.

Related

C++17 - Modifying const values

So, as of now, it seems to be impossible to actually modify a "const" value in C++ (tested in VS 2017).
const int a = 5;
int* ptr = (int*)&a; // Method 1
*((int*)(&a)) = 6; // Method 2
int* ptr = const_cast<int*>(&a); // Method 3
*ptr = 55;
cout << a << "\t" << &a << endl;
cout << *ptr << "\t" << ptr << endl;
Result:
5 SOMEMEMORYADDRESS
55 SOMEMEMORYADDRESS
Anyone got any idea what else can be tried to achieve the effect? Really curious how it is possible to have 1 memory address (at least according to the console) with 2 values.
Please note: there are topics like this for older C++ versions (and they used to work in the past - but they don't, anymore).
Really curious how it is possible to have 1 memory address (at least according to the console) with 2 values.
It's because you invoked undefined behavior. The C++ standard, from C++98, has expressly forbidden you from modifying an object that is declared const. And the standard has a catch-all statement such that if you do anything which causes modification of a const object, you get undefined behavior.
Because modifying an object declared const is UB, the compiler is free to assume that this object will never be modified. So, since the compiler can see that a is const and it is initialed to 5, it is 100% valid for the compiler to at compile time replace everything which revers to this object with 5. So when you do cout << a, the compiler is free to not bother to access memory; it can just do cout << 5.
If you did something to modify the memory behind a, that's UB, so the compiler doesn't have to care about what happens in that case.
they used to work in the past - but they don't, anymore
No, they never "worked". They merely just so happened to do what you thought they should. But C++ never guaranteed that compilers would behave in this way, so you have no right to complain about compilers changing that behavior now.

Returning a variable back by reference that goes out of scope [duplicate]

This question already has answers here:
Returning a reference to a local variable in C++
(3 answers)
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 5 years ago.
int& foo()
{
int i = 4;
return i;
}
int main()
{
int& j = foo();
cout << j << endl;
cout << j << endl;
cout << j << endl;
cout << j << endl;
cout << j << endl;
cout << j << endl;
return 0;
}
In here, I would expect the first cout of j to output garbage because of the fact that the local variable i, which j is referencing, has gone out of scope. However, it seems consistently the first cout statement outputs the correct value that would be outputted if i was still in scope which is 4. After that, every cout statement prints garbage that is the same value. Here is an example of some output I've been getting:
4
528494
528494
528494
528494
528494
Press any key to continue . . .
Why is j not immediately printing out garbage the first cout statement. Shouldn't i have already gone out of scope?
The rule is not "using a variable after it has passed out of scope gives garbage output". It is that using a reference to a variable that has gone out of scope is undefined behaviour according to all C++ standards.
Undefined behaviour means the C++ standard provides no guarantee whatsoever about what happens. A consequence is that, when behaviour is undefined, any actual observable result is permitted. Garbage output is only one possible observable result.
That means any explanation of the behaviour you're seeing will be specific to your implementation (compiler, your chosen optimisation or debugging settings, etc, memory management by your host system, ....). The behaviour may also vary over time, since - when behaviour is undefined - there is no requirement that any particular behaviour occurs consistently.
As a generic explanation, in your specific case, it is probably related to how your compiler manages usage of machine registers by your program. The variable i in foo() may be stored in a register, then that register may not be cleared immediately, so the value 4 is retrieved from it in the first cout << j << endl statement. The working of output streams (implementation of operator<<() or endl) may then use the same register internally - since there is absolutely no way that C++ code with well-defined behaviour can access those registers directly - and therefore overwrite it.
But that's just a guess. As I said, it depends on the implementation - that's why I used the word "may" so liberally in the preceding paragraph. When behaviour is undefined (by the standard) then a compiler is permitted to do anything. You could see a completely different behaviour by tweaking optimisations settings or next time you update your compiler. Different compilers may do things completely differently as well.

std::vector<T>::resize(n) vs reserve(n) with operator[] [duplicate]

This question already has answers here:
Vector going out of bounds without giving error
(4 answers)
Closed 7 years ago.
Numerous questions/answers inform me that std::vector<T>::resize(n) will increase capacity and size, whilst std::vector<T>::reserve(n) only increases capacity.
One example is Choice between vector::resize() and vector::reserve().
A comment in that question indicates that after use of reserve(n), the use of
vec[i less than n] = ..
is undefined behaviour, and many examples given are claimed to lead to segfaults.
When I compile and run
#include <vector>
#include <iostream>
void f(const std::vector<double> &s) {
std::cout << "s.size() = " << s.size() << std::endl;
std::cout << "s.capacity() = " << s.capacity() << std::endl;
}
int main() {
std::size_t n = 20121;
std::vector<double> a;
a.reserve(2*n);
a[n] = 2.5;
std::cout << "a["<<n<<"] = " << a[n] << std::endl;
f(a);
std::vector<double> b;
b.resize(2*n);
b[n] = 2.5;
std::cout << "b["<<n<<"] = " << b[n] << std::endl;
f(b);
}
my output is
a[20121] = 2.5
s.size() = 0
s.capacity() = 40242
b[20121] = 2.5
s.size() = 40242
s.capacity() = 40242
Questions:
Has there been a change that makes this ok? Is this just my compiler (g++ v5.2.0) giving me undefined, but nice, behaviour?
As a second point of curiosity, why does f(a) tell me the size is 0 (guessed answer: no push_back calls), even though a[n] returns a valid value?
By definition "Undefined Behavior" means that the result that you see on the execution of that line is not defined and can/will change with different runs.
Is this just my compiler (g++ v5.2.0) giving me undefined, but nice,
behaviour?
The nice behavior can be a mix of how std::vector is implemented in the version you are compiling and the state of memory when your program was executed. The compiler has almost no role to play in showing a "nice behavior".
One line answer: What you are noticing is indeed undefined behavior. The runtime is free to give any output/behavior including shooting monkeys out of your monitor, on hitting an UB.
As always with undefined behavior, your compiler may produce what you might think would be a "reasonable/nice" behavior, but the fact that you're seeing "nice" behavior doesn't mean that what you're doing is okay! This behavior may change at any time, with any new version of the compiler, when you compile and/or run it on any other machine or OS, or when you re-run your program during a different lunar phase.

Weird Behaviour with const_cast [duplicate]

This question already has answers here:
Two different values at the same memory address
(7 answers)
Closed 5 years ago.
I know that using const_cast is generally bad idea, but I was playing around with it and I came across a weird behaviour, where:
Two pointers have the same address value, yet when de-referenced, give different data values.
Does anyone have an explanation for this?
Code
#include <iostream>
int main()
{
const int M = 10;
int* MPtr = const_cast<int*>(&M);
(*MPtr)++;
std::cout << "MPtr = " << MPtr << " (*MPtr) = " << (*MPtr) << std::endl;
std::cout << " &M = " << &M << " M = " << M << std::endl;
}
Output
MPtr = 0x7fff9b4b6ce0 (*MPtr) = 11
&M = 0x7fff9b4b6ce0 M = 10
The program has undefined bahaviour because you may not change a const object.
From the C++ Standard
4 Certain other operations are described in this International
Standard as undefined (for example, the effect of attempting to modify
a const object). [ Note: This International Standard imposes no
requirements on the behavior of programs that contain undefined
behavior. —end note ]
So, aside from the "it's undefined behaviour" (which it is), the compiler is perfectly fine to use the fact that M is a constant, thus won't change, in the evaluation of cout ... << M << ..., so can use an instruction that has the immediate value 10, instead of the actual value stored in the memory of M. (Of course, the standard will not say how this works, more than "it's undefined", and compilers are able to choose different solutions in different circumstances, etc, etc, so it's entirely possible that you'll get different results if you modify the code, use a different compiler, different version of compiler or the wind is blowing in a different direction).
Part of the tricky bit with "undefined behaviour" is that it includes things that are "perfectly what you may expect" as well as "nearly what you'd expect". The compiler could also decide to start tetris if it discovers this is what you are doing.
And yes, this is very much one of the reasons why you SHOULD NOT use const_cast. At the very least NOT on things that were originally const - it's OK if you have something along these lines:
int x;
void func(const int* p)
{
...
int *q = const_cast<int *>(p);
*q = 7;
}
...
func(&x);
In this case, x is not actually const, it just becomes const when we pass it to func. Of course, the compiler may still assume that x is not changed in func, and thus you could have problems....

C++ Why the parenthesis?

I found this code in a book:
#include <iostream>
using namespace std;
void ChangesAreGood(int *myparam) {
(*myparam) += 10;
cout << "Inside the function:" << endl;
cout << (*myparam) << endl;
}
int main() {
int mynumber = 30;
cout << "Before the function:" << endl;
cout << mynumber << endl;
ChangesAreGood(&mynumber);
cout << "After the function:" << endl;
cout << mynumber << endl;
return 0;
}
It says:
(*myparam) += 10;
What difference would the following produce?
*myparam += 10;
To answer your question:
In your example, there is no difference except in readability.
And, as the comments on this post all suggest, please don't use the parenthesis here...
Interesting other cases
Using a property/method on the dereferenced object
On the other hand, there is a difference if you have something like
*myObject.myPropertyPtr += 10
compared to
(*myPointer).myProperty += 10
The names I chose here tell you what the difference is: the dereference operator * works on whatever is on its right hand side; in the first case the runtime will fetch the contents of myObject.myPropertyPtr, and dereference that, while in the second example it will dereference myPointer, and get myProperty from whatever is found on the object that myPointer points to.
The latter is so common that it even has its own syntax: myPointer->myProperty.
Using the ++ operator rather than +=
Another interesting example, which I thought of after reading another (now deleted) answer to this question, is the difference between these:
*myPointer++
(*myPointer)++
*(myPointer++)
The reason this is more interesting is because since ++ is a call like any other, and particularly doesn't deal with left and right hand side values, it is more ambiguous than your examples with +=. (Of course, they don't always make sense - sometimes you will end up trying to use the ++ operator on an object that doesn't support it - but if we limit our study to ints, this won't be a problem. And it should give you a compiler error anyway...)
Since you caught my curiosity, I conducted a small experiment testing these out. This is what I found out:
*myPointer++ does the same thing as *(myPointer++), i.e. first increment the pointer, then dereference. This shouldn't be so surprising - it is what we'd expect knowing the result of running *myObject.someProperty.
(*myPointer)++ does what you'd expect, i.e. first dereference the pointer, then increment whatever the pointer pointed to (and leave the pointer as is).
Feel free to take a closer look at the code I used to find this out if you want to. Just save it to dereftest.cpp, compile with g++ dereftest.cpp -o dereftest (assuming you have G++ installed) and run with ./dereftest.