From what I know, during program startup, C++ programs initialize a memory segment for constant values.
For example if you do a cout << "Hello World!"; then the const char* argument that operator<< receives will point to that memory segment, where the string "Hello World!" is.
I was under the impression that all hardcoded literals in the code end up residing in that memory segment during runtime. However I have seen a course where the following pitfall was illustrated:
class Data
{
public:
const int& y;
Data() : y(123) {}
void f()
{
int a[10000];
for (int i = 0; i < 10000; i++)
a[i] = 50;
}
};
void main()
{
Data d;
cout << d.y << endl;
d.f();
cout << d.y << endl;
}
The pitfall was described as the situation that during the execution of the constructor for Data, instead of d.y ending up pointing to that memory segment dedicated to constant values, a behaviour similar to int temp = 123; this->y = &temp; is what actually happens, and since temp is on the stack, it will be gone in a second, and after temp is gone, f() may end up allocating a such that the address that y points to ends up included in a and thus may be overwritten when calling f().
I would like some clarifications on:
Is really what happens/can happen?
If the answer to 1. is "yes" then why does this happen? Is there some UB somewhere or something like that?
If the answer to 1. is "yes" then why was this not fixed? And by fixed I mean a solution similar to what I described above where all literals reside in their own special memory segment?
Is there anything else I should know to understand this situation better? (maybe I am not asking the right questions?)
Related
it seems like I have a fundamental problem in understanding pointers in C++. In my understanding, the following code example should print "53" in the console, but instead it prints "33".
// Example program
#include <iostream>
#include <vector>
int main()
{
std::vector<int*> v;
{
int z = 5;
v.push_back(&z);
}
{
int a = 3;
v.push_back(&a);
}
std::cout << *v[0] << *v[1] << std::endl;
}
I originally had this problem in a bigger project I'm currently working on and I recognized that if I'm doing it this way, all pointers I added previously point to the same element as the last one. But why? I thought that if i add two pointers which point to different integers, they will stay different after adding them to a vector.
This:
{
int a = 3;
v.push_back(&a);
}
Results in undefined behavior: you are storing the address of a temporary and using it later. This means anything could happen: the program could print "I'm sorry David, I can't do that."
As soon as the scope ends (with }), the lifetime of a ends, and the address it used to occupy is available to be reused. And that is what happens in your code, the same address gets reused so you end up storing the same address twice. But again, undefined behavior means anything could happen. For example, if I compile with optimization enabled on my computer it prints "33" but if I disable optimization it prints "53".
The problem was with your scoping. Remove the scopes and you get the expected behaviour:
std::vector<int *> v;
int z = 5;
v.push_back(&z);
int a = 3;
v.push_back(&a);
std::cout << *v[0] << *v[1] << std::endl;
Original code has undefined behaviour:
std::vector<int*> v;
{
int z = 5;
v.push_back(&z);
} // here z no longer exists, so you can't dereference it later via *v[0]
I'm curious about why my research result is strange
#include <iostream>
int test()
{
return 0;
}
int main()
{
/*include either the next line or the one after*/
const int a = test(); //the result is 1:1
const int a = 0; //the result is 0:1
int* ra = (int*)((void*)&a);
*ra = 1;
std::cout << a << ":" << *ra << std::endl;
return 0;
}
why the constant var initialize while runtime can completely change, but initialize while compile will only changes pointer's var?
The function isn't really that relevant here. In principle you could get same output (0:1) for this code:
int main() {
const int a = 0;
int* ra = (int*)((void*)&a);
*ra = 1;
std::cout << a << ":" << *ra;
}
a is a const int not an int. You can do all sorts of senseless c-casts, but modifiying a const object invokes undefined behavior.
By the way in the above example even for std::cout << a << ":" << a; the compiler would be allowed to emit code that prints 1:0 (or 42:3.1415927). When your code has undefinded behavior anything can happen.
PS: the function and the different outcomes is relevant if you want to study internals of your compiler and what it does to code that is not valid c++ code. For that you best look at the output of the compiler and how it differs in the two cases (https://godbolt.org/).
It is undefined behavior to cast a const variable and change it's value. If you try, anything can happen.
Anyway, what seems to happen, is that the compiler sees const int a = 0;, and, depending on optimization, puts it on the stack, but replaces all usages with 0 (since a will not change!). For *ra = 1;, the value stored in memory actually changes (the stack is read-write), and outputs that value.
For const int a = test();, dependent on optimization, the program actually calls the function test() and puts the value on the stack, where it is modified by *ra = 1, which also changes a.
How is it possible that the value of *p and the value of DIM are different but the have the same address in memory?
const int DIM=9;
const int *p = &DIM;
* (int *) p = 18; //like const_cast
cout<< &DIM <<" "<< p << '\n';
cout << DIM << " " << *p << '\n';
You're changing the value of a const variable, which is undefined behavior. Literally anything could happen when you do this, including your program crashing, your computer exploding, ...
If a variable is supposed to change, don't make it const. The compiler is free to optimise away accesses to const variables, so even if you found a successful way to change the value in memory, your code might not even be accessing the original memory location.
It is a compiler optimization. Given that DIM is a constant, the compiler could have substituted its known value.
The code below does what you meant to do... as mentioned in other posts, if you mean to change the value of an variable, do not define it as const
#include <stdio.h>
int main()
{
int d= 9;
int *p_d=&d;
*p_d=18;
printf("d=%d\np_d=%d\n",d,*p_d);
return 0;
}
This code prints
d=18
p_d=18
I have come across a sample source code regarding use reference data member and i am confused about output. Here is sample code.
class Test {
private:
int &t;
public:
Test (int y):t(y) { }
int getT() { return t; }
};
int main() {
int x = 20;
Test t1(x);
cout << t1.getT() << "\n"; // Prints 20 as output. however y has already been destroyed but still prints 20.
x = 30;
cout << t1.getT() << endl; // Prints Garbage as output Why ? Ideally both steps should be Garbage.
return 0;
}
And to add for more confusion here is one more piece of code for same class
int main() {
int x = 20;
int z = 60;
Test t1(x);
Test t2(z);
cout<<t1.getT()<<"\n"; // Prints 60! WHY? Should print garbage
cout<<t2.getT() << "\n"; // Prints Garbage
cout<<t1.getT() << endl; // Prints Same Garbage value as previous expression
return 0;
}
x is passed by value using a temporary, so t is a reference to that temporary, not x. That temporary will be destroyed after constructor returns. Your code has undefined behavior. anything can come up as output. Your problem can be solved by passing a reference to x like
Test (int& y):t(y);
but this is not a good idea. There can be cases where x goes out of scope but the Test object is still used , then the same problem will appear.
Your constructor:
Test (int y):t(y) { }
sets t to be a reference to y, the local (temporary) variable on the stack, and not the variable in the calling function. When you change the variable value in the calling function it does not change anything in the object you created.
The fact that the reference is to a temporary variable that is lost at the end of the life of the constructor means that getT() returns an undefined value.
Every call to int getT() accesses the memory address for y. That memory address was released from the stack at the end of the constructor, so it points to memory that is not on the stack or the heap and so may be reused at any time. The time of reuse is not defined and depends on other operations established by the compiler and dependency libraries. The return value of int getT() therefor depends on other elements on your OS that affect memory, the compiler type and version, and the OS amongst other things.
Now i got it. Yes it is undefined but to answer my question why it is printing 20 or 60 before printing garbage? Actually answer is that 20 and 60 both values are garbage and ideally both getT function calls should print Garbage but it doesn't.Because there is no other instruction between Test t2(z);
cout<<t1.getT()<<"\n";
but for next statement \n works as a instruction and meanwhile stack clears the value.
I have the following sample code. Just wanted to know if is valid to take address of a local variable in a global pointer and then modify it's contents in a sub function. Following program correctly modifies value of variable a . Can such practice cause any issues ?
#include <iostream>
#include <vector>
using namespace std;
vector<int*> va;
void func()
{
int b ;
b = 10;
int * c = va[0];
cout << "VALUE OF C=" << *c << endl;
*c = 20;
cout << "VALUE OF C=" << *c << endl;
}
int main()
{
int a;
a = 1;
va.push_back(&a);
func();
cout << "VALUE IS= " << a << endl;
return 0;
}
This is OK, as long as you don't try to dereference va[0] after a has gone out of scope. You don't, so technically this code is fine.
That said, this whole approach may not be such a good idea because it makes code very hard to maintain.
I'd say that if your program grows you could forget about a change you made in some function and get some weird errors you didn't expect.
Your code is perfectly valid as long as you call func() while being in the scope of a. However, this is not considered to be a good practice. Consider
struct HugeStruct {
int a;
};
std::vector<HugeStruct*> va;
void print_va()
{
for (size_t i = 0; i < va.size(); i++)
std::cout<<va[i].a<<' ';
std::cout<<std:endl;
}
int main()
{
for (int i = 0; i < 4; i++) {
HugeStruct hs = {i};
va.push_back(&hs);
}
print_va(); // oups ...
}
There are 2 problems in the code above.
Don't use global variables unless absolutely necessary. Global variables violate encapsulation and may cause overlay of variable names. In most cases it's much easier to pass them to functions when needed.
The vector of pointers in this code looks awful. As you can see, I forgot that pointers became invalid as soon as I left for-loop, and print_va just printed out garbage. The simple solution could be to store objects in a vector instead of pointers. But what if I don't want HugeStruct objects to be copied again and again? It can take quite a lot of time. (Suppose that instead of one int we have a vector of million integers.) One of the solutions is to allocate HugeStructs dynamically and use vector of smart pointers: std::vector<std::shared_ptr<HugeStruct>>. This way you don't have to bother about memory management and scope. Objects will be destroyed as soon as nobody will refer to them.