Pointer address and reference confusion - c++

I have two nearly identical pieces of code which should produce the same output, except not only are they different, the one line I changed is somehow affecting unrelated output!
#include "stdafx.h"
#include <iostream>
using namespace std;
class Tag {
public:
int num = 0;
Tag* contains = nullptr;
Tag::Tag(int n) { num = n; }
void setContains(Tag t) { contains = &t; }
int getNum() { return num; }
Tag getContains() { return *contains; }
};
int main() {
Tag tag1 = Tag(1); Tag tag2 = Tag(2);
tag1.setContains(tag2);
cout << tag1.getContains().getNum() << endl << (*tag1.contains).getNum() << endl;
return 0;
}
This outputs
8460735
8460735
or some other random number. Which tells me I'm somehow outputting the pointer address and not the object it's referencing. So I changed the line
cout << tag1.getContains().getNum() << endl << (*tag1.contains).getNum() << endl;
to
cout << tag1.getContains().getNum() << endl << (*tag1.contains).num << endl;
and I get the output
2
2
Wait, what? I get it if the second line changes from the address to the actual number 2, but why do BOTH change to 2?

setContains makes contains point to a local variable. The variable is destroyed as soon as the function returns, leaving contains a dangling pointer. Any attempt to use it then exhibits undefined behavior.
Practically speaking, contains->num reads some random garbage from the stack where the variable used to live. Slight perturbations to the program change stack access patterns, leaving different garbage there.

Because you'are invoking undefined behavior, you save the address of a local argument to Tag* contains here:
void setContains(Tag t) { contains = &t; }
You should pass the argument by reference or pointer directly. Otherwise you are just saving the address of a variable on stack which is destroyed at function exit.
Everything based on contains afterwards is just undefined behavior.

Related

Why does passing local variable by value work?

# include <iostream>
# include <string>
using std::string;
using std::cout;
using std::endl;
string func() {
string abc = "some string";
return abc;
}
void func1(string s) {
cout << "I got this: " << s << endl;
}
int main() {
func1(func());
}
This gives:
$ ./a.out
I got this: some string
How/why does this code work ? I wonder because abc went out of scope and got destroyed as soon as the call to func() completed. So a copy of abc cannot be/should not be available in variable s in function func1 Is this understanding correct ?
The return value is copied from the local variable, effectively creating a new string object.
However, RVO (returned value optimization) should eliminate this step.
Try single stepping your code in a debugger. You should see the std::string copy constructor called for the return line. Be sure to compile with debug enabled and optimizers off.
Your code is essentially asking:
"Call func1, and in order for func1 to work I have to receive a string which we can use by calling the copy constructor on that string. The parameter for func1 we want to come from the return value of func (which we know has to be a string since its explicitly defined".
abc goes out of scope only after the copy constructor is called on the return value of func() which passes the value of the string. In theory you could have written it passed by reference or constant reference:
void func1(string& s) {
cout << "I got this: " << s << endl;
}
Which allows func1 to directly access the string in memory through a pointer (and also change it, if your code was meant to.)
void func1(string const& s) {
cout << "I got this: " << s << endl;
}
Which provides a constant reference to the string from func(). This ensures that you get a pointer to the data and that you won't change its contents. Typically passing data by constant reference (const&) is desirable because it's very fast and keeps your code from accidentally changing data that it shouldn't.
You really only need to pass by value if you're going to manipulate the data once you pass it to the new function, saving you the resources of creating another new container to handle the manipulation:
void func1(string s) {
s += " some extra stuff to add to the end of the string"; //append some new data
cout << "I got this: " << s << endl;
}

Direct initialization of object's variable using input from user

For th following code:
#include <iostream>
class Test
{
public:
int i;
void get();
};
void Test::get()
{
std::cout << "Enter the value of i: ";
std::cin >> i; // Line 1
}
Test t;
int main()
{
Test t;
t.get();
std::cout << "value of i in local t: "<<t.i<<'\n';
::t.get();
std::cout << "value of i in global t: "<<::t.i<<'\n';
return 0;
}
Though I know what is happening in the above code i.e. the values are assigned to the local and global t , I am confused by the line 1 as I am unable to understand how the value received from the user by the line 1 is getting assigned to the t.i or ::t.i .
It would be much appreciated If someone can help me explain **behind the scene of above problem **.
Test::get() is a member function.
Inside a member function, you can name any member variable of that class, and it'll affect the object you called the function on.
Read the chapter in your C++ book about classes.

Why there is still valid access to struct after leaving the scope? [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 9 years ago.
First of all I'd like to understand why I get lines (5) and (6) in the output and not only one of them.
Second, why mc1.print() prints valid values? Shouldn't _ms point to undefined place after mc1(&MyStruct(mc2)) because the place where MyStruct was created (in cast operator) was already unwind?
struct MyStruct
{
int w;
int h;
MyStruct()
{
cout << "MyStruct" << endl;
}
~MyStruct()
{
cout << "~MyStruct: w=" << w << "h=" << h << endl;;
}
};
class MyClass1
{
public:
MyClass1(MyStruct* ms)
:_ms(ms)
{
cout << "MyClass1" << endl;;
}
~MyClass1()
{
cout << "~MyClass1" << endl;
}
void print()
{
cout << "print: w=" << _ms->w << "h=" << _ms->h << endl;
}
MyStruct* _ms;
};
class MyClass2
{
public:
MyClass2()
{
cout << "MyClass2" << endl;
}
~MyClass2()
{
cout << "~MyClass2" << endl;
}
operator MyStruct()
{
MyStruct ms;
ms.h = 11;
ms.w = 22;;
return ms;
}
};
int main()
{
MyClass2 mc2;
MyClass1 mc1(&MyStruct(mc2));
mc1.print();
return 0;
}
the output is
1. MyClass2
2. MyStruct
3. ~MyStruct: w=22h=11
4. MyClass1
5. ~MyStruct: w=22h=11
6. ~MyStruct: w=22h=11
7. print: w=22h=11
Why there is still valid access to struct after leaving the scope?
There isn't. It isn't valid. You can try to read arbitrary memory addresses, but doing so is not valid". The C++ standard just doesn't say what should happen if you do something "invalid". So your code is allowed to do what you're observing.
First of all I'd like to understand why I get lines (5) and (6) in the output and not only one of them.
I've no idea. I don't, but since the program has undefined behaviour, anything could happen in principle.
Second, why mc1.print() prints valid values?
It doesn't. It gives undefined behaviour, by printing whatever happens to be in the memory that was once occupied by the dead temporary, if that memory is still accessible. In your case, it just happens that nothing has reused or otherwise invalidated the memory, so you happen to see the values you put there when the temporary was alive.
Shouldn't _ms point to undefined place after mc1(&MyStruct(mc2)) because the place where MyStruct was created (in cast operator) was already unwind?
It points to some memory location which no longer contains a valid object. Typically, the stack frame isn't released until the function returns; and even when it is, the memory typically remains accessible. So accessing a dead object gives you some kind of undefined behaviour, which may not be whatever undefined behaviour you thought it should give.

Pointer to vector: size is correct in function but 0 in caller

I have a simple function that I simplified to return just a dummy list (to ensure its not some logic error)
vector<AttrValue>* QueryEvaluator::getCandidateList(...) {
...
values.clear();
values.push_back(2);
values.push_back(3);
cout << "values is of size " << values.size() << endl;
return &values;
}
then in a cppunit test:
vector<AttrValue>* candidateList0 = evaluator->getCandidateList(cl, 0);
cout << candidateList0->size() << endl;
But problem is size(), in the test, is always 0 even though the cout message prints the correct size 2. What might be wrong?
I tried a simple program and it appears to be fine ...
#include <iostream>
#include <vector>
using namespace std;
vector<int>* test() {
vector<int> vec { 2, 3, 6, 1, 2, 3 };
return &vec;
}
int main() {
cout << test()->size() << endl;
return 0;
}
You are returning a the address of temporary from getCandidateList function, the object is release when function returns. access to it is undefined behavior. You could just return the vector out, RVO should come to apply and elide the copy:
Try:
std::vector<AttrValue> QueryEvaluator::getCandidateList(...)
{
//blah
return values;
}
I tried a simple program and it appears to be fine ...
the temporary vector is released when getCandidateList function returns. The program has undefined behavior.
Your vector appears to be declared on the stack so will be destroyed when it goes out of scope (when the function exits). If you want to return a pointer to a vector, allocate it on the heap instead
vector<AttrValue>* QueryEvaluator::getCandidateList(...) {
vector<AttrValue>* values = new vector<AttrValue>();
...
values->clear();
values->push_back(2);
values->push_back(3);
cout << "values is of size " << values->size() << endl;
return values;
}
It might be easier to instead declare it in the caller and pass a reference to getCandidateList
void QueryEvaluator::getCandidateList(vector<AttrValue>& values)
...or return it by value
vector<AttrValue> QueryEvaluator::getCandidateList(...) {
So many interesting things to consider:
vector<AttrValue>* QueryEvaluator::getCandidateList(...) {
...
values.clear();
values.push_back(2);
values.push_back(3);
cout << "values is of size " << values.size() << endl;
return &values;
}
So it looks like you left out the most interesting piece in the code ... above. Moral of the story try and provide compilable working code that shows the error. Reducing your problem to a small example usually results in you finding the problem yourself. At the very least you should provide exact definitions of all objects that are used (the type is the most important thing in C++)
Does it declare the vector as a local object?
std::vector<int> values;
In this case the vectors lifespan is bound to the function and it is destroyed at the end of the function. This means using it after the function has returned is undefined behavior (anything can happen).
But it also looks like you are using objects as part of you unit test framework. So a potential solution is to make the vector part of the object. Then the vector will live as long as the object (not just the function call) and thus returning a pointer to it will work as expected.
class QueryEvaluator
{
std::vector<int> values;
public:
vector<AttrValue>* QueryEvaluator::getCandidateList(...);
};
An alternative would be to return the vector by value rather than a pointer. This means the object will be correctly copied out of the function and your calling code can manipulate and test the vector all they need.
vector<AttrValue> QueryEvaluator::getCandidateList(...)
{
...
return &values;
}
Side Note:
Also you need to try not to use pointers in your code. Pointers doe not convey any ownership.This means we do not know who is responsible for deleting the object. In this case a reference would probably have been better (you never return NULL) as this gives the caller access to the object will retaining ownership (assuming you decided not to return by value).

Modyfying temporary object

Can someone tell why test(2) object is destroyed after test_method() call?
#include<iostream>
#include<string>
using namespace std;
class test
{
int n;
public:
test(int n) : n(n)
{
cout << "test: " << n << endl;
}
~test()
{
cout << "~test: " << n << endl;
}
test & test_method()
{
cout << "test_method: " << n << endl;
return *this;
}
};
int main(int argc, const char *argv[])
{
cout << "main start" << endl;
const test &test1 = test(1);
const test &test2 = test(2).test_method();
cout << "main end" << endl;
}
Output is:
main start
test: 1
test: 2
test_method: 2
~test: 2
main end
~test: 1
test(2).test_method() returns a reference, which is bound to test2, and then the object to which it refers is destroyed at the end of the full expression, since it is a temporary object. That should not be a surprise.
The real surprise is that test1 remains a valid reference, because it is directly bound to a temporary, and binding a temporary to a reference extends the lifetime of the temporary to that of the reference variable.
You only have to note that in the test(2) case, the temporary object isn't bound to anything. It's just used to invoke some member function, and then its job is done. It doesn't "babysit" member functions, or in other words, lifetime extension isn't transitive through all possible future references.
Here's a simple thought experiment why it would be impossible to actually have "arbitrary lifetime extension":
extern T & get_ref(T &);
{
T const & x = get_ref(T());
// stuff
// Is x still valid?
}
We have no idea if x remains valid beyond the first line. get_ref could be doing anything. If it's implemented as T & get_ref(T & x) { return x; }, we might hope for magic, but it could also be this:
namespace { T global; }
T & get_ref(T & unused) { return global; }
It's impossible to decide within the original translation unit whether anything needs to be extended or not. So the way the standard has it at present is that it's an entirely trivial, local decision, just made when looking at the reference declaration expression, what the lifetime of the temporary object in question should be.
Because the C++ standard requires this behavior. Give the object a name if you want it to persist. It will persist as long as the name.
Edit: You your example, test1 is the name that you gave to the first object, whereas the second object has obtained no name at all, and so it does not outlast evaluation of the expression.