Please see the following code snippets. In the second function i am returning a reference. I am declaring a local variable in the function and is returning the address. As the variable is local I believe its life ends as it exits the function. My question is why is it possible to access the value from the caller without any exceptions even though the original variable is deleted?
int& b=funcMulRef(20,3);
int* a= funcMul(20,3);
int* funcMul(int x,int y)
{
int* MulRes = new int;
*MulRes = (x*y);
return MulRes;
}
int& funcMulRef(int x,int y)
{
int MulRes ;
MulRes = (x*y);
return MulRes;
}
Regards,
JOhn
The behaviour of the second function is simply undefined; anything can happen, and in many circumstances, it will appear to work, simply because nothing has overwritten where the result used to be stored on the stack.
You are accessing data that is no longer in scope.
The memory probably still has the data in it though so it appears to work properly but is likely to be reused at any time and the value will be overwritten.
The next time you call any function or allocate a local stack variable it's very likely to reuse that memory for the new data and overwrite what you had there before. It's underfined behavour.
The original value isn't deleted. Just because the action of deleting it will cause some unseen computations.
The value is still there, but the memory space is no longer yours, and is actually undefined.
You are pointing to a space in memory that can be overrun by the program.
No, you shouldn't do this. The result of accessing residual data on the stack is undefined. Beside that, if your return value is of class type, its destructor will have already been called.
Are you trying to avoid temporary objects? If so, you might be interested in this:
http://en.wikipedia.org/wiki/Return_value_optimization
It most likely won't work in these cases :
funcMulRef(10,3) + funcMulRef(100,500)
alternatively, in a more nasty way :
std::cout << "10*3=" << funcMulRef(10,3) << " 100*500=" << funcMulRef(100,500) << std::endl;
gcc will warn for this kind of errors if you use -Wall
Related
This is not a question about why you would write code like this, but more as a question about how a method is executed in relation to the object it is tied to.
If I have a struct like:
struct F
{
// some member variables
void doSomething(std::vector<F>& vec)
{
// do some stuff
vec.push_back(F());
// do some more stuff
}
}
And I use it like this:
std::vector<F>(10) vec;
vec[0].doSomething(vec);
What happens if the push_back(...) in doSomething(...) causes the vector to expand? This means that vec[0] would be copied then deleted in the middle of executing its method. This would be no good.
Could someone explain what exactly happens here?
Does the program instantly crash? Does the method just try to operate on data that doesn't exist?
Does the method operate "orphaned" of its object until it runs into a problem like changing the object's state?
I'm interested in how a method call is related to the associated object.
Yes, it's bad. It's possible for your object to be copied (or moved in C++11 if the distinction is relevant to your code) while your are inside doSomething(). So after the push_back() returns, the this pointer may no longer point to the location of your object. For the specific case of vector::push_back(), it's possible that the memory pointed to by this has been freed and the data copied to a new array somewhere else. For other containers (list, for example) that leave their elements in place, this is (probably) not going to cause problems at all.
In practice, it's unlikely that your code is going to crash immediately. The most likely circumstance is a write to free memory and a silent corruption of the state of your F object. You can use tools like valgrind to detect this kind of behavior.
But basically you have the right idea: don't do this, it's not safe.
Could someone explain what exactly happens here?
Yes. If you access the object, after a push_back, resize or insert has reallocated the vector's contents, it's undefined behavior, meaning what actually happens is up to your compiler, your OS, what do some more stuff is and maybe a number of other factors like maybe phase of the moon, air humidity in some distant location,... you name it ;-)
In short, this is (indirectly via the std::vector implemenation) calling the destructor of the object itself, so the lifetime of the object has ended. Further, the memory previously occupied by the object has been released by the vector's allocator. Therefore the use the object's nonstatic members results in undefined behavior, because the this pointer passed to the function does not point to an object any more. You can however access/call static members of the class:
struct F
{
static int i;
static int foo();
double d;
void bar();
// some member variables
void doSomething(std::vector<F>& vec)
{
vec.push_back(F());
int n = foo(); //OK
i += n; //OK
std::cout << d << '\n'; //UB - will most likely crash with access violation
bar(); //UB - what actually happens depends on the
// implementation of bar
}
}
I know that returning temporary variables using references doesn't work since the temporary object is lost after the function terminates, but the following piece of code works since the returned temporary is assigned to another object.
I assume the temporary objects get destroyed after the line of function call. If it is so, why isn't this working for this kind of method chaining?
Counter& Counter::doubler()
{
Counter tmp;
tmp.i = this->i * 2;
return tmp;
}
int main()
{
Counter d(2);
Counter d1, d2;
d1 = d.doubler(); // normal function call
std::cout << "d1=" << d1.get() << std::endl; // Output : d1=4
d2 = d.doubler().doubler(); // Method chaining
std::cout << "d2=" << d2.get() << std::endl; // Output : d2=0
return 0;
}
If a function returns a reference to a local object, the object will be destroyed as soon as the function returns (as local objects are). It does not persist to the end of the line of the function call.
Accessing an object after it has been destroyed will yield unpredictable results. Sometimes it may work, for some definition of "work", and sometimes it may not. Just don't do it.
Counter& doubler()
{
Counter tmp;
tmp.i=this->i*2;
return tmp;
}
It's undefined behaviour. After return from function - your reference will be dangling, since Counter destructor will be called for local object tmp.
The real question is not "why this kind of method chaining is not working?", but instead "why the first ('normal') function call works?"
The answer is there's no way to tell, because it might as well break your program.
To state it clearly: returning temporary object by reference is undefined behavior. Which, of course, means that it might work by coincidence today and stop working tomorrow. All bets are off.
When a function returns and stack roll back happens it is logical rollback the stack pointer is set with different value. If a function returns a local variable reference then memory location pointing to local may still be with process and has same bits set. However this is not guaranteed and after few more calls will not be valid and may result in undefined behavior.
Other are all right, in that 'just do not fiddle around with references to local objects'
But as to why it works in one case and not in other
When you call it singly, and when the function returns, the object is still lying on the stack. Granted a 'destructed' object - but whatever space the object used to take is still there on the stack. If you have a simple object, like with a single int member, then there is NOTHING disturbing it on the stack, unless you code allocated something else on the stack, or the destructor decided to do a much thorough job and obliterate an integer member (which most destructors do not do). Granted yada yada, but till the very next line not much is going to happen that would move it from the stack. Your reference is pointing to a valid memory location and your (destructed) object would be there. That is why it works for you.
When you call it chained, see the first call returns you a reference to that tmp on the stack. As explained in #1 above, no problem so far. Your (destructed) tmp is still very much there on stack. But notice the moment you call that second doubler. Where is the tmp inside that second doubler function call going to come up? Right where the tmp from your first call was!!! The second call overwrites the object (the tmp with value 4) with a tmp with value 0 (the default constructed one). The second call is in effect made on a Counter which has 0 value, hence you get 0. Extremely tricky - that is why just forget about fiddling with returning references to local variable.
Now Purists may scream - undefined, no no just don't do it - and I am with them - I have myself said twice (now thrice) that do not do it. But people may try it. I bet for a 'simple' object like the following, AND code exactly as in the question (so as to nothing is disturbing the stack), everyone is going to get consistent 4, 0 - no randomness, no undefined ....
class Counter
{
public:
Counter()
{
i = 0;
}
Counter(int k)
{
i = k;
}
int get()
{
return i;
}
int i;
Counter& doubler();
};
I used to think returning a reference is bad as our returned reference will refer to some garbage value. But this code works (matrix is a class):
const int max_matrix_temp = 7;
matrix&get_matrix_temp()
{
static int nbuf = 0;
static matrix buf[max_matrix_temp];
if(nbuf == max_matrix_temp)
nbuf = 0;
return buf[nbuf++];
}
matrix& operator+(const matrix&arg1, const matrix&arg2)
{
matrix& res = get_matrix_temp();
//...
return res;
}
What is buf doing here and how does it save us from having garbage values?
buf is declared as static, meaning it retains it's value between calls to the function:
static matrix buf[max_matrix_temp];
i.e. it's not created on the stack as int i = 0; would be (a non-static local variable), so returning a reference to it is perfectly safe.
This following code is dangerous, because the memory for the variable's value is on the stack, so when the function returns and we move back up the stack to the previous function, all of the memory reservations local to the function cease to exist:
int * GetAnInt()
{
int i = 0; // create i on the stack
return &i; // return a pointer to that memory address
}
Once we've returned, we have a pointer to a piece of memory on the stack, and with dumb luck it will hold the value we want because it's not been overwritten yet — but the reference is invalid as the memory is now free for use as and when space on the stack is required.
I see no buf declared anywhere, which means it doesn't go out of scope with function return, so it's okay. (it actually looks like it's meant to be matrixbuf which is also fine, because it's static).
EDIT: Thanks to R. Martinho Fernandes for the guess. Of course it is matrix buf, so it makes buf static array in which temporary is allocated to make sure it doesn't get freed when the function returns and therefore the return value is still valid.
This is safe up to a point, but very dangerous. The returned reference
can't dangle, but if client code keeps it around, at some future point,
the client is in for a big surprise, as his value suddenly changes to a
new return value. And if you call get_matrix_temp more than
max_matrix_temp times in a single expression, you're going to end up
overwriting data as well.
In the days before std::string, in code using printf, I used this
technique for returning conversions of user defined types, where a
"%s" specifier was used, and the argument was a call to a formatting
function. Again, max_matrix_temp was the weak point: a single
printf which formatted more instances of my type would output wrong
data. It was a bad idea then, and it's a worse idea now.
Being that my C++ isn't that great, this may be a really simple/obvious answer, but it sure has me stumped. Keep in mind its kinda late here and I'm a little tired. I got this code here:
void TestFunc(int *pVar)
{
cout << endl << *pVar << endl;
delete pVar;
pVar = nullptr;
}
int main(int argc, char *argv[])
{
int *z(new int);
*z = 5;
TestFunc(z);
if (z == nullptr)
cout << "z Successfully Deleted!" << endl;
else cout << "z NOT deleted!" << endl;
return 0;
}
The program compiles just fine with no errors or warning. When I run it, it displays 5, just as I'd expect. However, it says z NOT deleted!. I am curious as to why pVar is not getting set to nullptr even though I explicity set it in my TestFunc() function. Any help would be appreciated. If it matters, this is Visual Studio 2010 and just a regular unmanaged C++ application.
Because it's being passed by value (i.e. as a copy).
If you want the variable itself to be passed (rather than just its value, which is copied), use
void TestFunc(int *&pVar)
instead.
Note that delete only cares about the pointee, not the pointer. So "deleting" a copy of a pointer deletes the same thing as the original pointer, because in either case you're deleting their targets, which are the same.
TestFunc accepts the pointer by value, so setting it to null inside the function actually only affects the copy in the function and is not visible to the caller. So pVar is set to null, but z in main() is not because those are different variables.
To make the change visible to the caller pass the pointer by reference or via a double pointer.
It is late (sorry!!!).
z is passed by value. z is not pVar. You assigned the value of nullptr to pVar and not to z.
The "idiom" you present is generally advocated as Safe Delete.
It was already known in C (though it was a free back then): zero-out the pointer you just freed / deleted to avoid doing so twice.
The trouble is, you zero out the current pointer, but any copy of it still point to the same memory area, which now contains garbage.
Memory handling is a difficult topic, the fundamental concept is ownership. At any point in time, the owners of a particular memory zone should be well identified, and they should have the responsibility of returning it to the system when appropriate.
The first step in this direction is the use of smart pointers, for example std::unique_ptr or boost::scoped_ptr. For shared ownership (experts only), std::shared_ptr might come handy, but you're not there yet.
If you write a delete in your code, it means you are exposing yourself to leaks. It is not bad in itself, but it calls for careful review, and makes the code brittle (ie, likely to break on change). In your case:
int main() {
boost::scoped_ptr<int> i(new 5);
foo(*i);
} // memory returned to system by ~scoped_ptr()
I came across an issue today regarding local variables. I learned that...
int * somefunc()
{
int x = 5;
return &x;
}
int * y = somefunc();
//do something
is bad, unsafe, etc. I'd imagine that the case is the same for...
int * somefunc()
{
int * x = new int;
x = 5;
return x;
}
int * y = somefunc();
//do something
delete y;
I've been under the impression for the longest time that this would be safe as the address of x stays in scope when it's returned. However, I'm having second thoughts now and I'm thinking this would lead to memory leaks and other problems, just as the fist example would. Can someone confirm this for me?
As it stands, the second example is wrong. You probably meant this:
int * somefunc()
{
int * x = new int;
*x = 5; // note the dereferencing of x here
return x;
}
Now this is technically fine, but it is prone to errors. First, if after the allocation of x an exception happens, you have to catch it, delete x and then rethrow, or you get a memory-leak. Second, if you return a pointer, the caller has to delete it - callers forget.
The recommended way would be to return a smart pointer, like boost::shared_ptr. This would solve the problems mentioned above. To understand why, read about RAII.
Yes, you're taking the risk of leaking memory. (compile errors aside.)
Doing this for an int is silly, but the principle is the same even if it's a large structure.
But understand: you've written C-style code, where you have a function that allocates storage.
If you're trying to learn C++, you should put somefunc() and the data it operates on into a class. Methods and data together. A class can also do RAII as Space_C0wb0y pointed out.
You might be making int * as just an example, but really, in the case you noted, there is not a reason to return int *, just return int, the actual value is more than good enough. I see these situations all the time, getting overly complicated, when, what is actually needed, is just to simplify.
In the case of 'int *', I can only really think of a realistic case of returning an array of ints, if so, then you need to allocate that, return that, and hopefully, in your documentation, note that it has to be released.
The first approach certainly leads to problems, as you are now well aware.
The second is kind of OK, but demands attention from the programmer because he needs to explicitly delete the returned pointer (as you did). This is harder when your application grows larger, using this method will probably cause problems (memory leaks) as the programmer will find it difficult to keep track of every single variable he needs to deallocate.
A 3rd approach for this scenario, is to pass a variable by reference to be used inside the function, which is way safer.
void somefunc(int& value)
{
value = 5;
}
// some code that calls somefunc()
int a_value = 0;
somefunc(a_value);
// printing a_value will display 5
(Edited)
Yes, the second is fine, so long as you dereference that 'x' before assigning!
Ok, I would analyze this by answering these questions:
What does x contain ? - A memory location(since it is a pointer
variable)
What is the scope of x? - Since it a a auto variable it's scope is
limited to the function somefunc()
What happens to auto variables once they exit the local scope ? - They are
deleted from the stack space.
So what happens to x now after return from somefunc()? - Since it is
an auto variable declared on the stack
, it's scope(lifetime) is limited to
somefunc() and hence will be deleted.
Ok so now, what happens to the value pointed to by x? We have a
memory leak as the value is allocated
on the heap and we have just lost the
address when x is deleted.
What does y get? - No idea.
What happens when y is deleted? - No idea.
The point is not to return a pointer or reference to a local variable, because once the function returns, locals don't exist.
However, the return value still exists, and dynamically allocated memory certainly exists as well.
In C++, we prefer to avoid raw pointers whenever possible. To "return a value that already exists" (i.e. the function does not create a new value), use a reference. To "return a value that didn't already exist" (i.e. the function creates a new value, in the idiomatic sense, not the new keyword sense) use a value, or if necessary, some kind of smart pointer wrapper.
It's both memory leak and a crash (because of the delete).