C++ return by reference stack allocation details - c++

can someone walk through exactly what happens with the memory in this operator overload function? I am confused on how exactly the object created inside the operator function gets deallocated in the main.
Object& operator+(const Object& other) {
Object o(*this); //create instance of o that deep copies first argument
...
//copy contents of other and add onto o
return o;
}
int main() {
Object b;
Object c;
Object a = b + c;
}
Edit: to be more specific, isn't it bad practice to create a local object in a function and then return it by reference? Wouldn't that cause a memory leak?
Edit 2: I am referencing my textbook Data abstraction & problem solving with c++ carrano which suggests an operator + overload for LinkedLists in this format: LinkedList<ItemType>& operator+(const LinkedList<ItemType>& rightHandSide) const;. They implemented the method in the way I described.
Edit 2.5: the full method pseudocode given by the book:
LinkedList<ItemType>& operator+(const LinkedList<ItemType>& rightHandSide) const {
concatList = a new, empty instance of LinkedList
concatList.itemCount = itemCount + rightHandSide.itemCount
leftChain = a copy of the chain of nodes in this list
rightChain = a copy of the chain of nodes in the list rightHandSide
concatList.headPtr = leftChain.headPtr
return concatList
}
Edit 3: Asked my professor about this. Will get to the bottom of this by tomorrow.
Edit 4: The book is wrong.

Returning a reference to a local object
As everyone else correctly states, returning a reference to a local object results in undefined behaviour. You will end up with a handle to a destroyed function-scope object.
Returning references in arithmetic operators
If you think about it, a+b should give you a result, but it shouldn't change a nor b. C++ however leaves it up to you to define how operators work on your own types so it's possible to implement the behaviour you need. This is why the operator+ usually has to create a new object and can't return a reference.
Compound assignments (+=, -=, etc) on the other hand do change the object itself so a += b is changing a. This is why it's usually being implemented by returning a reference (not to a local object, but to the instance itself):
Object& Object::operator+=(const Object& rhs)
{
// do internal arithmetics to add 'rhs' to this instance
return *this; // here we return the reference, but this isn't a local object!
}

It wouldn't cause a memory leak, but o gets destroyed when it goes out of scope, when the function returns. So the reference the caller has is junk. It might appear to work fine for a short time until the memory is overwritten later.

It is simply undefined behavior.
In terms of what happens to memory, the memory will not be reserved to the object after the function returns (because the object is now out of scope).
So it can contain ANYTHING, including the same object by way of coincidence.

Related

Returning an object from a function in C++: its process and nature

I'm quite confused about returning an object from a function. For example:
class A
{
public:
~A(){}
};
A find()
{
...
A a;
return a;
}
Does it return "a" by reference or by value? Moreover, does "find" delete "a" first, then return or return first, then delete "a"?
In your function, you are returning a value, not reference.
A find();
return type is A. Is is value, a copy of a will return.
In order to return a reference, you should write your function as follows.
A& find();
return type A& means a reference to A. But your function body should be change accordingly, in order to return a valid reference.
With the current implementation, you are creating object a inside the function. So it get delete when it goes out of the scope, at the end of function execution.
Your question "Moreover, does "find" delete "a" first, then return or return first, then delete "a"?"
copy of a will return first, then a will delete.
If you return a reference, reference will be return, then object a will get deleted.
In this code you return by value and this is where return value optimization comes(RVO) in to the play. a copy of 'a' is created, then origianal 'a' deleted, return copy of 'a'. The correct sequence is not sure.
The easy part: Does it return "a" by reference or by value?
A find()
{
...
A a;
return a;
}
Returns by value.
The hard part: Moreover, does "find" delete "a" first, then return or return first, then delete "a"?
Technically a copy of a is constructed, a is destroyed and the copy is returned. I cannot find anything in the C++ standard that specifies any particular ordering to those operations, but some are logically implied. Obviously you cannot copy after destruction.
I suspect this is left unspecified to allow C++ implementations to support a wide variety of calling conventions.
Note: This means the returned object must be copy-able. If the copy constructor is deleted or inaccessible, you cannot return by value.
There is no way to be certain of whether the return or the destruction is first. It should not matter and if you are designing a program where it does, give your head a shake.
Caveat
However in practice a modern optimizing compiler will do anything in its power to avoid copying and destroying using a variety of approaches under the blanket name of Return Value Optimization.
Note that this is a rare case where the As-If Rule is allowed to be violated. By skipping the copy construction and destruction, some side effects may not take place.
Also note that even if the need to copy is eliminated, the object must still be copy-able.
Sidenote:
A & find()
{
...
A a;
return a;
}
will return a reference, but this is a very bad idea. a has Automatic storage duration scoped by the function and will be destroyed on return. This leaves the caller with a dangling reference, a reference to a variable that no longer exists.
To get around this,
std::unique_ptr<A> find()
{
...
auto a = std::make_unique<A>();
return a;
}
but you will find that, with a modern compiler, this is no better than returning by value in most cases.

Memory Management in C++

I'm trying to test how and where the data is located and destroyed. For the example code below:
A new point is created and returned with NewCartesian method. I know that it should be stored in heap. When it's pushed back into the vector, does the memory of the points content is copied into a new Point structure? Or is it stored as a reference the this pointer?
When the point is created, when is it destroyed? Do I need to destroy the points when I'm done with it? Or are they destroyed when they are not usefuly anymore? For example, if main was another function, the vector would not be useful when it's finished.
Depending on the answers above, when is it good to use the reference of objects? Should I use Point& p or Point p for the return of Point::NewCartesian?
#define _USE_MATH_DEFINES
#include <iostream>
#include <cmath>
using namespace std;
struct Point {
private:
Point(float x, float y) : x(x), y(y) {}
public:
float x, y;
static Point NewCartesian(float x, float y) {
return{ x, y };
}
};
int main()
{
vector<Point> vectorPoint;
for (int i = 0; i < 10000; i++) {
Point& p = Point::NewCartesian(5, 10);
vectorPoint.push_back( p );
// vectorPoint.push_back( Point::NewCartesian(5, 10) );
Point& p2 = Point::NewPolar(5, M_PI_4);
}
cout << "deneme" << endl;
getchar();
return 0;
}
Thank you for your help,
Cheers,
... I know that it should be stored in heap.
Firstly, please read this explanation of why it's preferable to talk about automatic and dynamic object lifetimes, rather than stack/heap.
Secondly, that object is neither dynamically-allocated nor on the heap. You can tell because dynamic allocation uses a new-expression, or a library function like malloc, calloc or possibly mmap. If you don't have any of those (and you almost never should), it's not dynamic.
You're returning something by value, so that thing's lifetime is definitely automatic.
When the point is created, when is it destroyed?
If you write the full set of copy/move constructors and assignment operators, plus a destructor, you can simply set breakpoints in them in the debugger and see where they get invoked. Or, have them all print their this pointer and input arguments (ie, the source object being moved or copied from).
However, since we know the object is automatic, the answer is easy - when it goes out of scope.
Should I use Point& p or Point p for the return of Point::NewCartesian?
Definitely the second: the first returns a reference to an object with automatic lifetime in the scope of the NewCartesian function, meaning the objected referred to is already dead by the time the caller gets the reference.
Finally, this code
Point& p = Point::NewCartesian(5, 10);
is weird - it makes it hard to determine the lifetime of the Point referred to by p by reading the code. It could be some static/global/other object with dynamic lifetime to which NewCartesian returns a reference, or (as is actually the case) you could be binding a reference to an anonymous temporary. There's no benefit to writing it this way instead of
Point p = Point::NewCartesian(5, 10);
or just passing the temporary straight to push_back as in your commented code.
As an aside, the design of Point is very odd. It has public data members, but a private constructor, and a public static method that just calls the constructor. You could omit the constructor and static method entirely and just use aggregate initialization, or omit the static method and make the constructor public.
1a. No, it's on the stack, but read the answer by Useless, why the terms stack and heap are not the best choice.
1b. It gets copied when you call push_back.
2 . It's destroyed immediately after being created, because it only exists within the scope of the NewCartesian call and for the duration of the return being evaluated.
3a. You use a reference whenever you have a valid instance and want to pass it to a function without creating a copy. Specifically the function should have a reference parameter.
3b. You should use Point p, not Point& p, because right now you get a dangling reference to an object that doesn't exist anymore (see 2.)
As pointed out by Steven W. Klassen in the comments, your best option is the code that you have commented out: vectorPoint.push_back( Point::NewCartesian(5, 10) );. Passing the call to NewCartesian directly into push_back without making a separate local copy, allows the compiler to optimize it so that the memory is constructed exactly where push_back wants it and avoiding any intermediate memory allocations or copies. (Or more technically, it allows it to use the move operator.)

Is it possible to do a swap on return in c++, instead of return by value?

Suppose I'm coding a string class in C++ (I know I can use the library). The string length is variable and the storage space is dynamically allocated in the constructor and freed in the destructor. When the main function calls c=a+b (a,b,c are strings), the operator+ member function creates a temporary object that stores the concatenated string a+b, returns it to the main function, and then the operator= member function is called to free the string originally stored in c and copy data from the temporary string a+b to c, and finally the temporary a+b is destructed.
I'm wondering if there's a way to make this happen: instead of having the operator= copy data from a+b to c, I want it to swap the data pointers of a+b and c, so that when a+b is destructed, it destructs the original data in c (which is what we want), while c now takes the result of a+b without needing to copy.
I know coding a 2-parameter member function setToStrcat and calling c.setToStrcat(a,b) can do this. For example, the function can be coded as:
void String::setToStrcat(const String& a,const String& b){
String tmp(a.len+b.len); int i,j;
for(i=0;i<a.len;i++) tmp[i]=a[i];
for(j=0;j<b.len;i++,j++) tmp[i]=b[j];
tmp[i]='\0'; this->swap(tmp);
}
void String::swap(String& a){
int n=len; len=a.len; a.len=n;
char *s=str; str=a.str; a.str=s;
}
I omitted the definitions of my constructor (which allocates len+1 char-type spaces) and operator[] (which returns a reference of the ith character). The swap function swaps the data pointers and length variables between *this and tmp, so that when tmp is destructed after the swap, it is actually the data originally stored in *this (the String c in the main function) that is destructed. What *this now has in its possession (c.str) is the concatenated string a+b.
I would like to know if there is a way to optimize the performance of c=a+b to the same level. I tried c.swap(a+b) and changed the return type of a+b to String&, but I receive warning (reference to a local variable) and GDB shows that the temporary gets destructed before the swap happens, while I want the other way.
I think my question is general. In C++ programming, we often need a temporary object to store the result of a function, but when we assign it to another object in the main function, can we not copy the data but use a (much faster) swap of pointers instead? What is a neat way of making this happen?
In C++11, you can do this by writing a move constructor. Rvalue references were added to the language to solve this exact problem.
class String {
...
String(String&& s) : str(nullptr) {
this->swap(s);
}
String& operator=(String&& s) {
this->swap(s);
}
...
String operator+(String const& other) {
// (your implementation of concatenation here)
}
...
}
Then code like this will not trigger an extra copy constructor or a memory allocation, it will just move the allocated memory from the temporary (the thing returned from operator+) to the new object c.
String c = a + b;

Returning a class by value c++?

So I'm trying to write this class. One of the things I want to be able to do is to add two together, so I'm overloading the addition operator. But here's the thing, I don't want to return a pointer, I want to return the class "by value", so that the addition operator works without messing with pointers.
My current approach doesn't work, because the class I create goes out of scope, and the only other way I can think of is to do it with pointers. Is there any other way to do this, without calling new and allocating memory that will later have to be deleted by the user of the class?
The current code:
Polynomial operator+(const Polynomial &lhs, const Polynomial &rhs)
{
Polynomial newPoly;
newPoly.addWithOther(lhs);
newPoly.addWithOther(rhs);
return newPoly;
}
Well I figured it out. The copy constructor is called right after return is called. And the return register will take a copy of your object. Its here that any objects that are not returned will go out of scope. If you don't perform a deep copy of those objects, or in my case, a 'deep enough' copy, the members will be reallocated by the system.

C++ overloading the = operator

When overloading the = operator, should one make the contents of one object equal to the contents of the other object OR do you make the pointer of the object point to the same object?
Reading back on the question it seems that the contents should be copied and not the pointers. But I just can't figure it out, So I would be grateful if someone would explain what I should do, I know how to do both, I'm just not sure which one to choose.
class IntObject
{
private:
int *pi_One;
public:
IntObject(void);
IntObject::IntObject(int const &i_one);
~IntObject(void);
IntObject & operator=(const IntObject&);
};
IntObject::IntObject()
{
pi_One = new int(0);
}
IntObject::IntObject(int const &i_one)
{
pi_One = new int(i_one);
}
IntObject::~IntObject(void)
{
delete pi_One;
}
IntObject & IntObject::operator=(const IntObject& c) {
//This copies the pointer to the ints
this->pi_One = c.pi_One;
return *this;
}
It depends on what semantics you want to have in your type. If you want value semantics, then copy the contents (deep copy, as is the case in std::vector), if you want reference semantics (shallow copy, as in std::shared_ptr)
You should definitely copy the contents, not the pointers. Think about what you will do when one of the objects which both hold the same pointer is destroyed; you can't delete the pointer because the other object would be affected as well, but you can't not delete it either because you'd cause memory leaks. You'd have to use reference counting and everything would get a whole lot more complicated.
The contents should be copied (in fact, changing the pointer of the object shouldn't actually be possible - I can't imagine how you would do that - and even if it is somehow, you're not supposed to do it). You also have to take care of the differences between deep and shallow copies, especially if your class contains pointers (or containers with pointers in them).
Now that I think about it, I'm not even sure which pointer you could possibly want to reassign. Unless you are already working with a pointer - those already have an '=' operator that shouldn't be overloaded though.
The principle of least astonishment would say to copy the content. When using operator= on any other object, you wouldn't expect it to copy pointers.
If you keep your destructor as it is, then you should change assignment overload. It would be also wise that nobody is attempting assigning IntObject to itself:
IntObject & IntObject::operator=(const IntObject& c) {
if (this != &c)
{
//This copies the pointer to the ints
*this->pi_One = *c.pi_One;
}
return *this;
}
Otherwise, there will be attempt to free freed memory in IntObject's destructor