Memory Management in C++ - c++

I'm trying to test how and where the data is located and destroyed. For the example code below:
A new point is created and returned with NewCartesian method. I know that it should be stored in heap. When it's pushed back into the vector, does the memory of the points content is copied into a new Point structure? Or is it stored as a reference the this pointer?
When the point is created, when is it destroyed? Do I need to destroy the points when I'm done with it? Or are they destroyed when they are not usefuly anymore? For example, if main was another function, the vector would not be useful when it's finished.
Depending on the answers above, when is it good to use the reference of objects? Should I use Point& p or Point p for the return of Point::NewCartesian?
#define _USE_MATH_DEFINES
#include <iostream>
#include <cmath>
using namespace std;
struct Point {
private:
Point(float x, float y) : x(x), y(y) {}
public:
float x, y;
static Point NewCartesian(float x, float y) {
return{ x, y };
}
};
int main()
{
vector<Point> vectorPoint;
for (int i = 0; i < 10000; i++) {
Point& p = Point::NewCartesian(5, 10);
vectorPoint.push_back( p );
// vectorPoint.push_back( Point::NewCartesian(5, 10) );
Point& p2 = Point::NewPolar(5, M_PI_4);
}
cout << "deneme" << endl;
getchar();
return 0;
}
Thank you for your help,
Cheers,

... I know that it should be stored in heap.
Firstly, please read this explanation of why it's preferable to talk about automatic and dynamic object lifetimes, rather than stack/heap.
Secondly, that object is neither dynamically-allocated nor on the heap. You can tell because dynamic allocation uses a new-expression, or a library function like malloc, calloc or possibly mmap. If you don't have any of those (and you almost never should), it's not dynamic.
You're returning something by value, so that thing's lifetime is definitely automatic.
When the point is created, when is it destroyed?
If you write the full set of copy/move constructors and assignment operators, plus a destructor, you can simply set breakpoints in them in the debugger and see where they get invoked. Or, have them all print their this pointer and input arguments (ie, the source object being moved or copied from).
However, since we know the object is automatic, the answer is easy - when it goes out of scope.
Should I use Point& p or Point p for the return of Point::NewCartesian?
Definitely the second: the first returns a reference to an object with automatic lifetime in the scope of the NewCartesian function, meaning the objected referred to is already dead by the time the caller gets the reference.
Finally, this code
Point& p = Point::NewCartesian(5, 10);
is weird - it makes it hard to determine the lifetime of the Point referred to by p by reading the code. It could be some static/global/other object with dynamic lifetime to which NewCartesian returns a reference, or (as is actually the case) you could be binding a reference to an anonymous temporary. There's no benefit to writing it this way instead of
Point p = Point::NewCartesian(5, 10);
or just passing the temporary straight to push_back as in your commented code.
As an aside, the design of Point is very odd. It has public data members, but a private constructor, and a public static method that just calls the constructor. You could omit the constructor and static method entirely and just use aggregate initialization, or omit the static method and make the constructor public.

1a. No, it's on the stack, but read the answer by Useless, why the terms stack and heap are not the best choice.
1b. It gets copied when you call push_back.
2 . It's destroyed immediately after being created, because it only exists within the scope of the NewCartesian call and for the duration of the return being evaluated.
3a. You use a reference whenever you have a valid instance and want to pass it to a function without creating a copy. Specifically the function should have a reference parameter.
3b. You should use Point p, not Point& p, because right now you get a dangling reference to an object that doesn't exist anymore (see 2.)
As pointed out by Steven W. Klassen in the comments, your best option is the code that you have commented out: vectorPoint.push_back( Point::NewCartesian(5, 10) );. Passing the call to NewCartesian directly into push_back without making a separate local copy, allows the compiler to optimize it so that the memory is constructed exactly where push_back wants it and avoiding any intermediate memory allocations or copies. (Or more technically, it allows it to use the move operator.)

Related

Making changes to object state from within a function

I have had to simplify some of my code to ask this question. However, in the below code does the fact that I am not declaring x as a reference type mean my change of decrementing will get "forgotten" once the function has exited?
The smartest way to fix this would be to declare x as AnotherClass& x?
void MyClass::myFunc(unordered_map<int, AnotherClass>* dictionary, int z, int y){
AnotherClass x = dictionary->at(z);
//Does this change on x get "forgotten" in terms of what dictionary stores
//once myFunc() has finished, because x is not a reference/pointer type?
x.changeSomething(y--);
}
class MyClass{
public:
private:
myFunc(unordered_map<int, AnotherClass>* dictionary, int z);
unordered_map<int, AnotherClass>* dictionary
};
Correct. x is a copy of an element of dictionary. You are applying changes to the copy, which is local to the function. You should see no effects in the caller side. You can either take a reference, or act directly on the result of the call to at:
dictionary->at(z).changeSomething(z--);
Note that this has nothing to do with the code being inside a function.
In languages like Java or C# when you write Thing t = s; you are actually creating an alias t that refer to the same object in memory than s refer to. In C++, however, values and aliases are strictly separated:
Thing t = s; is about making a copy of s
Thing& t = s; is about creating an alias referring to the same object than s (a reference)
Thing* t = &s; is about creating an alias referring to the same object than s (a pointer)
The difference between references and pointers does not matter here, what matters is the difference between copies and aliases.
Changes to a copy are local to that copy
Changes to an object via an alias are local to that object, and visible through all aliases referring to that object
In term of your example:
// Fix 1: take dictionary by *reference* and not by *pointer*.
void MyClass::myFunc(std::unordered_map<int, AnotherClass>& dictionary, int z, int y){
// Fix 2: dictionary.at(z) returns a "AnotherClass&"
// that is an alias to the element held within the dictionary.
// Make sure not to accidentally make a copy by using that "&"
// for the type of x.
AnotherClass& x = dictionary.at(z);
// "x" is now a mere alias, the following call is thus applied
// to the value associated to "z" within "dictionary".
x.changeSomething(y--);
}
Note that you could write dictionary.at(z).changeSomething(y--); in this case; however there are several shortcomings:
if x is reused more then once, naming it makes it clearer.
in cases where the function/method invoked have side-effects, the number of calls is important and need be controlled.
from a performance point of view, avoiding unnecessary computing the same thing over and over is always welcome... but don't get too hang up on performance ;)

What happens if an object resizes its own container?

This is not a question about why you would write code like this, but more as a question about how a method is executed in relation to the object it is tied to.
If I have a struct like:
struct F
{
// some member variables
void doSomething(std::vector<F>& vec)
{
// do some stuff
vec.push_back(F());
// do some more stuff
}
}
And I use it like this:
std::vector<F>(10) vec;
vec[0].doSomething(vec);
What happens if the push_back(...) in doSomething(...) causes the vector to expand? This means that vec[0] would be copied then deleted in the middle of executing its method. This would be no good.
Could someone explain what exactly happens here?
Does the program instantly crash? Does the method just try to operate on data that doesn't exist?
Does the method operate "orphaned" of its object until it runs into a problem like changing the object's state?
I'm interested in how a method call is related to the associated object.
Yes, it's bad. It's possible for your object to be copied (or moved in C++11 if the distinction is relevant to your code) while your are inside doSomething(). So after the push_back() returns, the this pointer may no longer point to the location of your object. For the specific case of vector::push_back(), it's possible that the memory pointed to by this has been freed and the data copied to a new array somewhere else. For other containers (list, for example) that leave their elements in place, this is (probably) not going to cause problems at all.
In practice, it's unlikely that your code is going to crash immediately. The most likely circumstance is a write to free memory and a silent corruption of the state of your F object. You can use tools like valgrind to detect this kind of behavior.
But basically you have the right idea: don't do this, it's not safe.
Could someone explain what exactly happens here?
Yes. If you access the object, after a push_back, resize or insert has reallocated the vector's contents, it's undefined behavior, meaning what actually happens is up to your compiler, your OS, what do some more stuff is and maybe a number of other factors like maybe phase of the moon, air humidity in some distant location,... you name it ;-)
In short, this is (indirectly via the std::vector implemenation) calling the destructor of the object itself, so the lifetime of the object has ended. Further, the memory previously occupied by the object has been released by the vector's allocator. Therefore the use the object's nonstatic members results in undefined behavior, because the this pointer passed to the function does not point to an object any more. You can however access/call static members of the class:
struct F
{
static int i;
static int foo();
double d;
void bar();
// some member variables
void doSomething(std::vector<F>& vec)
{
vec.push_back(F());
int n = foo(); //OK
i += n; //OK
std::cout << d << '\n'; //UB - will most likely crash with access violation
bar(); //UB - what actually happens depends on the
// implementation of bar
}
}

use a pointer in operation with non pointer

I'm quite new to C++ and I don't understand very well the pointers yet.
this is ok, I have 2 non pointer object:
Vec2D A(0, 0), B(10, 10);
Vec2D C = A-B;
but if one is a pointer?
Vec2D::minus(Vec2D B) {
Vec2D that = Vec2D(this->x(), this->y());
return that-B;
}
So the question: how can I use the pointer this with - operation with B?
And also, I don't understand how many object are constructed in my methods and how can I optimize memory consumption passing some reference..
If I got your question correctly.. "this is a pointer, how can I operate on it and other pointers using methods that require a non-pointer?"
You use the dereference operator *
Example:
Vec2D that = *this;
To answer your second question:
an object is created to pass it as a parameter of minus
an object is created by Vec2D(this->x(), this->y()) (but will be probably erased away as a temp by a good optimizing compiler)
an object is created by you on the stack (that)
depending on how you implemented them, and on how good your compiler is, you may create another object in your copy constructor/operator=
an object (or more) may be created by your operator- in that-B
an object is created to be returned (only one, not two, as return value optimization is done by all compliers AFAIK)
How can you optimize it? Use references...
Vec2D Vec2D::minus(const Vec2D& B) {
return that-*this;
}
And implement operator- on Vec2D to use refereces too...
In general, pass parameters as (const) references.
Obviously, you cannot do the same for the return value (try, the compiler will complain..); there are techniques for these as well (especially in CG/games, with vectors, I have seen object pools used a lot; for those returning a reference/pointer is actually possible, but it is rather advanced stuff)
The "this" pointer is an auto-generated pointer to the object that contains the method being called.
If you call A.minus(B), the "this" pointer points to A.

Binary trees in C++ using references

I wish to implement a binary tree using references instead of using pointers (which is generally what you tend to find in every book and every website on the internet). I tried the following code:
class tree_node {
private:
tree_node& left;
tree_node& right;
data_type data;
public:
void set_left(tree_node&);
// ... other functions here
};
void tree_node::set_left(tree_node& new_left) {
this.left = new_left;
}
I get the following error:
error C2582: 'operator =' function is unavailable in 'tree_node'.
I know I can easily implement it using pointers but I would like to keep my solution elegant and free of pointers. Can you tell me where I am going wrong?
You can't change the object that a reference refers to1; once you initialize a reference, it always refers to the object with which it was initialized.
You should use pointers. There is nothing wrong with using pointers for this (it's clean using pointers as well because parent nodes own their children, so cleanup and destruction is easy!)
(1) Well, you could explicitly call the object's destructor and then use placement new in the assignment operator implementation, but that's just a mess!
You cannot assign to references. What you're trying to do can't be done... without a huge amount of bending.. (you'd essentially destroy a node and create a new one each time you want to modify it.)
There's a good reason why all those other people use pointers.
References aren't just pointers with shorter syntax. They're another name for the actual object they refer to, even when used as the lhs of an assignment.
int i = 3;
int j = 4;
int &ref = i;
ref = j;
std::cout << i << "\n"; // prints 4: i itself has been modified,
// because semantically ref *is* i
That is, ref = j has the same effect as i = j, or the same effect as *ptr = j, if you had first done int *ptr = &i;. It means, "copy the contents of the object j, into whatever object ref refers to".
For the full lifetime of ref, it will always refer to i. It cannot be made to refer to any other int, that is to say it cannot be "re-seated".
The same is true of reference data members, it's just that their lifetime is different from automatic variables.
So, when you write this.left = new_left, what that means is, "copy the contents of the object new_left into whatever object this.left refers to". Which (a) isn't what you mean, since you were hoping to re-seat this.left, and (b) even if it was what you meant, it's impossible, since this.left has reference members which themselves cannot be reseated.
It's (b) that causes the compiler error you see, although (a) is why you should use pointers for this.
References in C++ don't work the same as references in other languages. Once a reference is set, at construction time, it can't be changed to anything else.
My recommendation is to use boost's shared_ptr class instead of a reference. This will free you of the concern for managing the pointer's deallocation. You may also be interested in Boost's graph library.

What is the difference between references and normal variable handles in C++?

If C++, if I write:
int i = 0;
int& r = i;
then are i and r exactly equivalent?
That means that r is another name for i. They will both refer to the same variable. This means that if you write (after your code):
r = 5;
then i will be 5.
References are slightly different, but for most intents and purposes it is used identically once it has been declared.
There is slightly different behavior from a reference, let me try to explain.
In your example 'i' represents a piece of memory. 'i' owns that piece of memory -- the compiler reserves it when 'i' is declared, and it is no longer valid (and in the case of a class it is destroyed) when 'i' goes out of scope.
However 'r' does not own it's own piece of memory, it represents the same piece of memory as 'i'. No memory is reserved for it when it is declared, and when it goes out of scope it does not cause the memory to be invalid, nor will it call the destructor if 'r' was a class. If 'i' somehow goes out of scope and is destroyed while 'r' is not, 'r' will no longer represent a valid piece of memory.
For example:
class foo
{
public:
int& r;
foo(int& i) : r(i) {};
}
void bar()
{
foo* pFoo;
if(true)
{
int i=0;
pFoo = new foo(i);
}
pFoo->r=1; // r no longer refers to valid memory
}
This may seem contrived, but with an object factory pattern you could easily end up with something similar if you were careless.
I prefer to think of references as being most similar to pointers during creation and destruction, and most similar to a normal variable type during usage.
There are other minor gotchas with references, but IMO this is the big one.
The Reference is an alias of an object. i.e alternate name of an object. Read this article for more information - http://www.parashift.com/c++-faq-lite/references.html
A reference is an alias for an existing object.
Yep - a reference should be thought of as an alias for a variable, which is why you can't reassign them like you can reassign pointers (and also means that, even in the case of a non-optimizing compiler, you won't take up any additional storage space).
When used outside of function arguments, references are mostly useful to serve as shorthands for very->deeply->nested.structures->and.fields :)
C++ references differ from pointers in
several essential ways:
It is not possible to refer directly to a reference object
after it is defined; any occurrence of its name refers directly to the
object it references.
Once a reference is created, it cannot be later made to reference
another object; it cannot be reseated. This is often done with pointers.
References cannot be null, whereas pointers can; every reference
refers to some object, although it may or may not be valid.
References cannot be uninitialized. Because it is impossible to reinitialize a
reference, they must be initialized as soon as they are created. In particular, local and global variables must be initialized where they are defined, and references which are data members of class instances must be initialized in the initializer list of the class's constructor.
From Here.
The syntax int &r=i; creates another name i.e. r for variable i.hence we say that r is reference to i.if you access value of r,then r=0.Remember Reference is moreover a direct connection as its just another name for same memory location.
You are writing definitions here, with initializations. That means that you're refering to code like this:
void foo() {
int i = 0;
int& r = i;
}
but not
class bar {
int m_i;
int& m_r;
bar() : i(0), r(i) { }
};
The distinction matters. For instance, you can talk of the effects that m_i and m_r have on sizeof(bar) but there's no equivalent sizeof(foo).
Now, when it comes to using i and r, you can distinguish a few different situations:
Reading, i.e. int anotherInt = r;
Writing, i.e. r = 5
Passing to a function taking an int, i.e. void baz(int); baz(r);
Passing to a function taking an int&, i.e. void baz(int&); baz(r);
Template argument deduction, i.e. template<typename T> void baz(T); baz(r);
As the argument of sizeof, i.e. sizeof(r)
In these cases, they're identical. But there is one very important distinction:
std::string s = std::string("hello");
std::string const& cr = std::string("world");
The reference extends the lifetime of the temporary it's bound to, but the first line makes its a copy.