Say Pokemon is a class. Consider this snippet:
Pokemon Eve(4,3); //call to constructor, creating first object on the stack
Eve=Pokemon(3,5); //call to constructor again, creating a second object on the stack
Both objects are created on the stack. After the second line executes, the first object (with parameters 4,3) can no longer be accessed. What happens to it? What is the jargon used to describe this process? Garbage collection?
I think you misunderstand what is happening. The object originally created as Eve isn't 'lost' after the second line. In fact, it is still the same object afterwards (by which I mean it still has the same memory address). The second line assigns a different value to the object by calling the assignment operator on the object Eve. The parameter passed to the assignment operator is the temporary object created by Pokemon(3,5).
It is the said temporary object - Pokemon(3,5) which is destroyed after the second line. Its destruction is not 'garbage collection' or even a heap deallocation, as it is a stack object in the first place and thus doesn't require such.
I think you're thinking about this in terms of Java/C# or similar managed languages, that only have object references and not direct objects. In C++ with the code above you're dealing with objects directly. To see something equivalent to what you have in mind, consider this:
Pokemon* Eve = new Pokemon(4,3);
Eve=new Pokemon(3,5);
Here we're using pointers and the heap, rather than the stack. In this case, the second line really does 'lose' the original object, because it is the pointer that is reassigned. This is sort of similar to what happens in Java/C# with just normal assignment. The difference is, of course, that C++ doesn't have any garbage collector so in the above the original object is truly lost in the sense of being a memory leak.
It's called "the assignment operator", or "=".
The second object is the temporary object, not the first one. After it is constructed, the first object's assignment operator is invoked, to assign the value of the second object to the first object.
After that the second object's destructor is invoked, and the second object gets destroyed. Where it goes? To the great bit bucket in the sky. It's gone.
The objects are not created on the heap, but directly on the stack, so no space is lost (i.e., there's no allocated space that becomes unreachable) -- the assignment simply reuses the memory allocated by the initialization of Eve.
Edited: More careful wording around the "overwriting" contention.
Related
I know this has probably been asked and I've looked through other answers, but still I cannot get this completely.
I want to understand the difference between the two following codes:
MyClass getClass(){
return MyClass();
}
and
MyClass* returnClass(){
return new MyClass();
}
Now let's say I call such functions in a main:
MyClass what = getClass();
MyClass* who = returnClass();
If I got this straight, in the first case the object created in the
function scope will have automatic storage, i.e. when you exit the
scope of the function its memory block will be freed. Also, before
freeing such memory, the returned object will be copied into the
"what" variable I created. So there will exist only one copy of
the object. Am I correct?
1a. If I'm correct, why is RVO (Return Value Optimization) needed?
In the second case, the object will be allocated through a dynamic storage, i.e. it will exist even out of the function scope. So I need to use a deleteon it. The function returns a pointer to such object, so there's no copy made this time, and performing delete who will free the previously allocated memory. Am I (hopefully) correct?
Also I understand I can do something like this:
MyClass& getClass(){
return MyClass();
}
and then in main:
MyClass who = getClass();
In this way I'm just telling that "who" is the same object as the one created in the function. Though, now we're out of the function scope and thus that object doesn't necessarily exists anymore. So I think this should be avoided in order to avoid trouble, right? (and the same goes for
MyClass* who = &getClass();
which would create a pointer to the local variable).
Bonus question: I assume that anything said till now is also true when returning vector<T>(say, for example, vector<double>), though I miss some pieces.
I know that a vector is allocated in the stack while the things it contains are in the heap, but using vector<T>::clear() is enough to clear such memory.
Now I want to follow the first procedure (i.e. return a vector by value): when the vector will be copied, also the onjects it contains will be copied; but exiting the function scope destroys the first object. Now I have the original objects that are contained nowhere, since their vector has been destroyed and I have no way of deleting such objects that are still in the heap. Or maybe a clear() is performed automatically?
I know that I may beatray some misunderstandings in these subjects (expecially in the vector part), so I hope you can help me clarify them.
Q1. What happens conceptually is the following: you create an object of type MyClass on the stack in the stack frame of getClass.
You then copy that object into the return value of the function, which is a bit of stack that was allocated before the function call to hold this object.
Then the function returns, the temporary gets cleaned up. You copy the return value into the local variable what. So you have one allocation and two copies.
Most (all?) compilers are smart enough to omit the first copy: the temporary is not used except as return value. However, the copy from the return value into the local variable on the caller side cannot be omitted, because the return value lives on a part of the stack that is freed as soon as the function finishes.
Q1a. Return Value Optimization (RVO) is a special feature, that does allow that final copy to be elided. That is, instead of returning the function result on the stack, it will be allocated straight away in the memory allocated for what, avoiding all copying altogether. Note that, contrary to all other compiler optimizations, RVO can change the behaviour of your program! You could give MyClass a non-default copy constructor, that has side effects, like printing a message to the console or liking a post on Facebook. Normally, the compiler is not allowed to remove such function calls unless it can prove that these side effects are absent. However, the C++ specs contain a special exception for RVO, that says that even if the copy constructor does something non-trivial, it is still allowed to omit the return value copy and reduce the whole thing to a single constructor call.
2. In the second case, the MyClass instance is not allocated on the stack, but on the heap. The result of the new operator is an integer: the address of the object on the heap. This is the only point where you will ever be able to obtain this address (provided you didn't use placement new), so you need to hold onto it: if you lose it, you cannot call delete and you will have created a memory leak.
You assign the result of new to a variable whose type is denoted by MyClass* so that the compiler can do type checking and stuff, but in memory it is just an integer large enough to hold an address on your system (32- or 64-bits). You can check this for yourself by trying to coerce the result to a size_t (which is typedef'd to typically an unsigned int or something larger depending on your architecture) and seeing the conversion succeed.
This integer is returned to the caller by value, i.e. on the stack, just as in example (1). So again,
in principle, there is copying going on, but in this case only copying of a single integer which your CPU is very good at (most of the times it will not even go on the stack but get passed in a register) and not the whole MyClass object (which in general has to go on the stack because it's very large, read: larger than an integer).
3. Yes, you should not do that. Your analysis is correct: as the function finishes, the local object is cleaned up and its address becomes meaningless. The problem is, that it sometimes seems to work. Forgetting about optimizations for the time being, the main reason the way memory works: clearing (zero-ing) memory is quite expensive, so that is hardly ever done. Instead, it is just marked as available again, but it's not overwritten until you make another allocation that needs it. Therefore, even though the object is technically dead, its data may still be in the memory so when you dereference the pointer you may still get the right data back. However, since the memory is technically free, it may be overwritten at any time between right now and at the end of the universe. You have created what C++ calls Undefined Behaviour (UB): it may seem to work right now on your computer, but there's no telling what may happen somewhere else or at another point in time.
Bonus: When you return a vector by value, as you remarked, it is not just destroyed: it is first copied to the return value or - taking RVO into account - into the target variable. There are two options now: (1) The copy creates its own objects on the heap, and modifies its internal pointers accordingly. You now have two proper (deep) copies co-existing temporarily -- then when the temporary object goes out of scope, you are just left with the one valid vector. Or (2): When copying the vector, the new copy takes ownership of all the pointers that the old one holds. This is possible, if you know that the old vector is about to be destroyed: rather than re-allocating all the contents again on the heap, you can just move them to the new vector and leave the old one in a sort of half-dead state -- as soon as the function is done cleaning that stack the old vector is no longer there anyway.
Which of these two options is used, is really irrelevant or rather, an implementation detail: they have the same result and whether the compiler is smart enough to choose (2) should not usually be your concern (though in practice option (2) will always happen: deep copying an object just to destroy the original is just pointless and easily avoided).
As long as you realize that the thing that gets copied is the part on the stack and the ownership of the pointers on the heap gets transferred: no copying happens on the heap and nothing gets cleared.
Here are my answers to your different questions:
1- You are absolutely correct. If I understand the sequentiallity correctly, your code will allocate memory, create your object, copy the variable into the what variable, and get destroyed as out of scope. The same thing happens when you do:
int SomeFunction()
{
return 10;
}
This will create a temporary that holds 10 (so allocate), copy it to the return vairbale, and then destroy the temporary (so deallocate) (Here I'm not sure of the specifics, maybe the compiler can remove some stuff via automatic inlining, constante values, ... but you get the idea). Which brings me to
1a- You need RVO when to limit this allocation, copy, and deallocation part. If your class allocates a lot of data upon construction it is a bad idea to return it directly. You can use move constructor in that case, and reuse the storage space allocated by the temporary for example. Or return a pointer. Which takes all the way down to
2- Returning a pointer works exactly as returning an int from a function. But because pointers are only 4 or 8 bytes long, allocation and deallocation cost a lot less than doing so for a class that's 10 Mb long. And instead of copying the object you copy its adress on the heap (usually less heavy, but copy nonetheless). Do not forget it is not because a pointer represents a memory that its size is 0 byte. So using a pointer requires getting the value from some memory address. Returning a reference and inlining are also good ideas to optimise your code, as you avoid chasing pointer, function calls, etc.
3- I think you are correct there. I'd have to make sure by testing, but if follow my logic you are right.
I hope I answered your questions. And I hope my answers are as correct as can be. But maybe someone more clever than me can correct me :-)
Best.
So there are two things that I'm not sure.
If I do something like this:
void sendToDifferentThread(SomeClass &&obj);
...
{
SomeClass object;
sendToDifferentThread(std::move(object));
}
What will happen? How can there ever only be one copy of object if it's created on the stack, since when we go out of the enclosing scope everything on stack is destroyed?
If I do something like this:
SomeClass object;
doSomethingOnSameThread(std::move(object));
What will happen if I do something in the current scope to object later? It was "moved away" to some other function, so did the current function "lose" ownership of it in some way?
In C++ when an object is constructed, memory is allocated at the same time. If the constructor runs to completion (without throwing) then the object is "alive". At some point if it is a stack object and goes out of scope, or its a heap object and you call delete, its destructor is called and that original memory is freed, and then the object is "dead". C++11 std::move / move constructors don't change any of this. What a move constructor does is give you a way and a simple syntax to "destructively" copy objects.
For instance if you move construct from a std::vector<int>, instead of reading all the ints and copying them, it will copy the pointer and the size count instead to the new vector, and set the old pointer to a nullptr and size to 0 (or possibly, allocate a (tiny) new vector of minimal size... depends on implementation). Basically when you move from something you have to leave it in a "valid", "alive" state -- it's not "dead" after you move from it, and the destructor is still going to be called later. It didn't "move" in the sense that it's still following the same lifetime and now it's just "somewhere else in memory". When you "move" from an object, there are definitely two different objects involved from C++'s point of view and I don't think you can make sense of it after a certain point if you try to think of it as though there's only one object that exists in that scenario.
Let's say that I have a main class SomeManager for keeping track of instances of another class SomeClass. When SomeClass is constructed it calls a method of SomeManager passing a pointer to it self. Then SomeManager takes that pointer and pushes it into a vector. The destructor of SomeClass calls another function of SomeManager which removes it's pointer from the vector.
So, my question is. When an instance of SomeClass is moved through the move operator or constructor. Does it's address change and I have to remove the old address and add the new one?
I have some ideas about this from what I've read but I'm not sure and I don't want to mess things up.
Short answer: no, the address of the moved from object doesn't change. But the old object may not be a useful state.
When you perform a move construction you are creating a new object and moving the contents of another object into the new object. The new object will always be constructed in a different memory location from the old object. Something similar happens with move assignment: you simply move the contents of one object to another, but you still have to have two different objects at two different memory locations to perform the assignment (OK, there's self assignment, but we'll ignore that). The old object is still there (OK, it could be a temporary that gets destroyed at the end of the statement), but much of the time you have no guarantee about the old object except that it's in some valid state.
An analogy may be a house filled with furniture. Move construction is like building a new house and moving the furniture to it. Move assignment is like buying a second preexisting home and moving the furniture to it. In both cases the new house has a different address from the old one, but the old one still exists. It just might not be in a useful state (hard to live in a house with no furniture!).
Yes. The move constructor and assignment operator help a new instance of the SomeClass take ownership of the state of the old instance, but the new instance has a distinct address.
One consideration for you is whether you want the removed-from SomeClass instance to "un-register" itself from SomeManager as it's moved from vs. waiting for destruction - whether it matters depends on your exact code.
Move construction essentially means "Here is an object X. Create a new object Y by stealing the innards of X." Similar for move assignment. You don't move the object; you move its contents.
The address of the moved-from object, X, remains unchanged. Its contents may change (e.g., by becoming empty), or may not (e.g., if it's a primitive type for which a move is a copy).
Although in the scheme you are describing you'd better make sure that the move constructor also registers the object it creates...
This question already has answers here:
What is The Rule of Three?
(8 answers)
Closed 9 years ago.
So I'm reading about copy control and the Rule of Three, and it seems that the main example they're giving as to why they're necessary is when using pointers as class members. It says that if you copy the pointer from one object to another, then multiple objects of that class are pointing to the same memory. Why is this bad? Don't we want the object we're pointing things to do point to the same thing? What exactly is supposed to be happening when pointers are copied between classes? I'm not sure exactly what we're supposed to do when using a copy constructor to deal with pointers..
You want to write a copy constructor if you have any dynamic content, in order to avoid shallow (member-wise) copy.
Why would it be a problem if different pointers are pointing to the same object? Because that way, if you change the copy, you change the original.
If you, for example, have two pointers to the same object, and then delete the object by using the first pointer - the object will be gone; the second pointer will not be null, but a dangling pointer, with its previous, but now - invalid value.
Consider: std::vector contains a pointer to an array of elements. If you make a vector, then make a copy of that vector, then change the copy, you don't want the changes to the copy to also change the original. And if you insert elements into the copy, to the point where the array has to be deallocated and resized, the original vector now points to deallocated memory.
The point of the Rule of Three is to prevent copies from having any residual link to their originals. Copies are meant to be independent. There are exceptions to this (google "copy on write"), but they too require special handling and tracking -- again, in the copy constructor, assignment operator, and destructor.
The copy constructor is a tool that allows you to create another object of the same class by copying an existing one not by construction of new object.
If you would have copied pointers that way so they're pointing on the same place in the memory you would create another pointers to in fact the same object instead of creating a new one. It's like creating a shortcut to file in windows :-)
Copy constructor should copy every component of object into another place of memory and provide new pointers to this place, and that way create effective copy of an object.
If you have two object sharing a pointer then if you make a change to one object it will also change the other object. Normally (but not always) this is considered a bad thing.
Additionally if two objects share a pointer then you have the issue of who is responsible for deleting the pointer.
There is actually a standard class called std::shared_ptr which implements shared pointers and automatically manages the memory for you. If you want a class that shares pointers then you should use this class. This way you will not need to worry about the rule of three.
It's bad, because when you delete the object pointer by the pointed in one object, the pointer in the other object will point at invalid memory. That's why straight up assigning the pointer is risky. That's the most dangerous part. But that's not all. If you have two pointers that point to the same place in the memory, changing the memory via one pointer will also change the contents of the second pointer. Sometimes it can make changes in an object you won't expect to change and leave you with searching a bug for days.
And to answer to your second qestion:
What exactly is supposed to be happening when pointers are copied between classes?
Well... it depends on what you want. If you just want to point to something in the memory, and never change it, or delete it, then simply assigning one pointer to another is a valid idea. However, if you'll want to change the object under the pointer, it's much safer to do a copy of it first (via the new operator and the assigment operator), and then point to the new object with the second pointer.
I am writing a dish washer program, Dishwasher has a pump, motor, and an ID. Pump, motor, date, time are other small classes which Dishwasher will use. I checked with the debugger but when I create the Dishwasher class, my desired values aren't initialized. I think I am doing something wrong but what? :(
So the Dishwasher class is below :
class Dishwasher {
Pump pump; // the pump inside the dishwasher
Motor motor;// the motor inside the dishwasher
char* washer_id;//011220001032 means first of December 2000 at 10:32h
Time time_built;// Time variable, when the Dishwasher was built
Date date_built;// Date variable, when the Dishwasher was built
Time washing_time; // a time object, like 1:15 h
public:
Dishwasher(Pump, Motor, char*, float);
};
This is how I initialize the Class :
Dishwasher::Dishwasher(Pump p, Motor m, char *str, float f)
: pump(p), motor(m), max_load(f)
{
washer_id = new char [(strlen(str)+1)];
strcpy (washer_id,str);
time_built.id2time(washer_id);
date_built.id2date(washer_id);
}
This is how I create the class:
Dishwasher siemens(
Pump(160, "011219991143"),
Motor(1300, "081220031201"),
"010720081032",
17.5);
Here is the full code if you would like to see further, because I removed some unused things for better readability : http://codepad.org/K4Bocuht
First of all, I can't get over the fact you're using a char* for shuffling strings around. For crying out loud, std::string. It's a standardized way of dealing with a string of characters. It can wiggle its way out of anything you throw at it, efficiently. Be it a string literal, a char array, a char* or yet another string.
Second of all, your professor needs professional help. It's one thing to teach the "lower level intuition" with something more raw, but to enforce this on students who ought to be studying C++ is plain bad practice.
Now, onto the problem at hand. I'll just take the toplevel example, namely the raw, "naked" pointer lying in your Dishwasher constructor, the char* str. As you know, that's a pointer, namely a pointer to a type char. A pointer stores the memory address of a variable (the first byte of any type of variable in question, which is the lowest addressable unit in memory).
That distinct difference is very important. Why? Because when you assign a pointer to something else, you're not copying the actual object, but only the address of its first byte. So, in effect, you just get two pointers pointing to the same object.
As you are, no doubt, a good memory citizen, you probably defined a destructor to take care of this:
washer_id = new char [(strlen(str)+1)];
You're basically allocating strlen(str)+1 bytes on the heap, something that is unmanaged by the system and remains in your capable hands. Hence the name, a heap. A bunch of stuff, lose the reference to it, you'll never find it again, you can just throw it all away (what actually happens to all the leaks when the program returns out of main). Therefore, it is your duty to tell the system when you're done with it. And you did, defined a destructor, that is.
A big, nasty but...
But... There is a problem with this scheme. You have a constructor. And a destructor. Which manage resource allocation and deallocation. But what about copying?
Dishwasher siemens(
Pump(160, "011219991143"),
Motor(1300, "081220031201"),
"010720081032",
17.5);
You are probably aware that the compiler will try to implicitly create a basic copy constructor, a copy assignment operator (for objects that have already been constructed) and a destructor. Since an implicitly-generated destructor has nothing to release manually (we discussed dynamic memory and our responsibility), it is empty.
Because we do work with dynamic memory and allocate varying size blocks of bytes to store our text, we have a destructor in place (as your longer code shows). That is all well, but we still deal with an implicitly generated copy constructor and a copy assignment operator which copy the direct values of variables. As a pointer is a variable whose value is a memory address, what an implicitly-generated copy constructor or copy assignment operator does is just copying that memory address (this is called a shallow copy) which just creates another reference to a unique, singular byte in memory (and the rest of the contiguous block).
What we want is the opposite, a deep copy, to allocate new memory and copy over the actual object (or any compound multi-byte type, array data structure etc.) stored at the incoming pointer's memory address. This way, they will point to distinct objects whose lifespan is tied to the lifespan of the enveloping object, not the lifespan of the object from which is being copied from.
Observe in the example above that you're creating temporary objects on the stack which are alive for the time the constuctor is in action, after which they are released and their destructors are invoked.
Dishwasher::Dishwasher(Pump p, Motor m, char *str, float f)
: pump(p), motor(m), max_load(f)
{
washer_id = new char [(strlen(str)+1)];
strcpy (washer_id,str);
time_built.id2time(washer_id);
date_built.id2date(washer_id);
}
The initialization list has an added benefit of not initializing objects to their default values and then performing the copy, since you have the honor of directly calling the copy constructor (which in this case is implicitly generated by the compiler):
pump(p) is basically invoking Pump::Pump(const Pump &) and passing in the values the temporary object has been initialized with. Your Pump class contains a char*, which holds the address of the first byte to the string literal you shoved into the temporary object: Pump(160, "011219991143").
The copy constructor takes the temporary object and copies all the data explicitly-available to it, that means it just takes the address contained in the char* pointer, not the entire string. Therefore, you end up pointing from two places to the same object.
Since the temporary objects live on the stack, once the constructor finishes dealing with them, they will be released release and their destructors invoked. These destructors actually destroy the string along with them, which you placed while creating the Dishwasher object. Now your Pump object within the Dishwasher object holds a dangling pointer, a pointer to a memory address of an object lost in the endless memory abyss.
Solution?
Write your own copy constructor and copy assignment operator overload. On the example of Pump:
Pump(const Pump &pumpSrc) // copy constructor
Pump& operator=(const Pump &pumpSrc) // copy assignment operator overload
Do this foreach class.
Of course, in addition to the destructor, which you already have. These three are the protagonists over a rule of thumb called "The Rule of Three". It states that if you have to declare any of them explicitly, you probably need to explicitly declare the rest of them, too.
The probably part of the rule is just a void of responsibility, basically. The need for explicit definitions is actually quite obvious when you get further down the road with C++. For example, thinking about what your class does is a good way of determining whether you need everything defined explicitly.
Example: Your classes depend upon a naked pointer which points to a piece of memory and the memory address is all it pulls along. It's a simple typed variable which holds only a memory address to the first byte of the object in question.
In order to prevent a leak upon destroying your object, you defined a destructor which frees the allocated memory. But, do you copy the data between objects? Very likely, as you've seen. Sometimes you're going to create a temporary object which will allocate data and store the memory address into a pointer data member whose value you will copy. After a while, that temporary is bound to be destroyed at some time, and you will lose the data with it. The only thing you'll have left is the dangling pointer with a useless and dangerous memory address.
Official disclaimer: Some simplifications are in place to prevent me from writing a book on the subject. Also, I always try to concentrate on the question at hand, not the code of the OP, which means I do not comment on the practices employed. The code might be horrible, beautiful or anything in-between. But I do not try to turn the code or the OP around, I just try to answer the question and sometimes suggest something I think could be beneficial to the OP in the long run. Yes, we could be precise as hell and qualify everything... We could also use Peano axioms to define the number sets and the basic arithmetic operations before we boldly try to state that 2 + 3 = 3 + 2 = 5, for the same amount of fun.
You have violated the Rule of Three. You can solve your problem either of these two ways:
Never use naked pointers. In your case, you can do that by replacing char* in every instance with std::string.
Implement a copy constructor, and a copy assignment operator, each of which must perform a deep copy, and a destructor.