This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
What is the difference between a deep copy and a shallow copy?
Breadth vs Depth; think in terms of a tree of references with your object as the root node.
Shallow:
The variables A and B refer to different areas of memory, when B is assigned to A the two variables refer to the same area of memory. Later modifications to the contents of either are instantly reflected in the contents of other, as they share contents.
Deep:
The variables A and B refer to different areas of memory, when B is assigned to A the values in the memory area which A points to are copied into the memory area to which B points. Later modifications to the contents of either remain unique to A or B; the contents are not shared.
Shallow copies duplicate as little as possible. A shallow copy of a collection is a copy of the collection structure, not the elements. With a shallow copy, two collections now share the individual elements.
Deep copies duplicate everything. A deep copy of a collection is two collections with all of the elements in the original collection duplicated.
Try to consider following image
For example Object.MemberwiseClone creates a shallow copy link
and using ICloneable interface you can get deep copy as described here
In short, it depends on what points to what. In a shallow copy, object B points to object A's location in memory. In deep copy, all things in object A's memory location get copied to object B's memory location.
This wiki article has a great diagram.
http://en.wikipedia.org/wiki/Object_copy
Especially For iOS Developers:
If B is a shallow copy of A, then for primitive data it's like B = [A assign]; and for objects it's like B = [A retain];
B and A point to the same memory location
If B is a deep copy of A, then it is like B = [A copy];
B and A point to different memory locations
B memory address is same as A's
B has same contents as A's
Shallow copy: Copies the member values from one object into another.
Deep Copy: Copies the member values from one object into another.
Any pointer objects are duplicated and Deep Copied.
Example:
class String
{
int size;
char* data;
};
String s1("Ace"); // s1.size = 3 s1.data=0x0000F000
String s2 = shallowCopy(s1);
// s2.size =3 s2.data = 0X0000F000
String s3 = deepCopy(s1);
// s3.size =3 s3.data = 0x0000F00F
// (With Ace copied to this location.)
Just for the sake of easy understanding you could follow this article:
https://www.cs.utexas.edu/~scottm/cs307/handouts/deepCopying.htm
Shallow Copy:
Deep Copy:
I haven't seen a short, easy to understand answer here--so I'll give it a try.
With a shallow copy, any object pointed to by the source is also pointed to by the destination (so that no referenced objects are copied).
With a deep copy, any object pointed to by the source is copied and the copy is pointed to by the destination (so there will now be 2 of each referenced object). This recurses down the object tree.
char * Source = "Hello, world.";
char * ShallowCopy = Source;
char * DeepCopy = new char(strlen(Source)+1);
strcpy(DeepCopy,Source);
'ShallowCopy' points to the same location in memory as 'Source' does.
'DeepCopy' points to a different location in memory, but the contents are the same.
{Imagine two objects: A and B of same type _t(with respect to C++) and you are thinking about shallow/deep copying A to B}
Shallow Copy:
Simply makes a copy of the reference to A into B. Think about it as a copy of A's Address.
So, the addresses of A and B will be the same i.e. they will be pointing to the same memory location i.e. data contents.
Deep copy:
Simply makes a copy of all the members of A, allocates memory in a different location for B and then assigns the copied members to B to achieve deep copy. In this way, if A becomes non-existant B is still valid in the memory. The correct term to use would be cloning, where you know that they both are totally the same, but yet different (i.e. stored as two different entities in the memory space). You can also provide your clone wrapper where you can decide via inclusion/exclusion list which properties to select during deep copy. This is quite a common practice when you create APIs.
You can choose to do a Shallow Copy ONLY_IF you understand the stakes involved. When you have enormous number of pointers to deal with in C++ or C, doing a shallow copy of an object is REALLY a bad idea.
EXAMPLE_OF_DEEP COPY_ An example is, when you are trying to do image processing and object recognition you need to mask "Irrelevant and Repetitive Motion" out of your processing areas. If you are using image pointers, then you might have the specification to save those mask images. NOW... if you do a shallow copy of the image, when the pointer references are KILLED from the stack, you lost the reference and its copy i.e. there will be a runtime error of access violation at some point. In this case, what you need is a deep copy of your image by CLONING it. In this way you can retrieve the masks in case you need them in the future.
EXAMPLE_OF_SHALLOW_COPY I am not extremely knowledgeable compared to the users in StackOverflow so feel free to delete this part and put a good example if you can clarify. But I really think it is not a good idea to do shallow copy if you know that your program is gonna run for an infinite period of time i.e. continuous "push-pop" operation over the stack with function calls. If you are demonstrating something to an amateur or novice person (e.g. C/C++ tutorial stuff) then it is probably okay. But if you are running an application such as surveillance and detection system, or Sonar Tracking System, you are not supposed to keep shallow copying your objects around because it will kill your program sooner or later.
What is Shallow Copy?
Shallow copy is a bit-wise copy of an object. A new object is created that has an exact copy of the values in the original object. If any of the fields of the object are references to other objects, only the reference addresses are copied i.e., only the memory address is copied.
In this figure, the MainObject1 has fields field1 of type int, and ContainObject1 of type ContainObject. When you do a shallow copy of MainObject1, MainObject2 is created with field2 containing the copied value of field1 and still pointing to ContainObject1 itself. Note that since field1 is of primitive type, its value is copied to field2 but since ContainedObject1 is an object, MainObject2 still points to ContainObject1. So any changes made to ContainObject1 in MainObject1 will be reflected in MainObject2.
Now if this is shallow copy, lets see what's deep copy?
What is Deep Copy?
A deep copy copies all fields, and makes copies of dynamically allocated memory pointed to by the fields. A deep copy occurs when an object is copied along with the objects to which it refers.
In this figure, the MainObject1 have fields field1 of type int, and ContainObject1 of type ContainObject. When you do a deep copy of MainObject1, MainObject2 is created with field2 containing the copied value of field1 and ContainObject2 containing the copied value of ContainObject1. Note any changes made to ContainObject1 in MainObject1 will not reflect in MainObject2.
good article
Deep Copy
A deep copy copies all fields, and makes copies of dynamically allocated memory pointed to by the fields. A deep copy occurs when an object is copied along with the objects to which it refers.
Shallow Copy
Shallow copy is a bit-wise copy of an object. A new object is created that has an exact copy of the values in the original object. If any of the fields of the object are references to other objects, just the reference addresses are copied i.e., only the memory address is copied.
In object oriented programming, a type includes a collection of member fields. These fields may be stored either by value or by reference (i.e., a pointer to a value).
In a shallow copy, a new instance of the type is created and the values are copied into the new instance. The reference pointers are also copied just like the values. Therefore, the references are pointing to the original objects. Any changes to the members that are stored by reference appear in both the original and the copy, since no copy was made of the referenced object.
In a deep copy, the fields that are stored by value are copied as before, but the pointers to objects stored by reference are not copied. Instead, a deep copy is made of the referenced object, and a pointer to the new object is stored. Any changes that are made to those referenced objects will not affect other copies of the object.
I would like to give example rather than the formal definition.
var originalObject = {
a : 1,
b : 2,
c : 3,
};
This code shows a shallow copy:
var copyObject1 = originalObject;
console.log(copyObject1.a); // it will print 1
console.log(originalObject.a); // it will also print 1
copyObject1.a = 4;
console.log(copyObject1.a); //now it will print 4
console.log(originalObject.a); // now it will also print 4
var copyObject2 = Object.assign({}, originalObject);
console.log(copyObject2.a); // it will print 1
console.log(originalObject.a); // it will also print 1
copyObject2.a = 4;
console.log(copyObject2.a); // now it will print 4
console.log(originalObject.a); // now it will print 1
This code shows a deep copy:
var copyObject2 = Object.assign({}, originalObject);
console.log(copyObject2.a); // it will print 1
console.log(originalObject.a); // it will also print 1
copyObject2.a = 4;
console.log(copyObject2.a); // now it will print 4
console.log(originalObject.a); // !! now it will print 1 !!
'ShallowCopy' points to the same location in memory as 'Source' does. 'DeepCopy' points to a different location in memory, but the contents are the same.
Shallow Cloning:
Definition: "A shallow copy of an object copies the ‘main’ object, but doesn’t copy the inner objects."
When a custom object (eg. Employee) has just primitive, String type variables then you use Shallow Cloning.
Employee e = new Employee(2, "john cena");
Employee e2=e.clone();
You return super.clone(); in the overridden clone() method and your job is over.
Deep Cloning:
Definition: "Unlike the shallow copy, a deep copy is a fully independent copy of an object."
Means when an Employee object holds another custom object:
Employee e = new Employee(2, "john cena", new Address(12, "West Newbury", "Massachusetts");
Then you have to write the code to clone the 'Address' object as well in the overridden clone() method. Otherwise the Address object won't clone and it causes a bug when you change value of Address in cloned Employee object, which reflects the original one too.
var source = { firstName="Jane", lastname="Jones" };
var shallow = ShallowCopyOf(source);
var deep = DeepCopyOf(source);
source.lastName = "Smith";
WriteLine(source.lastName); // prints Smith
WriteLine(shallow.lastName); // prints Smith
WriteLine(deep.lastName); // prints Jones
Shallow Copy- Reference variable inside original and shallow-copied objects have reference to common object.
Deep Copy- Reference variable inside original and deep-copied objects have reference to different object.
clone always does shallow copy.
public class Language implements Cloneable{
String name;
public Language(String name){
this.name=name;
}
public String getName() {
return name;
}
#Override
protected Object clone() throws CloneNotSupportedException {
return super.clone();
}
}
main class is following-
public static void main(String args[]) throws ClassNotFoundException, CloneNotSupportedException{
ArrayList<Language> list=new ArrayList<Language>();
list.add(new Language("C"));
list.add(new Language("JAVA"));
ArrayList<Language> shallow=(ArrayList<Language>) list.clone();
//We used here clone since this always shallow copied.
System.out.println(list==shallow);
for(int i=0;i<list.size();i++)
System.out.println(list.get(i)==shallow.get(i));//true
ArrayList<Language> deep=new ArrayList<Language>();
for(Language language:list){
deep.add((Language) language.clone());
}
System.out.println(list==deep);
for(int i=0;i<list.size();i++)
System.out.println(list.get(i)==deep.get(i));//false
}
OutPut of above will be-
false true true
false false false
Any change made in origional object will reflect in shallow object not in deep object.
list.get(0).name="ViSuaLBaSiC";
System.out.println(shallow.get(0).getName()+" "+deep.get(0).getName());
OutPut- ViSuaLBaSiC C
Imagine there are two arrays called arr1 and arr2.
arr1 = arr2; //shallow copy
arr1 = arr2.clone(); //deep copy
In Simple Terms, a Shallow Copy is similar to Call By Reference and a Deep Copy is similar to Call By Value
In Call By Reference, Both formal and actual parameters of a function refers to same memory location and the value.
In Call By Value, Both formal and actual parameters of a functions refers to different memory location but having the same value.
A shallow copy constructs a new compound object and insert its references into it to the original object.
Unlike shallow copy, deepcopy constructs new compound object and also inserts copies of the original objects of original compound object.
Lets take an example.
import copy
x =[1,[2]]
y=copy.copy(x)
z= copy.deepcopy(x)
print(y is z)
Above code prints FALSE.
Let see how.
Original compound object x=[1,[2]] (called as compound because it has object inside object (Inception))
as you can see in the image, there is a list inside list.
Then we create a shallow copy of it using y = copy.copy(x). What python does here is, it will create a new compound object but objects inside them are pointing to the orignal objects.
In the image it has created a new copy for outer list. but the inner list remains same as the original one.
Now we create deepcopy of it using z = copy.deepcopy(x). what python does here is, it will create new object for outer list as well as inner list. as shown in the image below (red highlighted).
At the end code prints False, as y and z are not same objects.
HTH.
struct sample
{
char * ptr;
}
void shallowcpy(sample & dest, sample & src)
{
dest.ptr=src.ptr;
}
void deepcpy(sample & dest, sample & src)
{
dest.ptr=malloc(strlen(src.ptr)+1);
memcpy(dest.ptr,src.ptr);
}
To add more to other answers,
a Shallow Copy of an object performs copy by value for value types
based properties, and copy by reference for reference types based properties.
a Deep Copy of an object performs copy by value for value types based
properties, as well as copy by value for reference types based
properties deep in the hierarchy (of reference types)
Shallow copy will not create new reference but deep copy will create the new reference.
Here is the program to explain the deep and shallow copy.
public class DeepAndShollowCopy {
int id;
String name;
List<String> testlist = new ArrayList<>();
/*
// To performing Shallow Copy
// Note: Here we are not creating any references.
public DeepAndShollowCopy(int id, String name, List<String>testlist)
{
System.out.println("Shallow Copy for Object initialization");
this.id = id;
this.name = name;
this.testlist = testlist;
}
*/
// To performing Deep Copy
// Note: Here we are creating one references( Al arraylist object ).
public DeepAndShollowCopy(int id, String name, List<String> testlist) {
System.out.println("Deep Copy for Object initialization");
this.id = id;
this.name = name;
String item;
List<String> Al = new ArrayList<>();
Iterator<String> itr = testlist.iterator();
while (itr.hasNext()) {
item = itr.next();
Al.add(item);
}
this.testlist = Al;
}
public static void main(String[] args) {
List<String> list = new ArrayList<>();
list.add("Java");
list.add("Oracle");
list.add("C++");
DeepAndShollowCopy copy=new DeepAndShollowCopy(10,"Testing", list);
System.out.println(copy.toString());
}
#Override
public String toString() {
return "DeepAndShollowCopy [id=" + id + ", name=" + name + ", testlist=" + testlist + "]";
}
}
Taken from [blog]: http://sickprogrammersarea.blogspot.in/2014/03/technical-interview-questions-on-c_6.html
Deep copy involves using the contents of one object to create another instance of the same class. In a deep copy, the two objects may contain ht same information but the target object will have its own buffers and resources. the destruction of either object will not affect the remaining object. The overloaded assignment operator would create a deep copy of objects.
Shallow copy involves copying the contents of one object into another instance of the same class thus creating a mirror image. Owing to straight copying of references and pointers, the two objects will share the same externally contained contents of the other object to be unpredictable.
Explanation:
Using a copy constructor we simply copy the data values member by member. This method of copying is called shallow copy. If the object is a simple class, comprised of built in types and no pointers this would be acceptable. This function would use the values and the objects and its behavior would not be altered with a shallow copy, only the addresses of pointers that are members are copied and not the value the address is pointing to. The data values of the object would then be inadvertently altered by the function. When the function goes out of scope, the copy of the object with all its data is popped off the stack.
If the object has any pointers a deep copy needs to be executed. With the deep copy of an object, memory is allocated for the object in free store and the elements pointed to are copied. A deep copy is used for objects that are returned from a function.
I came to understand from the following lines.
Shallow copy copies an object value type(int, float, bool) fields in to target object and object's reference types(string, class etc) are copied as references in target object. In this target reference types will be pointing to the memory location of source object.
Deep copy copies an object's value and reference types into a complete new copy of the target objects. This means both the value types and reference types will be allocated a new memory locations.
Shallow copying is creating a new object and then copying the non-static fields of the current object to the new object. If a field is a value type --> a bit-by-bit copy of the field is performed; for a reference type --> the reference is copied but the referred object is not; therefore the original object and its clone refer to the same object.
Deep copy is creating a new object and then copying the nonstatic fields of the current object to the new object. If a field is a value type --> a bit-by-bit copy of the field is performed. If a field is a reference type --> a new copy of the referred object is performed. The classes to be cloned must be flagged as [Serializable].
Copying ararys :
Array is a class, which means it is reference type so array1 = array2 results
in two variables that reference the same array.
But look at this example:
static void Main()
{
int[] arr1 = new int[] { 1, 2, 3, 4, 5 };
int[] arr2 = new int[] { 6, 7, 8, 9, 0 };
Console.WriteLine(arr1[2] + " " + arr2[2]);
arr2 = arr1;
Console.WriteLine(arr1[2] + " " + arr2[2]);
arr2 = (int[])arr1.Clone();
arr1[2] = 12;
Console.WriteLine(arr1[2] + " " + arr2[2]);
}
shallow clone means that only the memory represented by the cloned array is copied.
If the array contains value type objects, the values are copied;
if the array contains reference type, only the references are copied - so as a result there are two arrays whose members reference the same objects.
To create a deep copy—where reference type are duplicated, you must loop through the array and clone each element manually.
The copy constructor is used to initialize the new object with the previously created object of the same class. By default compiler wrote a shallow copy. Shallow copy works fine when dynamic memory allocation is not involved because when dynamic memory allocation is involved then both objects will points towards the same memory location in a heap, Therefore to remove this problem we wrote deep copy so both objects have their own copy of attributes in a memory.
In order to read the details with complete examples and explanations you could see the article C++ constructors.
To add just a little more for confusion between shallow copy and simply assign a new variable name to list.
"Say we have:
x = [
[1,2,3],
[4,5,6],
]
This statement creates 3 lists: 2 inner lists and one outer list. A reference to the outer list is then made available under the name x. If we do
y = x
no data gets copied. We still have the same 3 lists in memory somewhere. All this did is make the outer list available under the name y, in addition to its previous name x. If we do
y = list(x)
or
y = x[:]
This creates a new list with the same contents as x. List x contained a reference to the 2 inner lists, so the new list will also contain a reference to those same 2 inner lists. Only one list is copied—the outer list.
Now there are 4 lists in memory, the two inner lists, the outer list, and the copy of the outer list. The original outer list is available under the name x, and the new outer list is made available under the name y.
The inner lists have not been copied! You can access and edit the inner lists from either x or y at this point!
If you have a two dimensional (or higher) list, or any kind of nested data structure, and you want to make a full copy of everything, then you want to use the deepcopy() function in the copy module. Your solution also works for 2-D lists, as iterates over the items in the outer list and makes a copy of each of them, then builds a new outer list for all the inner copies."
source: https://www.reddit.com/r/learnpython/comments/1afldr/why_is_copying_a_list_so_damn_difficult_in_python/
I've read countless articles on copy constructors and move semantics. I feel like I 'sort' of understand what's going on, but a lot of the explanations leave out whats actually occurring under the hood (which is what is causing me confusion).
For example:
string b(x + y);
string(string&& that)
{
data = that.data;
that.data = 0;
}
What is actually happening in memory with the objects? So you have some object 'b' that takes x + y which is an rvalue and then that invokes the move constructor. This is really causing me confusion... Why do that?
I understand the benefit is to 'move' the data instead of copy it, but where I'm lost here is when I try to piece together what happens to each object/parameter at a memory level.
Sorry if this sounds confusing, talking about it is even confusing myself.
EDIT:
In summary, I understand the 'why' of the copy constructors and move constructors... I just don't understand the 'how'.
What's going on is a complex object will normally not be entirely stack based. Let's take an example object:
class String {
public:
// happy fun API
private:
size_t size;
char* data;
};
Like most strings, our string is a character array. It essentially is an object that keeps around a character array and a proper size.
In the case of a copy, there's two steps involved. First you copy size then you copy data. But data is just a pointer. So if we copy the object then modify the original, the two places are pointing to the same data, our copy changes. This is not what we want.
So instead what must be done is to do the same thing we did when we first made the object, new the data to the proper size.
So when we're copying the object we need to do something like:
String::String(String const& copy) {
size = copy.size;
data = new int[size];
memcpy(data, copy.data, size);
}
But on the other hand, if we only need to move the data, we can do something like:
String::String(String&& copy) {
size = copy.size;
data = copy.data;
copy.size = 0;
copy.data = nullptr; // So copy's dtor doesn't try to free our data.
}
Now behind the scenes, the pointer was just kinda... passed to us. We didn't have to allocate any more information. This is why moves are preferred. Allocating and copying memory on the heap can be a very expensive operation because it's not happening locally on the stack, it's happening somewhere else, so that memory has to be fetched, it might not be in cache, etc.
... (x + y);
Let's assume Short-String-Optimisation is not in play - either because the string implementation doesn't use it or the string values are too long. operator+ returns by value, so has to create a temporary with a new buffer totally unrelated to the x and y strings...
[ string { const char* _p_data; ... } ]
\
\-------------------------(heap)--------[ "hello world!" ];
Sans optimisation, that's done to prepare the argument for the string constructor - "before" considering what that constructor will do with the argument.
string b(x + y);
Here the string(string&&) constructor is invoked, as the compiler understands that the temporary above is suitable for moving from. When the constructor starts running, its pointer to text is uninitialised - something like the diagram below with the temporary shown again for context:
[ string { const char* _p_data; ... } ]
\
\-------------------------(heap)--------[ "hello world!" ];
[ string b { const char* _p_data; ... } ]
\
\----? uninitialised
What the move constructor for b then does is steal the existing heap buffer from the temporary.
nullptr
/
[ string { const char* _p_data; ... } ]
-------------------------(heap)--------[ "hello world!" ];
/
/
[ string b { const char* _p_data; ... } ]
It also needs to set the temporary's _p_data to nullptr to make sure that when the temporary's destructor runs it doesn't delete[] the buffer now considered to be owned by b. (The move constructor will "move" other data members too - the "capacity" value, either a pointer to the "end" position or a "size" value etc.).
All this avoids having b's constructor create a second heap buffer, copy all the text over into it, only to then do extra work to delete[] the temporary's buffer.
(x + y) gives you a string value. You want to store it in b without copying it. This was made possible long before C++11 and move semantics, by the Return Value Optimization (RVO).
I'm a java programmer switching over to C++. I have a list of data in a class. I want to return the contents of a list stored in a class variable and then generate a new list and store it in the class variable so I can start adding new data to the empty list. I think I know how to do it, but I want to double check since I'm new to references and c++ memory management and don't want a stupid memory leak later. (note, I can't copy my actual code easily so I'm just rewriting it, forgive me if I mistype anything).
I believe the correct syntax would be something like this:
//mylist is typedef of a list type
mylist& temporaryList=classList;
classList=myList();
return classList;
Is this syntax correct? Also, will I have to worry about freeing either the returned variable or the classList variable at any time or will RIAA take care of it all for me?
Sorry for asking such an easy question, but thank you for confirming my assumptions.
mylist& temporaryList=classList;
tempraryList is a reference. When you change classList, it will change too. Try this:
mylist tempraryList = classList;
This will copy mylist's copy constructor, creating a new one instead of just aliasing the other one.
return classList;
This is returning the one you just decided should be a new list. You want to return temporaryList. Make sure it isn't being returned by reference, though, because temporaryList will go out of scope (cleaning itself up, since it was allocated on the stack) and you'll end up with a dangling reference.
As well, usually, rather than assigning the result of a default constructor, classes might provide a sort of reset function to do that without the overhead of another object.
As #chris points out you have a problem in that you are using a reference, references have the nice feature that they alias the actual object at little to no cost, but in your case it means that when you reset the member list you are resetting the only list in your program.
The trivial naïve fix to your code is, as #chris also points out, to copy the list and then reset the original:
mylist getAndClear() {
mylist temporaryList=classList; // make a copy
classList=myList(); // reset the internal
return temporaryList; // return the **copy**
}
Now, the problem with this approach is that you are incurring the cost of copying when you know that the original list is going to be destroyed immediately. You can avoid that cost by using the copy-and-swap idiom(*):
mylist getAndClear() {
mylist tmp; // empty
swap( tmp, classList ); // swap contents, now classList is empty
// tmp holds the data
return tmp;
}
This is assuming that mylist is a std::list. If it is not, make sure that you implement swap (or in C++11 move constructors, that will enable an efficient std::swap).
(*) While this may not seem a direct application of the copy-and-swap idiom, it actually is, where the modification to be applied is clearing the list. Perform a copy and apply the change over the copy (in this case the copy is avoided, as the change is emptying it), then swap the contents once the operation has completed successfully.
chris mentions using a member function to reset or clear the object like this:
mylist tmp = classList;
classList.clear();
return tmp;
But you can usually do even better by avoiding the copy in the first place.
mylist tmp;
std::swap(tmp,classList);
return tmp;
Also, will I have to worry about freeing either the returned variable or the classList variable at any time or will RIAA take care of it all for me?
You don't need to delete resources unless you new them somewhere. And if you do use new then you should also use smart pointers so you still won't have to delete anything.
One of the most important things to understand about C++ coming from Java is that C++ objects are value-like objects by default. That is:
class A {
bool bar_called;
public:
A() : bar_called(false) {}
void bar() { bar_called = true; }
bool has_bar_been_called_on_this_object() { return bar_called; }
};
void foo(A a) {
a.bar();
}
int main() {
A a;
foo(a);
std::cout << a.has_bar_been_called_on_this_object() << '\n';
}
The output will indicate that bar has not been called on a. Java uses, but tries to hide, pointers. So once you figure out pointers in C++ things should make more sense to you, and then you will be able figure out how to not use pointers.
Object o = new Object(); // Java hides the fact that o is a pointer to an Object, but fails to hide the consequences
Object b = o; // b is a pointer to an object, the same Object o points to.
// Equivalent C++
Object *o = new Object();
Object *b = o;
Judging from the C++ code you presented, in Java you'd do what your asking about something like this:
mylist tmp = classList;
classList = new mylist();
return tmp;
The equivalent in C++ would be:
mylist *tmp = classList; // classList is a pointer to a new'd up list.
classList = new mylist();
return tmp;
However that's not idomatic C++. In C++ you generally don't want to use pointers, and if you do you want to use smart pointers
std::shared_ptr<mylist> tmp = classList; // classList is a std::shared_ptr<mylist>
classList = std::make_shared<mylist>();
return tmp;
or
std::unique_ptr<mylist> tmp = std::move(classList); // classList is a unique_ptr
classList = std::unique_ptr<mylist>(new mylist()); // careful here, don't go calling a bunch of functions inside the mylist initializer, it's dangerous for reason beyond the scope of this post
return tmp;
But the C++ way is really to avoid pointers altogether.
mylist tmp; // classList is not a pointer at all
std::swap(tmp,classList); // the values of classList and tmp are swapped
return tmp; // tmp is returned by value, tmp has the same value as classList, but is not the same object, tmp and classList are objects, not pointers to objects as they are in Java or in the above C++ examples.
EDIT: I know in this case, if it were an actual class i would be better off not putting the string on the heap. However, this is just a sample code to make sure i understand the theory. The actual code is going to be a red black tree, with all the nodes stored on the heap.
I want to make sure i have these basic ideas correct before moving on (I am coming from a Java/Python background). I have been searching the net, but haven't found a concrete answer to this question yet.
When you reassign a pointer to a new object, do you have to call delete on the old object first to avoid a memory leak? My intuition is telling me yes, but i want a concrete answer before moving on.
For example, let say you had a class that stored a pointer to a string
class MyClass
{
private:
std::string *str;
public:
MyClass (const std::string &_str)
{
str=new std::string(_str);
}
void ChangeString(const std::string &_str)
{
// I am wondering if this is correct?
delete str;
str = new std::string(_str)
/*
* or could you simply do it like:
* str = _str;
*/
}
....
In the ChangeString method, which would be correct?
I think i am getting hung up on if you dont use the new keyword for the second way, it will still compile and run like you expected. Does this just overwrite the data that this pointer points to? Or does it do something else?
Any advice would be greatly appricated :D
If you must deallocate the old instance and create another one, you should first make sure that creating the new object succeeds:
void reset(const std::string& str)
{
std::string* tmp = new std::string(str);
delete m_str;
m_str = tmp;
}
If you call delete first, and then creating a new one throws an exception, then the class instance will be left with a dangling pointer. E.g, your destructor might end up attempting to delete the pointer again (undefined behavior).
You could also avoid that by setting the pointer to NULL in-between, but the above way is still better: if resetting fails, the object will keep its original value.
As to the question in the code comment.
*str = _str;
This would be the correct thing to do. It is normal string assignment.
str = &_str;
This would be assigning pointers and completely wrong. You would leak the string instance previously pointed to by str. Even worse, it is quite likely that the string passed to the function isn't allocated with new in the first place (you shouldn't be mixing pointers to dynamically allocated and automatic objects). Furthermore, you might be storing the address of a string object whose lifetime ends with the function call (if the const reference is bound to a temporary).
Why do you think you need to store a pointer to a string in your class? Pointers to C++ collections such as string are actually very rarely necessary. Your class should almost certainly look like:
class MyClass
{
private:
std::string str;
public:
MyClass (const std::string & astr) : str( astr )
{
}
void ChangeString(const std::string & astr)
{
str = astr;
}
....
};
Just pinpointing here, but
str = _str;
would not compile (you're trying to assign _str, which is the value of a string passed by reference, to str, which is the address of a string). If you wanted to do that, you would write :
str = &_str;
(and you would have to change either _str or str so that the constnest matches).
But then, as your intuition told you, you would have leaked the memory of whatever string object was already pointed to by str.
As pointed earlier, when you add a variable to a class in C++, you must think of whether the variable is owned by the object, or by something else.
If it is owned by the object, than you're probably better off with storing it as a value, and copying stuff around (but then you need to make sure that copies don't happen in your back).
It is is not owned, then you can store it as a pointer, and you don't necessarily need to copy things all the time.
Other people will explain this better than me, because I am not really confortable with it.
What I end up doing a lot is writing code like this :
class Foo {
private :
Bar & dep_bar_;
Baz & dep_baz_;
Bing * p_bing_;
public:
Foo(Bar & dep_bar, Baz & dep_baz) : dep_bar_(dep_bar), dep_baz_(dep_baz) {
p_bing = new Bing(...);
}
~Foo() {
delete p_bing;
}
That is, if an object depends on something in the 'Java' / 'Ioc' sense (the objects exists elsewhere, you're not creating it, and you only wants to call method on it), I would store the dependency as a reference, using dep_xxxx.
If I create the object, I would use a pointer, with a p_ prefix.
This is just to make the code more "immediate". Not sure it helps.
Just my 2c.
Good luck with the memory mgt, you're right that it is the tricky part comming from Java ; don't write code until you're confortable, or you're going to spend hours chasing segaults.
Hoping this helps !
The general rule in C++ is that for every object created with "new" there must be a "delete". Making sure that always happens in the hard part ;) Modern C++ programmers avoid creating memory on the heap (i.e. with "new") like the plague and use stack objects instead. Really consider whether you need to be using "new" in your code. It's rarely needed.
If you're coming from a background with garbage collected languages and find yourself really needing to use heap memory, I suggest using the boost shared pointers. You use them like this:
#include <boost/shared_ptr.hpp>
...
boost::shared_ptr<MyClass> myPointer = boost::shared_ptr<MyClass>(new MyClass());
myPointer has pretty much the same language semantics as a regular pointer, but shared_ptr uses reference counting to determine when delete the object it's referencing. It's basically do it yourself garbage collection. The docs are here: http://www.boost.org/doc/libs/1_42_0/libs/smart_ptr/smart_ptr.htm
I'll just write a class for you.
class A
{
Foo * foo; // private by default
public:
A(Foo * foo_): foo(foo_) {}
A(): foo(0) {} // in case you need a no-arguments ("default") constructor
A(const A &a):foo(new Foo(a.foo)) {} // this is tricky; explanation below
A& operator=(const &A a) { foo = new Foo(a.foo); return *this; }
void setFoo(Foo * foo_) { delete foo; foo = foo_; }
~A() { delete foo; }
}
For classes that hold resources like this, the copy constructor, assignment operator, and destructor are all necessary. The tricky part of the copy constructor and assignment operator is that you need to delete each Foo precisely once. If the copy constructor initializer had said :foo(a.foo), then that particular Foo would be deleted once when the object being initialized was destroyed and once when the object being initialized from (a) was destroyed.
The class, the way I've written it, needs to be documented as taking ownership of the Foo pointer it's being passed, because Foo * f = new Foo(); A a(f); delete f; will also cause double deletion.
Another way to do that would be to use Boost's smart pointers (which were the core of the next standard's smart pointers) and have boost::shared_ptr<Foo> foo; instead of Foo * f; in the class definition. In that case, the copy constructor should be A(const A &a):foo(a.foo) {}, since the smart pointer will take care of deleting the Foo when all the copies of the shared pointer pointing at it are destroyed. (There's problems you can get into here, too, particularly if you mix shared_ptr<>s with any other form of pointer, but if you stick to shared_ptr<> throughout you should be OK.)
Note: I'm writing this without running it through a compiler. I'm aiming for accuracy and good style (such as the use of initializers in constructors). If somebody finds a problem, please comment.
Three comments:
You need a destructor as well.
~MyClass()
{
delete str;
}
You really don't need to use heap allocated memory in this case. You could do the following:
class MyClass {
private:
std::string str;
public:
MyClass (const std::string &_str) {
str= _str;
}
void ChangeString(const std::string &_str) {
str = _str;
};
You can't do the commented out version. That would be a memory leak. Java takes care of that because it has garbage collection. C++ does not have that feature.
When you reassign a pointer to a new object, do you have to call delete on the old object first to avoid a memory leak? My intuition is telling me yes, but i want a concrete answer before moving on.
Yes. If it's a raw pointer, you must delete the old object first.
There are smart pointer classes that will do this for you when you assign a new value.