Newbie question about manual memory management and deep copying - c++

Alright, so I'm trying out C++ for the first time, as it looks like I'll have to use it for an upcoming course in college. I have a couple years of programming under my belt, but not much in the non-garbage-collected world.
I have a class, a Node for use in a doubly linked list. So basically it has a value and two pointers to other Nodes. The main constructor looks like Node(const std::string & val, Node * prev, Node * next). The exercise includes a copy constructor that does a shallow copy of another Node, with a comment above it that says to change it to make a deep copy.
Here is what I thought that meant:
Node(const Node & other)
: value(other.value)
{
prev = new Node(other.prev->value, other.prev->prev, other.prev->next);
next = new Node(other.next->value, other.next->prev, other.next->next);
}
This seems to accomplish the goal of making it so that changing the copied Node doesn't affect the new Node. However, when I do it this way, I am allocating new stuff on the heap. This worries me, because I think it means that I should also be deleting it in the Node's destructor. But this is now inconsistent with the other constructor, where pointers to the Nodes are just passed in, already pointing to something. I can't rightly go deleteing next and prev in the destructor with that going on, right?
I'm really confused, guidance appreciated!
EDIT: Here is the code (before my above change to it), as requested:
#include <string>
//! Node implements a doubly-linked list node
class Node {
friend class LinkedList; //!< LinkedList can access private members of Node
public:
//! Constructor
Node(const std::string & v, Node * p, Node * n) :
value(v), prev(p), next(n)
{
}
//! Change to deep copy
Node(const Node & other) :
value(other.value), prev(other.prev), next(other.next)
{
}
//! Read-only public methods for use by clients of the LinkedList class
const std::string & GetValue() const
{
return value;
}
Node * GetPrevious()const
{
return prev;
}
Node * GetNext()const
{
return next;
}
//! Change to deep copy
Node & operator=(const Node & other)
{
if(this!=&other)
{
value=other.value;
prev=other.prev;
next=other.next;
}
return *this;
}
private:
std::string value; //!< value stored in the node
Node * prev; //!< pointer to previous node in the list
Node * next; //!< pointer to next node in the list
};

First of all, I'm not really sure how the objective of the exercise should be understood. How deep should the copy be? In a solution like yours, this->next->next and other.next->next would be still the same thing. Should this object also be duplicated? And the rest of the list? Where does it end? One could of course deep-copy the whole list, but this would be a quite unexpected behavior of a copy constructor of a single node, I think.
Is maybe the value member variable a pointer, that is supposed to be deep copied? That would make much more sense for me.
But back to your interpretation:
Node a(...);
// ... more code that adds a whole list to a
Node b(a);
There are two problems with your implementation. For one b->next->prev points to a, while I suspect it should point back to b. Secondly you need to think about the corner cases, where a might be the first or last node in the list.
And to your main question: you are of course right, somewhere the newly created objects need to be deleted again. No matter if you just copy the prev and next nodes or the whole list, I would say the user of that copy is responsible to delete all the copied nodes again. I assume with a normal, not-copied list, the user of that list would walk through all the nodes and delete them manually one after another, once he's done with the list. He wouldn't not assume the destructor of one node to delete the whole list. And the same goes for copies, they should behave the same. The user of the copied stuff should delete all the copies. (In practice you would probably have a list class, that does all that node management for you).
But again, if the copy constructor of the node copies the whole list, or even just several of it's nodes, this would be very unexpected and all the time people would forget to clean up all these copies. But that's not your node class' fault, but the exercise requirements'.

Usually "deep copy" involves traversing the data structure and copying the entire thing. In your case, given a Node, make a complete copy of the list.

A deep copy makes an entire copy of a structure. What I mean by structure is a collection of objects that work together to perform a task. If you had a car class that had an object for each wheel and the body - a deep copy would make a copy of the entire car (and make copies of both the wheels and the body).
In your case, the "Entire Structure" is the list. A deep copy operation would only make sense if performed at the "list level." A deep copy of a node would copy the data that the node points to - but would not assign itself to be part of a list (as a Node should be unaware of the "master" list object).
List* List::CopyList()
{
List* nlist = new List();
ListNode* node = NULL, prev = NULL;
ListNode* newNodes = new ListNode[m_nodeCount];
int i = 0;
while ((node == NULL && node = m_first) || (node = node->Next()) != NULL)
{
newNodes[i] = node->CopyNode(); // also makes a new copy of the node's data
newNodes[i]->SetNext(NULL);
newNodes[i]->SetPrev(prev);
if (prev) prev->SetNext(newNodes[i]);
prev = newNodes[i];
++i;
}
if (m_len > 0)
nlist->SetFirst(newNodes[i]);
if (m_len > 1)
nlist->SetLast(newNodes[m_len - 1]);
return nlist;
}
Note: I just pulled that code out of my ass so it's not tested. Hope it helps though :)

You are correct in worrying.
By passing pointers into the Node constructor there is no information about ownership passed with the pointer. This is a poor design of the constructor. Either you should pass in a reference indicating you don't own the next node or pass a std::auto_ptr<> which indicates that you must take ownership. One could argue that the next or prev could be NULL (beginning or end of the list) and thus you can not use references, but this can be overcome by having alternative constructors.
Of course there are exceptions:
Is the Node class a private member of another class. If this is the case the use of the Node class is completely controlled by the owner and thus its correct usage would be controlled by the owning class.
What you have not provided is the definition of the destructor. With this we will be able to tell if the node is actually taking ownership of the pointer that are passed in to the constructors (or if the next and prev are already smart pointers)?

If every node makes copies of the nodes it points to then you can safely delete the objects in the destructors. If you are passing pointers (as the constructor Node(const std::string & v, Node * p, Node * n)) does then you do not "own" the pointers and should not delete them. If this is part of a linked list class then that class should own the pointers and delete the objects as necessary. You could also make Node a private subclass of the linked list class to avoid users (or yourself) messing with your pointers.
You've made a mistake in the recursion in your implementation as well, the copy constructor contains one level of a deep copy and calls the "normal" constructor, which takes pointers, making it shallow. What this means is that your deep copying is only one level deep. It should repeatedly call the copy constructor, something like this:
Node(const Node & other) : value(other.value)
{
prev = new Node(*(other.prev));
next = new Node(*(other.next));
}
AFAIK there is no benefit to using a deep copy here though, the only practical application I can think of is when copying the entire list, which could be handled more effectively in the class representing said list.

Related

Class Design Problems (managing memory allocation & deallocation)

1) So i have made a somewhat modified form of linked list that has indexed based addressing and other delete functions. I am just gonna copy the header file i made...
template<class T>
class LinkedList
{
public:
LinkedList();
~LinkedList();
int del_node(); // deletes the Node and element
int remove_node(); // deletes the Node only
int get_length();
int search_node(T*); // matches the pointer to see if its the same element
void add(T*);
void clear_list(); // calls deletes on all nodes and all elements
bool is_blank();
T& get_at(int); // operates like a vector get_at
private:
struct Node
{
T* element; // pointer passed to add(T*) is stored here.
Node* next;
}
Now see how i am adding an object in a linked list. I need to pass in an object pointer which i am passing in the form of
new Object()
This is particularly useful when i am adding Vertices of a graph. I just input the data and other fields from the user and call
LinkedList graph
graph.add(new Vertex(arguments));
Now there comes a situation when i have to copy some elements from the LinkedList A to B for temporary storage. Now i want to be able to remove elements from B after any kind of operation. But if i use delete it destroys the internal Node and deletes the object pointed by the pointer element i passed to it. So i created an additional function remove that only deletes the Node but not the object pointed by the element.
So i wanted to ask if its okay to do this or is there a design fault in my list and i should not be doing this? I am thinking of this from a library point of view for example if i would go about providing this class in a library. Is this suitable or will this confuse people? Any advice would be appreciated.
Please, I don't need any suggestions to use a replacement
function/class/library like vector. I am studying Data Structures and i have
to design any sort of data structure myself.
The more idiomatic fashion is to have Node::~Node always call delete element;, but add a T* Node::release();. This is what std::unique_ptr does for instance.
The implementation is straight forward:
T* Node::release()
{
T* tmp = element;
element = nullptr;
return tmp;
}
That way the Node d'tor is still correct, but you can "save" the data from deletion.
This is also the first step in addressing what I sense is a flaw in your implementation. You implement all functionality in LinkedList, even that which is relevant to the behavior of the internal class Node. Don't do that. Give Node a role and an interface related to that role. Than have LinkedList work by using that interface.
Ownership should be explicit when designing your class.
For that, you can use explicit method names and return std::unique_ptr when you are transfering ownership. With explicit method names you should be able to remove your comments.
template<class T>
class LinkedList
{
public:
LinkedList(const LinkedList&);
LinkedList(LinkedList&&);
LinkedList& operator=(const LinkedList&);
LinkedList& operator=(LinkedList&&);
void free_element(int); // deletes the Node and element
std::unique_ptr<T> extract_element(int); // deletes the Node only
int get_length() const;
void add_element(std::unique_ptr<T>);
void absorb_element(T*);
void free_all_elements(); // calls deletes on all nodes and all elements
};

Implementing a templated doubly linked list of pointers to objects

I'm a little confused about implementing a doubly linked list where the data in the list are pointers.
The private part of my linked list class looks like:
private:
struct node {
node* next;
node* prev;
T* o;
};
node* first; // The pointer to the first node (NULL if none)
node* last; // The pointer to the last node (NULL if none)
unsigned int size_;
As you can see, the list is full of pointers to objects rather than just plain old objects, which makes it a little more confusing to me.
The following is the description in the spec:
Note that while this list is templated across the contained type, T, it inserts and removes only pointers to T, not instances of T. This ensures that the Dlist implementation knows that it owns inserted objects, it is responsible for copying them if the list is copied, and it must destroy them if the list is destroyed.
Here is my current implementation of insertFront(T* o):
void Dlist::insertFront(T* o) {
node* insert = new node();
insert->o = new T(*o);
insert->next = first;
insert->prev = last;
first = insert;
}
This seems wrong though. What if T doesn't have a copy constructor? And how does this ensure sole ownership of the object in the list?
Could I just do:
insert->o = o;
It seems like this is not safe, because if you had:
Object* item = new Object();
dlist.insertFront(item);
delete item;
Then the item would be also be destroyed for the list. Is this correct? Is my understanding off anywhere?
Thanks for reading.
Note: While this looks like homework, it is not. I am actually a java dev just brushing up my pointer skills by doing an old school project.
When you have a container of pointers, you have one of the two following usage scenarios:
A pointer is given to the container and the container takes responsibility for deleting the pointer when the containing structure is deleted.
A pointer is given to the container but owned by the caller. The caller takes responsibility for deleting the pointer when it is no longer needed.
Number 1 above is quite straight-forward.
In the case of number 2, it is expected that the owner of the container (presumably also the caller) will remove the item from the container prior to deleting the item.
I have purposely left out a third option, which is actually the option you took in your first code example. That is to allocate a new item and copy it. The reason I left it out is because the caller can do that.
The other reason for leaving it out is that you may want a container that can take non-pointer types. Requiring it to be a pointer by always using T* instead of T may not be as flexible as you want. There are times when you should force it to be a pointer, but I can't think of any use (off the top of my head) for doing this for a container.
If you allow the user to declare Dlist<MyClass*> instead of Dlist<MyClass> then the owner of that list is implicitly aware that it is using pointers and this forces them to assume scenario Number 2 from above.
Anyway, here are your examples with some commentary:
1. Do not allocate a new T item unless you have a very good reason. That reason may simply be encapsulation. Although I mentioned above that you shouldn't do this, there are times when you may want to. If there is no copy constructor, then your class is probably plain-old-data. If copying is non-trivial, you should follow the Rule of Three.
void Dlist::insertFront(T* o) {
node* insert = new node();
insert->o = new T(*o); //<-- Follow rule of three
insert->next = first;
insert->prev = last;
first = insert;
}
2. This is what you would normally do
insert->o = o;
3. You must not delete your item after inserting. Either pass ownership to your container, or delete the item when neither you nor the container requires it anymore.
Object* item = new Object();
dlist.insertFront(item);
delete item; //<-- The item in the list is now invalid

Some pointers for creating a singly linked list in C++

I have a uni assignment in which I have to implement a singly linked list that contains different objects that are derived from a common abstract base class called Shape.
I'll link to GitHub for the class implementation: shapes.h , shapes.cpp. So far it consists of Shape and its derived class Circle. There'll also be Rectangle, Point and Polygon later.
I should now implement a singly linked list of these different kinds of shapes. So far I've come up with the following class prototype for the List-class and the Node-class:
class Node
{
public:
Node() {}
friend class ShapeList;
private:
Shape* data;
Node* nextNode;
};
class ShapeList
{
public:
ShapeList(){head = NULL;}
void Append(Shape& inData);
private:
Node* head;
};
Adding elements void Append(Shape& inData) to a ShapeList-object should be able to be called from main in the following style:
ShapeList list1;
list1.Append( Circle(5,5,5) );
list1.Append( Rectangle( 4, 10, 2, 4) );
Given this information, how should I go about implementing void Append(Shape& inData)? I've tried several different approaches, but haven't come up with the correct solution so far.
It's also completely possible that the parameter to Append should be something else than (Shape& inData).
edit:
I've implemented Append(Shape& inData) but it works only sometimes:
Circle circle1;
ShapeList list1;
list1.Append( circle1 );
but not with
ShapeList list1;
list1.Append ( Circle(5,5,5) )
So far my Append()-implementation looks as follows:
void ShapeList::Append(Shape& inData)
{
//Create a new node
Node* newNode = new Node();
newNode->data=&inData;
newNode->nextNode=NULL;
//Create a temp pointer
Node *tmp = head;
if (tmp != NULL)
{
//Nodes already present in the list
//Traverse to the end of the list
while(tmp->nextNode != NULL)
tmp = tmp->nextNode;
tmp->nextNode=newNode;
}
else
head=newNode;
}
Does that look ok to you guys?
Since this is tagged under 'homework', I will only point you to the good direction. This may be too basic or maybe it is enough for your needs...
In a typical situation, you would simply use a container that is already written such as std::list.
But for implementing your own linked list
When you start from the head member of the ShapeList, you should be able to traverse the entire list and find a node for which 'nextNode' has never been assigned.
This is where you want to add a new node.
Now thee a a few tricks to be make things work:
1- In C++, variables are not automatically initialized. You must therefore initialize the many values when you create a new node, especially the next node pointer.
2- Instead of having pointers to references, I suggest that either you create copies of Shapes, of use some kind of smart pointers to avoid copying.
3- Don't forget about memory management, when you destroy your linked list, you will have to destroy all nodes individually since.
One very nice implementation of the singly linked list is as a circular list with the "head" pointer pointing at the tail. This makes it easy to insert at either the front or append to the end: in either case you create a fresh node, make the current tail point to it, and make it point to the current head, and then in the insert case make the head pointer point to the new node.
What you appear to be missing (other than what's already been pointed out: allocating, deallocating, and copying the nodes) is a way to know that you've actually created the list. So you'll want to add in some sort of output - either an operator << or a print() routine, which will walk the list, and call your graphical objects' printing mechanisms in order.
You say that it is possible that the argument to Append might not be Shape &data. Given the requirement of the calling convention specified, it should be:
Append( const Shape &data ) // provided shapes have copy constructors
{
Node *newNode = new Node( data ); // requires a constructor of Node that copies data to a freshly allocated location and sticks a pointer to that location in its data field - then Node's destructor needs to release that pointer.
... ( and the code to manipulate the existing list and newNode's next pointer )
}
Among other things this makes responsibility for management clear and simple.
If you have a Node constructor that takes both a pointer to a Node and a Shape, you should be able to do Append in two lines - one allocating the new Node and calling the constructor appropriately, and one modifying a pointer to point to the new node.
I would add - based on your edit - that you absolutely need to do the allocation and copy inside Append.
You probably want Node to be nested inside of ShapeList so its full name will be ShapeList::Node, not just ::Node.
Since Node will own some data remotely, you probably need to define the big three for it.
In line with that, when you push something onto the list, the list will hold a dynamically allocated copy, not the original object.
Edit: Append should take a Shape const & rather than a Shape &. A reference to const can bind to a temporary object, but a reference to non-const cannot, so the calls using parameters that create temporary objects (e.g., list.Append(Circle(5,5,5))) won't compile if the parameter is a reference to non-const object.
I'd also change Node::Node to require that you pass it a parameter or two. As-is, your linked-list code is dealing with the internals of a Node more than I'd like. I'd change it to something like:
Node::Node(Shape const *d, Node *n=NULL) : data(d), nextNode(n) {}
Then in append, instead of:
Node* newNode = new Node();
newNode->data=&inData;
newNode->nextNode=NULL;
You'd use something like:
Node *newNode = new Node(&inData); // or, probably, `... = new Node(inData.clone());`
...and Node's ctor would handle things from there.
Also note that it's easier to add to the beginning of a linked list than to the end (it saves you from walking the whole list). If you really want to add to the end, it's probably worthwhile to save a pointer to the last node you added, so you can go directly to the end, rather than walking the whole list every time.
Here is one way to handle the polymorphic requirement (std::shared_ptr), demonstrated with the STL singly linked list...
typedef forward_list<shared_ptr<Shape>> ShapeList;
ShapeList list1;
list1.push_back(make_shared<Circle>(5,5,5));
list1.push_back(make_shared<Rectangle>(4, 10, 2, 4));
Here is how it would effect Node:
class Node
{
public:
Node() {}
friend class ShapeList;
private:
shared_ptr<Shape> data;
Node* nextNode;
};
and ShapeList...
class ShapeList
{
public:
ShapeList(){head = NULL;}
void Append(const shared_ptr<Shape>& inData);
private:
Node* head;
};

Can't use a class as template type in another class?

I have a class Stack, using template, one of its methods is "push", which is written below:
template <class T>
void Stack<T>::push(T _data){
Node<T>* temp = new Node<T>;
temp->data = _data;
temp->next = head;
head = temp;
}
The stack works well with int, double, string, char....
But it says
prog.cpp:32: note: synthesized method ‘Node<Tree>::Node()’ first required here
when I use a class "Tree" as data type.
I don't understand, why it works with "string" but not with "Tree", they are both classes, not primitive types.
http://ideone.com/NMxeF
(Ignore the other error, my IDE only gives one error at line 32 and some warnings)
Help!
Edit after reading the actual code (the "note" shown above is fairly misleading about the real problem).
Looking at the code, where you try to use new Node<T>;, that needs a default constructor for T (which in this case is Tree) because your Node template contains an instance of T:
struct Node {
T data; // <--- instance of T, not being initialized in your code.
Node *next;
};
Tree doesn't have a default constructor, so that fails (and the note is showing you where the default constructor would be needed).
You have a few choices about how to fix that. The most obvious would be for a Node to hold either a pointer or a reference to a T instead of containing an actual instance of T.
Another would be to have Node's constructor take a reference to a (probably const) T, and copy that T into the Node:
class Node {
T data;
Node *next;
public:
Node(T const &dat) : data(dat), next(0) {}
};
The choice between these two approaches is fairly fundamental. If you have Node store a pointer/reference to T, then it will be the responsibility of calling code to ensure the passed object remains valid as long as the Node exists. The node and calling code will share access to a single instance of T.
By contrast, if you copy the passed object into the Node, then this copy will be destroyed when the Node is destroyed. The original T (Tree, in your case) you passed to the Node will remain the responsibility of the calling code, and the Node will take responsibility for its copy.
In the usual case, you'd tend to favor the latter -- it gives cleaner semantics, and keeps ownership of the data clear. In the case of a Tree, however, you probably don't want to copy an entire tree into a Node if you can avoid it. One compromise position would be to use something like a Node<shared_ptr<Tree> > instead. The shared_ptr can keep copying fast and cheap, while avoiding writing a Node that's only suitable for a few kinds of objects and situations. That also makes fairly explicit that you're storing only a pointer that gives shared access to the original object.
Do you have a default constructor for Tree? If not, that might be your problem: Node holds in its data member a Tree type that must be default constructed when you call new Node<Tree>.
To fix, you can modify Node's constructor to take data and next as a parameter, so you don't require default constructor on its template type (you still need assignment operator to be available).

Creating a deep copy of a doubly-linked list node

I have my node defined something like:
class LLNode
{
public:
std::shared_ptr<LLNode> prev;
std::shared_ptr<LLNode> next;
std::shared_ptr<int> data;
LLNode(void)
: prev(std::shared_ptr<LLNode>(nullptr)),
next(std::shared_ptr<LLNode>(nullptr)),
data(std::shared_ptr<int>(nullptr))
{
}
LLNode(const LLNode &node)
: prev(std::shared_ptr<LLNode>(node.prev == nullptr?nullptr:new LLNode(node.prev))),
next(std::shared_ptr<LLNode>(node.next == nullptr?nullptr:new LLNode(node.next))),
data(std::shared_ptr<int>(new int(node.data)))
{
}
};
However, if I have a node which is linked to another node (which obviously will often be the case), copying node A will instantiate a copy of the next node B, which in turn will try to instantiate a copy of node A, which will try to copy node B, etc. etc. until there's a stackoverflow or memory error. This could be fixed by only instantiating a new copy of next (or prev), but then nothing linked previously (or next) to this node will be copied.
Is there a good way to copy a doubly linked list node?
You are doing the mistake that you are trying to copy the whole chain/list from a single node. That does not make that much sense to do in copy ctor of a list node. Make the copy ctor just copy the members' values, do not recurse. Copying the whole chain/list is the job for a LinkedList class.
Just set next and prev to null, regardless of the next and prev values of the node being copied. Write a separate function that copies the node and all it's children, which would be used to copy the whole list.