C++ vector implementation - removing elements - c++

I'm implementing a vector type. I'm not troubled by the algorithms or the data structure at all but I am unsure about a remove method. for instance:
bool Remove(Node* node)
{
/* rearrange all the links and extract the node */
delete node;
}
where node is a pointer to the current node that we are at. But if I delete node then how do I prevent this from happening:
Node* currentNode = MoveToRandNode();
Remove(currentNode);
cout << currentNode->value;
If currentNode were a pointer to a pointer it would be easier but...it's not.

You could add another level of abstraction to your iterator (which now is a raw pointer)
If you do not handle raw pointers, but create some sort of iterator class instead of a pointer, it is possible to invalidate the iterator, and thus failing controlled if anyone tries to access the iterator after it has been removed.
class Iterator {
Node operator*() {
if (node) return *node;
else throw Something();}
private:
Node* node;
}
Of course this wrapping of a pointer will come at a cost of some overhead (checking the pointer on each deref). So you will have to decide how safe you want to play. Either document as suggested by others or wrap for safety.

Step back first. You need to define who "owns" the memory pointed to by the vector. Is it the vector itself, or the code that uses the vector? Once you define this, the answer will be easy - either Remove() method should always delete it or never.
Note that you've just scratched the surface of the possible bugs and you answer to "who owns it" will help with other possible issues like:
If you copy a vector, do you need to copy the items within it, or just the pointers (e.g. do a shallow or deep copy
When you destroy a vector, should you destroy the items within it?
When you insert an item, should you make a copy of the item, or does the vector take ownership of it?

well, you cannot do that, but some modifications to your code can improve safety.
Add ref
bool Remove(Node*& node)
{
/* rearrange all the links and extract the node */
delete node;
node = nullptr;
}
check for nullptr
if(currentNode)
cout << currentNode->value;
probably you need to try std::shared_ptr

This is similar to "iterator invalidation". E.g., if you have a std::list l and a std::list::iterator it pointing into that list, and you call l.erase(it), then the iterator it is invalidated -- i.e., if you use it in any way then you get undefined behavior.
So following that example, you should include in your documentation of the Remove method something along the lines: "the pointer node is invalidated, and may not be used or dereferenced after this method returns."
(Of course, you could also just use std::list, and not bother to re-invent the wheel.)
For more info on iterator invalidation, see: http://www.angelikalanger.com/Conferences/Slides/CppInvalidIterators-DevConnections-2002.pdf

In addition what innochenti wrote.
I think you have to decide what is expected/desired behavior of cout << currentNode->value;:
Error - (as innochenti wrote node = nullptr)
Default Value - create node devault_value (which has some default value for its value), and after delete node; do node=default_value

Related

C++ - std::list.erase() not removing element

I am having a small issue with a bit of code for a school assignment (I know that's shunned upon here, but I locked myself into using the std::list library and am paying for it). I have a function that has a list of pointers to classes passed into it along with a particular ID belonging to one of those classes that I want to destroy and resize my list. However, with my code, the list is never resized and the values are garbage, that crash my program. So it looks like the actual class is being removed, but the element is never removed from the list...
If I had time to make my own doubly-linked list implementation, I'd iterate over the list looking for the element that I want to delete. If it is found, create a temporary node pointer and point that to the node I am about to delete. Set the previous node's "next" element to the iterator's "next" element, and then delete the iterator node.
But.. using the stl::list implementation, I'm at a loss what to do. Here is what I have so far, where a DOCO is a class, and the elements in the list are pointers to instances of classes. I've looked into remove() vs. erase(), which maybe using both may fix it, but I'm not sure how to implement remove() with an iterator like this.
bool DOCO::kill_doco(std::list < DOCO* > docolist, int docoid)
{
for (std::list<DOCO*>::iterator it = docolist.begin(); it != docolist.end(); )
{
if ((*it)->id == docoid)
{
delete * it;
it = docolist.erase(it);
std::cerr << "item erased\n";
}
else
{
++it;
}
}
std::cerr << "leaving kill\n";
return true;
}
kill_doco(std::list < DOCO* > docolist
this creates a copy of the list. This copy is a list of pointers.
You proceed to modify the copy of the list, and delete an element in it.
The original list (which you copied) still has the original pointer, which is now pointing to a deleted object.
The easy fix is:
kill_doco(std::list < DOCO* >& docolist
C++ is a value-oriented language, unlike languages like Java or C#. The name of something refers to an actual value of that thing, not a reference to it.
Pointers are similarly the value of the address of the object.
Reference like semantics, or pointer-like semantics, can be done in C++. But, unlike Java/C#, by default every object in C++ is an actual value.
People who move from one language to the other (either way) can get confused by this.
The "default" object type in a C++ program is a regular type -- a type that acts like an integer when you copy it around and the like. It is relatively easy to move away from this, but that is the default.
So what you did was akin to:
void clear_bit( int x, int bit ) {
x = x & ~(1 << bit);
}
and being surprised that the value x you passed in wasn't modified by the function. The "dangling" pointer left in the original list is the 2nd thing that bit you.

Can we implement a link list without using the head pointer means by using a simple variable of the head instead of the pointer of the head?

Can we implement a link list without using the head pointer means by using a simple variable of the head instead of the pointer of the head ?
Yes. If you are implementing a circular linked list with a sentinel node, the sentinel node can be the simple variable that also serves as the head.
Alternatively, you could use a std::optional instance to serve as the head.
In specific cases you could, but in general not. And why would you want to? Here are some reasons, I could think of now. Take for example this code:
template<class T>
class Node
{
private:
T value;
Node<T> *next;
};
class MyLinkedList
{
private:
bool isEmpty; // indicates wether the list is empty or not
Node head; // Head as member
};
But there are several major flaws with this code:
You would always need to care about isEmpty when adding or deleting, or doing anything with the list
You can't initialize head if T has no default constructor
When deleting the last element you have to call the destructor of object that technically remains in scope.
When deleting the last element and then deleting the empty list the destructor of Node::value will be called twice
Don't know if those are all reasons, but I think, just #2 is a big enough problem to not consider this.
Of course you could use std::optional, but that's just a pointer with a wrapper. which even works, without a default constructor, so could be an alternative. Alltough it would be used in the same way as a (smart) pointer, so it's not "a simple variable of the head".

How to "completely" delete all nodes in a linked list?

I got curious about how pointers and deleting pointers worked in C++ so I set up an experiment. I made a very simple singly linked list and the following recursive function that deletes all nodes in a list:
void deleteList(Node *node) {
if (!node)
return;
deleteList(node->next);
cout << "deleting node " << node->data << endl;
delete node;
node = nullptr;
}
I suppose that this successfully deletes all nodes in a list. However, after calling this function in main, I check to see if the head node still exists:
List list;
// appending a bunch of numbers to the list...
list.deleteList(list.head);
if (list.head)
cout << true;
this will print 1 to the console, meaning that the head does indeed still exist. I would expect the head, and all other nodes after it, to be null and hence that the if condition fails, since setting pointers to null is the last thing I do in the recursive function. So why does the program report that the head still exists?
edit: changed List list(); to List list;
You freed the memory, but the assignment to nullptr only affected the copy of the pointer passed to the function, not the original pointer in the caller.
If you declared the function as receiving the pointer by reference:
void deleteList(Node *&node) {
then the assignment of node = nullptr; would affect the caller as well.
Mind you, since you tagged this C++11, it's usually much simpler to just define the linked list as a series of std::unique_ptr<Node>s in the forward direction (raw pointers in the reverse direction if it's bidirectional to avoid reference cycles), so you can avoid the need for a special deleter function, and just set the head pointer to nullptr and let C++ do the work of cascading the deletion.
Edit: There is a flaw to letting std::unique_ptr do the work; as pointed out in the comments, this means the list size is effectively limited by the stack, and too large lists will cause stack overflow when you delete them. So explicitly clearing one by one (the simplest approach being to implement popping properly, and have clearing simply be popping until head is converted to a nullptr by the popleft method) would be safer. I left the original suggestion in place for posterity, so this explanation makes sense.

std::forward_list -- erasing with a stored iterator

I'm trying to keep a global list of a particular (base) class's instances so that I can track them down by iterating through this global list at any time.
I believe the most proper way to address this is with an intrusive list. I have heard that one can encounter these creatures by digging into the Linux kernel, for example.
In the situation where I'm in, I don't really need such guarantees of performance, and using intrusive lists will complicate matters somewhat for me.
Here's what I've got so far to implement this concept of a class that knows about all of its instances.
class A {
static std::forward_list<A*> globallist;
std::forward_list<A*>::iterator listhandle;
public:
A() {
globallist.push_front(this);
listhandle = globallist.begin();
}
virtual ~A() {
globallist.erase_after(...); // problem
}
};
The problem is that there is no forward_list::erase(), and it really does not appear like saving globallist.before_begin() in the ctor would do me much good. I'm never supposed to dereference before_begin()'s iterator. Will it actually hold on to the position? If I save out before_begin's iterator, and then push_front() a new item, that iterator is probably still not capable of being dereferenced, but will it be serviceable for sending to erase_after()?
forward_list is a singly linked list. To remove a node in the middle of that, you must have a pointer to previous node, somehow. For example, you could do something like this:
class A {
static std::forward_list<A*> globallist;
std::forward_list<A*>::iterator prev_node;
public:
A() {
A* old_head = globallist.front();
globallist.push_front(this);
prev_node = globallist.before_begin();
old_head->prev_node = globallist.begin();
}
};
The case of pushing the first element into an empty list, as well as the removal logic, are left as an exercise for the reader (when removing, copy your prev_node to the next node's prev_node).
Or, just use std::list and avoid all this trouble.

Implementing a templated doubly linked list of pointers to objects

I'm a little confused about implementing a doubly linked list where the data in the list are pointers.
The private part of my linked list class looks like:
private:
struct node {
node* next;
node* prev;
T* o;
};
node* first; // The pointer to the first node (NULL if none)
node* last; // The pointer to the last node (NULL if none)
unsigned int size_;
As you can see, the list is full of pointers to objects rather than just plain old objects, which makes it a little more confusing to me.
The following is the description in the spec:
Note that while this list is templated across the contained type, T, it inserts and removes only pointers to T, not instances of T. This ensures that the Dlist implementation knows that it owns inserted objects, it is responsible for copying them if the list is copied, and it must destroy them if the list is destroyed.
Here is my current implementation of insertFront(T* o):
void Dlist::insertFront(T* o) {
node* insert = new node();
insert->o = new T(*o);
insert->next = first;
insert->prev = last;
first = insert;
}
This seems wrong though. What if T doesn't have a copy constructor? And how does this ensure sole ownership of the object in the list?
Could I just do:
insert->o = o;
It seems like this is not safe, because if you had:
Object* item = new Object();
dlist.insertFront(item);
delete item;
Then the item would be also be destroyed for the list. Is this correct? Is my understanding off anywhere?
Thanks for reading.
Note: While this looks like homework, it is not. I am actually a java dev just brushing up my pointer skills by doing an old school project.
When you have a container of pointers, you have one of the two following usage scenarios:
A pointer is given to the container and the container takes responsibility for deleting the pointer when the containing structure is deleted.
A pointer is given to the container but owned by the caller. The caller takes responsibility for deleting the pointer when it is no longer needed.
Number 1 above is quite straight-forward.
In the case of number 2, it is expected that the owner of the container (presumably also the caller) will remove the item from the container prior to deleting the item.
I have purposely left out a third option, which is actually the option you took in your first code example. That is to allocate a new item and copy it. The reason I left it out is because the caller can do that.
The other reason for leaving it out is that you may want a container that can take non-pointer types. Requiring it to be a pointer by always using T* instead of T may not be as flexible as you want. There are times when you should force it to be a pointer, but I can't think of any use (off the top of my head) for doing this for a container.
If you allow the user to declare Dlist<MyClass*> instead of Dlist<MyClass> then the owner of that list is implicitly aware that it is using pointers and this forces them to assume scenario Number 2 from above.
Anyway, here are your examples with some commentary:
1. Do not allocate a new T item unless you have a very good reason. That reason may simply be encapsulation. Although I mentioned above that you shouldn't do this, there are times when you may want to. If there is no copy constructor, then your class is probably plain-old-data. If copying is non-trivial, you should follow the Rule of Three.
void Dlist::insertFront(T* o) {
node* insert = new node();
insert->o = new T(*o); //<-- Follow rule of three
insert->next = first;
insert->prev = last;
first = insert;
}
2. This is what you would normally do
insert->o = o;
3. You must not delete your item after inserting. Either pass ownership to your container, or delete the item when neither you nor the container requires it anymore.
Object* item = new Object();
dlist.insertFront(item);
delete item; //<-- The item in the list is now invalid