C++ undefined behavior with too many heap pointer deletions

C++ undefined behavior with too many heap pointer deletions - c++

I wrote a program to create a linked list, and I got undefined behavior (or I assume I did, given the program just stopped without any error) when I increased the size of the list to a certain degree and, critically, attempted to delete it (through ending its scope). A basic version of the code is below:
#include <iostream>
#include <memory>
template<typename T> struct Nodeptr;
template<class T>
struct Node {
Nodeptr<T> next;
T data;
Node(const T& data) : data(data) { }
};
template<class T>
struct Nodeptr : public std::shared_ptr<Node<T>> {
Nodeptr() : std::shared_ptr<Node<T>>(nullptr) { }
Nodeptr(const T& data) : std::shared_ptr<Node<T>>(new Node<T>(data)) { }
};
template<class T>
struct LinkedList {
Nodeptr<T> head;
void prepend(const T& data) {
auto new_head = Nodeptr<T>(data);
new_head->next = head;
head = new_head;
}
};
int main() {
int iterations = 10000;
{
LinkedList<float> ls;
std::cout << "START\n";
for(float k = 0.0f; k < iterations; k++) {
ls.prepend(k);
}
std::cout << "COMPLETE\n";
}
std::cout << "DONE\n";
return 0;
}
Right now, when the code is run, START and COMPLETE are printed, while DONE is not. The program exits prior without an error (for some reason).
When I decrease the variable to, say, 5000 instead of 10000, it works just fine and DONE is printed. When I delete the curly braces around the LinkedList declaration/testing block (taking it out its smaller scope, causing it NOT to be deleted before DONE is printed), then everything works fine and DONE is printed. Therefore, the error must be arising because of the deletion process, and specifically because of the volume of things being deleted. There is, however, no error message telling me that there is no more space left in the heap, and 10000 floats seems like awfully little to be filling up the heap anyhow. Any help would be appreciated!
Solved! It now works directly off of heap pointers, and I changed Node's destructor to prevent recursive calls to it:
~Node() {
if(!next) return;
Node* ptr = next;
Node* temp = nullptr;
while(ptr) {
temp = ptr->next;
ptr->next = nullptr;
delete ptr;
ptr = temp;
}
}

It is a stack overflow caused by recursive destructor calls.
This is a common issue with smart pointers one should be aware of when writing any deeply-nested data structure.
You need to an explicit destructor for Node removing elements iteratively by reseting the smart pointers starting from the tail of the list. Also follow the rule-of-3/5 and do the same for all other operations that might destroy nodes recursively as well.
Because this is essentially rewriting all object destruction it does however make use of smart pointers in the first place somewhat questionable, although there is still some benefit in preventing double delete (and leak) mistakes. Therefore it is common to simply not use smart pointers in such a situation at all and instead fall back to raw pointers and manual lifetime management for the data structure's nodes.
Also, there is no need to use std::shared_ptr here in any case. There is only one owner per node. It should be std::unique_ptr. std::shared_ptr has a very significant performance impact and also has the issue of potentially causing leaks if circular references are formed (whether intentionally or not).
I also don't see any point in having Nodeptr as a separate class. It seems to be used to just be an alias for std::make_shared<Node>, which can be achieved by just using a function instead. Especially inheriting from a standard library smart pointer seems dubious. If at all I would use composition instead.

Related

c++ templates and destructor problems

How to deal with memory leaking with template classes in C++?
In this code I defined 4 template classes:
class node and class linked_list make up a doubly linked list
class item and class bag just make up another doubly linked list
These template classes are designed to deal with objects of various classes.
In the main function, I first created a linked_list<string> and a bag<int> and everything is fine.
But when I try to make a bag<linked_list<string>>, problems arise.
I tried to trace back to see what happened, and I saw that in the function push_back in class bag, a destructor of linked_list has been called that erased all the data in the input v. I don't know why that happens.
Note that I overwrote the destructors for all classes and called className.~className() in the main function to prevent memory leaking.
And it does work to prevent memory leaking from ls_1 and bag_1.
I don't know which part of my code is wrong. Can somebody help me?
#include <iostream>
#include <stdlib.h>
#include <string>
using namespace std;
//node and linked_list class make up a doubly linked list
template<class T> class node {
public:
T value;
node<T> * next;
node<T> * previous;
node<T>() { next = nullptr; previous = nullptr; }
node<T>(T v) { value = v; next = nullptr; previous = nullptr; }
~node<T>() { delete next; }
};
template<class T> class linked_list { //doubly linked list
public:
node<T> * head;
node<T> * tail;
linked_list<T>() { head = nullptr; tail = nullptr; }
~linked_list<T>() { delete head; }
void push_front(T v) { //insert an item to the front
node<T> * p = new node<T>(v);
p->next = head;
head = p;
if (tail == nullptr) {
tail = p;
}
}
};
//item and bag class just make up another doubly linked list
template<class X> class item {
public:
X value;
item<X> *next;
item<X> *previous;
item<X>(X v) { value = v; next = nullptr; previous = nullptr; }
~item<X>() { delete next; }
};
template<class X> class bag { //just another doubly linked list
public:
item<X> *last;
item<X> *first;
int num_items;
int size() { return num_items; }
bag() { last = nullptr; first = nullptr; num_items = 0; }
~bag() { delete first; }
void push_back(X v) { //insert an item to the back
item<X> * p = new item<X>(v);
if (num_items == 0) {
last = first = p;
}
else {
last->next = p;
p->previous = last;
last = p;
}
num_items++;
last->next = nullptr;
}
};
int main() {
//When using built-in classes (like strings) as input
//there's no problem at all
linked_list<string> ls_1;
ls_1.push_front("David");
ls_1.push_front("John");
bag<int> bag_1;
bag_1.push_back(1);
bag_1.push_back(2);
//Problems arise here when using user defined classes (linked_list) as input
//I traced back and found out that a destructor has been called
//that erases all the data in the input. Donno how that happens
bag<linked_list<string>> bag_string;
bag_string.push_back(ls_1);
//These lines are to prevent the memory leaking
//I overwrote destructors for linked_list and node class
//otherwise there's still memory leaking
ls_1.~linked_list();
bag_1.~bag();
bag_string.~bag();
_CrtDumpMemoryLeaks();
getchar();
getchar();
}

Implement node, linked_list, item, bag copy constructors and assignments or declare them as deleted. The default versions generated by the compiler do not do the deep copying and that leads to multiple deletes of same objects after they were copied.
Read the rule of three/five/zero for full details.
A bit off-topic, but making a list node delete its siblings is a classic gotcha: for a sufficiently long list it ends up calling ~node<T>() recursively until it exhausts the stack. And this is the reason node pointers cannot be smart-pointers.
A fix would be to have a default destructor for nodes and make the list destroy the nodes in a loop, rather than recursively.
You may also like to use the full list node as a head of the list that points to itself when empty. That removes that nullptr checking logic completely.

I tried to trace back to see what happened, and I saw that in the function push_back in class bag, a destructor of linked_list has been called that erased all the data in the input v
Yes, this happens because your bag::push_back() takes its argument by value. This means it creates a copy of the ls_1 you created in main. You have not specified how to "copy" a list, so the compiler generated this function (a copy constructor) automatically. It can do that because your linked_list only contains two pointers, so the compiler assumes (because you have not told it otherwise) that copying the pointers over is all that is necessary to generate a copy of a linked_list. Unfortunately, that is not correct.
You now have two lists that manage the same contents: The original ls_1 in main() and the function argument v in push_back() - they both contain the same pointers.
Then the same thing happens again in your item constructor: You make a local copy of the list that holds the same pointers as the first two.
You now have several list objects pointing to the same data. Each one will try to destroy the data once it dies. This results in undefined behavior.
To correct this, you need to figure out how copying of a list should work. This is (in part) what the rule linked in the other comment is about: If the destructor of your class is not trivial (i.e. the compiler-generated version would not be sufficient, most likely because you need to release a resource like allocated memory), you should/must always care about how to handle your class being copied around. The various mechanisms that may invoke copy-like behavior (assignment, copy constructor, plus move versions in newer C++) need to be specified (or forbidden) by you.

Why do these visit methods cause memory leaks?

I am working on a medium sized C++ framework making use of the visitor pattern.
A valgrind test of a program implementing this framework reported a number of memory leaks that could be tracked down to one of the visitors, namely the copyCreator.
template<typename copyNodeType>
struct copyCreator {
copyCreator {}
copyCreator(node * firstVisit) {
firstVisit->accept(*this);
}
~copyCreator() {
copy.reset();
for(auto ptr : openList) {
delete ptr;
}
}
std::unique_ptr<copyNodeType> copy = 0;
vector<nonterminalNode *> openList;
// push to tree
template<typename nodeType>
void push(nodeType * ptr) {
if (copy) {
// if root is set, append to tree
openList.back()->add_child(ptr);
}
else {
auto temp = dynamic_cast<copyNodeType *>(ptr);
if(temp) {
copy = std::unique_ptr<copyNodeType>(temp);
}
}
}
// ...
void visit(struct someNonterminalNode & nod) {
auto next = new someNonterminalNode(); //This is leaked
push(next);
openList.push_back(next);
nod.child->accept(*this);
openList.pop_back();
};
There are a two main reasons why I am confused about this:
The two different constructors cause a different number of leaks
The leaks are reported to occur during visits
The accept methods of all nodes simply triggers a standard double dispatch to the visit method of the correct visitor.
I am fairly new to C++ programming and might have overlooked some really fundamental issue.

copyCreator<nodeType>::push(ptr) is supposed to take ownership of ptr. But it fails to do so if (a) ptr is not of type nodeType* (as determined by dynamic_cast), and (b) no node of type nodeType has been visited yet.
In other words, copyCreator<nodeType> creates, and promptly leaks, copies of all nodes until it encounters one of type nodeType.
This is precisely what happens in copyCreator<programNode> cpy2(&globalScope, a);, where a is forallNode*. cpy2 expects to encounter programNode (which it never does), and meanwhile, it copies and leaks all other nodes.

Nested shared_ptr destruction causes stack overflow

This is a more of design problem (I know why this is happening, just want to see how people deal with it). Suppose I have a simple linked list struct:
struct List {
int head;
std::shared_ptr<List> tail;
};
The shared_ptr enables sharing of sublists between multiple lists. However, when the list gets very long, a stack overflow might happen in its destructor (caused by recursive releases of shared_ptrs). I've tried using an explicit stack, but that gets very tricky since a tail can be owned by multiple lists. How can I design my List to avoid this problem?
UPDATE: To clarify, I'm not reinventing the wheel (std::forward_list). The List above is only a simplified version of the real data structure. The real data structure is a directed acyclic graph, which if you think about it is just a lot of of linked lists with shared tails/heads. It's usually prohibitively expensive to copy the graph, so data sharing is necessary.
UPDATE 2: I'm thinking about explicitly traversing down the pointer chain and std::move as I go. Something like:
~List()
{
auto p = std::move(tail);
while (p->tail != nullptr && p->tail.use_count() == 1) {
// Some other thread may start pointing to `p->tail`
// and increases its use count before the next line
p = std::move(p->tail);
}
}
This seems to work in a single thread, but I'm worried about thread safety.

If you're having problems with stack overflows on destruction for your linked datastructure, the easiest fix is just to implement deferred cleanup:
struct Graph {
std::shared_ptr<Graph> p1, p2, p3; // some pointers in your datastructure
static std::list<std::shared_ptr<Graph>> deferred_cleanup;
~Graph() {
deferred_cleanup.emplace_back(std::move(p1));
deferred_cleanup.emplace_back(std::move(p2));
deferred_cleanup.emplace_back(std::move(p3));
}
static void cleanup() {
while (!deferred_cleanup.empty()) {
std::list<std::shared_ptr<Graph>> tmp;
std::swap(tmp, deferred_cleanup);
tmp.clear(); } }
};
and you just need to remember to call Graph::cleanup(); periodically.

this should do it. With a little work it can easily be made thread-safe (a little locking/atomics in the deleter engine)
synopsis:
The shared_ptr's to the nodes are created with a custom destructor which, rather than deleting the node, hands it off to a deleter engine.
The engine's implementation is a singleton. Upon being notified of a new node to be deleted, it adds the node to a delete queue. If there is no node being deleted, the nodes in the queue are deleted in turn (no recursion).
While this is happening, new nodes arriving in the engine are simply added to the back of the queue. The in-progress delete cycle will take care of them soon enough.
#include <memory>
#include <deque>
#include <stdexcept>
#include <iostream>
struct node;
struct delete_engine
{
void queue_for_delete(std::unique_ptr<node> p);
struct impl;
static impl& get_impl();
};
struct node
{
node(int d) : data(d) {}
~node() {
std::cout << "deleting node " << data << std::endl;
}
static std::shared_ptr<node> create(int d) {
return { new node(d),
[](node* p) {
auto eng = delete_engine();
eng.queue_for_delete(std::unique_ptr<node>(p));
}};
}
int data;
std::shared_ptr<node> child;
};
struct delete_engine::impl
{
bool _deleting { false };
std::deque<std::unique_ptr<node>> _delete_list;
void queue_for_delete(std::unique_ptr<node> p)
{
_delete_list.push_front(std::move(p));
if (!_deleting)
{
_deleting = true;
while(!_delete_list.empty())
{
_delete_list.pop_back();
}
_deleting = false;
}
}
};
auto delete_engine::get_impl() -> impl&
{
static impl _{};
return _;
}
void delete_engine::queue_for_delete(std::unique_ptr<node> p)
{
get_impl().queue_for_delete(std::move(p));
}
struct tree
{
std::shared_ptr<node> root;
auto add_child(int data)
{
if (root) {
throw std::logic_error("already have a root");
}
auto n = node::create(data);
root = n;
return n;
}
};
int main()
{
tree t;
auto pc = t.add_child(6);
pc = pc->child = node::create(7);
}

std::shared_ptr (and before that, boost::shared_ptr) is and was the de-facto standard for building dynamic systems involving massive DAGs.
In reality, DAGs don't get that deep (maybe 10 or 12 algorithms deep in your average FX pricing server?) so the recursive deletes are not a problem.
If you're thinking of building an enormous DAG with a depth of 10,000 then it might start to be a problem, but to be honest I think it will be the least of your worries.
re the analogy of a DAG being like a linked list... not really. Since it's acyclic all your pointers pointing "up" will need to be shared_ptr and all your back-pointers (e.g. binding message subscriptions to sink algorithms) will need to be weak_ptr's which you lock as you fire the message.
disclaimer: I've spent a lot of time designing and building information systems based on directed acyclic graphs of parameterised algorithm components, with a great deal of sharing of common components (i.e. same algorithm with same parameters).
Performance of the graph is never an issue. The bottlenecks are:
initially building the graph when the program starts - there's a lot of noise at that point, but it only happens once.
getting data into and out of the process (usually a message bus). This is invariably the bottleneck as it involves I/O.

why does "a->content" give me a address instead of a value?

now i have been making games for a few years using the gm:s engine(tho i assure you i aint some newbie who uses drag and drop, as is all to often the case), and i have decided to start to learn to use c++ on its own, you know expand my knowledge and all that good stuff =D
while doing this, i have been attempting to make a list class as a practice project, you know, have a set of nodes linked together, then loop threw those nodes to get a value at a index, well here is my code, and i ask as the code has a single major issue that i struggle to understand
template<class type>
class ListNode
{
public:
type content;
ListNode<type>* next;
ListNode<type>* prev;
ListNode(type content) : content(content), next(NULL), prev(NULL) {}
protected:
private:
};
template<class type>
class List
{
public:
List() : SIZE(0), start(NULL), last(NULL) {}
unsigned int Add(type value)
{
if (this->SIZE == 0)
{
ListNode<type> a(value);
this->start = &a;
this->last = &a;
}
else
{
ListNode<type> a(value);
this->last->next = &a;
a.prev = this->last;
this->last = &a;
}
this->SIZE++;
return (this->SIZE - 1);
}
type Find(unsigned int pos)
{
ListNode<type>* a = this->start;
for(unsigned int i = 0; i<this->SIZE; i++)
{
if (i < pos)
{
a = a->next;
continue;
}
else
{
return (*a).content;
}
continue;
}
}
protected:
private:
unsigned int SIZE;
ListNode<type>* start;
ListNode<type>* last;
};
regardless, to me at least, this code looks fine, and it works in that i am able to create a new list without crashing, as well as being able to add elements to this list with it returning the proper index of those elements from within the list, however, beyond that the problem arises when getting the value of a element from the list itself, as when i ran the following test code, it didn't give me what it was built to give me
List<int> a;
unsigned int b = a.Add(313);
unsigned int c = a.Add(433);
print<unsigned int>(b);
print<int>(a.Find(b));
print<unsigned int>(c);
print<int>(a.Find(c));
now this code i expected to give me
0
313
1
433
as that's what is been told to do, however, it only half does this, giving me
0
2686684
1
2686584
now, this i am at a lost, i assume that the values provided are some kind of pointer address, but i simply don't understand what those are meant to be for, or what is causing the value to become that, or why
hence i ask the internet, wtf is causing these values to be given, as i am quite confused at this point
my apologies if that was a tad long and rambling, i tend to write such things often =D
thanks =D

You have lots of undefined behaviors in your code, when you store pointers to local variables and later dereference those pointers. Local variables are destructed once the scope they were declared in ends.
Example:
if (this->SIZE == 0)
{
ListNode<type> a(value);
this->start = &a;
this->last = &a;
}
Once the closing brace is reached the scope of the if body ends, and the variable a is destructed. The pointer to this variable is now a so called stray pointer and using it in any way will lead to undefined behavior.
The solution is to allocate the objects dynamically using new:
auto* a = new ListNode<type>(value);
Or if you don't have a C++11 capable compiler
ListNode<type>* a = new ListNode<type>(value);

First suggestion: use valgrind or a similar memory checker to execute this program. You will probably find there are many memory errors caused by dereferencing stack pointers that are out of scope.
Second suggestion: learn about the difference between objects on the stack and objects on the heap. (Hint: you want to use heap objects here.)
Third suggestion: learn about the concept of "ownership" of pointers. Usually you want to be very clear which pointer variable should be used to delete an object. The best way to do this is to use the std::unique_ptr smart pointer. For example, you could decide that each ListNode is owned by its predecessor:
std::unique_ptr<ListNode<type>> next;
ListNode<type>* prev;
and that the List container owns the head node of the list
std::unique_ptr<ListNode<type>> start;
ListNode<type>* last;
This way the compiler will do a lot of your work for you at compile-time, and you wont have to depend so much on using valgrind at runtime.

Unique Pointer attempting to reference a deleted function

Hello I am trying to use pointers and learning the basics on unique pointers in C++. Below is my code I have commented the line of code in main function. to debug the problem However, I am unable to do so. What am I missing ? Is my move() in the insertNode() incorrect ? The error I get is below the code :
#include<memory>
#include<iostream>
struct node{
int data;
std::unique_ptr<node> next;
};
void print(std::unique_ptr<node>head){
while (head)
std::cout << head->data<<std::endl;
}
std::unique_ptr<node> insertNode(std::unique_ptr<node>head, int value){
node newNode;
newNode.data = value;
//head is empty
if (!head){
return std::make_unique<node>(newNode);
}
else{
//head points to an existing list
newNode.next = move(head->next);
return std::make_unique<node>(newNode);
}
}
auto main() -> int
{
//std::unique_ptr<node>head;
//for (int i = 1; i < 10; i++){
// //head = insertNode(head, i);
//}
}
ERROR
std::unique_ptr>::unique_ptr(const std::unique_ptr<_Ty,std::default_delete<_Ty>> &)' : attempting to reference a deleted function

Aside from other small problems, the main issue is this line:
return std::make_unique<node>(newNode);
You are trying to construct a unique pointer to a new node, passing newNode to the copy constructor of node. However, the copy constructor of node is deleted, since node contains a non-copyable type (i.e. std::unique_ptr<node>).
You should pass a std::move(newNode) instead, but this is problematic since you create the node on the stack and it will be destroyed at the exit from the function.
Using a std::unique_ptr here is a bad idea in my opinion, since, for example, to print the list (or insert into the list), you need to std::move the head (so you lose it) and so on. I think you're much better off with a std::shared_ptr.

I was having the same problem and indeed using a shared_ptr works.
Using the smart pointer as an argument in the function copies the pointer (not the data it points to), and this causes the unique_ptr to reset and delete the data it was previously pointing at- hence we get that "attempting to reference a deleted function" error. If you use a shared_ptr this will simply increment the reference count and de-increment it once you are out of the scope of that function.
The comments in the answers above suggest that using a shared_ptr is baseless. These answers were written before the C++17 standard and it is my understanding that we should be using the most updated versions of the language, hence the shared_ptr is appropriate here.

I don't know why we have to expose node type to user in any case. Whole thingamajig of C++ is to write more code in order to write less code later, as one of my tutors said.
We would like to encapsulate everything and leave no head or tail (pun intended) of node to user. Very simplistic interface would look like:
struct list
{
private:
struct node {
int data;
std::unique_ptr<node> next;
node(int data) : data{data}, next{nullptr} {}
};
std::unique_ptr<node> head;
public:
list() : head{nullptr} {};
void push(int data);
int pop();
~list(); // do we need this?
};
The implementation does something what Ben Voigt mentioned:
void list::push(int data)
{
auto temp{std::make_unique<node>(data)};
if(head)
{
temp->next = std::move(head);
head = std::move(temp);
} else
{
head = std::move(temp);
}
}
int list::pop()
{
if(head == nullptr) {
return 0; /* Return some default. */
/* Or do unthinkable things to user. Throw things at him or throw exception. */
}
auto temp = std::move(head);
head = std::move(temp->next);
return temp->data;
}
We actually need a destructor which would NOT be recursive if list will be really large. Our stack may explode because node's destructor would call unique_ptr's destructor then would call managed node's destructor, which would call unique_ptr's destructor... ad nauseatum.
void list::clear() { while(head) head = std::move(head->next); }
list::~list() { clear(); }
After that default destructor would ping unique_ptr destructor only once for head, no recursive iterations.
If we want to iterate through list without popping node, we'd use get() within some method designed to address that task.
Node *head = list.head.get();
/* ... */
head = head->next.get();
get() return raw pointer without breaking management.

How about this example, in addition to the sample code, he also mentioned some principles:
when you need to "assign" -- use std::move and when you need to just traverse, use get()

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ undefined behavior with too many heap pointer deletions - c++

Related

c++ templates and destructor problems

Why do these visit methods cause memory leaks?

Nested shared_ptr destruction causes stack overflow

why does "a->content" give me a address instead of a value?

Unique Pointer attempting to reference a deleted function

Categories

Resources