I am working on a medium sized C++ framework making use of the visitor pattern.
A valgrind test of a program implementing this framework reported a number of memory leaks that could be tracked down to one of the visitors, namely the copyCreator.
template<typename copyNodeType>
struct copyCreator {
copyCreator {}
copyCreator(node * firstVisit) {
firstVisit->accept(*this);
}
~copyCreator() {
copy.reset();
for(auto ptr : openList) {
delete ptr;
}
}
std::unique_ptr<copyNodeType> copy = 0;
vector<nonterminalNode *> openList;
// push to tree
template<typename nodeType>
void push(nodeType * ptr) {
if (copy) {
// if root is set, append to tree
openList.back()->add_child(ptr);
}
else {
auto temp = dynamic_cast<copyNodeType *>(ptr);
if(temp) {
copy = std::unique_ptr<copyNodeType>(temp);
}
}
}
// ...
void visit(struct someNonterminalNode & nod) {
auto next = new someNonterminalNode(); //This is leaked
push(next);
openList.push_back(next);
nod.child->accept(*this);
openList.pop_back();
};
There are a two main reasons why I am confused about this:
The two different constructors cause a different number of leaks
The leaks are reported to occur during visits
The accept methods of all nodes simply triggers a standard double dispatch to the visit method of the correct visitor.
I am fairly new to C++ programming and might have overlooked some really fundamental issue.
copyCreator<nodeType>::push(ptr) is supposed to take ownership of ptr. But it fails to do so if (a) ptr is not of type nodeType* (as determined by dynamic_cast), and (b) no node of type nodeType has been visited yet.
In other words, copyCreator<nodeType> creates, and promptly leaks, copies of all nodes until it encounters one of type nodeType.
This is precisely what happens in copyCreator<programNode> cpy2(&globalScope, a);, where a is forallNode*. cpy2 expects to encounter programNode (which it never does), and meanwhile, it copies and leaks all other nodes.
Related
I wrote a program to create a linked list, and I got undefined behavior (or I assume I did, given the program just stopped without any error) when I increased the size of the list to a certain degree and, critically, attempted to delete it (through ending its scope). A basic version of the code is below:
#include <iostream>
#include <memory>
template<typename T> struct Nodeptr;
template<class T>
struct Node {
Nodeptr<T> next;
T data;
Node(const T& data) : data(data) { }
};
template<class T>
struct Nodeptr : public std::shared_ptr<Node<T>> {
Nodeptr() : std::shared_ptr<Node<T>>(nullptr) { }
Nodeptr(const T& data) : std::shared_ptr<Node<T>>(new Node<T>(data)) { }
};
template<class T>
struct LinkedList {
Nodeptr<T> head;
void prepend(const T& data) {
auto new_head = Nodeptr<T>(data);
new_head->next = head;
head = new_head;
}
};
int main() {
int iterations = 10000;
{
LinkedList<float> ls;
std::cout << "START\n";
for(float k = 0.0f; k < iterations; k++) {
ls.prepend(k);
}
std::cout << "COMPLETE\n";
}
std::cout << "DONE\n";
return 0;
}
Right now, when the code is run, START and COMPLETE are printed, while DONE is not. The program exits prior without an error (for some reason).
When I decrease the variable to, say, 5000 instead of 10000, it works just fine and DONE is printed. When I delete the curly braces around the LinkedList declaration/testing block (taking it out its smaller scope, causing it NOT to be deleted before DONE is printed), then everything works fine and DONE is printed. Therefore, the error must be arising because of the deletion process, and specifically because of the volume of things being deleted. There is, however, no error message telling me that there is no more space left in the heap, and 10000 floats seems like awfully little to be filling up the heap anyhow. Any help would be appreciated!
Solved! It now works directly off of heap pointers, and I changed Node's destructor to prevent recursive calls to it:
~Node() {
if(!next) return;
Node* ptr = next;
Node* temp = nullptr;
while(ptr) {
temp = ptr->next;
ptr->next = nullptr;
delete ptr;
ptr = temp;
}
}
It is a stack overflow caused by recursive destructor calls.
This is a common issue with smart pointers one should be aware of when writing any deeply-nested data structure.
You need to an explicit destructor for Node removing elements iteratively by reseting the smart pointers starting from the tail of the list. Also follow the rule-of-3/5 and do the same for all other operations that might destroy nodes recursively as well.
Because this is essentially rewriting all object destruction it does however make use of smart pointers in the first place somewhat questionable, although there is still some benefit in preventing double delete (and leak) mistakes. Therefore it is common to simply not use smart pointers in such a situation at all and instead fall back to raw pointers and manual lifetime management for the data structure's nodes.
Also, there is no need to use std::shared_ptr here in any case. There is only one owner per node. It should be std::unique_ptr. std::shared_ptr has a very significant performance impact and also has the issue of potentially causing leaks if circular references are formed (whether intentionally or not).
I also don't see any point in having Nodeptr as a separate class. It seems to be used to just be an alias for std::make_shared<Node>, which can be achieved by just using a function instead. Especially inheriting from a standard library smart pointer seems dubious. If at all I would use composition instead.
I have a recursive search algorithm, and I want to clean up my pointers after each call. However, I return in so many locations, it seems sloppy to put a delete or free before every one.
Is there a better way? Does me freeing them all at return of the function mean I should just allocate them on the stack instead of in the heap?
Note this is a parallel search (not shown in code), but the caller will never return before its children. Does this have any additional pitfalls for using the stack?
Example Code (Don't worry about the algorithm here):
//create a new struct state (using new), initialize and return (C style)
new_state()
free_list(state* node)//free a list
double minimax(state* node, state* bestState) {
if (base_case) {
return;
}
state* gb = new_state(); //single node
state* children = new_state(); //head of list
generate_children(children); //fill list
state* current = children; //traverse node
//recurse on child
double result = -minimax(current, gb);
if (case1) {
free(gb);
free_list(children);
return;
}
if (case2) {
//do stuff
}
while(current != NULL){
result = -minimax(current, gb);
if (case1) {
free(gb);
free_list(children);
return;
}
if (case2) {
//do stuff
}
current = current->next;
}
free(gb);
gb = NULL;
//More stuff (with children but not gb)
free_list(children);
return;
}
Here is a small sample of RAII:
First we have a struct that simply stores your items.
struct FreeAll
{
state* gb;
state* children;
FreeAll(state* g, state* c) : gb(g), children(c) {}
~FreeAll() { free(gb); free(children); }
};
Note that on destruction, the free() is called on both items. How to use it?
double minimax(state* node, state* bestState)
{
if (base_case) {
return;
}
state* gb = new_state(); //single node
state* children = new_state(); //head of list
// Initialize our small RAII object with the above
// pointers
FreeAll fa(gb, children);
generate_children(children); //fill list
state* current = children; //traverse node
//recurse on child
double result = -minimax(current, gb);
if (case1) {
return;
}
if (case2) {
//do stuff
}
while(current != NULL){
result = -minimax(current, gb);
if (case1) {
return;
}
if (case2) {
//do stuff
}
current = current->next;
}
//More stuff (with children but not gb
return;
}
The local variable fa is a FreeAll type. When this local goes out of scope, the destructor of fa is called which calls free on both the pointers that were stored in the struct. Also note the lack of any code at the return points to free the memory. This will be done by fa when it goes out of scope.
Note this is a simple example, and has none of the sophistication as other methods mentioned, but it gives you the basic gist of the RAII paradigm.
However, I return in so many locations, it seems sloppy to put a delete or free before every one.
Yes it does.
Is there a better way?
Yes. Smart pointers are a better way. But if you do not want to drop what you are doing, and learn how to use smart pointers before you can continue, (it can be hard the first time,) keep reading further down.
Does me freeing them all at return of the function mean I should just allocate them on the stack instead of in the heap?
Yes, you could do that. It would also perform better. But it will not work if you are planning on allocating a lot of memory.
Note this is a parallel search (not shown in code), but the caller will never return before its children. Does this have any additional pitfalls for using the stack?
The pitfalls are the same. With parallel code, you have to be careful.
There are many ways to avoid this problem. Smart pointers and stack allocation have already been mentioned.
Another way is to have only one exit point. This can get clunky at times, because, for example, it would mean that you would have to set a flag within your loop right before breaking out of it so as to know whether it terminated successfully or due to an error.
Another way is to allocate your pointers in function A, call function B to do the actual work, (passing it the allocated pointers,) and then once function B returns to function A, free the pointers.
ScopeGuard does the job for you.
https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Scope_Guard
void your_function()
{
Scope_guard const final_action = []{
free(gb);
free_list(children);};
// your code here
};
This is a more of design problem (I know why this is happening, just want to see how people deal with it). Suppose I have a simple linked list struct:
struct List {
int head;
std::shared_ptr<List> tail;
};
The shared_ptr enables sharing of sublists between multiple lists. However, when the list gets very long, a stack overflow might happen in its destructor (caused by recursive releases of shared_ptrs). I've tried using an explicit stack, but that gets very tricky since a tail can be owned by multiple lists. How can I design my List to avoid this problem?
UPDATE: To clarify, I'm not reinventing the wheel (std::forward_list). The List above is only a simplified version of the real data structure. The real data structure is a directed acyclic graph, which if you think about it is just a lot of of linked lists with shared tails/heads. It's usually prohibitively expensive to copy the graph, so data sharing is necessary.
UPDATE 2: I'm thinking about explicitly traversing down the pointer chain and std::move as I go. Something like:
~List()
{
auto p = std::move(tail);
while (p->tail != nullptr && p->tail.use_count() == 1) {
// Some other thread may start pointing to `p->tail`
// and increases its use count before the next line
p = std::move(p->tail);
}
}
This seems to work in a single thread, but I'm worried about thread safety.
If you're having problems with stack overflows on destruction for your linked datastructure, the easiest fix is just to implement deferred cleanup:
struct Graph {
std::shared_ptr<Graph> p1, p2, p3; // some pointers in your datastructure
static std::list<std::shared_ptr<Graph>> deferred_cleanup;
~Graph() {
deferred_cleanup.emplace_back(std::move(p1));
deferred_cleanup.emplace_back(std::move(p2));
deferred_cleanup.emplace_back(std::move(p3));
}
static void cleanup() {
while (!deferred_cleanup.empty()) {
std::list<std::shared_ptr<Graph>> tmp;
std::swap(tmp, deferred_cleanup);
tmp.clear(); } }
};
and you just need to remember to call Graph::cleanup(); periodically.
this should do it. With a little work it can easily be made thread-safe (a little locking/atomics in the deleter engine)
synopsis:
The shared_ptr's to the nodes are created with a custom destructor which, rather than deleting the node, hands it off to a deleter engine.
The engine's implementation is a singleton. Upon being notified of a new node to be deleted, it adds the node to a delete queue. If there is no node being deleted, the nodes in the queue are deleted in turn (no recursion).
While this is happening, new nodes arriving in the engine are simply added to the back of the queue. The in-progress delete cycle will take care of them soon enough.
#include <memory>
#include <deque>
#include <stdexcept>
#include <iostream>
struct node;
struct delete_engine
{
void queue_for_delete(std::unique_ptr<node> p);
struct impl;
static impl& get_impl();
};
struct node
{
node(int d) : data(d) {}
~node() {
std::cout << "deleting node " << data << std::endl;
}
static std::shared_ptr<node> create(int d) {
return { new node(d),
[](node* p) {
auto eng = delete_engine();
eng.queue_for_delete(std::unique_ptr<node>(p));
}};
}
int data;
std::shared_ptr<node> child;
};
struct delete_engine::impl
{
bool _deleting { false };
std::deque<std::unique_ptr<node>> _delete_list;
void queue_for_delete(std::unique_ptr<node> p)
{
_delete_list.push_front(std::move(p));
if (!_deleting)
{
_deleting = true;
while(!_delete_list.empty())
{
_delete_list.pop_back();
}
_deleting = false;
}
}
};
auto delete_engine::get_impl() -> impl&
{
static impl _{};
return _;
}
void delete_engine::queue_for_delete(std::unique_ptr<node> p)
{
get_impl().queue_for_delete(std::move(p));
}
struct tree
{
std::shared_ptr<node> root;
auto add_child(int data)
{
if (root) {
throw std::logic_error("already have a root");
}
auto n = node::create(data);
root = n;
return n;
}
};
int main()
{
tree t;
auto pc = t.add_child(6);
pc = pc->child = node::create(7);
}
std::shared_ptr (and before that, boost::shared_ptr) is and was the de-facto standard for building dynamic systems involving massive DAGs.
In reality, DAGs don't get that deep (maybe 10 or 12 algorithms deep in your average FX pricing server?) so the recursive deletes are not a problem.
If you're thinking of building an enormous DAG with a depth of 10,000 then it might start to be a problem, but to be honest I think it will be the least of your worries.
re the analogy of a DAG being like a linked list... not really. Since it's acyclic all your pointers pointing "up" will need to be shared_ptr and all your back-pointers (e.g. binding message subscriptions to sink algorithms) will need to be weak_ptr's which you lock as you fire the message.
disclaimer: I've spent a lot of time designing and building information systems based on directed acyclic graphs of parameterised algorithm components, with a great deal of sharing of common components (i.e. same algorithm with same parameters).
Performance of the graph is never an issue. The bottlenecks are:
initially building the graph when the program starts - there's a lot of noise at that point, but it only happens once.
getting data into and out of the process (usually a message bus). This is invariably the bottleneck as it involves I/O.
now i have been making games for a few years using the gm:s engine(tho i assure you i aint some newbie who uses drag and drop, as is all to often the case), and i have decided to start to learn to use c++ on its own, you know expand my knowledge and all that good stuff =D
while doing this, i have been attempting to make a list class as a practice project, you know, have a set of nodes linked together, then loop threw those nodes to get a value at a index, well here is my code, and i ask as the code has a single major issue that i struggle to understand
template<class type>
class ListNode
{
public:
type content;
ListNode<type>* next;
ListNode<type>* prev;
ListNode(type content) : content(content), next(NULL), prev(NULL) {}
protected:
private:
};
template<class type>
class List
{
public:
List() : SIZE(0), start(NULL), last(NULL) {}
unsigned int Add(type value)
{
if (this->SIZE == 0)
{
ListNode<type> a(value);
this->start = &a;
this->last = &a;
}
else
{
ListNode<type> a(value);
this->last->next = &a;
a.prev = this->last;
this->last = &a;
}
this->SIZE++;
return (this->SIZE - 1);
}
type Find(unsigned int pos)
{
ListNode<type>* a = this->start;
for(unsigned int i = 0; i<this->SIZE; i++)
{
if (i < pos)
{
a = a->next;
continue;
}
else
{
return (*a).content;
}
continue;
}
}
protected:
private:
unsigned int SIZE;
ListNode<type>* start;
ListNode<type>* last;
};
regardless, to me at least, this code looks fine, and it works in that i am able to create a new list without crashing, as well as being able to add elements to this list with it returning the proper index of those elements from within the list, however, beyond that the problem arises when getting the value of a element from the list itself, as when i ran the following test code, it didn't give me what it was built to give me
List<int> a;
unsigned int b = a.Add(313);
unsigned int c = a.Add(433);
print<unsigned int>(b);
print<int>(a.Find(b));
print<unsigned int>(c);
print<int>(a.Find(c));
now this code i expected to give me
0
313
1
433
as that's what is been told to do, however, it only half does this, giving me
0
2686684
1
2686584
now, this i am at a lost, i assume that the values provided are some kind of pointer address, but i simply don't understand what those are meant to be for, or what is causing the value to become that, or why
hence i ask the internet, wtf is causing these values to be given, as i am quite confused at this point
my apologies if that was a tad long and rambling, i tend to write such things often =D
thanks =D
You have lots of undefined behaviors in your code, when you store pointers to local variables and later dereference those pointers. Local variables are destructed once the scope they were declared in ends.
Example:
if (this->SIZE == 0)
{
ListNode<type> a(value);
this->start = &a;
this->last = &a;
}
Once the closing brace is reached the scope of the if body ends, and the variable a is destructed. The pointer to this variable is now a so called stray pointer and using it in any way will lead to undefined behavior.
The solution is to allocate the objects dynamically using new:
auto* a = new ListNode<type>(value);
Or if you don't have a C++11 capable compiler
ListNode<type>* a = new ListNode<type>(value);
First suggestion: use valgrind or a similar memory checker to execute this program. You will probably find there are many memory errors caused by dereferencing stack pointers that are out of scope.
Second suggestion: learn about the difference between objects on the stack and objects on the heap. (Hint: you want to use heap objects here.)
Third suggestion: learn about the concept of "ownership" of pointers. Usually you want to be very clear which pointer variable should be used to delete an object. The best way to do this is to use the std::unique_ptr smart pointer. For example, you could decide that each ListNode is owned by its predecessor:
std::unique_ptr<ListNode<type>> next;
ListNode<type>* prev;
and that the List container owns the head node of the list
std::unique_ptr<ListNode<type>> start;
ListNode<type>* last;
This way the compiler will do a lot of your work for you at compile-time, and you wont have to depend so much on using valgrind at runtime.
I am currently working on implementing something similar to this
(http://molecularmusings.wordpress.com/2011/06/27/config-values/)
while using the template system, so that i do not have to create a class for all types i want to support. Now the class itself works fine, but i am having trouble figuring out how to manage memory if i read config values from a file and save them to the list.
This is my ConfigSetting class:
#pragma once
template <typename T>
class ConfigSetting {
public:
static ConfigSetting* head;
static ConfigSetting* tail;
public:
ConfigSetting(const std::string& name, const std::string& synopsis, T initValue) : m_name(name), m_synopsis(synopsis), m_value(initValue)
{
this->addToList();
}
// Special constructor for int ranges
ConfigSetting(const std::string& name, const std::string& synopsis, T initValue, T minValue, T maxValue) : m_name(name), m_synopsis(synopsis), m_value(initValue), m_min(minValue), m_max(maxValue)
{
this->addToList();
}
ConfigSetting& operator=(T value)
{
this->m_value = value;
return *this;
}
inline operator T(void) const
{
return m_value;
}
static ConfigSetting* findSetting(const std::string& name)
{
if (head) {
ConfigSetting* temp = head;
while (temp != nullptr) {
if (temp->m_name == name) {
return temp;
}
temp = temp->m_next;
}
}
return nullptr;
}
private:
void addToList(void)
{
if (head) {
tail->m_next = this;
tail = this;
}
else {
head = this;
tail = this;
}
}
ConfigSetting* m_next;
const std::string m_name;
const std::string m_synopsis;
T m_value;
T m_min;
T m_max;
};
template<class T> ConfigSetting<T>* ConfigSetting<T>::head = nullptr;
template<class T> ConfigSetting<T>* ConfigSetting<T>::tail = nullptr;
And i am using it like this (from another class called ConfigReader):
ConfigSetting<std::string>* cf = new ConfigSetting<std::string>(key, synopsis, value);
Now my question is: What is the best way manage memory in this case? Since the list is static i cannot just run through the list deleting everything once the destructor gets called. I could be using shared_ptr like this:
shared_ptr<ConfigSetting<std::string>> sp(new ConfigSetting<std::string>(key, synopsis, value));
or another type of smart pointer? Maybe there even is more elegant solution i didn't think of.
As far as I can see, there is nothing in your implicit destructor that must be called to ensure proper operation. If that is true, you can just forget about cleaning up your lists. Trying to do so will only increase the runtime of your program with absolutely no benefit. Just let the kernel do its job, it won't leak any memory pages just because you couldn't be bothered to clean up static data.
However, if you have a nontrivial destructor down the line somewhere which includes such important operations like flushing files or sending messages to other processes, then you must use a destructor function. I'm not talking about the normal C++ destructors here, but about a specially declared function that is executed by the runtime after main() exits.
With gcc, you declare a destructor function like this:
void foo() __attribute__((destructor));
void foo() {
//Do vitally important cleanup here.
}
Since the linker takes care of instructing the runtime to call your destructor function, you do not have to have any call to these functions, they may actually be declared with file local visibility.
Now, you ask "Am I not supposed to delete this pointer somewhere?" Yes, you are supposed to delete it. You are supposed to call delete on every object that you create with new for two reasons:
To give back the memory held by the object to the runtime, so that your process can reuse the memory for other purposes. If you fail to delete objects that you create on a regular basis, the memory footprint of your process will increase indefinitely until the kernel steps in and shoots down your process.
To run the destructor for your object, which frequently results in calling delete on other objects which are not needed anymore. In most cases, this will just give back more memory according to 1., which seems to be your case. It may do more vital operations, though.
Since the objects in question have to live until the very end of your process lifetime (they are static data, after all), you cannot possibly reuse their memory. The kernel, however, is above the level of the runtime that provides you with the new and delete keywords. The kernel is the creator of your tiny process world, in which the new and delete keywords happen to live. The kernel does not care about which parts of the virtual address space your runtime considers used/unused. The kernel will simply strip down the entire virtual address space when your process exits, and the used/unused state of your memory will dissipate into nothingness.