Imagine I have some Node struct that contains pointers to the left and right children and some data:
struct Node {
int data;
Node *left;
Node *right;
};
Now I want to do some state space search, and naturally I want to construct the graph as I go. So I will have a kind of loop that will have to create Nodes and keep them around. Something like:
Node *curNode = ... ; // starting node
while (!done) {
// ...
curNode->left = new Node();
curNode->right = new Node();
// ..
// Go left (for example)
curNode = curNode->left;
}
The problem is that I have to dynamically allocate node on each iteration, which is slow. So the question is: how can I have pointers to some memory but not by allocating it one by one?
The first solution I thought of is to have a std::vector<Node> that will contain all the allocated nodes. The problem is that when we push_back elements, all references might be invalidated, so all my left/right pointers will be garbage.
The second solution is to allocate a big chunk of memory upfront, and then we just grab the next available pointer when we want to create a Node. To avoid references invalidation, we just have to create a linked list of big chunks of memory when we exceed the capacity of the current chunk so every given pointer stays valid. I think that std::deque behaves like this, but it's not explicitly created for this.
Another solution would be to store vector indices instead of pointers but this is not a solution because a Node doesn't want to be associated with any container, it wants the pointer directly.
So what is the good solution here, that would avoid having to allocated new nodes on each iteration?
You can use std::deque<Node> and it will do memory management for you creating elements by groups and no invalidating pointers if you do not delete elements in middle. Though if you want to have more precise control on how many elements in a group you can quite simply create something like that:
class NodePool {
constexpr size_t blockSize = 512;
using Block = std::array<Node,blockSize>;
using Pool = std::list<Block>;
size_t allocated = blockSize;
Pool pool;
public:
Node *allocate()
{
if( allocated == blockSize ) {
pool.emplace_back();
allocated = 0;
}
return &( pool.back()[ allocated++ ] );
}
};
I did not try to compile it, but it should be enough to exress the idea. Here changing blockSize you can fine tune performance of your program. Though you should be aware than Node objects will be fully constructed by groups (unlike hoiw std::deque would do it). As much as I am aware there is no way to create raw memory for Node objects which is standard comformant.
Related
I'm trying to create custom list. This is the node structure:
struct Node {
ListNode* prev;
ListNode* next;
void* data;
};
One of constructors creates list from array. So I decided to allocate nodes in consecutive part of memory to make algorithm a little bit faster.
auto elements = new Node[size];
elements[0] = ...
elements[size - 1] = ...
for (int i = 1; i + 1 < size; i++) {
elements[i].data = array[i];
elements[i].prev = &elements[i - 1];
elements[i].next = &elements[i + 1];
}
head = &elements[0];
tail = &elements[size - 1];
After that I can add new elements:
Node* tmp = new Node;
tmp->prev = tail;
tmp->data = data;
tail = tmp;
Also I can change next and prev.
So I can't distinguish elements(is this element part of array or have been allocated later using new) and in destructor I have to delete elements using delete instead of delete[].
Node* curNode = head;
for (int i = 0; i < size; i++) {
Node* tmp = curNode.next;
delete curNode;
curNode = tmp;
}
This code doesn't delete elements which have been allocated in array(according to Valgrind).
How can I allocate nodes in one array(to decrease number of cache misses) and then successfully delete them element by element?
What you are trying to do is the most hacky implementation of linked list you can possibly think of. For all real life purpose you should stick with STL and it looks like std::vector does what you want. That being said, if you are trying to implement your own linked list to learn how it works, let me start by saying that you are already doing it wrong.
By definition linked list is made out of Nodes where each Node points to the next one and can also point to previous. Physical order of individual nodes in memory plays no role and is irrelevant from functionality point of view. There is potential performance gain related do cache hits if you have bunch of consecutive Nodes in the same page, but it is not something you should be aiming at when implementing linked list. If your goal is to have top tier performance, then pure array list will always beat any linked list implementation you can come up with. And std::vector is already what you should be using 99% of the time.
If you already implement a function that takes collection of elements and build Nodes out of them, you somewhat enforce OS to gets you chunks of memory for Nodes in kind of contiguous fashion. It's not a strong guarantee, but I would consider it good enough.
You can't release individual elements that belongs to a chunk of memory created with new[]. If you want to stick to array as your underlying storage for Nodes you have two options, as already mentioned in comments.
Option 1) Allocate single array for whole list and use indexes as your next and previous pointers in nodes. Note that it would require you to somehow handle situation when your list will be asked to hold more elements than your array can handle. Most likely allocating more arrays or allocating bigger one and copying everything which will bring you to array list with fancy ordering.
Option 2) Add dedicated memory manager that will be allocating chunks of memory in form of arrays and will handle individual entries, which is basically implementing your own memory allocator.
I have a
priority_queue<node*, std::vector<node*>, CompareNodes> heap;
Let's say the node consists of:
class node {
public:
int value;
int key;
int order = 1000000;
};
How do I free the memory after i'm done with the priority queue?
My approach doesn't seem to be working:
while (heap.top()) {
node * t = heap.top();
heap.pop();
delete t;
}
Looks like you'll want to do something more like this:
while (!heap.empty())
{ /* the rest ... */ }
If the heap is empty, .top() will throw an exception because there's nothing to return, which will happen when you are popping elements.
Also, if available you should use
priority_queue<std::unique_ptr<node>, std::vector<std::unique_ptr<node>>, CompareNodes> heap;
so you don't have to worry about clearing the memory yourself.
Just like most std:: containers, the memory may or may not be freed when you want it to be. Memory is usually kept around for a longer time so that when you perform a heap.push or equivalent operation, the memory doesn't need to be allocated again.
Think of std::vector which has to allocate a new set of memory for the entire vector each time it grows (vector data must be contiguous in memory). It is more efficient for std::vector to perform a large one time allocation and keep the memory around so that the growth operation doesn't kill performance -- a) allocate new space big enough, b) copy entire contents of existing vector to new vector space, c) delete the old vector space.
Bottom line is you can't force it to free memory for individual items.
None of my code uses dynamic memory, but I do have a vector of pointers to a struct called Node, and in my code, I do lose references to those Nodes at one point. The struct looks like this:
struct Node {
int value;
Node* next;
};
I also have a for loop that tries to find the smallest value in my vector of Node pointers by taking the smallest Node off as I go. Here, lists is the vector of Node pointers, and add is the previous smallest value.
for (int i = 1; i < int(lists.size()); ++i) {
if (lists[i]->value <= add) {
add = lists[i]->value;
lists[i] = lists[i]->next;
break;
}
}
I thought I couldn't leak memory if I was just in the stack though...
If the Node referenced in the lists array is dynamically allocated, you should free all of them manually. Otherwise there will be memory leak. You can find more details on https://en.wikipedia.org/wiki/Memory_leak
Non-duplicates:
Which STL C++ container to use for a fixed size list? (Specific use case)
std::list fixed size (See below)
Motives:
Allocation happens once (in the constructor) and deallocation happens once (in the destructor).
O(1) insertion and removal of an element anywhere in the list without needing to deal with the overhead of memory management. This isn't possible with an array-based implementation.
Is there a straightforward approach for implementing this using the standard library? Is there an implementation of something like this in Boost?
What I was first thinking when I read that was the approach to use a different allocator, i.e. one that pre-allocates a given number of elements to avoid the price of allocating. I'm not familiar with defining allocators though, but if you find out I'd be interested in the results.
Without that, here's a different approach. I saved myself the template ... stuff, but I guess you'll be able to do that yourself if you need.
typedef std::list<...> list_t;
struct fslist: private list_t
{
// reuse some parts from the baseclass
using list_t::iterator;
using list_t::const_iterator;
using list_t::begin;
using list_t::end;
using list_t::empty;
using list_t::size;
void reserve(size_t n)
{
size_t s = size();
// TODO: Do what std::vector does when reserving less than the size.
if(n < s)
return;
m_free_list.resize(n - s);
}
void push_back(element_type const& e)
{
reserve_one();
m_free_list.front() = e;
splice(end(), m_free_list, m_free_list.begin());
}
void erase(iterator it)
{
m_free_list.splice(m_free_list.begin(), *this, it);
}
private:
// make sure we have space for another element
void reserve_one()
{
if(m_free_list.empty())
throw std::bad_alloc();
}
list_t m_free_list;
};
This is incomplete, but it should get you started. Also note that splice() is not made public, because moving elements from or to a different list would change both size and capacity.
I think the simplest way to do it would be to have 2 data structures. An array/vector which is fixed sized and is used for "allocation". You simply grab an element from the array to create a node and insert it into your list. Something like this seems to meet you requirements:
struct node {
node *prev;
node *next;
int value;
};
node storage[N];
node *storage_ptr = storage;
then to create a new node:
if(node == &[storage + N]) {
/* out of space */
}
node *new_node = *storage_ptr++;
// insert new_node into linked list
This is fixed size, allocated all at once, and when storage goes out of scope, the nodes will be destroyed with it.
As for efficiently removing items from the list, it is doable, but slightly more complex. I would have a secondary linked list for "removed" nodes. When you remove a node from the main list, insert it at the end/beginning of the "deleted" list.
When allocating, check the deleted list first before going to the storage array. If it's NULL use storage, otherwise, pluck it off the list.
I ended up writing a template for this called rigid_list.
It's far from complete but it's a start:
https://github.com/eltomito/rigid_list
(motivated by Ulrich Eckhardt's answer)
I am building a suffix trie (unfortunately, no time to properly implement a suffix tree) for a 10 character set. The strings I wish to parse are going to be rather long (up to 1M characters). The tree is constructed without any problems, however, I run into some when I try to free the memory after being done with it.
In particularly, if I set up my constructor and destructor to be as such (where CNode.child is a pointer to an array of 10 pointers to other CNodes, and count is a simple unsigned int):
CNode::CNode(){
count = 0;
child = new CNode* [10];
memset(child, 0, sizeof(CNode*) * 10);
}
CNode::~CNode(){
for (int i=0; i<10; i++)
delete child[i];
}
I get a stack overflow when trying to delete the root node. I might be wrong, but I am fairly certain that this is due to too many destructor calls (each destructor calls up to 10 other destructors). I know this is suboptimal both space, and time-wise, however, this is supposed to be a quick-and-dirty solution to a the repeated substring problem.
tl;dr: how would one go about freeing the memory occupied by a very deep tree?
Thank you for your time.
One option is to allocate from a large buffer then deallocate that buffer all at once.
For example (untested):
class CNodeBuffer {
private:
std::vector<CNode *> nodes;
public:
~CNodeBuffer() {
empty();
}
CNode *get(...) {
CNode *node = new CNode(...);
nodes.push_back(node);
return node;
}
void empty() {
for(std::vector<CNode *>::iterator *i = nodes.begin(); i != nodes.end(); ++i) {
delete *i;
}
nodes = std::vector<CNode *>();
}
};
If pointers to a std::vector's elements are stable, you can make things a bit simplier and just use a std::vector<CNode>. This requires testing.
Do you initialize the memory for the nodes themselves? From what I can see, your code only allocates memory for the pointers, not the actual nodes.
As far as your question goes, try to iterate over the tree in an iterative manner, not recursively. Recursion is bad, it's nice only when it's on the paper, not in the code, unfortunately.
Have you considered just increasing your stack size?
In visual studio you do it with /FNUMBER where NUMBER is stack size in bytes. You might also need to specify /STACK:reserve[,commit].
You're going to do quite a few deletes. That will take a lot of time, because you will access memory in a very haphazard way. However, at that point you don't need the tree structure anymore. Hence, I would make two passes. In the first pass, create a std::vector<CNode*>, and reserve() enough space for all nodes in your tree. Now recurse over the tree and copy all CNode*'s to your vector. In the second step, sort them (!). Then, in the third step, delete all of them. The second step is technically optional but likely makes the third step a lot faster. If not, try sorting in reverse order.
I think in this case a breadth-first cleanup might help, by putting all the back-tracking information into a deque rather than on the OS provided stack. It still won't pleasantly solve the problem of making it happen in the destructor though.
Pseudocode:
void CNode::cleanup()
{
std::deque<CNode*> nodes;
nodes.push_back(this);
while(!nodes.empty())
{
// Get and remove front node from deque.
// From that node, put all non-null children at end of deque.
// Delete front node.
}
}