Do you know, please, if C++ STL contains a Binary Search Tree (BST) implementation, or if I should construct my own BST object?
In case STL conains no implementation of BST, are there any libraries available?
My goal is to be able to find the desired record as quickly as possible: I have a list of records (it should not be more few thousands.), and I do a per-frame (its a computer game) search in that list. I use unsigned int as an identifier of the record of my interest. Whatever way is the fastest will work best for me.
What you need is a way to look up some data given a key. With the key being an unsigned int, this gives you several possibilities. Of course, you could use a std::map:
typedef std::map<unsigned int, record_t> my_records;
However, there's other possibilities as well. For example, it's quite likely that a hash map would be even faster than a binary tree. Hash maps are called unordered_map in C++, and are a part of the C++11 standard, likely already supported by your compiler/std lib (check your compiler version and documentation). They were first available in C++TR1 (std::tr1::unordered_map)
If your keys are rather closely distributed, you might even use a simple array and use the key as an index. When it comes to raw speed, nothing would beat indexing into an array. OTOH, if your key distribution is too random, you'd be wasting a lot of space.
If you store your records as pointers, moving them around is cheap, and an alternative might be to keep your data sorted by key in a vector:
typedef std::vector< std::pair<unsigned int, record_t*> > my_records;
Due to its better data locality, which presumably plays nice with processor cache, a simple std::vector often performs better than other data structures which theoretically should have an advantage. Its weak spot is inserting into/removing from the middle. However, in this case, on a 32bit system, this would require moving entries of 2*32bit POD around, which your implementation will likely perform by calling CPU intrinsics for memory move.
std::set and std::map are usually implemented as red-black trees, which are a variant of binary search trees. The specifics are implementation dependent tho.
A clean and simple BST implementation in CPP:
struct node {
int val;
node* left;
node* right;
};
node* createNewNode(int x)
{
node* nn = new node;
nn->val = x;
nn->left = nullptr;
nn->right = nullptr;
return nn;
}
void bstInsert(node* &root, int x)
{
if(root == nullptr) {
root = createNewNode(x);
return;
}
if(x < root->val)
{
if(root->left == nullptr) {
root->left = createNewNode(x);
return;
} else {
bstInsert(root->left, x);
}
}
if( x > root->val )
{
if(root->right == nullptr) {
root->right = createNewNode(x);
return;
} else {
bstInsert(root->right, x);
}
}
}
int main()
{
node* root = nullptr;
int x;
while(cin >> x) {
bstInsert(root, x);
}
return 0;
}
STL's set class is typically implemented as a BST. It's not guaranteed (the only thing that is is it's signature, template < class Key, class Compare = less<Key>, class Allocator = allocator<Key> > class set;) but it's a pretty safe bet.
Your post says you want speed (presumably for a tighter game loop).
So why waste time on these slow-as-molasses O(lg n) structures and go for a hash map implementation?
Related
I am writing a tree container at the moment (just for understanding and training) and by now I got a first and very basic approach to add elements to the tree.
This is my tree code by know. No destructor, no cleanup and no element access by now.
template <class T> class set
{
public:
struct Node
{
Node(const T& val)
: left(0), right(0), value(val)
{}
Node* left;
Node* right;
T value;
};
set()
{}
template <class T> void add(const T& value)
{
if (m_Root == nullptr)
{
m_Root = new Node(value);
}
Node* next = nullptr;
Node* current = m_Root;
do
{
if (next != nullptr)
{
current = next;
}
next = value >= current->value ? current->left : current->right;
} while (next != nullptr);
value >= current->value ? current->left = new Node(value) : current->right = new Node(value);
}
private:
Node* m_Root;
};
Well, now I tested the add performance against the insert performance of a std::set with unique and balanced (low and high) values and came to the conclusion that the performance is simple awful.
Is there a reason why the set inserts values that much faster and what would be a decent way of improving the insert performance of my approach? (I know that there might be better tree models, but as far as I know, the insert performance should be close together between most tree models).
under an i5 4570 stock clock,
the std::set needs 0.013s to add 1000000 int16 values.
my set need 4.5s to add the same values.
where does this big difference come from?
Update:
Allright, here is my testcode:
int main()
{
int n = 1000000;
test::set<test::int16> mset; //my set
std::set<test::int16> sset; //std set
std::timer timer; //simple wrapper for clock()
test::random_engine engine(0, 500000); //simple wrapper for rand() and yes, it's seeded, and yes I am aware that an int16 will overflow
std::set<test::int16> values; //Set of values to ensure unique values
bool flip = false;
for (int i = 0; n > i; ++i)
{
values.insert(flip ? engine.generate() : 0 - engine.generate());
flip = !flip; //ensure that we get high and low values and no straight line, but at least 2 paths
}
timer.start();
for (std::set<test::int16>::iterator it = values.begin(); values.end() != it; ++it)
{
mset.add(*it);
}
timer.stop();
std::cout << timer.totalTime() << "s for mset\n";
timer.reset();
timer.start();
for (std::set<test::int16>::iterator it = values.begin(); values.end() != it; ++it)
{
sset.insert(*it);
}
timer.stop();
std::cout << timer.totalTime() << "s for std\n";
}
the set won't store every value due to dubicates but both containers will get a high number and the same values in the same order to ensure representative results. I know the test could be more accurate but it should give some comparable numbers.
std::set implementation usually uses red-black tree data structure. It's a self-balancing binary search tree, and it's insert operation is guaranteed to be O(log(n)) time complexity in the worst-case (that is required by the standard). You use simple binary search tree with O(n) worst-case insert operation.
If you insert unique random values, such a big difference looks suspicious. But don't forget that randomness will not make your tree balanced and the height of the tree could be much bigger than log(n)
Edit
It seems I found the main problem with your code. All generated values you store in std::set. After that, you add them to the sets in the increasing order. That's degrading your set to the linked list.
The two obvious differences are:
the red-black tree (probably) used in std::set rebalances itself to put an upper bound on worst-case behaviour, exactly as DAle says.
If this is the problem, you should see it when plotting N (number of nodes inserted) against time-per-insert. You could also keep track of tree depth (at least for debugging purposes), and plot that against N.
the standard containers use an allocator which probably does something smarter than newing each node individually. You could try using std::allocator in your own container to see if that makes a significant improvement.
Edit 1 if you implemented a pool allocator, that's relevant information that should have been in the question.
Edit 2 now that you've added your test code, there's an obvious problem which means your set will always have the worst-case performance for insertion. You pre-sorted your input values! std::set is an ordered container, so putting your values in there first guarantees you always insert in increasing value order, so your tree (which does not self-balance) degenerates to an expensive linked list, and your inserts are always linear rather than logarithmic time.
You can verify this by storing your values in a vector instead (just using the set to detect collisions), or using an unordered_set to deduplicate without pre-sorting.
I am trying to build a binary search tree, however, it is vital for the algorithm that I am implementing to do so with a vector to diminish cache misses. My original idea was to adapt something similar to the heap insertion technique , since data placement is the same and, once you add an item, you need to bubble sort up the branch to make sure the properties of each data structure are respected (thus the O(log n) complexity).
However, adapting the insert function has proven trickier than anticipated.
This is the original working code for the binary heap:
template <typename DataType>
void BinHeap<DataType>:: Insert(const DataType& value)
{
data.push_back(value);
if(data.size() > 1)
{
BubbleUp(data.size() -1);
}
}
template <typename DataType>
void BinHeap<DataType>::BubbleUp(unsigned pos)
{
int parentPos = Parent(pos);
if(parentPos > 0 && data[parentPos] < data[pos])
{
std::swap(data[parentPos], data[pos]);
BubbleUp(parentPos);
}
}
And here is my attempt to adapt it into a vector based Binary Search Tree (please do not mind the odd naming of the class, as this is still not the final version):
template <typename DataType>
void BinHeap<DataType>:: Insert(const DataType& value)
{
data.push_back(value);
if(data.size() > 1)
{
BubbleUp(data.size() -1);
}
}
template <typename DataType>
void BinHeap<DataType>::BubbleUp(unsigned pos)
{
int parentPos = Parent(pos);
bool isLeftSon = LeftSon(parentPos) == pos;
if(parentPos >= 0)
{
if(isLeftSon && ( data[parentPos] < data[pos] ) )
{
std::swap(data[parentPos] , data[pos]);
}
else if (data[parentPos] > data[pos])// RightSon
{
std::swap(data[parentPos] , data[pos]);
}
BubbleUp(parentPos-1);
BubbleDown(parentPos-1);
}
}
template <typename DataType>
void BinHeap<DataType>::BubbleDown(unsigned pos)
{
int leftChild = LeftSon(pos);
int rightChild = RightSon(pos);
bool leftExists = leftChild < data.size() && leftChild > 0;
bool rightExists = rightChild < data.size() && rightChild > 0;
// No children
if(!leftExists && !rightExists)
{
return;
}
if(leftExists && data[pos] < data[leftChild])
{
std::swap(data[leftChild] , data[pos]);
}
else if (rightExists && data[pos] > data[rightChild])
{
std::swap(data[rightChild] , data[pos]);
}
}
This approach is able to guarantee that the properties of the BST are respected locally, but not across siblings or ancestors (grandparents, etc). For example, if every number from 1 to 16 is inserted in order, 12 will have a left child of 6 and right child of 14. However, it parent 16 will have a left child of 8 and a right child of 12 (thus 6 is on the right subtree of 16). I feel my current approach is over complicating the process, but I am not sure how to rearrange it to make the necessary changes in an efficient manner. Any insight would be greatly appreciated.
The realistic answer to the question title (which is at the time I composed this answer "How to create the insert function for a binary search tree built with a vector?") is: Don't do that!
It is clear from your code that you are trying to preserve the compact storage and self-balancing properties of a heap while also wishing it to be searchable via classic left/right child tree navigation. But, the heap trick of using (index-1)/2 to locate the parent node only works for a "perfectly balanced" tree. That is, the N element array is perfectly packed from 0 to N-1. And then, you expect an in-order walk of this tree to be sorted (if you didn't, then your binary left/right search navigation would not be able to find the right node).
Thus, you are maintaining a sorted set of elements in your array. Except, you have some strange rules for how to navigate the array to get the sorted order.
There is no way that your scheme can maintain a binary sorted array any simpler than a scheme that maintains a plain sorted array. The node manipulations only lead to a complicated piece of software that is difficult to understand, to maintain, and reason about correctly. A sorted array, on the other hand, is easy to understand and maintain, and is easy to see how it leads to a correct result. The binary search (or optionally, dictionary search) is fast.
While maintaining a sorted array requires a linear insertion logic, your scheme must be at least as complex, because it is also maintaining a sorted set of elements in the array.
If you want a data structure that is hardware data cache friendly, and provides logarithmic insertion and search, use a B+-tree. It is a little more complex than your average data structure, but this is a case where the complexity can be worth it. Especially if regular trees just cause too much data cache thrash. As a bit of advice, optimal performance usually results if an interior node (with keys) is sized to fit within a cache line or two.
I really don't understand the Big Picture, or overall view, of what you are trying to accomplish. There are many existing functions, and libraries that perform the functionality that I think you want.
Efficient Data Search
Since you are using a vector, placing a B-Tree into a vector seems moot. The general situation is that you maintain a sorted vector and perform a binary_search, upper_bound, or lower_bound on the array. Provided that your program is performing more searches than inserts, this should be faster than traversing a binary tree inside an array.
The maintenance is much easier using an array of sorted values than performing maintenance on a Balanced Binary Tree. The maintenance consists of appending a value, then sorting.
The sorted vector technique also uses less memory. There is no need for child links so you save 2 indices for every slot in the vector.
Using A Binary Tree
There are many examples on the 'net for using Binary Trees. Looks like you want to maintain a balanced tree, so you should search the web for "c++ example balanced binary tree array". If the examples use pointers, replace the pointers by an index.
Trees are complex data structures and have maintenance overhead. I'm not sure if balancing a tree in a vector is slower than sorting a vector; but it is usually more work and more complex.
Usage Criteria
With modern computers executing instructions in the nanosecond time period, searching performance differences become negligible with huge amounts of data. So for small amounts of data, a linear search may be faster than a binary search, due to overhead costs in a binary search.
Similarly, with a binary tree and a sorted array. The overhead of node processing may be more than the overhead of sorting a vector, and only negligible for large amounts of data.
Development time is crucial. The time spent developing a specialized binary tree in a vector is definitely more than using an std::vector, std::sort, and std::lower_bound. These items are already implemented and tested. So while you are developing this specialized algorithm and data structure, another person using a sorted vector could be finished and onto another project by the time you finish your development.
I am reading about hashing in Robert Sedwick book on Algorithms in C++
We might be using a header node to streamline the code for insertion
into an ordered list, but we might not want to use M header nodes for
individual lists in separate chaining. Indeed, we could even eliminate
the M links to the lists by having the first nodes in the lists
comprise the table
.
class ST
{
struct node
{
Item item;
node* next;
node(Item x, node* t)
{ item = x; next = t; }
};
typedef node *link;
private:
link* heads;
int N, M;
Item searchR(link t, Key v)
{
if (t == 0) return nullItem;
if (t->item.key() == v) return t->item;
return searchR(t->next, v);
}
public:
ST(int maxN)
{
N = 0; M = maxN/5;
heads = new link[M];
for (int i = 0; i < M; i++) heads[i] = 0;
}
Item search(Key v)
{ return searchR(heads[hash(v, M)], v); }
void insert(Item item)
{ int i = hash(item.key(), M);
heads[i] = new node(item, heads[i]); N++; }
};
My two questions on above text what does author mean by
"We could even eliminate the M links to the lists by having the first nodes in the lists comprise the table." How can we modify above code for this?
"we might not want to use M header nodes for individual lists in separate chaining." What does this statement mean.
"We could even eliminate the M links to the lists by having the first nodes in the lists comprise the table."
Consider Node* x[n] vs Node x[n]: the former needs an extra pointer and on-insertion memory allocated for the head Node of every non-empty element, and an extra indirection for every hash table operation, while the latter eliminates the n pointers but requires that any unused elements will be able to be put in some discernable not-in-use state (tracking of which may or may not require extra memory), and if sizeof(Node) size is greater than sizeof(Node*), it may be more wasteful of memory anyway. The difference in memory use can also affect efficiency of cache use: if the table has a high element to buckets ratio then a Node[] gets the Node data into fewer contiguous memory pages, and if you're iterating (in unsorted order) then it's very cache efficient, whereas Node*[] will jump to separate memory allocations that might be all over the place (or on the other hand, might actually be quite close together in some actually useful: e.g. if both access patterns and dynamic memory allocation addresses correlate to chronological time of object creation.
How can we modify above code for this?
First, your existing code has a problem: heads[i] = new node(item, heads[i]); overwrites an entry in the hash table without first checking if it's empty... if there's anything there then you should be adding to the list, not overwriting the array.
The design change discussed needs:
link* heads;
...changed to...
node* head;
You'd initialise it like this:
head = new node[M];
Which needs an extra node constructor (if item has an equivalent default constructor, you can leave out its initialisation below)
node() : item(nullItem), next(nullptr) { }
Then there's some knock on changes to the rest of your code that are easy to work through. Basically, you're getting rid of a layer of pointers.
"we might not want to use M header nodes for individual lists in separate chaining." What does this statement mean.
I didn't write it so can't say authoritatively, but it appears to be saying that when designing the list code, a decision might have been made to have an initial Node even in an empty list, as this simplifies code for several list operations. While the extra data-less Node might seem a reasonable price when contemplating "usual" uses of a list, hash tables are unusual in that you want most of the lists chained of the buckets to have 0 or 1 element, and exponentially fewer should be longer and longer. So, such a list implementation is poorly suited to use in a hash table.
I'm working on an octree implementation where the tree nodes are templated with their dimensional length (as a power of 2):
template<long N>
struct node_t {
enum { DIM = 1 << N };
node_t<N+1> * parent;
node_t<N-1> * children[8];
long count;
}
And specialized for N = 0 (leaves) to point to data.
struct node_t<0> {
enum { DIM = 1 };
node_t<1> * parent;
data_t data;
long count;
}
(Aside: I suppose I probably also need a specialization for N_MAX that excludes a parent pointer, or else C++ will generate types of increasing N ad finitum? But that's not really relevant to my question.)
I'd like to create a function that steps along a ray in the 3D space that my octree occupies, so ostensibly I could just keep a pointer to the root node (which has a known type) and traverse the octree from the root at every step. However, I would prefer a more 'local' option, in which I can keep track of the current node so that I can start lower in the tree when possible and thus avoid unnecessarily traversing the upper nodes of the octree.
But I don't know what that type pointer could be (or any other way of implementing this) so that I don't experience slicing.
I'm not tied down to templates, as the dimension can simply be implemented as a long const. But then I don't know how to make it so that the leaves have a different child type than inodes.
Thanks for your help!
Update
The reason I'd like to do it this way rather than something similar to this is because of the count variable in each node: if the count is 0, I'd like to jump through the whole cube, rather wasting time going through leaves that I know to be empty. (This is for a raytracing voxel engine.)
As much as I love templates, your code might actually be simpler with:
class node {
node* parent; // NULL for root node
long dim;
long count;
virtual rayTrace(Ray) = 0;
};
class leafNode : node {
data_t data;
virtual rayTrace(Ray);
};
class nonLeafNode : node {
vector<node*> children;
virtual rayTrace(Ray);
};
This has the advantage that the tree can be whatever depth you want, including some subtrees can be deeper than others. It has the downside that dim must be computed at runtime, but even that has the silver lining that you can make it a double if your tree gets really big.
I'm currently working on building a hash table in order to calculate the frequencies, depending on the running time of the data structure. O(1) insertion, O(n) worse look up time etc.
I've asked a few people the difference between std::map and the hash table and I've received an answer as;
"std::map adds the element as a binary tree thus causes O(log n) where with the hash table you implement it will be O(n)."
Thus I've decided to implement a hash table using the array of linked lists (for separate chaining) structure. In the code below I've assigned two values for the node, one being the key(the word) and the other being the value(frequency). It works as; when the first node is added if the index is empty it is directly inserted as the first element of linked list with the frequency of 0. If it is already in the list (which unfortunately takes O(n) time to search) increment its frequency by 1. If not found simply add it to the beginning of the list.
I know there are a lot of flows in the implementation thus I would like to ask the experienced people in here, in order to calculate frequencies efficiently, how can this implementation be improved?
Code I've written so far;
#include <iostream>
#include <stdio.h>
using namespace std;
struct Node {
string word;
int frequency;
Node *next;
};
class linkedList
{
private:
friend class hashTable;
Node *firstPtr;
Node *lastPtr;
int size;
public:
linkedList()
{
firstPtr=lastPtr=NULL;
size=0;
}
void insert(string word,int frequency)
{
Node* newNode=new Node;
newNode->word=word;
newNode->frequency=frequency;
if(firstPtr==NULL)
firstPtr=lastPtr=newNode;
else {
newNode->next=firstPtr;
firstPtr=newNode;
}
size++;
}
int sizeOfList()
{
return size;
}
void print()
{
if(firstPtr!=NULL)
{
Node *temp=firstPtr;
while(temp!=NULL)
{
cout<<temp->word<<" "<<temp->frequency<<endl;
temp=temp->next;
}
}
else
printf("%s","List is empty");
}
};
class hashTable
{
private:
linkedList* arr;
int index,sizeOfTable;
public:
hashTable(int size) //Forced initalizer
{
sizeOfTable=size;
arr=new linkedList[sizeOfTable];
}
int hash(string key)
{
int hashVal=0;
for(int i=0;i<key.length();i++)
hashVal=37*hashVal+key[i];
hashVal=hashVal%sizeOfTable;
if(hashVal<0)
hashVal+=sizeOfTable;
return hashVal;
}
void insert(string key)
{
index=hash(key);
if(arr[index].sizeOfList()<1)
arr[index].insert(key, 0);
else {
//Search for the index throughout the linked list.
//If found, increment its value +1
//else if not found, add the node to the beginning
}
}
};
Do you care about the worst case? If no, use an std::unordered_map (it handles collisions and you don't want a multimap) or a trie/critbit tree (depending on the keys, it may be more compact than a hash, which may lead to better caching behavior). If yes, use an std::set or a trie.
If you want, e.g., online top-k statistics, keep a priority queue in addition to the dictionary. Each dictionary value contains the number of occurrences and whether the word belongs to the queue. The queue duplicates the top-k frequency/word pairs but keyed by frequency. Whenever you scan another word, check whether it's both (1) not already in the queue and (2) more frequent than the least element in the queue. If so, extract the least queue element and insert the one you just scanned.
You can implement your own data structures if you like, but the programmers who work on STL implementations tend to be pretty sharp. I would make sure that's where the bottleneck is first.
1- The complexity time for search in std::map and std::set is O(log(n)). And, the amortize time complexity for std::unordered_map and std::unordered_set is O(n). However, the constant time for hashing could be very large and for small numbers become more than log(n). I always consider this face.
2- if you want to use std::unordered_map, you need to make sure that std::hash is defined for you type. Otherwise you should define it.