I understand that my STL (that comes with g++ 4.x.x) uses red-black trees to implement containers such as the map. Is it possible to use the STL's internal red-black tree directly. If so, how? If not, why not - why does STL not expose the red-black tree?
Surprisingly, I cannot find an answer using google.
Edit: I'm investigating using the red-black tree as a solution to the extra allocator constructor call on insertion. See this question. My STL uses red-black trees for map implementation.
Actually - the answer is very simple, and independent of your version of gcc. You can download the stl source code from sgi's website, and see the implementation and use for yourself.
For example, in version 3.2, you can see the red-black tree implementation in the stl_tree.h file, and an example of its use in stl_set.h.
Note that since the stl classes are template classes, the implementations are actually inside the header files.
Most STL implementations of set and map are red black trees I believe, though nothing is stopping someone from implementing them using a different data structure - if I remember correctly, the C++ standard does not require a RB tree implementation.
The STL does not expose it as that would violate OOP principles.. exposing the underlying data structure could lead to some undesired behavior if someone else were to use your library. That is, specifically for set and map, you should only be allowed access to methods that would conform to the set and map data structures.. exposing the underlying representation could perhaps lead a user to have duplicates inside set, which is bad.
That being said, there is no way (to my knowledge) to directly use the underlying red black tree.. it would depend a lot on how you would want to use it. Implementing your own red-black tree would most likely be your best bet, or check our 3rd party libraries (perhaps Boost?)
You're not even given a guarantee that the data structure used will be a red-black tree (e.g., it's been implemented at least once as an AVL tree, and something like B-, B* or B+ tree would probably be fine as well).
As such, the only way to get at the internals would be to look at a specific implementation, and make use of things it doesn't (at least try to) expose publicly.
As to the why: I think mostly because it's an attempt at abstraction, not exposing all the implementation details.
Related
Has anyone seen an implementation of the STL where stl::set is not implemented as a red-black tree?
The reason I ask is that, in my experiments, B-trees outperform std::set (and other red-black tree implementations) by a factor of 2 to 4 depending on the value of B. I'm curious if there is a compelling reason to use red-black trees when there appear to be faster data structures available.
Some folks over at Google actually built a B-tree based implementation of the C++ standard library containers. They seem to have much better performance than standard binary tree implementations.
There is a catch, though. The C++ standard guarantees that deleting an element from a map or set only invalidates other iterators pointing to the same element in the map or set. With the B-tree based implementation, due to node splits and consolidations, the erase member functions on these new structures may invalidate iterators to other elements in the tree. As a result, these implementations aren't perfect replacements for the standard implementations and couldn't be used in a conformant implementation.
Hope this helps!
There is at least one implementation based on AVL trees instead of red-black trees.
I haven't tried to verify conformance of this implementation, but at least (unlike a B-tree based implementation) it at least could be written to conform perfectly to the requirements of the standard.
I want to use vocabulary trees (which is not necessarily binary) in my program and I already have a general idea on how to create a tree class but I was wondering if there are any c++ libraries that are useful for that purpose. If not I would like to know about the methods I can use to manage my tree faster( add/remove/access nodes), like storing them in consecutive memory locations.
thank you
You can use The Boost Graph Library to model all kinds of trees.
Though std::map and std::set get mentioned in oleskii's link they are binary trees. Any n-ary tree can be rearranged to a binary tree, but that may not help you, since the re-organisation will take time. The boost graph libraries are more general purpose.
A quick google for n-ary trees C++" just turned up treetree
"Treetree is a header-only C++ library that implements a generic
tree-structured container class according to STL conventions"
If you want to make you current tree implementation faster, you should measure where it is currently slow.
Check simple things, e.g. make sure you pass by reference rather than by copy.
We are porting out game from C++ to web; the game make extensive use of STL.
Can you provide short comparison chart (and if possible, a bit of code samples for basic operations like insertion/deletion/searching and (where applicable) equal_range/binary_search) for the classes what are equivalents to the following STL containers :
std::vector
std::set
std::map
std::list
stdext::hash_map
?
Thanks a lot for your time!
UPD:
wow, it seems we do not have everything we needhere :(
Can anyone point to some industry standard algorithms library for AS3 programs (like boost in C++)?
I can not believe people can write non-trivial software without balanced binary search trees (std::set std::map)!
The choices of data structures are significantly more limited in as3. You have:
Array or Vector.<*> which stores a list of values and can be added to after construction
Dictionary (hash_map) which stores key/value pairs
maps and sets aren't really supported as there's no way to override object equality. As for binary search, most search operations take a predicate function for you to override equality for that search.
Edit: As far as common algorithm and utility libraries, I'd take a look at as3commons
Maybe this library will fit your needs.
Looking for good source code either in C or C++ or Python to understand how a hash function is implemented and also how a hash table is implemented using it.
Very good material on how hash fn and hash table implementation works.
Thanks in advance.
Hashtables are central to Python, both as the 'dict' type and for the implementation of classes and namespaces, so the implementation has been refined and optimised over the years. You can see the C source for the dict object here.
Each Python type implements its own hash function - browse the source for the other objects to see their implementations.
When you want to learn, I suggest you look at the Java implementation of java.util.HashMap. It's clear code, well-documented and comparably short. Admitted, it's neither C, nor C++, nor Python, but you probably don't want to read the GNU libc++'s upcoming implementation of a hashtable, which above all consists of the complexity of the C++ standard template library.
To begin with, you should read the definition of the java.util.Map interface. Then you can jump directly into the details of the java.util.HashMap. And everything that's missing you will find in java.util.AbstractMap.
The implementation of a good hash function is independent of the programming language. The basic task of it is to map an arbitrarily large value set onto a small value set (usually some kind of integer type), so that the resulting values are evenly distributed.
There is a problem with your question: there are as many types of hash map as there are uses.
There are many strategies to deal with hash collision and reallocation, depending on the constraints you have. You may find an average solution, of course, that will mostly fit, but if I were you I would look at wikipedia (like Dennis suggested) to have an idea of the various implementations subtleties.
As I said, you can mostly think of the strategies in two ways:
Handling Hash Collision: Bucket, which kind ? Open Addressing ? Double Hash ? ...
Reallocation: freeze the map or amortized linear ?
Also, do you want baked in multi-threading support ? Using atomic operations it's possible to get lock-free multithreaded hashmaps as has been proven in Java by Cliff Click (Google Tech Talk)
As you can see, there is no one size fits them all. I would consider learning the principles first, then going down to the implementation details.
C++ std::unordered_map use a linked-list bucket and freeze the map strategies, no concern is given to proper synchronization as usual with the STL.
Python dict is the base of the language, I don't know of the strategies they elected
Is there any good tree manipulation (template) libraries for C++ out there that can do basic things like binary tree.
Though it is not difficult to write a binary tree all from scratch, but I'm really surprised that it is not so easy to find one ready-for-use.
Trees are subsets of graphs. There are plenty of graph libraries out there, such as Boost Graph Library. You will have to add your vertices as you want and then use any one of the many visitors to traverse your tree.
Alternatively, you could custom make one with standard containers (think of a root node as containing 2 children that have a value and may have 2 other children).
What do you need the tree for? There may already be something in the STL or Boost that satisfies your need. For example: the STL std::map<key,value> is usually implemented as a balanced binary tree.
There is also tree.hh which implements an STL-like n-way tree.
ACE has an implementation of Red Black tree. It is fairly easy to use.
link text