In current C++ STL, where are red-black tree used? (I assume map and set do?) Is the red-black tree used 2-3 tree (ie only left or right child can be red) or 2-3-4 tree (ie both left and right child can be red)? is there a red-black tree lib in STL?
std::map, std::multimap, std::set and std::multiset are often implemented in terms of red-black trees but doing so is not mandated by the standard. Since using a red-black tree is not required there is also no requirement for any particular flavor of RB tree.
I believe (though am not certain) that SGI's STL (upon which much of the original standard library is based) actually does have a red-black tree available. If it helps, I know boost::intrusive does have a reusable red-black tree implementation.
Related
Boost provides boost::container::set/map/multiset/multimap where the underlying binary-search-tree (BST) can be configured, and it can be chosen to be an AVL tree.
One (maybe the most crucial one) reason, why one would prefer AVL trees over Red-Black trees, is the merge and split operations of complexity O(logN). However, surprisingly for me, it seems boost::container doesn't provide these operations. The documentation describes merge as an element-wise operation of O(NlogN) complexity (this is regardless of the underlying BST implementation!?), and the documentation doesn't even mention about split!
I can't say about merge, but as for split, I can assume that the lack of it might be justified by the constant-time size issue, so split of complexity O(logN) might not be aware of the sizes of the two resulting parts. But this could be fixed having an intrusive container and holding the sub-tree nodes count with each node.
There is also boost::intrusive::avl_set, but I couldn't find the AVL merge and split algorithms in the documentation.
So the questions are.
Is there a full-functional, ready-to-go AVL based implementation of set/map/multiset/multimap that provides merge and split operations with the complexity of O(logN)?
If not, how can I build one using boost::intrusive::avl_set?
Inspired by this question: Why isn't std::set just called std::binary_tree? I came up with one of my own. Is red-black tree the only possible data structure fullfilling requirements of std::set or are there any others? For instance, another self-balancing tree - AVL tree - seems to be good alternative with very similar properties. Is it theoretically possible to replace underlying data structure of std::set or is there a group of requirements that makes red-black tree the only viable choice?
AVL trees have worse performance (not to be confused with asymptotic complexity) than RB trees in most real world situations. You can base std::set on AVL trees and be fully standard-compliant, but it will not win you any customers.
If we insert random integers in std::set, and read the set, we get ordered sequence. Basically, we have implicit sorting. However, what kind of sorting algorithm do we have here? Is it heapsort?
At least normally, it's a tree sort. That is, the items are inserted into a balanced binary search tree (usually a red-black tree), and that tree is traversed in order.
std::set and std::map are usually implemented using self-balancing binary search trees, usually red-black trees because they tend to be the fastest in practice. For detailed information about these data structures, you might want to consult a textbook such as Introduction to Algorithms by Cormen et al. or Algorithms by Sedgewick.
The C++ standard doesn't enforce any kind of sorting algorithm for std::set or std::map. So their implementations might differ among different platforms.
With that said, they are commonly implemented as a red-black tree, which is a self-balancing binary search tree. They don't sort their contents, they maintain the order of their contents as new items are inserted. Inserting a single item to them is usually O(logn).
Has anyone seen an implementation of the STL where stl::set is not implemented as a red-black tree?
The reason I ask is that, in my experiments, B-trees outperform std::set (and other red-black tree implementations) by a factor of 2 to 4 depending on the value of B. I'm curious if there is a compelling reason to use red-black trees when there appear to be faster data structures available.
Some folks over at Google actually built a B-tree based implementation of the C++ standard library containers. They seem to have much better performance than standard binary tree implementations.
There is a catch, though. The C++ standard guarantees that deleting an element from a map or set only invalidates other iterators pointing to the same element in the map or set. With the B-tree based implementation, due to node splits and consolidations, the erase member functions on these new structures may invalidate iterators to other elements in the tree. As a result, these implementations aren't perfect replacements for the standard implementations and couldn't be used in a conformant implementation.
Hope this helps!
There is at least one implementation based on AVL trees instead of red-black trees.
I haven't tried to verify conformance of this implementation, but at least (unlike a B-tree based implementation) it at least could be written to conform perfectly to the requirements of the standard.
I understand that my STL (that comes with g++ 4.x.x) uses red-black trees to implement containers such as the map. Is it possible to use the STL's internal red-black tree directly. If so, how? If not, why not - why does STL not expose the red-black tree?
Surprisingly, I cannot find an answer using google.
Edit: I'm investigating using the red-black tree as a solution to the extra allocator constructor call on insertion. See this question. My STL uses red-black trees for map implementation.
Actually - the answer is very simple, and independent of your version of gcc. You can download the stl source code from sgi's website, and see the implementation and use for yourself.
For example, in version 3.2, you can see the red-black tree implementation in the stl_tree.h file, and an example of its use in stl_set.h.
Note that since the stl classes are template classes, the implementations are actually inside the header files.
Most STL implementations of set and map are red black trees I believe, though nothing is stopping someone from implementing them using a different data structure - if I remember correctly, the C++ standard does not require a RB tree implementation.
The STL does not expose it as that would violate OOP principles.. exposing the underlying data structure could lead to some undesired behavior if someone else were to use your library. That is, specifically for set and map, you should only be allowed access to methods that would conform to the set and map data structures.. exposing the underlying representation could perhaps lead a user to have duplicates inside set, which is bad.
That being said, there is no way (to my knowledge) to directly use the underlying red black tree.. it would depend a lot on how you would want to use it. Implementing your own red-black tree would most likely be your best bet, or check our 3rd party libraries (perhaps Boost?)
You're not even given a guarantee that the data structure used will be a red-black tree (e.g., it's been implemented at least once as an AVL tree, and something like B-, B* or B+ tree would probably be fine as well).
As such, the only way to get at the internals would be to look at a specific implementation, and make use of things it doesn't (at least try to) expose publicly.
As to the why: I think mostly because it's an attempt at abstraction, not exposing all the implementation details.