Has anyone seen an implementation of the STL where stl::set is not implemented as a red-black tree?
The reason I ask is that, in my experiments, B-trees outperform std::set (and other red-black tree implementations) by a factor of 2 to 4 depending on the value of B. I'm curious if there is a compelling reason to use red-black trees when there appear to be faster data structures available.
Some folks over at Google actually built a B-tree based implementation of the C++ standard library containers. They seem to have much better performance than standard binary tree implementations.
There is a catch, though. The C++ standard guarantees that deleting an element from a map or set only invalidates other iterators pointing to the same element in the map or set. With the B-tree based implementation, due to node splits and consolidations, the erase member functions on these new structures may invalidate iterators to other elements in the tree. As a result, these implementations aren't perfect replacements for the standard implementations and couldn't be used in a conformant implementation.
Hope this helps!
There is at least one implementation based on AVL trees instead of red-black trees.
I haven't tried to verify conformance of this implementation, but at least (unlike a B-tree based implementation) it at least could be written to conform perfectly to the requirements of the standard.
Related
In current C++ STL, where are red-black tree used? (I assume map and set do?) Is the red-black tree used 2-3 tree (ie only left or right child can be red) or 2-3-4 tree (ie both left and right child can be red)? is there a red-black tree lib in STL?
std::map, std::multimap, std::set and std::multiset are often implemented in terms of red-black trees but doing so is not mandated by the standard. Since using a red-black tree is not required there is also no requirement for any particular flavor of RB tree.
I believe (though am not certain) that SGI's STL (upon which much of the original standard library is based) actually does have a red-black tree available. If it helps, I know boost::intrusive does have a reusable red-black tree implementation.
Inspired by this question: Why isn't std::set just called std::binary_tree? I came up with one of my own. Is red-black tree the only possible data structure fullfilling requirements of std::set or are there any others? For instance, another self-balancing tree - AVL tree - seems to be good alternative with very similar properties. Is it theoretically possible to replace underlying data structure of std::set or is there a group of requirements that makes red-black tree the only viable choice?
AVL trees have worse performance (not to be confused with asymptotic complexity) than RB trees in most real world situations. You can base std::set on AVL trees and be fully standard-compliant, but it will not win you any customers.
Is there any STL for segment tree?
In competitive programming it takes a lot of time to code for seg tree. I wonder if there is any STL for that so that lot of time could be saved.
I assume by "segment tree" you actually mean range tree, which is more commonly used in programming contests than the more specialized structure for storing a set of intervals.
There is no such container in the C++ standard library but if you are competing in ACM contests you can consider writing your own and simply copying it as needed. You can find my own implementation here (including lazy propagation) but if you search the web you can probably find a more generic version.
In applications where you need the sum instead of the minimum or maximum, you can use a binary indexed tree instead of a segment tree, which is faster, uses less memory, and is also easier to code (about a dozen lines or less).
There is no STL in C++ for segment tree. However, you can check out the Boost Library called Interval Container Library (ICL) which should satisfy your requirements.
If we insert random integers in std::set, and read the set, we get ordered sequence. Basically, we have implicit sorting. However, what kind of sorting algorithm do we have here? Is it heapsort?
At least normally, it's a tree sort. That is, the items are inserted into a balanced binary search tree (usually a red-black tree), and that tree is traversed in order.
std::set and std::map are usually implemented using self-balancing binary search trees, usually red-black trees because they tend to be the fastest in practice. For detailed information about these data structures, you might want to consult a textbook such as Introduction to Algorithms by Cormen et al. or Algorithms by Sedgewick.
The C++ standard doesn't enforce any kind of sorting algorithm for std::set or std::map. So their implementations might differ among different platforms.
With that said, they are commonly implemented as a red-black tree, which is a self-balancing binary search tree. They don't sort their contents, they maintain the order of their contents as new items are inserted. Inserting a single item to them is usually O(logn).
I understand that my STL (that comes with g++ 4.x.x) uses red-black trees to implement containers such as the map. Is it possible to use the STL's internal red-black tree directly. If so, how? If not, why not - why does STL not expose the red-black tree?
Surprisingly, I cannot find an answer using google.
Edit: I'm investigating using the red-black tree as a solution to the extra allocator constructor call on insertion. See this question. My STL uses red-black trees for map implementation.
Actually - the answer is very simple, and independent of your version of gcc. You can download the stl source code from sgi's website, and see the implementation and use for yourself.
For example, in version 3.2, you can see the red-black tree implementation in the stl_tree.h file, and an example of its use in stl_set.h.
Note that since the stl classes are template classes, the implementations are actually inside the header files.
Most STL implementations of set and map are red black trees I believe, though nothing is stopping someone from implementing them using a different data structure - if I remember correctly, the C++ standard does not require a RB tree implementation.
The STL does not expose it as that would violate OOP principles.. exposing the underlying data structure could lead to some undesired behavior if someone else were to use your library. That is, specifically for set and map, you should only be allowed access to methods that would conform to the set and map data structures.. exposing the underlying representation could perhaps lead a user to have duplicates inside set, which is bad.
That being said, there is no way (to my knowledge) to directly use the underlying red black tree.. it would depend a lot on how you would want to use it. Implementing your own red-black tree would most likely be your best bet, or check our 3rd party libraries (perhaps Boost?)
You're not even given a guarantee that the data structure used will be a red-black tree (e.g., it's been implemented at least once as an AVL tree, and something like B-, B* or B+ tree would probably be fine as well).
As such, the only way to get at the internals would be to look at a specific implementation, and make use of things it doesn't (at least try to) expose publicly.
As to the why: I think mostly because it's an attempt at abstraction, not exposing all the implementation details.