There is plenty of discussion on StackOverflow and other sites on what type of C++ container to use, with the not so shocking conclusion "it depends on your needs".
Currently i'm using std::list on my interfaces, however i have no direct requirement on lists as opposed to vectors or deques; and in there lies my question.
I can't say what my requirements will be down the line. Todays its a list, tomorrow... who knows?
I've been toying with the idea of creating a wrapper class 'Collection' which does nothing more than expose the STL containers interface allowing me to alter the internals without breaking my interfaces if the need arises.
Is this worth the hassle?
Should i just suck it up and make a decision on my current needs?
Any opinions?
Cheers,
Ben
EDIT:
Thread safety is important.
The recompilation of code that consumes the interface is unacceptable.
You should write such class if only you are going to make an option in your program to use different container type or create some kind of run-time optimization but in general you should know what the container is used for and so you know how it's used and that leads to what your needs are.
Don't make a class that you use just because you don't understand different containers because it's a waste of resources. In such case you should learn more about a few main container types, such as list, vector, queue, probably map, and use whenever they are needed. The only reason why there are so many of them is that different situations require different containers to make programming easier and code more efficient. For example lists are good if you put and remove a lot while vector is faster if you do more of reading. Queues are good when there is a need to do things in exact order (priority_queue is the same, by the way, except you can use a specific order), maps are good for hashing current state or something like that.
You should write your code generically. But instead of defining a generic Container, use the STL way of decoupling algorithms from containers (iterators). Since you want to link dynamically, read this article, and you may find some things in boost (any_range...).
If you need a single container and want to change its type quickly, use a typedef as recommended by #icabod.
If you're writing algorithms that should work with different containers selected at compile-time, then implement them as template code on containers, or, if possible, iterators.
Only if you need to select a container type at run-time you should implement a polymorphic Container or Collection class + subclasses.
What C++ library provides Data structures API that match the ones provided by java.util.* as much as possible.
Specifically, I am looking for the following DS and following Utility Functions:-
**DS**: Priority Queue, HashMap, TreeMap, HashSet,
TreeSet, ArrayList, String most importantly.
**Utility**: Arrays.* , Collections.*, Regex, FileHandling etc.
and other converters and algorithms like Binary Search, Sort, NthElement etc.
My guess is that Boost may be able to do all these, but I find it too bulky and is non-trivial to add it into a project, especially, when I want to quickly get started on something and when although the code would require all these data structures, the code overall is not going to be that huge to warrant spending lot of effort in setting up libraries.
An example would be if someone had to write a C++ program to do Network Flow Algorithm for a school assignment. I am sure I could come up with better examples, but this one's on top of my head.
Thanks
Ajay
All of those containers are available in some form in the SC++L:
Priority Queue std::priority_queue (this is actually a container adapter, rather than a container itself - that is, it works "on top of" another container, usually std::vector or std::deque.
HashMap std::unordered_map (or if your compiler doesn't support C++0x, there's boost::unordered_map)
TreeMap std::map
HashSet and TreeSet are basically the same as HashMap and TreeMap, except the key and value are the same thing. However, there's also std::unordered_set and std::set.
ArrayList is the venerable std::vector
String is the venerable std::string. Many of the functions you get in the Java String class can be found in the Boost.Strings library.
Do not be afraid of setting up boost. In my experience, you set it up once and then use it over and over again in all of your projects. Also, all of the libraries that I mentioned above are header-only libraries. That means, you don't actually need to build/install any libraries, just references the headers.
For the other things, I'm not so sure, since I don't know Java all that well. At the end of the day, you're not going to find a library that's "just like Java, except written in C++" because that would be kind of pointless. A C++ library is written to play to C++'s strength, a Java library is written to play to Java's strengths. To try and shoehorn a library designed for one language into another doesn't make sense to me.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I am a Java developer trying to learn C++. I have many times read on the internet (including Stack Overflow) that STL is the best collections library that you can get in any language. (Sorry, I do not have any citations for that)
However after studying some STL, I am really failing to see what makes STL so special. Would you please shed some light on what sets STL apart from the collection libraries of other languages and make it the best collection library?
What is so great about the STL ?
The STL is great in that it was conceived very early and yet succeeded in using C++ generic programming paradigm quite efficiently.
It separated efficiently the data structures: vector, map, ... and the algorithms to operate on them copy, transform, ... taking advantage of templates to do so.
It neatly decoupled concerns and provided generic containers with hooks of customization (Comparator and Allocator template parameters).
The result is very elegant (DRY principle) and very efficient thanks to compiler optimizations so that hand-generated algorithms for a given container are unlikely to do better.
It also means that it is easily extensible: you can create your own container with the interface you wish, as long as it exposes STL-compliant iterators you'll be able to use the STL algorithms with it!
And thanks to the use of traits, you can even apply the algorithms on C-array through plain pointers! Talk about backward compatibility!
However, it could (perhaps) have been better...
What is not so great about the STL ?
It really pisses me off that one always have to use the iterators, I'd really stand for being able to write: std::foreach(myVector, [](int x) { return x+1;}); because face it, most of the times you want to iterate over the whole of the container...
But what's worse is that because of that:
set<int> mySet = /**/;
set<int>::const_iterator it = std::find(mySet.begin(), mySet.end(), 1005); // [1]
set<int>::const_iterator it = mySet.find(1005); // [2]
[1] and [2] are carried out completely differently, resulting in [1] having O(n) complexity while [2] has O(log n) complexity! Here the problem is that the iterators abstract too much.
I don't mean that iterators are not worthy, I just mean that providing an interface exclusively in terms of iterators was a poor choice.
I much prefer myself the idea of views over containers, for example check out what has been done with Boost.MPL. With a view you manipulate your container with a (lazy) layer of transformation. It makes for very efficient structures that allows you to filter out some elements, transform others etc...
Combining views and concept checking ideas would, I think, produce a much better interface for STL algorithms (and solve this find, lower_bound, upper_bound, equal_range issue).
It would also avoid common mistakes of using ill-defined ranges of iterators and the undefined behavior that result of it...
It's not so much that it's "great" or "the best collections library that you can get in *any* language", but it does have a different philosophy to many other languages.
In particular, the standard C++ library uses a generic programming paradigm, rather than an object-oriented paradigm that is common in languages like Java and C#. That is, you have a "generic" definition of what an iterator should be, and then you can implement the function for_each or sort or max_element that takes any class that implements the iterator pattern, without actually having to inherit from some base "Iterator" interface or whatever.
What I love about the STL is how robust it is. It is easy to extend it. Some complain that it's small, missing many common algorithms or iterators. But this is precisely when you see how easy it is to add in the missing components you need. Not only that, but small is beautiful: you have about 60 algorithms, a handful of containers and a handful of iterators; but the functionality is in the order of the product of these. The interfaces of the containers remain small and simple.
Because it's fashion to write small, simple, modular algorithms it gets easier to spot bugs and holes in your components. Yet, at the same time, as simple as the algorithms and iterators are, they're extraordinarily robust: your algorithms will work with many existing and yet-to-be-written iterators and your iterators work with many existing and yet-to-be-written algorithms.
I also love how simple the STL is. You have containers, you have iterators and you have algorithms. That's it (I'm lying here, but this is what it takes to get comfortable with the library). You can mix different algorithms with different iterators with different containers. True, some of these have constraints that forbid them from working with others, but in general there's a lot to play with.
Niklaus Wirth said that a program is algorithms plus data-structures. That's exactly what the STL is about. If Ruby and Python are string superheros, then C++ and the STL are an algorithms-and-containers superhero.
STL's containers are nice, but they're not much different than you'll find in other programming languages. What makes the STL containers useful is that they mesh beautifully with algorithms. The flexibility provided by the standard algorithms is unmatched in other programming languages.
Without the algorithms, the containers are just that. Containers. Nothing special in particular.
Now if you're talking about container libraries for C++ only, it is unlikely you will find libraries as well used and tested as those provided by STL if nothing else because they are standard.
The STL works beautifully with built-in types. A std::array<int, 5> is exactly that -- an array of 5 ints, which consumes 20 bytes on a 32 bit platform.
java.util.Arrays.asList(1, 2, 3, 4, 5), on the other hand, returns a reference to an object containing a reference to an array containing references to Integer objects containing ints. Yes, that's 3 levels of indirection, and I don't dare predict how many bytes that consumes ;)
This is not a direct answer, but as you're coming from Java I'd like to point this out. By comparison to Java equivalents, STL is really fast.
I did find this page, showing some performance comparisons. Generally Java people are very touchy when it comes to performance conversations, and will claim that all kinds of advances are occurring all the time. However similar advances are also occurring in C/C++ compilers.
Keep in mind that STL is actually quite old now, so other, newer libraries may have specific advantages. Given the age, its' popularity is a testament to how good the original design was.
There are four main reasons why I'd say that STL is (still) awesome:
Speed
STL uses C++ templates, which means that the compiler generates code that is specifically tailored to your use of the library. For example, map will automagically generate a new class to implement a map collection of 'key' type to 'value' type. There is no runtime overhead where the library tries to work out how to efficiently store 'key' and 'value' - this is done at compile time. Due to the elegant design some operations on some types will compile down to single assembly instructions (e.g. increment integer-based iterator).
Efficiency
The collections classes have a notion of 'allocators', which you can either provide yourself or use the library-provided ones which allocate only enough storage to store your data. There is no padding nor wastage. Where a built-in type can be stored more efficiently, there are specializations to handle these cases optimally, e.g. vector of bool is handled as a bitfield.
Exensibility
You can use the Containers (collection classes), Algorithms and Functions provided in the STL on any type that is suitable. If your type can be compared, you can put it into a container. If it goes into a container, it can be sorted, searched, compared. If you provide a function like 'bool Predicate(MyType)', it can be filtered, etc.
Elegance
Other libraries/frameworks have to implement the Sort()/Find()/Reverse() methods on each type of collection. STL implements these as separate algorithms that take iterators of whatever collection you are using and operate blindly on that collection. The algorithms don't care whether you're using a Vector, List, Deque, Stack, Bag, Map - they just work.
Well, that is somewhat of a bold claim... perhaps in C++0x when it finally gets a hash map (in the form of std::unordered_map), it can make that claim, but in its current state, well, I don't buy that.
I can tell you, though, some cool things about it, namely that it uses templates rather than inheritance to achieve its level of flexibility and generality. This has both advantages and disadvantages; a disadvantage is that lots of code gets duplicated by the compiler, and any sort of dynamic runtime typing is very hard to achieve; however, a key advantage is that it is incredibly quick. Because each template specialization is really its own separate class generated by the compiler, it can be highly optimized for that class. Additionally, many of the STL algorithms that operate on STL containers have general definitions, but have specializations for special cases that result in incredibly good performance.
STL gives you the pieces.
Languages and their environments are built from smaller component pieces, sometimes via programming language constructs, sometimes via cut-and-paste. Some languages give you a sealed box - Java's collections, for instance. You can do what they allow, but woe betide you if you want to do something exotic with them.
The STL gives you the pieces that the designers used to build its more advanced functionality. Directly exposing the iterators, algorithms, etc. give you an abstract but highly flexible way of recombining core data structures and manipulations in whatever way is suitable for solving your problem. While Java's design probably hits the 90-95% mark for what you need from data structures, the STL's flexibility raises it to maybe 99%, with the iterator abstraction meaning you're not completely on your own for the remaining 1%.
When you combine that with its speed and other extensibility and customizabiltiy features (allocators, traits, etc.), you have a quite excellent package. I don't know that I'd call it the best data structures package, but certainly a very good one.
Warning: percentages totally made up.
Unique because it
focuses on basic algorithms instead of providing ready-to-use solutions to specific application problems.
uses unique C++ features to implement those algorithms.
As for being best... There is a reason why the same approach wasn't (and probably won't) ever followed by any other language, including direct descendants like D.
The standard C++ library's approach to collections via iterators has come in for some constructive criticism recently. Andrei Alexandrescu, a notable C++ expert, has recently begun working on a new version of a language called D, and describes his experiences designing collections support for it in this article.
Personally I find it frustrating that this kind of excellent work is being put into yet another programming language that overlaps hugely with existing languages, and I've told him so! :) I'd like someone of his expertise to turn their hand to producing a collections library for the so-called "modern languages" that are already in widespread use, Java and C#, that has all the capabilities he thinks are required to be world-class: the notion of a forward-iterable range is already ubiquitous, but what about reverse iteration exposed in an efficient way? What about mutable collections? What about integrating all this smoothly with Linq? etc.
Anyway, the point is: don't believe anyone who tells you that the standard C++ way is the holy grail, the best it could possibly be. It's just one way among many, and has at least one obvious drawback: the fact that in all the standard algorithms, a collection is specified by two separate iterators (begin and end) and hence is clumsy to compose operations on.
Obviously C++, C#, and Java can enter as many pissing contests as you want them to. The clue as to why the STL is at least somewhat great is that Java was initially designed and implemented without type-safe containers. Then Sun decided/realised people actually need them in a typed language, and added generics in 1.5.
You can compare the pros and cons of each, but as to which of the three languages has the "greatest" implementation of generic containers - that is solely a pissing contest. Greatest for what? In whose opinion? Each of them has the best libraries that the creators managed to come up with, subject to other constraints imposed by the languages. C++'s idea of generics doesn't work in Java, and type erasure would be sub-standard in typical C++ usage.
The primary thing is, you can use templates to make using containers switch-in/switch-out, without having to resort to the horrendous mess that is Java's interfaces.
If you fail to see what usage the STL has, I recommend buying a book, "The C++ Programming Language" by Bjarne Stroustrup. It pretty much explains everything there is about C++ because he's the dude who created it.
Does anyone know of a C++ data structure library providing functional (a.k.a. immutable, or "persistent" in the FP sense) equivalents of the familiar STL structures?
By "functional" I mean that the objects themselves are immutable, while modifications to those objects return new objects sharing the same internals as the parent object where appropriate.
Ideally, such a library would resemble STL, and would work well with Boost.Phoenix (caveat- I haven't actually used Phoenix, but as far as I can tell it provides many algorithms but no data structures, unless a lazily-computed change to an existing data structure counts - does it?)
I would look and see whether FC++ developed by Yannis Smaragdakis includes any data structures. Certainly this project more than any other is about supporting a functional style in C++.
This is more of a heads up than a detailed answer, but Bartosz Milewski appears to have done a lot of work on this. See, for example:
http://bartoszmilewski.com/2013/11/13/functional-data-structures-in-c-lists/
Looks like he's implemented a lot of algorithms from Okasiki's book Purely Functional Data Structures here:
https://github.com/BartoszMilewski/Okasaki
N.B. I haven't tried these yet, but they're the first C++ persistent data structures I've seen outside of FC++.
Hopefully, I'll get to trying them soon.