Design rationale behind STL - c++

I was looking at a few sources for STL implementations (SGI, STLport, libc++) and saw a few design patterns that seemd common to all or most implementations, but I could find no reason for. I assume there must be a good reson, and want to know what it is:
Many classes, including vector and list_iterator among others, were implemented as 2 classes, e.g. list_iterator_base with part of the functionality, and then list_iterator which inherits list_iterator_base with the rest of the interface. What is the point? It seems it could be done jut as easily in one class.
The iterators seem to not make use of the iterator class. Is there some performance penalty to using it?
Those are 2 questions I found in just a quick skim. If anyone knows of a good resource explaining the implementation rationale of a STL implementation, I will be happy to hear of it.

The answers are fairly straight forward:
STL is all about generic programming. The key idea is not to have duplicate code. The immediate goal is to not have duplicate source code but as it turns out it also makes sense to not duplicate binary code. Thus, it is quite common that STL components factor commonly used parts out and use them. The links for a list class or the type independent attributes of a vector are just two examples. For vectors there are even multiple layers: some parts are entirely independent of the type (e.g., the size), others only need the type itself (e.g., all the accessors, the iterators, etc.), and some parts need to know how to deal with resource allocation (e.g., insertion and destruction needs to know about the allocator being used).
It turns out that std::iterator<...> doesn't really work: The types defined in base classes depending on template parameters are not directly accessible in class template deriving from such a base. That is, the types need to be qualified with the base class and need to be marked as types using typename. To make matters worse, users could in theory allocate objects of the derived class and release them through a pointer to std::iterator<...> (yes, that would be a silly thing to do). That is, there is no benefit but a potential drawback, i.e., it is best avoided.
That said, I'm not aware of any good resource covering the techniques of implementing generic libraries. Most of the details applied in STL implementations were independently invented by multiple people but the literature on Generic Programming is still relatively scarce. I don't think that any of the papers describing STL actually discuss implementation techniques: They normally concentrate on design details. Given that only very few people seem to understand what STL is about, it isn't a big surprise that authors tend to concentrate on describing what STL is rather than how to implement it.

Related

Why isn't there a common base for the standard library containers?

Just out of interest...
If I were to design a library of containers, I would surely derive them from a common base class, which would have (maybe abstract) declarations of methods like size() and insert() .
Is there a strong reason for the standard library containers not to be implemented like that?
In C++, inheritance is used for run-time polymorphy (read: run-time interface reuse). It comes with the overhead of a redirection via the vtable at runtime.
In order to have the same interface for multiple container classes (so that the API is predictable and algorithms can make assumptions), there's no need for inheritance. Just give them the same methods and you're done. The containers (and algorithms) in C++ are implemented as templates, which means that you get compile-time polymorphy.
This avoids any run-time overhead, and it is in line with the C++ mantra "Only pay for what you need".
Java collections (which you probably have in mind) have a bunch of common methods implemented in AbstractCollection, that sit on top of the size() and iterator() methods implemented in the concrete class. Then IIRC there are more such methods in AbstractList and so on. Some subclasses override most of those methods, a few can get away with implementing little more than the required abstract methods. Some of the methods genuinely are implemented in common across most or all collections sharing an interface. For example the one that gives you a non-modifiable view is a total PITA to write, in my bitter experience.
C++ containers have a few member functions in common to all containers, pretty much none of which could be implemented the same way for all containers[*]. Where there are common algorithms that can be performed using an iterator (like find), they are free functions in <algorithm>, far from anywhere that inheritance gets a look-in.
There are also member functions common to all sequences, to all associative arrays, etc. But again for each of those concepts there isn't much in common between the implementations for different concrete data structures.
So ultimately in comes down to a question of design philosophy. Both Java and C++ have some code reuse in respect of containers. In C++ this code reuse comes through function templates in <algorithm>. In Java some of it comes through inheritance.
The main practical difference probably is that in Java your function can accept a java.util.Collection without knowing what kind of collection it is. In C++ a function template can accept any container, but a non-template function cannot. This affects the coding style of users, and is also informed by it - Java programmers are more likely to reach for runtime polymorphism than C++ programmers.
From an implementer's point of view, if you're writing the library of C++ containers then there is nothing to stop you from sharing private base classes between different containers in std, if you feel that will help reuse code.
[*] Now that size() is guaranteed O(1) in C++11, empty() could probably be implemented in common. In C++03 size() merely "should" be O(1), and there were implementations of std::list that took advantage of this to implement one of the versions of splice in O(1) instead of the O(n) it would take to update the size.
There is no good reason to introduce such base class. Who would benefit from it? Where would it be useful?
Algorithms that require a 'generic' interface use iterators. There is no common way of inserting elements to different containers. In fact, you cannot even insert elements to some containers.
It would make sense if you could write code that is independent of the container you are using. But that's not the case. I recommend you to read this chapter "Item 2: Beware the illusion of container-independent code" of the book Effective STL.
From another viewpoint: right now, if you use them in a generic way, the compiler has about all the information it can have, enabling thorough optimization. Thanks to template implementations.
Now, if for example both list and vector would implement an abstract OO interface like push_backable, and you would use an abstract pointer (push_backable*)->push_back(...) in your code, the compiler would lose a lot of information and thus opportunities for optimization.
These are typical operations that might appear in inner loops and you really want the maximal possible optimization for them. See also Frerich Raabe's answer.

Why were concepts (generic programming) conceived when we already had classes and interfaces?

Also on programmers.stackexchange.com:
I understand that STL concepts had to exist, and that it would be silly to call them "classes" or "interfaces" when in fact they're only documented (human) concepts and couldn't be translated into C++ code at the time, but when given the opportunity to extend the language to accomodate concepts, why didn't they simply modify the capabilities of classes and/or introduced interfaces?
Isn't a concept very similar to an interface (100% abstract class with no data)? By looking at it, it seems to me interfaces only lack support for axioms, but maybe axioms could be introduced into C++'s interfaces (considering an hypothetical adoption of interfaces in C++ to take over concepts), couldn't them? I think even auto concepts could easily be added to such a C++ interface (auto interface LessThanComparable, anyone?).
Isn't a concept_map very similar to the Adapter pattern? If all the methods are inline, the adapter essentially doesn't exist beyond compile time; the compiler simply replaces calls to the interface with the inlined versions, calling the target object directly during runtime.
I've heard of something called Static Object-Oriented Programming, which essentially means effectively reusing the concepts of object-orientation in generic programming, thus permitting usage of most of OOP's power without incurring execution overhead. Why wasn't this idea further considered?
I hope this is clear enough. I can rewrite this if you think I was not; just let me know.
There is a big difference between OOP and Generic Programming, Predestination.
In OOP, when you design the class, you had the interfaces you think will be useful. And it's done.
In Generic Programming, on the other hand, as long as the class conforms to a given set of requirements (mainly methods, but also inner constants or types), then it fits the bill and may be used. The Concept proposal is about formalizing this, so that detection may occur directly when checking the method signature, rather than when instantiating the method body. It also makes checking template methods more easily, since some methods can be rejected without any instantiation if the concepts do not match.
The advantage of Concepts is that you do not suffer from Predestination, you can pick a class from Library1, pick a method from Library2, and if it fits, you're gold (if it does not, you may be able to use a concept map). In OO, you are required to write a full-fledged Adapter, every time.
You are right that both seem similar. The difference is mainly about the time of binding (and the fact that Concept still have static dispatch instead of dynamic dispatch like with interfaces). Concepts are more open, thus easier to use.
Classes are a form of named conformance. You indicate that class Foo conforms with interface I by inheriting from I.
Concepts are a form of structural and/or runtime conformance. A class Foo does not need to state up front which concepts it conforms to.
The result is that named conformance reduces the ability to reuse classes in places that were not expected up front, even though they would be usable.
The concepts are in fact not part of C++, they are just concepts! In C++ there is no way to "define a concept". All you have is, templates and classes (STL being all template classes, as the name says: S tandard T emplate L ibrary).
If you mean C++0x and not C++ (in which case I suggest you change the tag), please read here:
http://en.wikipedia.org/wiki/Concepts_(C++)
Some parts I am going to copy-paste for you:
In the pending C++0x revision of the C++ programming language, concepts and the related notion of axioms were a proposed extension to C++'s template system, designed to improve compiler diagnostics and to allow programmers to codify in the program some formal properties of templates that they write. Incorporating these limited formal specifications into the program (in addition to improving code clarity) can guide some compiler optimizations, and can potentially help improve program reliability through the use of formal verification tools to check that the implementation and specification actually match.
In July 2009, the C++0x committee decided to remove concepts from the draft standard, as they are considered "not ready" for C++0x.
The primary motivation of the introduction of concepts is to improve the quality of compiler error messages.
So as you can see, concepts are not there to replace interfaces etc, they are just there to help the compiler optimize better and produce better errors.
While I agree with all the posted answers, they seem to have missed one point which is performance. Unlike interfaces, concepts are checked in compile-time and therefore don't require virtual function calls.

Fast dynamic casting progress

A little while ago, I found that very interesting paper on a very neat performance upgrade for dynamic_cast in C++: http://www2.research.att.com/~bs/fast_dynamic_casting.pdf.
Basically, it makes dynamic_cast in C++ way faster than the traditional research in inheritance tree. As stated in the paper, the method provides for a fast, constant-time dynamic casting algorithm.
This paper was published in 2005. Now, I am wondering if the technique was ever implemented somewhere or if there are plans to implement it anywhere?
I do not know what implementations various compilers use beside GCC (which isn't linear). However, it is important to stress that the paper does not necessarily propose a method that is always faster than existing implementations for all (or even common) usage. It proposes a general solution that is asymptotically better as inheritance hierarchies grow.
However, it is rarely a good design to have large inheritance hierarchies, as they tend to force the application to become monolithic and inflexible to change. Programs with flexible design tend to only have hierarchies mostly with 2 levels, an abstract base and an implementation of runtime polymorphic roles to support the Open/Closed Principle. In these cases, walking the inheritance graph can be as simple as a single pointer dereference and compare, which can be faster than the index-sum-then-dereference-then-compare presented by Gibbs and Stroustrup.
Also, it is important to stress that it is never necessary to write a program that uses dynamic_cast unless your own business rules require it. The use of dynamic_cast is always an indication that polymorphism is not being properly used and reuse is being compromised. If you need a behavior based on casting up a hierarchy, adding a virtual method gives the clean solution. If you have a code section that does dynamic_cast-checks on types, that section of code will never "close" (in the meaning of the Open/Closed Principle), and will need to be updated for every new type added to the system. A virtual dispatch, on the other hand, is added only on new types, allowing you to remain open to expansion and yet closing the behaviors operating on the base type.
So this is really a rather academic suggestion (equating to changing a map to a hash_map algorithmically) that shouldn't have real world effects if good design is followed. If business rules forbid good design (some shops may have code barriers or code ownership issues where you cannot change existing architectures the way they need to be, nor do they allow adaptors to be built as would commonly be used for 3rd party libraries), then it is best not to make the decision on which compiler to use based on what algorithm is implemented. As always, if performance is key and you have to use a feature like dynamic_cast, profile your code. It is possible (and likely in many cases) that the tree-walking implementation is faster in practice.
See also the standards committee's review of implementations, including dynamic_cast and a well-known look at c++ in embedded environments and good use (which mentions Gibbs and Stroustrup in passing).

What is so great about STL? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I am a Java developer trying to learn C++. I have many times read on the internet (including Stack Overflow) that STL is the best collections library that you can get in any language. (Sorry, I do not have any citations for that)
However after studying some STL, I am really failing to see what makes STL so special. Would you please shed some light on what sets STL apart from the collection libraries of other languages and make it the best collection library?
What is so great about the STL ?
The STL is great in that it was conceived very early and yet succeeded in using C++ generic programming paradigm quite efficiently.
It separated efficiently the data structures: vector, map, ... and the algorithms to operate on them copy, transform, ... taking advantage of templates to do so.
It neatly decoupled concerns and provided generic containers with hooks of customization (Comparator and Allocator template parameters).
The result is very elegant (DRY principle) and very efficient thanks to compiler optimizations so that hand-generated algorithms for a given container are unlikely to do better.
It also means that it is easily extensible: you can create your own container with the interface you wish, as long as it exposes STL-compliant iterators you'll be able to use the STL algorithms with it!
And thanks to the use of traits, you can even apply the algorithms on C-array through plain pointers! Talk about backward compatibility!
However, it could (perhaps) have been better...
What is not so great about the STL ?
It really pisses me off that one always have to use the iterators, I'd really stand for being able to write: std::foreach(myVector, [](int x) { return x+1;}); because face it, most of the times you want to iterate over the whole of the container...
But what's worse is that because of that:
set<int> mySet = /**/;
set<int>::const_iterator it = std::find(mySet.begin(), mySet.end(), 1005); // [1]
set<int>::const_iterator it = mySet.find(1005); // [2]
[1] and [2] are carried out completely differently, resulting in [1] having O(n) complexity while [2] has O(log n) complexity! Here the problem is that the iterators abstract too much.
I don't mean that iterators are not worthy, I just mean that providing an interface exclusively in terms of iterators was a poor choice.
I much prefer myself the idea of views over containers, for example check out what has been done with Boost.MPL. With a view you manipulate your container with a (lazy) layer of transformation. It makes for very efficient structures that allows you to filter out some elements, transform others etc...
Combining views and concept checking ideas would, I think, produce a much better interface for STL algorithms (and solve this find, lower_bound, upper_bound, equal_range issue).
It would also avoid common mistakes of using ill-defined ranges of iterators and the undefined behavior that result of it...
It's not so much that it's "great" or "the best collections library that you can get in *any* language", but it does have a different philosophy to many other languages.
In particular, the standard C++ library uses a generic programming paradigm, rather than an object-oriented paradigm that is common in languages like Java and C#. That is, you have a "generic" definition of what an iterator should be, and then you can implement the function for_each or sort or max_element that takes any class that implements the iterator pattern, without actually having to inherit from some base "Iterator" interface or whatever.
What I love about the STL is how robust it is. It is easy to extend it. Some complain that it's small, missing many common algorithms or iterators. But this is precisely when you see how easy it is to add in the missing components you need. Not only that, but small is beautiful: you have about 60 algorithms, a handful of containers and a handful of iterators; but the functionality is in the order of the product of these. The interfaces of the containers remain small and simple.
Because it's fashion to write small, simple, modular algorithms it gets easier to spot bugs and holes in your components. Yet, at the same time, as simple as the algorithms and iterators are, they're extraordinarily robust: your algorithms will work with many existing and yet-to-be-written iterators and your iterators work with many existing and yet-to-be-written algorithms.
I also love how simple the STL is. You have containers, you have iterators and you have algorithms. That's it (I'm lying here, but this is what it takes to get comfortable with the library). You can mix different algorithms with different iterators with different containers. True, some of these have constraints that forbid them from working with others, but in general there's a lot to play with.
Niklaus Wirth said that a program is algorithms plus data-structures. That's exactly what the STL is about. If Ruby and Python are string superheros, then C++ and the STL are an algorithms-and-containers superhero.
STL's containers are nice, but they're not much different than you'll find in other programming languages. What makes the STL containers useful is that they mesh beautifully with algorithms. The flexibility provided by the standard algorithms is unmatched in other programming languages.
Without the algorithms, the containers are just that. Containers. Nothing special in particular.
Now if you're talking about container libraries for C++ only, it is unlikely you will find libraries as well used and tested as those provided by STL if nothing else because they are standard.
The STL works beautifully with built-in types. A std::array<int, 5> is exactly that -- an array of 5 ints, which consumes 20 bytes on a 32 bit platform.
java.util.Arrays.asList(1, 2, 3, 4, 5), on the other hand, returns a reference to an object containing a reference to an array containing references to Integer objects containing ints. Yes, that's 3 levels of indirection, and I don't dare predict how many bytes that consumes ;)
This is not a direct answer, but as you're coming from Java I'd like to point this out. By comparison to Java equivalents, STL is really fast.
I did find this page, showing some performance comparisons. Generally Java people are very touchy when it comes to performance conversations, and will claim that all kinds of advances are occurring all the time. However similar advances are also occurring in C/C++ compilers.
Keep in mind that STL is actually quite old now, so other, newer libraries may have specific advantages. Given the age, its' popularity is a testament to how good the original design was.
There are four main reasons why I'd say that STL is (still) awesome:
Speed
STL uses C++ templates, which means that the compiler generates code that is specifically tailored to your use of the library. For example, map will automagically generate a new class to implement a map collection of 'key' type to 'value' type. There is no runtime overhead where the library tries to work out how to efficiently store 'key' and 'value' - this is done at compile time. Due to the elegant design some operations on some types will compile down to single assembly instructions (e.g. increment integer-based iterator).
Efficiency
The collections classes have a notion of 'allocators', which you can either provide yourself or use the library-provided ones which allocate only enough storage to store your data. There is no padding nor wastage. Where a built-in type can be stored more efficiently, there are specializations to handle these cases optimally, e.g. vector of bool is handled as a bitfield.
Exensibility
You can use the Containers (collection classes), Algorithms and Functions provided in the STL on any type that is suitable. If your type can be compared, you can put it into a container. If it goes into a container, it can be sorted, searched, compared. If you provide a function like 'bool Predicate(MyType)', it can be filtered, etc.
Elegance
Other libraries/frameworks have to implement the Sort()/Find()/Reverse() methods on each type of collection. STL implements these as separate algorithms that take iterators of whatever collection you are using and operate blindly on that collection. The algorithms don't care whether you're using a Vector, List, Deque, Stack, Bag, Map - they just work.
Well, that is somewhat of a bold claim... perhaps in C++0x when it finally gets a hash map (in the form of std::unordered_map), it can make that claim, but in its current state, well, I don't buy that.
I can tell you, though, some cool things about it, namely that it uses templates rather than inheritance to achieve its level of flexibility and generality. This has both advantages and disadvantages; a disadvantage is that lots of code gets duplicated by the compiler, and any sort of dynamic runtime typing is very hard to achieve; however, a key advantage is that it is incredibly quick. Because each template specialization is really its own separate class generated by the compiler, it can be highly optimized for that class. Additionally, many of the STL algorithms that operate on STL containers have general definitions, but have specializations for special cases that result in incredibly good performance.
STL gives you the pieces.
Languages and their environments are built from smaller component pieces, sometimes via programming language constructs, sometimes via cut-and-paste. Some languages give you a sealed box - Java's collections, for instance. You can do what they allow, but woe betide you if you want to do something exotic with them.
The STL gives you the pieces that the designers used to build its more advanced functionality. Directly exposing the iterators, algorithms, etc. give you an abstract but highly flexible way of recombining core data structures and manipulations in whatever way is suitable for solving your problem. While Java's design probably hits the 90-95% mark for what you need from data structures, the STL's flexibility raises it to maybe 99%, with the iterator abstraction meaning you're not completely on your own for the remaining 1%.
When you combine that with its speed and other extensibility and customizabiltiy features (allocators, traits, etc.), you have a quite excellent package. I don't know that I'd call it the best data structures package, but certainly a very good one.
Warning: percentages totally made up.
Unique because it
focuses on basic algorithms instead of providing ready-to-use solutions to specific application problems.
uses unique C++ features to implement those algorithms.
As for being best... There is a reason why the same approach wasn't (and probably won't) ever followed by any other language, including direct descendants like D.
The standard C++ library's approach to collections via iterators has come in for some constructive criticism recently. Andrei Alexandrescu, a notable C++ expert, has recently begun working on a new version of a language called D, and describes his experiences designing collections support for it in this article.
Personally I find it frustrating that this kind of excellent work is being put into yet another programming language that overlaps hugely with existing languages, and I've told him so! :) I'd like someone of his expertise to turn their hand to producing a collections library for the so-called "modern languages" that are already in widespread use, Java and C#, that has all the capabilities he thinks are required to be world-class: the notion of a forward-iterable range is already ubiquitous, but what about reverse iteration exposed in an efficient way? What about mutable collections? What about integrating all this smoothly with Linq? etc.
Anyway, the point is: don't believe anyone who tells you that the standard C++ way is the holy grail, the best it could possibly be. It's just one way among many, and has at least one obvious drawback: the fact that in all the standard algorithms, a collection is specified by two separate iterators (begin and end) and hence is clumsy to compose operations on.
Obviously C++, C#, and Java can enter as many pissing contests as you want them to. The clue as to why the STL is at least somewhat great is that Java was initially designed and implemented without type-safe containers. Then Sun decided/realised people actually need them in a typed language, and added generics in 1.5.
You can compare the pros and cons of each, but as to which of the three languages has the "greatest" implementation of generic containers - that is solely a pissing contest. Greatest for what? In whose opinion? Each of them has the best libraries that the creators managed to come up with, subject to other constraints imposed by the languages. C++'s idea of generics doesn't work in Java, and type erasure would be sub-standard in typical C++ usage.
The primary thing is, you can use templates to make using containers switch-in/switch-out, without having to resort to the horrendous mess that is Java's interfaces.
If you fail to see what usage the STL has, I recommend buying a book, "The C++ Programming Language" by Bjarne Stroustrup. It pretty much explains everything there is about C++ because he's the dude who created it.

Why is the STL so heavily based on templates instead of inheritance?

I mean, aside from its name the Standard Template Library (which evolved into the C++ standard library).
C++ initially introduce OOP concepts into C. That is: you could tell what a specific entity could and couldn't do (regardless of how it does it) based on its class and class hierarchy. Some compositions of abilities are more difficult to describe in this manner due to the complexities of multiple inheritance, and the fact that C++ supports interface-only inheritance in a somewhat clumsy way (compared to java, etc), but it's there (and could be improved).
And then templates came into play, along with the STL. The STL seems to take the classical OOP concepts and flush them down the drain, using templates instead.
There should be a distinction between cases when templates are used to generalize types where the types themselves are irrelevant for the operation of the template (containers, for examples). Having a vector<int> makes perfect sense.
However, in many other cases (iterators and algorithms), templated types are supposed to follow a "concept" (Input Iterator, Forward Iterator, etc...) where the actual details of the concept are defined entirely by the implementation of the template function/class, and not by the class of the type used with the template, which is a somewhat anti-usage of OOP.
For example, you can tell the function:
void MyFunc(ForwardIterator<...> *I);
Update: As it was unclear in the original question, ForwardIterator is ok to be templated itself to allow any ForwardIterator type. The contrary is having ForwardIterator as a concept.
expects a Forward Iterator only by looking at its definition, where you'd need either to look at the implementation or the documentation for:
template <typename Type> void MyFunc(Type *I);
Two claims I can make in favor of using templates: 1. Compiled code can be made more efficient, by recompiling the template for each used type, instead of using dynamic dispatch (mostly via vtables). 2. And the fact that templates can be used with native types.
However, I am looking for a more profound reason for abandoning classic OOP in favor of templating for the STL?
The short answer is "because C++ has moved on". Yes, back in the late 70's, Stroustrup intended to create an upgraded C with OOP capabilities, but that is a long time ago. By the time the language was standardized in 1998, it was no longer an OOP language. It was a multi-paradigm language. It certainly had some support for OOP code, but it also had a turing-complete template language overlaid, it allowed compile-time metaprogramming, and people had discovered generic programming. Suddenly, OOP just didn't seem all that important. Not when we can write simpler, more concise and more efficient code by using techniques available through templates and generic programming.
OOP is not the holy grail. It's a cute idea, and it was quite an improvement over procedural languages back in the 70's when it was invented. But it's honestly not all it's cracked up to be. In many cases it is clumsy and verbose and it doesn't really promote reusable code or modularity.
That is why the C++ community is today far more interested in generic programming, and why everyone is finally starting to realize that functional programming is quite clever as well. OOP on its own just isn't a pretty sight.
Try drawing a dependency graph of a hypothetical "OOP-ified" STL. How many classes would have to know about each other? There would be a lot of dependencies. Would you be able to include just the vector header, without also getting iterator or even iostream pulled in? The STL makes this easy. A vector knows about the iterator type it defines, and that's all. The STL algorithms know nothing. They don't even need to include an iterator header, even though they all accept iterators as parameters. Which is more modular then?
The STL may not follow the rules of OOP as Java defines it, but doesn't it achieve the goals of OOP? Doesn't it achieve reusability, low coupling, modularity and encapsulation?
And doesn't it achieve these goals better than an OOP-ified version would?
As for why the STL was adopted into the language, several things happened that led to the STL.
First, templates were added to C++. They were added for much the same reason that generics were added to .NET. It seemed a good idea to be able to write stuff like "containers of a type T" without throwing away type safety. Of course, the implementation they settled on was quite a lot more complex and powerful.
Then people discovered that the template mechanism they had added was even more powerful than expected. And someone started experimenting with using templates to write a more generic library. One inspired by functional programming, and one which used all the new capabilities of C++.
He presented it to the C++ language committee, who took quite a while to grow used to it because it looked so strange and different, but ultimately realized that it worked better than the traditional OOP equivalents they'd have to include otherwise. So they made a few adjustments to it, and adopted it into the standard library.
It wasn't an ideological choice, it wasn't a political choice of "do we want to be OOP or not", but a very pragmatic one. They evaluated the library, and saw that it worked very well.
In any case, both of the reasons you mention for favoring the STL are absolutely essential.
The C++ standard library has to be efficient. If it is less efficient than, say, the equivalent hand-rolled C code, then people would not use it. That would lower productivity, increase the likelihood of bugs, and overall just be a bad idea.
And the STL has to work with primitive types, because primitive types are all you have in C, and they're a major part of both languages. If the STL did not work with native arrays, it would be useless.
Your question has a strong assumption that OOP is "best". I'm curious to hear why. You ask why they "abandoned classical OOP". I'm wondering why they should have stuck with it. Which advantages would it have had?
The most direct answer to what I think you're asking/complaining about is this: The assumption that C++ is an OOP language is a false assumption.
C++ is a multi-paradigm language. It can be programmed using OOP principles, it can be programmed procedurally, it can be programmed generically (templates), and with C++11 (formerly known as C++0x) some things can even be programmed functionally.
The designers of C++ see this as an advantage, so they would argue that constraining C++ to act like a purely OOP language when generic programming solves the problem better and, well, more generically, would be a step backwards.
My understanding is that Stroustrup originally preferred an "OOP-styled" container design, and in fact didn't see any other way to do it. Alexander Stepanov is the one responsible for the STL, and his goals did not include "make it object oriented":
That is the fundamental point: algorithms are defined on algebraic structures. It took me another couple of years to realize that you have to extend the notion of structure by adding complexity requirements to regular axioms. ... I believe that iterator theories are as central to Computer Science as theories of rings or Banach spaces are central to Mathematics. Every time I would look at an algorithm I would try to find a structure on which it is defined. So what I wanted to do was to describe algorithms generically. That's what I like to do. I can spend a month working on a well known algorithm trying to find its generic representation. ...
STL, at least for me, represents the only way programming is possible. It is, indeed, quite different from C++ programming as it was presented and still is presented in most textbooks. But, you see, I was not trying to program in C++, I was trying to find the right way to deal with software. ...
I had many false starts. For example, I spent years trying to find some use for inheritance and virtuals, before I understood why that mechanism was fundamentally flawed and should not be used. I am very happy that nobody could see all the intermediate steps - most of them were very silly.
(He does explain why inheritance and virtuals -- a.k.a. object oriented design "was fundamentally flawed and should not be used" in the rest of the interview).
Once Stepanov presented his library to Stroustrup, Stroustrup and others went through herculean efforts to get it into the ISO C++ standard (same interview):
The support of Bjarne Stroustrup was crucial. Bjarne really wanted STL in the standard and if Bjarne wants something, he gets it. ... He even forced me to make changes in STL that I would never make for anybody else ... he is the most single minded person I know. He gets things done. It took him a while to understand what STL was all about, but when he did, he was prepared to push it through. He also contributed to STL by standing up for the view that more than one way of programming was valid - against no end of flak and hype for more than a decade, and pursuing a combination of flexibility, efficiency, overloading, and type-safety in templates that made STL possible. I would like to state quite clearly that Bjarne is the preeminent language designer of my generation.
The answer is found in this interview with Stepanov, the author of the STL:
Yes. STL is not object oriented. I
think that object orientedness is
almost as much of a hoax as Artificial
Intelligence. I have yet to see an
interesting piece of code that comes
from these OO people.
Why a pure OOP design to a Data Structure & Algorithms Library would be better ?!
OOP is not the solution for every thing.
IMHO, STL is the most elegant library I have seen ever :)
for your question,
you don't need runtime polymorphism, it is an advantage for STL actually to implement the Library using static polymorphism, that means efficiency.
Try to write a generic Sort or Distance or what ever algorithm that applies to ALL containers!
your Sort in Java would call functions that are dynamic through n-levels to be executed!
You need stupid thing like Boxing and Unboxing to hide nasty assumptions of the so called Pure OOP languages.
The only problem I see with STL, and templates in general is the awful error messages.
Which will be solved using Concepts in C++0X.
Comparing STL to Collections in Java is Like comparing Taj Mahal to my house :)
templated types are supposed to follow
a "concept" (Input Iterator, Forward
Iterator, etc...) where the actual
details of the concept are defined
entirely by the implementation of the
template function/class, and not by
the class of the type used with the
template, which is a somewhat
anti-usage of OOP.
I think you misunderstand the intended use of concepts by templates. Forward Iterator, for example, is a very well-defined concept. To find the expressions which must be valid in order for a class to be a Forward Iterator, and their semantics including computational complexity, you look at the standard or at http://www.sgi.com/tech/stl/ForwardIterator.html (you have to follow the links to Input, Output, and Trivial Iterator to see it all).
That document is a perfectly good interface, and "the actual details of the concept" are defined right there. They are not defined by the implementations of Forward Iterators, and neither are they defined by the algorithms which use Forward Iterators.
The differences in how interfaces are handled between STL and Java are three-fold:
1) STL defines valid expressions using the object, whereas Java defines methods which must be callable on the object. Of course a valid expression might be a method (member function) call, but it doesn't have to be.
2) Java interfaces are runtime objects, whereas STL concepts are not visible at runtime even with RTTI.
3) If you fail to make valid the required valid expressions for an STL concept, you get an unspecified compilation error when you instantiate some template with the type. If you fail to implement a required method of a Java interface, you get a specific compilation error saying so.
This third part is if you like a kind of (compile-time) "duck typing": interfaces can be implicit. In Java, interfaces are somewhat explicit: a class "is" Iterable if and only if it says it implements Iterable. The compiler can check that the signatures of its methods are all present and correct, but the semantics are still implicit (i.e. they're either documented or not, but only more code (unit tests) can tell you whether the implementation is correct).
In C++, like in Python, both semantics and syntax are implicit, although in C++ (and in Python if you get the strong-typing preprocessor) you do get some help from the compiler. If a programmer requires Java-like explicit declaration of interfaces by the implementing class, then the standard approach is to use type traits (and multiple inheritance can prevent this being too verbose). What's lacking, compared with Java, is a single template which I can instantiate with my type, and which will compile if and only if all the required expressions are valid for my type. This would tell me whether I've implemented all the required bits, "before I use it". That's a convenience, but it's not the core of OOP (and it still doesn't test semantics, and code to test semantics would naturally also test the validity of the expressions in question).
STL may or may not be sufficiently OO for your taste, but it certainly separates interface cleanly from implementation. It does lack Java's ability to do reflection over interfaces, and it reports breaches of interface requirements differently.
you can tell the function ... expects a Forward Iterator only by
looking at its definition, where you'd need either to look at the
implementation or the documentation for ...
Personally I think that implicit types are a strength, when used appropriately. The algorithm says what it does with its template parameters, and the implementer makes sure those things work: it's exactly the common denominator of what "interfaces" should do. Furthermore with STL, you're unlikely to be using, say, std::copy based on finding its forward declaration in a header file. Programmers should be working out what a function takes based on its documentation, not just on the function signature. This is true in C++, Python, or Java. There are limitations on what can be achieved with typing in any language, and trying to use typing to do something it doesn't do (check semantics) would be an error.
That said, STL algorithms usually name their template parameters in a way which makes it clear what concept is required. However this is to provide useful extra information in the first line of the documentation, not to make forward declarations more informative. There are more things you need to know than can be encapsulated in the types of the parameters, so you have to read the docs. (For example in algorithms which take an input range and an output iterator, chances are the output iterator needs enough "space" for a certain number of outputs based on the size of the input range and maybe the values therein. Try strongly typing that.)
Here's Bjarne on explicitly-declared interfaces: http://www.artima.com/cppsource/cpp0xP.html
In generics, an argument must be of a
class derived from an interface (the
C++ equivalent to interface is
abstract class) specified in the
definition of the generic. That means
that all generic argument types must
fit into a hierarchy. That imposes
unnecessary constraints on designs
requires unreasonable foresight on the
part of developers. For example, if
you write a generic and I define a
class, people can't use my class as an
argument to your generic unless I knew
about the interface you specified and
had derived my class from it. That's
rigid.
Looking at it the other way around, with duck typing you can implement an interface without knowing that the interface exists. Or someone can write an interface deliberately such that your class implements it, having consulted your docs to see that they don't ask for anything you don't already do. That's flexible.
"OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them." - Alan Kay, creator of Smalltalk.
C++, Java, and most other languages are all pretty far from classical OOP. That said, arguing for ideologies is not terribly productive. C++ is not pure in any sense, so it implements functionality that seems to make pragmatic sense at the time.
STL started off with the intention of provide a large library covering most commonly used algorithm -- with the target of consitent behavior and performance. Template came as a key factor to make that implementation and target feasible.
Just to provide another reference:
Al Stevens Interviews Alex Stepanov, in March 1995 of DDJ:
http://www.sgi.com/tech/stl/drdobbs-interview.html
Stepanov explained his work experience and choice made towards a large library of algorithm, which eventually evolved into STL.
Tell us something about your long-term interest in generic programming
.....Then I was offered a job at Bell Laboratories working in the C++ group on C++ libraries. They asked me whether I could do it in C++. Of course, I didn't know C++ and, of course, I said I could. But I couldn't do it in C++, because in 1987 C++ didn't have templates, which are essential for enabling this style of programming. Inheritance was the only mechanism to obtain genericity and it was not sufficient.
Even now C++ inheritance is not of much use for generic programming. Let's discuss why. Many people have attempted to use inheritance to implement data structures and container classes. As we know now, there were few if any successful attempts. C++ inheritance, and the programming style associated with it are dramatically limited. It is impossible to implement a design which includes as trivial a thing as equality using it. If you start with a base class X at the root of your hierarchy and define a virtual equality operator on this class which takes an argument of the type X, then derive class Y from class X. What is the interface of the equality? It has equality which compares Y with X. Using animals as an example (OO people love animals), define mammal and derive giraffe from mammal. Then define a member function mate, where animal mates with animal and returns an animal. Then you derive giraffe from animal and, of course, it has a function mate where giraffe mates with animal and returns an animal. It's definitely not what you want. While mating may not be very important for C++ programmers, equality is. I do not know a single algorithm where equality of some kind is not used.
The basic problem with
void MyFunc(ForwardIterator *I);
is how do you safely get the type of the thing the iterator returns? With templates, this is done for you at compile time.
For a moment, let's think of the standard library as basically a database of collections and algorithms.
If you've studied the history of databases, you undoubtedly know that back in the beginning, databases were mostly "hierarchical". Hierarchical databases corresponded very closely to classical OOP--specifically, the single-inheritance variety, such as used by Smalltalk.
Over time, it became apparent that hierarchical databases could be used to model almost anything, but in some cases the single-inheritance model was fairly limiting. If you had a wooden door, it was handy to be able to look at it either as a door, or as a piece of some raw material (steel, wood, etc.)
So, they invented network model databases. Network model databases correspond very closely to multiple inheritance. C++ supports multiple inheritance completely, while Java supports a limited form (you can inherit from only one class, but can also implement as many interfaces as you like).
Both hierarchical model and network model databases have mostly faded from general purpose use (though a few remain in fairly specific niches). For most purposes, they've been replaced by relational databases.
Much of the reason relational databases took over was versatility. The relational model is functionally a superset of the network model (which is, in turn, a superset of the hierarchical model).
C++ has largely followed the same path. The correspondence between single inheritance and the hierarchical model and between multiple inheritance and the network model are fairly obvious. The correspondence between C++ templates and the hierarchical model may be less obvious, but it's a pretty close fit anyway.
I haven't seen a formal proof of it, but I believe the capabilities of templates are a superset of those provided by multiple inheritance (which is clearly a superset of single inerhitance). The one tricky part is that templates are mostly statically bound--that is, all the binding happens at compile time, not run time. As such, a formal proof that inheritance provides a superset of the capabilities of inheritance may well be somewhat difficult and complex (or may even be impossible).
In any case, I think that's most of the real reason C++ doesn't use inheritance for its containers--there's no real reason to do so, because inheritance provides only a subset of the capabilities provided by templates. Since templates are basically a necessity in some cases, they might as well be used nearly everywhere.
This question has many great answers. It should also be mentioned that templates supports an open design. With the current state of object oriented programming languages, one has to use the visitor pattern when dealing with such problems, and true OOP should support multiple dynamic binding. See Open Multi-Methods for C++, P. Pirkelbauer, et.al. for very intersting reading.
Another interesting point of templates are that they can be used on for runtime polymorphism as well. For example
template<class Value,class T>
Value euler_fwd(size_t N,double t_0,double t_end,Value y_0,const T& func)
{
auto dt=(t_end-t_0)/N;
for(size_t k=0;k<N;++k)
{y_0+=func(t_0 + k*dt,y_0)*dt;}
return y_0;
}
Notice that this function will also work if Value is a vector of some kind (not std::vector, which should be called std::dynamic_array to avoid confusion)
If func is small, this function will gain a lot from inlining. Example usage
auto result=euler_fwd(10000,0.0,1.0,1.0,[](double x,double y)
{return y;});
In this case, you should know the exact answer (2.718...), but it is easy to construct a simple ODE without elementary solution (Hint: use a polynomial in y).
Now, you have a large expression in func, and you use the ODE solver in many places, so your executable gets polluted with template instantiations everywhere. What to do? First thing to notice is that a regular function pointer works. Then you want to add currying so you write an interface and an explicit instantiation
class OdeFunction
{
public:
virtual double operator()(double t,double y) const=0;
};
template
double euler_fwd(size_t N,double t_0,double t_end,double y_0,const OdeFunction& func);
But the above instantiation only works for double, why not write the interface as template:
template<class Value=double>
class OdeFunction
{
public:
virtual Value operator()(double t,const Value& y) const=0;
};
and specialize for some common value types:
template double euler_fwd(size_t N,double t_0,double t_end,double y_0,const OdeFunction<double>& func);
template vec4_t<double> euler_fwd(size_t N,double t_0,double t_end,vec4_t<double> y_0,const OdeFunction< vec4_t<double> >& func); // (Native AVX vector with four components)
template vec8_t<float> euler_fwd(size_t N,double t_0,double t_end,vec8_t<float> y_0,const OdeFunction< vec8_t<float> >& func); // (Native AVX vector with 8 components)
template Vector<double> euler_fwd(size_t N,double t_0,double t_end,Vector<double> y_0,const OdeFunction< Vector<double> >& func); // (A N-dimensional real vector, *not* `std::vector`, see above)
If the function had been designed around an interface first, then you would have been forced to inherit from that ABC. Now you have this option, as well as function pointer, lambda, or any other function object. The key here is that we must have operator()(), and we must be able to do use some arithmetic operators on its return type. Thus, the template machinery would break in this case if C++ did not have operator overloading.
How do you do comparisons with ForwardIterator*'s? That is, how do you check if the item you have is what you're looking for, or you've passed it by?
Most of the time, I would use something like this:
void MyFunc(ForwardIterator<MyType>& i)
which means I know that i is pointing to MyType's, and I know how to compare those. Though it looks like a template, it isn't really (no "template" keyword).
The concept of separating interface from interface and being able to swap out the implementations is not intrinsic to Object-Oriented Programming. I believe it's an idea that was hatched in Component-Based Development like Microsoft COM. (See my answer on What is Component-Driven Development?) Growing up and learning C++, people were hyped out inheritance and polymorphism. It wasn't until 90s people started to say "Program to an 'interface', not an 'implementation'" and "Favor 'object composition' over 'class inheritance'." (both of which quoted from GoF by the way).
Then Java came along with built-in garbage collector and interface keyword, and all of a sudden it became practical to actually separate interface and implementation. Before you know it the idea became part of the OO. C++, templates, and STL predates all of this.