I used C++ vectors to implement stacks, queue, heaps, priority queue and directed weighted graphs. In the books and references, I have seen big classes for these data structures, all of which can be implemented in short using vectors. (May be there is more flexibility in using pointers)
Can we also implement even advanced data structures using vectors ?
If yes, why do C++ books still explain concepts with the long classes using pointers ?
Is it to keep in mind the lower level idea, if it is more vivid that way or it makes students equipped with such usage of pointers ?
It's true that many data structures can be implemented on top of a vector (array, for the sake of this answer), essentially all of them can, since every computation task can be implemented to run on a turing-machine which has a far more basic data access capability (or, in the real world, you may say that any program you implement with pointers eventually runs on a CPU with a simply array-like virtual memory space, so you could just call that a huge array). However, it's not always clever. Two main reasons :
performance / time complexity - a vector simply can't provide all basic operations that in O(1). There's a solution for fast initialization, but try to randomly insert values into a large vector and see how bad you perform - that's because you have to move all the elements by one place over and over. A list could do that in a single operation. Of course other structures have their own performance shortcomings, but that's the beauty of designing complicated data structures with these basic building blocks.
structural complexity - you can think of a list along the same line of a vector as an ordered container, and perhaps extend this into multidimensional matrices that can be implemented on top of them since they still retain some basic ordering, but there are more complicated structures. Take for e.g. a tree, a simple full binary tree one can be implemented with a vector very easily since the parent-child relations can be easily converted to index arithmetics, but what if the tree isn't full and has varying number of children per node? Now, you may say it can still be done (any graph can be implemented with vectors either through adjacency matrix or adjacency list for e.g.), but there's almost no sense in doing so when you can have a much simpler implementation using pointer links. Just think of doing an AVL roll with an array. :shudder:
Mind you that the second argument may very well boil down to performance ("hey, it's an awkward approach but I still managed to use a vector!"), but it's more than that - it would complicate your code, clutter your data structure design, and could make it far more prone to bugs.
Now, here comes the "but" - even though there's much sense in using all the possible tools the language provides you, it's very widely accepted to use vector-based structures for performance critical tasks. See almost all scientific CPU benchmarks, most of them ultimately rely on vectors (uncited, but I can elaborate further if anyone is interested. Suffice to say that even the well-known *graph*500 does that).
The reason is not that it's best programming practice, but that it's more suited with CPU internal structure and gets more "juice" out of the HW. That's due to spatial locality - CPUs are very fond of that as it allows the memory unit to parallelize accesses (in an array you always know where's the next element, in a list you have to wait until the current one is fetched), and also issue stream/stride prefetches that reduce latency of future requests.
I can't say this is always a good practice, when you run through a graph the accesses are still pretty irregular even if you use an array implementation, but it's still a very common practice.
To summarize, taking the question literally - most of them can, of sorts (for a given definition of "most", ok?), but if the intention was "why teach pointers", I believe you can see that in order to understand your limits and what you can and should use - you need to know a great deal more than just arrays and even pointers. A good programmer should know a bit about everything - OS design, CPU design, etc. You can't do anything decent unless you really understand the fabric you're running on, and that unfortunately (or not) includes lots of pointers
You can implement a kind of allocator using an std::vector as the backing store. If you do that, all the standard data structures from elementary computer science can be implemented on top of vectors. It will hardly free you from using pointers, though: vectors are really just chunks of memory with a few useful additional operations, most notably the ability to expand.
More to the point: if you don't understand pointers, you won't understand how to do use vector for advanced data structures either. vector is a useful abstraction, but it follows the C++ rule that "you don't get what you don't pay for", so it's also a very "thin" abstraction, and you do pay for the cost of abstraction in terms of the amount of code you have to write.
(Jonathan Wakely points out, in the comments, that you won't get the exact guarantees that the C++ standard library requires of allocators data structures when you implement them on top of vector. Put in principle, vectors are just a way of handling blocks of memory.)
If you are learning C++ you need to be familiar with pointers and how to use them even if there are more higher level concepts that does that job for you.
Yes, it is possible to implement most data structures with vectors or lists and if you just started learning programming it's probably a good idea that you'll know how to write these data structures yourself.
With that being said, production code should always use the standard library unless there is a good reason not to do so.
Related
What's the need to go for defining and implementing data structures (e.g. stack) ourselves if they are already available in C++ STL?
What are the differences between the two implementations?
First, implementing by your own an existing data structure is a useful exercise. You understand better what it does (so you can understand better what the standard containers do). In particular, you understand better why time complexity is so important.
Then, there is a quality of implementation issue. The standard implementation might not be suitable for you.
Let me give an example. Indeed, std::stack is implementing a stack. It is a general-purpose implementation. Have you measured sizeof(std::stack<char>)? Have you benchmarked it, in the case of a million of stacks of 3.2 elements on average with a Poisson distribution?
Perhaps in your case, you happen to know that you have millions of stacks of char-s (never NUL), and that 99% of them have less than 4 elements. With that additional knowledge, you probably should be able to implement something "better" than what the standard C++ stack provides. So std::stack<char> would work, but given that extra knowledge you'll be able to implement it differently. You still (for readability and maintenance) would use the same methods as in std::stack<char> - so your WeirdSmallStackOfChar would have a push method, etc. If (later during the project) you realize or that bigger stack might be useful (e.g. in 1% of cases) you'll reimplement your stack differently (e.g. if your code base grow to a million lines of C++ and you realize that you have quite often bigger stacks, you might "remove" your WeirdSmallStackOfChar class and add typedef std::stack<char> WeirdSmallStackOfChar; ....)
If you happen to know that all your stacks have less than 4 char-s and that \0 is not valid in them, representing such "stack"-s as a char w[4] field is probably the wisest approach. It is fast and easy to code.
So, if performance and memory space matters, you might perhaps code something as weird as
class MyWeirdStackOfChars {
bool small;
union {
std::stack<char>* bigstack;
char smallstack[4];
}
Of course, that is very incomplete. When small is true your implementation uses smallstack. For the 1% case where it is false, your implemention uses bigstack. The rest of MyWeirdStackOfChars is left as an exercise (not that easy) to the reader. Don't forget to follow the rule of five.
Ok, maybe the above example is not convincing. But what about std::map<int,double>? You might have millions of them, and you might know that 99.5% of them are smaller than 5. You obviously could optimize for that case. It is highly probable that representing small maps by an array of pairs of int & double is more efficient both in terms of memory and in terms of CPU time.
Sometimes, you even know that all your maps have less than 16 entries (and std::map<int,double> don't know that) and that the key is never 0. Then you might represent them differently. In that case, I guess that I am able to implement something much more efficient than what std::map<int,double> provides (probably, because of cache effects, an array of 16 entries with an int and a double is the fastest).
That is why any developer should know the classical algorithms (and have read some Introduction to Algorithms), even if in many cases he would use existing containers. Be also aware of the as-if rule.
STL implementation of Data Structures is not perfect for every possible use case.
I like the example of hash tables. I have been using STL implementation for a while, but I use it mainly for Competitive Programming contests.
Imagine that you are Google and you have billions of dollars in resources destined to storing and accessing hash tables. You would probably like to have the best possible implementation for the company use cases, since it will save resources and make search faster in general.
Oh, and I forgot to mention that you also have some of the best engineers on the planet working for you (:
(This video is made by Kulukundis talking about the new hash table made by his team at Google )
https://www.youtube.com/watch?v=ncHmEUmJZf4
Some other reasons that justify implementing your own version of Data Structures:
Test your understanding of a specific structure.
Customize part of the structure to some peculiar use case.
Seek better performance than STL for a specific data structure.
Hating STL errors.
Benchmarking STL against some simple implementation.
I'm implementing C++ code communicating with hardware which runs a number of hardware-assisted data structures (direct access tables, and search trees). So I need to maintain a local cache which would store data before pushing it down on the hardware.
I think to replicate H/W tree structure I could choose std::map, but what about direct table (basically it is implemented as a sequential array of results and allows direct-access lookups)?
Are there close enough analogues in STL to implement such structures or simple array would suffice?
Thanks.
If you are working with hardware structures, you are probably best off mimicking the structures as exactly as possible using C structs and C arrays.
This will give you the ability to map the hardware structure as exactly as possible and to move the data around with a simple memcpy.
The STL will probably not be terribly useful since it does lots of stuff behind the scenes and you have no control of the memory layout. This will mean that each write to hardware will involve a complex serialization exercise that you will probably want to avoid.
I believe you're looking for std::vector. Or, if the size is known at compile time, std::array (since C++11).
C++11 has an unordered-map, and unordered-set, which are analogous to a hash table. Maps are faster for iteration, while sets are faster for look up.
But first you should run a profiler to see if your data-structures are what slows your program down
According to Bjarne Stroustrup's slides from his Going Native 2012 keynote, insertion and deletion in a std::list are terribly inefficient on modern hardware:
Vector beats list massively for insertion and deletion
If this is indeed true, what use cases are left for std::list? Shouldn't it be deprecated then?
Vector and list solve different problems. List provides the guarantee that iterators never become invalidated as you insert and remove other elements. Vector doesn't make that guarantee.
Its not all about performance. So the answer is no. List should not be deprecated.
Edit Beyond this, C++ isn't designed to work solely on "modern hardware." It is intended to be useful across a much wider range of hardware than that. I'm a programmer in the financial industries and I use C++, but other domains such as embedded devices, programmable controllers, heart-lung machines and myriad others are just as important. The C++ language should not be designed solely with the needs of certain domains and the performance of certain classes of hardware in mind. Just because I might not use a list doesn't mean it should be deprecated from the language.
Whether a vector outperforms a list or not also depends on the type of the elements. For example, for int elements vector is indeed very fast as most of the data fits inside the CPU cache and SIMD instructions can be used for the data copying. So the O(n) complexity of vector doesn't have much impact.
But what about larger data types, where copying doesn't translate to a stream operation, and instead data must be fetched from all over the place? Also, what about hardware that doesn't have large CPU caches and possibly also lacks SIMD instructions? C++ is used on much more than just modern desktop and workstation machines. Deprecating std::list is out of the question.
What Stroustrup is saying in that presentation is that before you pick std::list for your data, you should make sure that it's the right choice for your particular situation. In other words, benchmark and profile. It definitely doesn't say you should always pick std::vector.
No, and especially not based on one particular graph. There are instances where list will perform better than vector. See: http://www.baptiste-wicht.com/2012/12/cpp-benchmark-vector-list-deque/
And that's ignoring the non-performance differences, as others have mentioned.
Bjarne's point in that talk wasn't that you shouldn't use list. It was that people make too many assumptions about list's performance that often turn out to be wrong. He was simply justifying the stance that vector should always be your default go-to container type unless you actually find a need for the performance or other semantic characteristics of lists.
std::list is a deque, it has push_front() and pop_front(). It still has a niche role as such, though it may not be the best choice for a deque.
std::list does not reallocate memory, while std::vector may. Sometimes you don't want an item to move in memory (e.g. a stackful coroutine).
Linked lists are related to tree data structures. Both contain links. If we deprecate std::list, then what about tree-based containers?
Of course not. std::list is a different data structure. Comparing different data structure is good indication of its properties, advantages or disadvantages. But each data structure has its advantage.
I am using C++ to code some complicated FFT algorithm, so I need to implement such algebraic structures as quaternions and Hamilton-Eisenstein codes. Algorithm works with 2D array of that structures. What would be the overhead of implementing them as classes? In other way, should I create the array with [M][N] dimensions which consists of Quaternion classes, or should I create [M][N][4] array and work with [4] arrays as quaternions? Using classes is more convenient, but creating M*N classes and accessing their methods instead of working with just array - wouldn't that be too much overhead? I'm coding the algorithm for large images processing, so performance is important for me.
IMHO you are better served by implementing them as classes simply because this will let you write your code quicker with less errors. You should do measurements to see what performs best if that is important for you, but also make sure that it is actually this code that is the performance bottleneck. (Mandatory Donald Knuth quote: "premature optimization is the root of all evil").
Most compilers will do a very good job at optimizing code for you, I would say. More often than not I find that it is something else than these low level things that make a difference, like adding an early-out test or minimizing the dataset or whatnot.
For a quaternion, you can still implement the class using an array internally (in case that is actually faster), which should make the difference even less important.
You are probably better served by for instance making sure that you can run your algorithms in parallell on multicore machines or make your actual calculations use SSE instructions.
Regarding overhead of classes: Unless your classes have virtual functions, there's no penalty for using classes.
So, for example, an array of complex variables may be written as:
std::complex<double> m[10][10];
Beware of STL collection classes, though, as they tend to use dynamic allocation and sometimes introduce significant overhead (i.e., I wouldn't make arrays using vector< vector<> >.
You might want to investigate the use of a library such as Eigen for fast, optimized, matrix/vector classes.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
My question is mostly about STL than the rest of the C++ that can be compared (I guess) to be as much fast as C a long as classes aren't used at every corner.
STL is standard for games and in engines like OGRE3D, but I was wondering that if STL's features are nice to use, the problem is while I don't really know how they work, I should know first what features can cause serious hogs before using them.
I'm very excited to begin that game programming school, and apparently there is no way I am going to not use those advanced features.
Using STL tends to generate as good if not more efficient code than hand written code for many cases.
Use a profiler to see where you have problems.
Even where C++ STL might perform worse its code is likely to be less error prone. So only write code if the profiler shows there is an issue
1) Debug builds. Severaly slow down many stl containers due to excessive error checking. At least on Microsoft compilers.
2) Excessive dynamic memory allocation. If you have routine that contains std::vector within it AND if you'll call it few thousand times per frame, it will be very slow and bottleneck will be somewhere within operator new or another memory allocation routine. If you'll turn this vector into some kind of static buffer (so you won't have to recreate it every time), it will be much faster. Memory allocation is slow. If you have a buffer, it is normally better to reuse it instead of making a new one during each call.
3) Using inefficient algorithms. For example, using linear search instead of binary search, using wrong sort algorithms (for example, quick sort, heap sort are faster than bubble sort on unsorted data but insertion sort can be faster than quicksort on partially sorted data). Searching instead of using std::map, and so on.
4) Using wrong kind of container. For example, std::vector isn't suitable for inserting elements at random places. std::deque, while is comparable with std::vector (random access), allows fast push_front and push_back, can be 10 times slower than std::vector if you subtract two iterators (on MSVC, again).
Unless you're building the entire engine from scratch, you're not going to notice a difference in using or not using C++ classes or STL. In general, most of the time the CPU is going to be running code that's not even written by you anyway. Plus the overhead imposed by any scripting languages you implement (Lua, Python, etc) will eclipse any small performance penalty you may incur by using C++/STL.
In short, don't worry about it. Writing good OOP code is better than trying to write super-fast code from the get-go. "Premature optimization is the root of much programming evil."
Actually, you can use classes everywhere, and still get as good of performance as C (and often better performance than typical C).
Most STL is designed to do most of its "tricky" parts at compile time, so the run-time performance is excellent. The main thing to look out for (especially if you write for things like game consoles or mobile phones, that have less capable graphics hardware) is structuring your data to work well with a cache.
Here are, in my opinion, the key points to writing performant C++/STL code:
Learn what memory allocation strategies are for each STL container,
Learn what algorithms work best with what iterator categories,
Learn run-time polymorphism vs compile-time polymorphism.
Good starting points are:
SGI STL Programmer's Guide,
STL Reference,
The Definitive C++ Book Guide and List (SO).
I recommend Effective STL by Scott Myers. The biggest performance hog of STL, is the lack of understanding of it. Please learn it well!
Also see Optimizing software in C++ by Agner Fog for C++ specific performance related topics.
The other answers are all accurate: the problems with STL and game programming are mostly with misuse.
My general approach is the following:
1. Write it with STL and carefully choose the appropriate algorithm, container, etc.
2. Profile for bottlenecks.
3. If it's STL causing the problem, replace it.
Optimizing too early can really slow you down and only cause more problems later.
Of course, it depends on platform as well. Sometimes, you have to write all of your own stuff because you simply can't afford the extra CPU/RAM overhead of STL.
Stroustrup talks about the design and performance of the STL in general, and specifically about the different performance characteristics of the various different container types, in his book The C++ Programming Language (3rd edition).
I don't have experience in gaming, but Electronic Arts developed their own (non-conforming) implementation of the STL. There is an extensive article explaining the motives and design of the library here.
Note that in many cases, you will be better off by using the STL that comes with your implementation, then measure, then measure again and make sure that you understand what is going on and what is really a performance problem. Then, only then, and if the problem is within the STL (and not in how the STL is used), I would use unstandard libraries.
These rarely become performance hogs if used correctly. A profiler should always be your primary means of finding bottlenecks in your code short of obvious algorithmic inefficiencies (in which case it's still good practice to use a profiler to make sure if you are on a tight deadline).
There are some legitimate efficiency concerns, however, if you do come across STL usage to show up as a profiler hotspot.
vector<ExpensiveElement> v;
// insert a lot of elements to v
v.push_back(ExpensiveElement(...) );
This push_back immediately above has the worst case scenario of having to linearly copy all the ExpensiveElements inserted so far (if we've exceeded the current capacity). In the best case scenario, we still have to copy ExpensiveElement one time unnecessarily.
We can mitigate the issue by making vector store shared_ptr, but now we pay for two additional heap allocations per ExpensiveElement inserted (one for the reference counter in boost::shared_ptr, and one for ExpensiveElement) along with the overhead of a pointer indirection each time we want to access ExpensiveElement stored in the vector.
To mitigate the memory allocation/deallocation overhead (generally more likely to be a hotspot than an additional level of indirection), we can implement a fast memory allocator for ExpensiveElement. Nevertheless, imagine if std::vector provided an alloc_back method:
new (v.alloc_back()) ExpensiveElement(...);
This would avoid any copy ctor overhead, but is unsafe and prone to abuse. Nevertheless, that's exactly what I did with our vector-clone in response to hotspots. Note that I work in raytracing which is a field where performance is often one of the highest measures of quality (other than correctness) and we profile our code daily so it's not like we just decided out of the blue that vector wasn't efficient enough for our purposes.
We also had no choice but to implement a vector clone because we provide a software development kit where other people's std::vector implementations may be incompatible with our own. I don't want to give you the wrong idea: explore these kinds of solutions only if your profiler sessions really call for it!
Another common source of inefficiency is when using linked STL containers like set, multiset, map, multimap, and list. However, that's not necessarily their fault, but rather the fault of the default std::allocator being used. These perform a separate memory allocation/deallocation per node so the default allocator can be pretty slow for these purposes, especially across multiple threads (thread contention, better off with per-thread memory pools). You can really get a speed boost by writing your own memory allocator (though this is not a trivial thing to do and don't forget alignment if you do).
I can't emphasize enough that these kinds of optimizations should only be applied in response to the profiler. You'll make your code harder to use and maintain this way, so you should be doing it only in exchange for a solid, demonstrable boost in your application's performance.
This book covers what issues you face when using C++ in games.