What container should I use for the game objects that are created and deleted frequently? - c++

I am doing a game where I create objects and kill them frequently. I must be able to loop the list of objects linearly, in a way that the next object is always newer than previous, so the rendering of the objects will be correct (they will overlap). I also need to be able to store the pointers of each object into a quadtree, to quickly find nearby objects.
My first thought was to use std::list, but I have never done anything like this before, so I am looking for experts thoughts about this.
What container should I use?
Edit: I am not just deleting from the front: the objects can be killed at any order, but they are always added in the end of the list, so last item is newest.

std::vector is the recommended container to start with when you're not sure what you're doing. Only when you know that's not going to work for you should you choose something else.
That said, if you're regularly adding to the back of the container and deleting from the front, you probably want std::deque. [Edit] But it appears that's not what you're doing.
I'm thinking you might want two containers, one to maintain the insert order and one for your quadtree. There are lots of quadtree implementations on the Internet, so I'll focus on the other one. Using std::list as you suggest will make the delete operation itself faster than vector or deque. It also has the advantage of letting you store iterators, because list won't invalidate the other iterators when an element is removed. Your objects in the quadtree could maintain an iterator into the insert order list. When you remove an element from the quadtree, you can remove it from the list too in O(1) time.
As always, the decision about which container to use is all about tradeoffs. A list comes with increased memory footprint over vector and the loss of contiguous memory layout. You might be surprised how much cache locality matters when your data set is large. The only way to know for sure is to try various containers and see which one runs the best for your application.

I think boost::stable_vector fits your needs for deletion\iteration.

So, you want to be able to iterate through through your container in the order in which the items have been added, but you want to be able to remove items from any point in the container. A simple queue obviously isn't going to hack it.
Happily, there are 4 containers that will do this job easily enough, std::vector, std::list and std::deque and std::set. If you use standard container idioms (eg. begin, end, erase, insert, and to a lesser extent, push_front, pop_back, front, back) you can use whichever container you felt like. With those 8 operations, you could switch between std::vector, std::list and std::deque, and with just the first 4 you could use std::set, too. Write your code, and then you can easily chop and change between the different container types and do a little profiling to compare performance and memory overheads or whatever.
Intuitively, std::list is probably a good bet, and perhaps std::set would work too. But rather than making assumptions, just use the general tools the template library gives you, and profile and optimise things later when you have some meaningful performance data to work with.

Related

C++: What are the reasons for choosing a linked list / deque over a vector? [duplicate]

There's a well known image (cheat sheet) called "C++ Container choice". It's a flow chart to choose the best container for the wanted usage.
Does anybody know if there's already a C++11 version of it?
This is the previous one:
Not that I know of, however it can be done textually I guess. Also, the chart is slightly off, because list is not such a good container in general, and neither is forward_list. Both lists are very specialized containers for niche applications.
To build such a chart, you just need two simple guidelines:
Choose for semantics first
When several choices are available, go for the simplest
Worrying about performance is usually useless at first. The big O considerations only really kick in when you start handling a few thousands (or more) of items.
There are two big categories of containers:
Associative containers: they have a find operation
Simple Sequence containers
and then you can build several adapters on top of them: stack, queue, priority_queue. I will leave the adapters out here, they are sufficiently specialized to be recognizable.
Question 1: Associative ?
If you need to easily search by one key, then you need an associative container
If you need to have the elements sorted, then you need an ordered associative container
Otherwise, jump to the question 2.
Question 1.1: Ordered ?
If you do not need a specific order, use an unordered_ container, otherwise use its traditional ordered counterpart.
Question 1.2: Separate Key ?
If the key is separate from the value, use a map, otherwise use a set
Question 1.3: Duplicates ?
If you want to keep duplicates, use a multi, otherwise do not.
Example:
Suppose that I have several persons with a unique ID associated to them, and I would like to retrieve a person data from its ID as simply as possible.
I want a find function, thus an associative container
1.1. I couldn't care less about order, thus an unordered_ container
1.2. My key (ID) is separate from the value it is associated with, thus a map
1.3. The ID is unique, thus no duplicate should creep in.
The final answer is: std::unordered_map<ID, PersonData>.
Question 2: Memory stable ?
If the elements should be stable in memory (ie, they should not move around when the container itself is modified), then use some list
Otherwise, jump to question 3.
Question 2.1: Which ?
Settle for a list; a forward_list is only useful for lesser memory footprint.
Question 3: Dynamically sized ?
If the container has a known size (at compilation time), and this size will not be altered during the course of the program, and the elements are default constructible or you can provide a full initialization list (using the { ... } syntax), then use an array. It replaces the traditional C-array, but with convenient functions.
Otherwise, jump to question 4.
Question 4: Double-ended ?
If you wish to be able to remove items from both the front and back, then use a deque, otherwise use a vector.
You will note that, by default, unless you need an associative container, your choice will be a vector. It turns out it is also Sutter and Stroustrup's recommendation.
I like Matthieu's answer, but I'm going to restate the flowchart as this:
When to NOT use std::vector
By default, if you need a container of stuff, use std::vector. Thus, every other container is only justified by providing some functionality alternative to std::vector.
Constructors
std::vector requires that its contents are move-constructible, since it needs to be able to shuffle the items around. This is not a terrible burden to place on the contents (note that default constructors are not required, thanks to emplace and so forth). However, most of the other containers don't require any particular constructor (again, thanks to emplace). So if you have an object where you absolutely cannot implement a move constructor, then you will have to pick something else.
A std::deque would be the general replacement, having many of the properties of std::vector, but you can only insert at either ends of the deque. Inserts in the middle require moving. A std::list places no requirement on its contents.
Needs Bools
std::vector<bool> is... not. Well, it is standard. But it's not a vector in the usual sense, as operations that std::vector normally allows are forbidden. And it most certainly does not contain bools.
Therefore, if you need real vector behavior from a container of bools, you're not going to get it from std::vector<bool>. So you'll have to make due with a std::deque<bool>.
Searching
If you need to find elements in a container, and the search tag can't just be an index, then you may need to abandon std::vector in favor of set and map. Note the key word "may"; a sorted std::vector is sometimes a reasonable alternative. Or Boost.Container's flat_set/map, which implements a sorted std::vector.
There are now four variations of these, each with their own needs.
Use a map when the search tag is not the same thing as the item you're looking for itself. Otherwise use a set.
Use unordered when you have a lot of items in the container and search performance absolutely needs to be O(1), rather than O(logn).
Use multi if you need multiple items to have the same search tag.
Ordering
If you need a container of items to always be sorted based on a particular comparison operation, you can use a set. Or a multi_set if you need multiple items to have the same value.
Or you can use a sorted std::vector, but you'll have to keep it sorted.
Stability
When iterators and references are invalidated is sometimes a concern. If you need a list of items, such that you have iterators/pointers to those items in various other places, then std::vector's approach to invalidation may not be appropriate. Any insertion operation may cause invalidation, depending on the current size and capacity.
std::list offers a firm guarantee: an iterator and its associated references/pointers are only invalidated when the item itself is removed from the container. std::forward_list is there if memory is a serious concern.
If that's too strong a guarantee, std::deque offers a weaker but useful guarantee. Invalidation results from insertions in the middle, but insertions at the head or tail causes only invalidation of iterators, not pointers/references to items in the container.
Insertion Performance
std::vector only provides cheap insertion at the end (and even then, it becomes expensive if you blow capacity).
std::list is expensive in terms of performance (each newly inserted item costs a memory allocation), but it is consistent. It also offers the occasionally indispensable ability to shuffle items around for virtually no performance cost, as well as to trade items with other std::list containers of the same type at no loss of performance. If you need to shuffle things around a lot, use std::list.
std::deque provides constant-time insertion/removal at the head and tail, but insertion in the middle can be fairly expensive. So if you need to add/remove things from the front as well as the back, std::deque might be what you need.
It should be noted that, thanks to move semantics, std::vector insertion performance may not be as bad as it used to be. Some implementations implemented a form of move semantic-based item copying (the so-called "swaptimization"), but now that moving is part of the language, it's mandated by the standard.
No Dynamic Allocations
std::array is a fine container if you want the fewest possible dynamic allocations. It's just a wrapper around a C-array; this means that its size must be known at compile-time. If you can live with that, then use std::array.
That being said, using std::vector and reserveing a size would work just as well for a bounded std::vector. This way, the actual size can vary, and you only get one memory allocation (unless you blow the capacity).
Here is the C++11 version of the above flowchart. [originally posted without attribution to its original author, Mikael Persson]
Here's a quick spin, although it probably needs work
Should the container let you manage the order of the elements?
Yes:
Will the container contain always exactly the same number of elements?
Yes:
Does the container need a fast move operator?
Yes: std::vector
No: std::array
No:
Do you absolutely need stable iterators? (be certain!)
Yes: boost::stable_vector (as a last case fallback, std::list)
No:
Do inserts happen only at the ends?
Yes: std::deque
No: std::vector
No:
Are keys associated with Values?
Yes:
Do the keys need to be sorted?
Yes:
Are there more than one value per key?
Yes: boost::flat_map (as a last case fallback, std::map)
No: boost::flat_multimap (as a last case fallback, std::map)
No:
Are there more than one value per key?
Yes: std::unordered_multimap
No: std::unordered_map
No:
Are elements read then removed in a certain order?
Yes:
Order is:
Ordered by element: std::priority_queue
First in First out: std::queue
First in Last out: std::stack
Other: Custom based on std::vector?????
No:
Should the elements be sorted by value?
Yes: boost::flat_set
No: std::vector
You may notice that this differs wildly from the C++03 version, primarily due to the fact that I really do not like linked nodes. The linked node containers can usually be beat in performance by a non-linked container, except in a few rare situations. If you don't know what those situations are, and have access to boost, don't use linked node containers. (std::list, std::slist, std::map, std::multimap, std::set, std::multiset). This list focuses mostly on small and middle sided containers, because (A) that's 99.99% of what we deal with in code, and (B) Large numbers of elements need custom algorithms, not different containers.

Fast data structure that supports finding the minimum element and accessing, inserting, removing and updating data at any index

I'm looking for ideas to implement a templatized sequence container data structure which can beat the performance of std::vector in as many features as possible and potentially perform much faster. It should support the following:
Finding the minimum element (and returning it's index)
Insertion at any index
Removal at any index
Accessing and updating any element by index (via operator[])
What would be some good ways to implement such a structure in C++?
You generally be pretty sure that the STL implementations of all containers tend to be very good at the range of tasks they were designed for. That is to say, you're unlikely to be able to build a container that is as robust as std::vector and quicker for all applications. However, generally speaking, it is almost always possible to beat a generic tool when optimizing for a specific application.
First, let's think about what a vector actually is. You can think of it as a pointer to a c-style array, except that its elements are stored on the heap. Unlike a c array, it also provides a bunch of methods that make it a little bit more convenient to manipulate. But like a c-array, all of it's data is stored contiguously in memory, so lookups are extremely cheap, but changing its size may require the entire array to be shifted elsewhere in memory to make room for the new elements.
Here are some ideas for how you could do each of the things you're asking for better than a vanilla std::vector:
Finding the minimum element: Search is typically O(N) for many containers, and certainly for a vector (because you need to iterate through all elements to find the lowest). You can make it O(1), or very close to free, by simply keeping the smallest element at all times, and only updating it when the container is changed.
Insertion at any index: If your elements are small and there are not many, I wouldn't bother tinkering here, just do what the vector does and keep elements contiguously next to each other to keep lookups quick. If you have large elements, store pointers to the elements instead of the elements themselves (boost's stable vector will do this for you). Keep in mind that this make lookup more expensive, because you now need to dereference the pointer, so whether you want to do this will depend on your application. If you know the number of elements you are going to insert, std::vector provides the reserve method which preallocates some memory for you, but what it doesn't do is allow you to decide how the size of the allocated memory grows. So if your application warrants lots of push_back operations without enough information to intelligently call reserve, you might be able to beat the standard std::vector implementation by tailoring the growth function of your container to your particular needs. Another option is using a linked list (e.g. std::list), which will beat an std::vector in insertions for larger containers. However, the cost here is that lookup (see 4.) will now become vastly slower (O(N) instead of O(1) for vectors), so you're unlikely to want to go down this path unless you plan to do more insertions/erasures than lookups.
Removal at any index: Similar considerations as for 2.
Accessing and updating any element by index (via operator[]): The only way you can beat std::vector in this regard is by making sure your data is in the cache when you try to access it. This is because lookup for a vector is essentially an array lookup, which is really just some pointer arithmetic and a pointer dereference. If you don't access your vector often you might be able to squeeze out a few clock cycles by using a custom allocator (see boost pools) and placing your pool close to the stack pointer.
I stopped writing mainly because there are dozens of ways in which you could approach this problem.
At the end of the day, this is probably more of an exercise in teaching you that the implementation of std::vector is likely to be extremely efficient for most compilers. All of these suggestions are essentially micro-optimizations (which are the root of all evil), so please don't blindly apply these in important code, as they're highly likely to end up costing you a lot of time and headache.
However, that's not to say you shouldn't tinker and learn for yourself, so by all means go ahead and try to beat it for your application and let us know how you go! Good luck :)

Keeping an unordered list of small objects with frequent insertions and removals

Suppose I have a list of small objects that I iterate through (say, in a loop) with frequent insertions and removals. However, the sequential order that I iterate through the list does not matter. Instead of using std::list to store the elements, I was thinking about using std::vector in the following way (for constant time removals):
Insertion: use push_back to insert at the end of the array.
Removal: let's say I want to remove an element at position k from a vector of size n. Then, I copy the content of the nth (or (n-1)st, depending on how you see it) element to the kth element and use pop_back. Given that the elements are small, the copy operation shouldn't be costly.
This is to take advantage of contiguous memory and not having to dynamically allocate memory for every insertion. Is there a downside for this approach? I also noticed that C++11 has unordered_set, but I think this may be overkill for what I'm trying to do.
I apologize if this idea sounds blatantly obvious.
Your idea is the basic approach to keep an array efficient. If the order really doesn't matter for you, I think it's the ideal approach. You might want to encapsulate it in a class (a wrapper around std::vector) so that you can employ it in multiple places without code duplication, test it separately and generally follow the "single responsibility" principle.
If you have access to C++11 features, you won't even have to copy the n-th element - you can move it instead, making this feasible even for heavier objects.
I can't see a downside to the approach given your fairly loose requirements.
One other option to consider is that if you item is cheaper to swap than copy, you can swap the last item with the one to delete and the pop your now-swapped item off the end.
It does really sound like unordered_set is too heavyweight for your needs since it has constant time find that you don't need for your requirements.

Choosing a STL Container for a very large list

I have a very large list of items (~2 millions) that I want to optimize for access speed. I iterate trough the items using an iterator (++it).
Right now the code is implemented using std:map<std::wstring, STRUCT>.
I wonder if it's worth to change std::map with a std::deque<std::pair<std::wstring, STRUCT>>. I think I would have advantage of using pointer arithmetic and minimize cache miss. It worths ?
I know that profiling is the answer but I need an opinion before implementing this ...
If you know in advance the size, then std::Vector is clearly the way to go it your objects aren't too big.
std::vector<Object> list;
list.reserve(2000000);
And then fill it as usual.
This is the fastest and least memory consuming approach. However, you need to be able to allocate enought continous memory. But excepted if your object are 1kb big, it shouldn't be a problem.
With deque, you would lose ( or would have to re-implement ) the advantage of Key-Value pairs. If it's not essential for your data, I would consider using deque.
Generally, if you're only doing search in this set (no insertions/deletions), you're probably better off using a sorted sequential cointainer, like deque or vector. You can then use simple binary search to find the needed elements. The advantage of using a sequential container is that it is better in terms of memory usage, has very simple implementation, and provides better locality of reference. I'd write one version of the code using vector, and another version of the code using deque, then compare them in terms of preformance to decide which one to use in the final version.
However, if your structure needs to be updated (new elements need to be inserted or old elements have to be deleted frequently), map is better choice. Or maybe, you just have to drop STL containers altogether and just use an in-memory database (see SQLite), but it highly depends on what problem you're solving.
The fastest container to iterate through is usually a vector, so if you want to optimize for iteration at the expense of everything else, use that.
Overall app performance of course will depend how many times you iterate, and how you construct your data in the first place. For a simple test, once your map has been populated you can construct a vector from it as follows:
vector<pair<K,V> > myvec(mymap.begin(), mymap.end());
Where K and V are the key and value types of the map. Then just use the vector iterators in place of the map iterators and compare performance.
Of course, if you want to modify the map in future, then normally it would not be appropriate to optimize for iteration at the expense of everything else.

C++ smart linear container

Let me explain my problem together with the background, so it would be easier to understand why I'm asking for this specific type of thing. I'm developing an instant messenger. Most of the architecture is outlined by my teacher, however implementation detail may vary. There is an "Engine" class, EventManager, which registers clients. To identify them and to easily remove them, I use a map (with client-id's) or a set with pointers. So far, so good. But then this EventManager uses poll() (or select(), but that's nowhere as comfortable to use as poll(), as you have to rebuild the array each time, which is slow and not-so-nice, I guess, and I can restrict myself to UNIX environment, if you ask) in its main loop. Which needs an array of struct pollfd. Now every time a new client comes or goes, this array needs to be rebuilt. Either I use a dynamic array by hand and allocate memory every time (baaaaaad), or I use a vector, which would handle new client's struct pollfd insertion pretty well at the end of the container, or a deque, which would insert and remove anywhere pretty well. Now my two questions are:
If I choose vector, will it automatically shrink and move elements in the middle of itself instead of full reallocation? and
That would anyway copy a lot, if it's in the beginning, so I'd like to use deque. Does that have an array interface (like you would do with vector - &myVector[0]) or is it strictly non-contiguous?
If you remove something from the middle of a vector it will move all the following elements one position towards the beginning. It will not reallocate. You don't have to consider reallocations at all because they are amortized to give O(1) time per insertion.
deque is not much better than vector. Removing from the beginning or end is efficient. Not from the middle. If you remove from anywhere, then it will hopefully be twice as fast a vector, but not faster. Since it's a more complicated structure it'll probably be even slower. deque doesn't guarantee continuous storage, so although indexing is allowed and done in O(1) time, you still can't reliably convert it to a pointer.
Anyway it smells like premature optimization. Use vector. Since the order of clients is not significant, you can speed up the erasure of clients by swapping the element that you want to remove with the last element in the vector and calling pop_back() after that.