Fast data structure that supports finding the minimum element and accessing, inserting, removing and updating data at any index - c++

I'm looking for ideas to implement a templatized sequence container data structure which can beat the performance of std::vector in as many features as possible and potentially perform much faster. It should support the following:
Finding the minimum element (and returning it's index)
Insertion at any index
Removal at any index
Accessing and updating any element by index (via operator[])
What would be some good ways to implement such a structure in C++?

You generally be pretty sure that the STL implementations of all containers tend to be very good at the range of tasks they were designed for. That is to say, you're unlikely to be able to build a container that is as robust as std::vector and quicker for all applications. However, generally speaking, it is almost always possible to beat a generic tool when optimizing for a specific application.
First, let's think about what a vector actually is. You can think of it as a pointer to a c-style array, except that its elements are stored on the heap. Unlike a c array, it also provides a bunch of methods that make it a little bit more convenient to manipulate. But like a c-array, all of it's data is stored contiguously in memory, so lookups are extremely cheap, but changing its size may require the entire array to be shifted elsewhere in memory to make room for the new elements.
Here are some ideas for how you could do each of the things you're asking for better than a vanilla std::vector:
Finding the minimum element: Search is typically O(N) for many containers, and certainly for a vector (because you need to iterate through all elements to find the lowest). You can make it O(1), or very close to free, by simply keeping the smallest element at all times, and only updating it when the container is changed.
Insertion at any index: If your elements are small and there are not many, I wouldn't bother tinkering here, just do what the vector does and keep elements contiguously next to each other to keep lookups quick. If you have large elements, store pointers to the elements instead of the elements themselves (boost's stable vector will do this for you). Keep in mind that this make lookup more expensive, because you now need to dereference the pointer, so whether you want to do this will depend on your application. If you know the number of elements you are going to insert, std::vector provides the reserve method which preallocates some memory for you, but what it doesn't do is allow you to decide how the size of the allocated memory grows. So if your application warrants lots of push_back operations without enough information to intelligently call reserve, you might be able to beat the standard std::vector implementation by tailoring the growth function of your container to your particular needs. Another option is using a linked list (e.g. std::list), which will beat an std::vector in insertions for larger containers. However, the cost here is that lookup (see 4.) will now become vastly slower (O(N) instead of O(1) for vectors), so you're unlikely to want to go down this path unless you plan to do more insertions/erasures than lookups.
Removal at any index: Similar considerations as for 2.
Accessing and updating any element by index (via operator[]): The only way you can beat std::vector in this regard is by making sure your data is in the cache when you try to access it. This is because lookup for a vector is essentially an array lookup, which is really just some pointer arithmetic and a pointer dereference. If you don't access your vector often you might be able to squeeze out a few clock cycles by using a custom allocator (see boost pools) and placing your pool close to the stack pointer.
I stopped writing mainly because there are dozens of ways in which you could approach this problem.
At the end of the day, this is probably more of an exercise in teaching you that the implementation of std::vector is likely to be extremely efficient for most compilers. All of these suggestions are essentially micro-optimizations (which are the root of all evil), so please don't blindly apply these in important code, as they're highly likely to end up costing you a lot of time and headache.
However, that's not to say you shouldn't tinker and learn for yourself, so by all means go ahead and try to beat it for your application and let us know how you go! Good luck :)

Related

Memory efficient std::map alternative

I'm using a std::map to store about 20 million entries. If they were stored without any container overhead, it would take approximately 650MB of memory. However, since they are stored using std::map, it uses up about 15GB of memory (i.e. too much).
The reason I am using an std::map is because I need to find keys that are equal to/larger/smaller than x. This is why something like sparsehash wouldn't work (since, using that, I cannot find keys by comparison).
Is there an alternative to using std::map (or ordered maps in general) that would result in less memory usage?
EDIT: Writing performance is much more important than reading performance. It will probably only read ~10 entries, but I don't know which entries it will read.
One alternative would be to use flat_map from Boost.Containers: that supports the same interface as std::map, but is backed by a sorted contiguous array (think std::vector) instead of a tree. Or hand-roll your own solution based on the same idea.
Its performance characteristic is of course different, due to the different back-end. It's up to you to evaluate whether it's usable in your case.
Are you writing on-the-fly or one time before the lookup is done? If the later is the case, you shouldn't need a map, you could use std::vector and one-time sort.
You could just insert everything unsorted to the vector, sort one-time after everything is there (O(N * log N) as well as std::map, but much better performance characteristics) and then lookup in the sorted array (O(logN) as the std::map).
And especially if you know the number of elements before reading and could reserve the vector size upfront, that could work pretty well. Or at least if you know some "upper bound" to reserve perhaps slightly more than actually needed but avoid the reallocations.
Given your requirements:
Insertion needs to be quick
There are many elements to read
Read-back can be slow
You only read back data once
I'd consider typedef std::pair<uint64, thirty_six_byte_struct> element; and populate a std::list<element>. That will be hard to beat in terms of performance.
For reading back, I'd simply traverse the linked list, checking at every point if you need one of those elements. That's a O(N) traversal but as you say, you'll only do that once.
Turns out the issue wasn't std::map.
I realized was using 3 separate maps to represent various parts of the same data, and after slimming it down to 1, the difference in memory was entirely negligible.
Looking at the code a little more, I realized code I had written to free a really expensive struct (per element of the map) didn't actually work.
Fixing that part, it now uses <1GB of memory, as it should! :)
TL;DR: std::map's overhead is entirely negligible for this. The issue was my own.

Would I see a performance gain using std::map instead of vector<pair<string, string> >?

I currently have some code where I am using a vector of pair<string,string>. This is used to store some data from XML parsing and as such, the process is quite slow in places. In terms of trying to speed up the entire process I was wondering if there would be any performance advantage in switching from vector<pair<string,string> > to std::map<string,string> ? I could code it up and run a profiler, but I thought I would see if I could get an answer that suggests some obvious performance gain first. I am not required to do any sorting, I simply add items to the vector, then at a later stage iterate over the contents and do some processing - I have no need for sorting or anything of that nature. I am guessing that perhaps I would not get any performance gain, but I have never actually used a std::map before so I don't know without asking or coding it all up.
No. If (as you say) you are simply iterating over the collection, you will see a small (probably not measurable) performance decrease by using a std::map.
Maps are for accessing a value by its key. If you never do this, map is a bad choice for a container.
If you are not modifying your vector<pair<string,string> > - just iterating it over and over - you will get perfomance degradation by using map. This is because typical map is organized with binary tree of objects, each of which can be allocated in different memory blocks (unless you write own allocator). Plus, each node of map manages pointers to neighbor objects, so it's time and memory overhead, too. But, search by key is O(log) operation. On other side, vector holds data in one block, so processor cache usually feels better with it. Searching in vector is actually O(N) operation which is not so good but acceptable. Search in sorted vector can be upgraded to O(log) using lower_bound etc functions.
It depends on operations you doing on this data. If you make many searches - probably its better to use hashing container like unordered_map since search by key in this containers is O(1) operation. For iterating, as mentioned, vector is faster.
Probably it is worth to replace string in your pair, but this highly depends on what you hold there and how access container.
The answer depends on what you are doing with these data structures and what the size of them is. If you have thousands of elements in your std::vector<std::pair<std::stringm std::string> > and you keep searching for the first element over and over, using a std::map<std::string, std::string> may improve the performance (you might want to consider using std::unordered_map<std::string, std::string> for this use case, instead). If your vectors are relatively small and you don't trying to insert elements into the middle too often, using vectors may very well be faster. If you just iterate over the elements, vectors are a lot faster than maps: iterations isn't really one of their strength. Maps are good at looking things up, assuming the number of elements isn't really small because otherwise a linear search over a vector is still faster.
The best way to determine where the time is spent is to profile the code: it is often not entirely clear up front where the time is spent. Frequently, the suspected hot-spots are actually non-problematic and other areas show unexpected performance problems. For example, you might be passing your objects my value rather than by reference at some obscure place.
If your usage pattern is such that you perform many insertions before performing any lookups, then you might benefit from implementing a "lazy" map where the elements are sorted on demand (i.e. when you acquire an iterator, perform a lookup, etc).
As C++ say std::vector sort items in a linear memory, so first it allocate a memory block with an initial capacity and then when you want to insert new item into vector it will check if it has more room or not and if not it will allocate a new buffer with more space, copy construct all items into new buffer and then delete source buffer and set it to new one.
When you just start inserting items into vector and you have lot of items you suffer from too many reallocation, copy construction and destructor call.
In order to solve this problem, if you now count of input items (not exact but its usual length) you can reserve some memory for the vector and avoid reallocation and every thing.
if you have no idea about the size you can use a collection like std::list witch never reallocate its internal items.

Quickest Queue Container (C++)

I'm using queue's and priority queues, through which I plan on pumping a lot of data quite quickly.
Therefore, I want my q's and pq's to be responsive to additions and subtractions.
What are the relative merits of using a vector, list, or deque as the underlying container?
Update:
At the time of writing, both Mike Seymour and Steve Townsend's answers below are worth reading. Thanks both!
The only way to be sure how the choice effects performance is to measure it, in a situation similar to your expected use cases. That said, here are some observations:
For std::queue:
std::deque is usually the best choice; it supports all the necessary operations in constant time, and allocates memory in chunks as it grows.
std::list also supports the necessary operations, but may be slower due to more memory allocations; in special circumstances, you might be able to get good results by allocating from a dedicated object pool, but that's not entirely straightforward.
std::vector can't be used as it doesn't have a pop_front() operation; such an operation would be slow, as it would have to move all the remaining elements.
A potentially faster, but less flexible, alternative is to implement a circular buffer over a fixed-size array (e.g. std::array, or a std::vector that you don't resize). You'll need to deal with the case of it filling up, either by reporting an error, or allocating a larger buffer and copying all the data.
For std::priority_queue:
std::vector is usually the best choice; it grows exponentially (reducing the number of memory allocations), and is a simple data structure that's very fast to access - an iterator can be implemented simply as a wrapper around a pointer.
std::deque may be slower as it typically grows linearly (requiring more memory allocation), and access is more complicated than with a vector.
std::list can't be used as it doesn't provide random access.
To summarise - the defaults are usually the best choice, but if speed really is important, then measure the alternatives.
I would use std::queue for your basic queue which is (by default at least) a wrapper on deque. Do something more special-purpose if that does not work for you.
std::priority_queue also exists (over vector by default) but the added semantics make it more likely that you could have to roll your own here, depending on perf observed for your particular access pattern.
vector has storage characteristics which make it very ill-suited to removal from front of the dataset. A lot of shuffling down to be done every time you pop_front. For a simple queue, avoid this.
list is likely to be too expensive for any high-hit queue, because by contract it has to offer function you don't need. It could be a candidate for use as a priority queue but my instinct is always to trust the STL.
vector would implement a stack as your fast insertion is at the end and fast removal is also at the end. If you want a FIFO queue, vector would be the wrong implementation to use.
deque or list both provide constant time insertion at either end. list is good for LRU caches where you want to move elements out of the middle fast and where you want your iterators to remain valid no matter how much you move them about. deque is generally used when insertions and deletions are at the end.
The main thing I need to ask about your collection is whether they are accessed by multiple threads. I sort-of assume they are, in which case one of your primary aims is to reduce locking. This is best done if you at least have a multi_push and multi_get feature so that you can put more than one element on at a time without any locking.
There are also lock-free containers or semi-lock-free containers.
You will probably find that your locking strategy is more important than any performance in the collection itself as long as your operations are all constant-time.

Best STL data structure to find unordered elements

I'm currently trying to implement a hash table in C++ for a homework...
I've chosen to use internal linking as a solution for collisions in the table...
and I'm looking for a good STL container that will find a specific entry in an unordered set of data.
I can't use an stl container that is based on trees (set, map, trees, etc...)
Right now I'm using a vector, is it a good choice? The search time will be linear, right? Can it be better?
As you're saying I assume the buckets can get big..., it's better to use std::list. Searching is linear in both cases, but adding elements is constant in std::list.
I guess they're all the same, since data isn't ordered - No, they are not. If they were, there would be just one container. Each container has it's own advantages and disadvantages, different containers are used for different situations.
A little information about vector:
std::vector has capacity, that's why it has capacity() and size() methods. They're both different. So, suppose the capacity is 4 and you have 2 elements, then size will be 2. So, adding another element will increment the size (will be 3) and it's all very fast.
But what happens when you have to add 5+ elements and the capacity is 4? Completely new memory is allocated, all old elements are copied in the new memory, all old elements are destroyed (their destructors are called, if user-defined types). Then the old memory has to be freed. These are expensive operations if you think that adding/removing elements will be more often.
You can avoid this, using std::vector::reserve method to reserve some memory in advance and not reallocate new memory all the time and copy everything over and over again. But this is useful when you know the approximate size of these vectors. I suppose you don't in your situation( reserving much memory is't a good solution, too - you should not waste memory just like that ) So, again, I'd prefer std::list.
Or double hash.
Anyway, this allocating of new memory and copying of objects will not happen that often, as std::vector is "clever" and when allocate new space, it doesn't increase the capacity with only 1 element or something. I think it doubles it, but I'm not that sure about that. Argh, I don't know how exactly this is called in English.. Probably something like "accumulative time/memory" or "accumulative complexity" :? Don't know :/
NOTE: Whatever you choose, I'd suggest you to pay your attention at the hash-function. It's the most important here. A hash container should NOT have too many elements with the same hash. So, my advice is to search for a good hash-function and then this will not matter that much.
Hope that helped (:
EDIT: I'd recommend you this article - comparing std::vector and std::deque - it's perfect - compares memory usage (allocating, deallocating, growing), CPU usage, etc. I'd recommend the whole site for such articles - there aren't many, but are really well written.
std::tr1::unordered_set could be what you need.

selection of data structure

I use C++, say i want to store 40 usernames, I will simply use an array. However, if I want to store 40000 usernames is this still a good idea in terms of search speed? Which data structure should I use to improve this speed?
You need to specify what the insertion and removal requirements are. Do things need to be removed and inserted at random points in the sequence?
Also, why the requirement to search sequentially? Are you doing searches that aren't suitable for a hash table lookup?
At the moment I'd suggest a deque or a list. Often it's best to choose a container with the interface that makes for the simplest implementation for your algorithm and then only change the choice if the performance is inadequate and an alternative provides the necessary speedup.
A vector has two principle advantages, there is no per-object memory overhead, although vectors will over-allocate to prevent frequent copying and objects are stored contiguously so sequential access tends to be fast. These are also its disadvantages. Growing vectors require reallocation and copying, and insertion and removal from anywhere other than the end of the vector also require copying. Contiguous storage can produce problems for vectors with large numbers of objects or large objects as the contiguous storage requirements can be hard to satisfy even with only mild memory fragmentation.
A list doesn't require contigous storage but list nodes usually have a per-object overhead of two pointers (in most implementation). This can be significant in list of very small objects (e.g. in a list of pointers, each node is 3x the size of the data item). Insertion and removal from the middle of a list is very cheap though and list nodes never need to me moved in memory once created.
A deque uses chunked storage, so it has a low per-object overhead similar to a vector, but doesn't require contiguous storage over the whole container so doesn't have the same problem with fragmented memory spaces. It is often a very good choice for collections and is often overlooked.
As a rule of thumb, prefer vector to list or, diety forbid, C-style array.
After the vector is filled, make sure it is properly ordered using the sort algorithm. You can then search for a particular record using either find, binary_search or lower_bound. (You don't need to sort to use find.)
Seriously unless you are in a resource constrained environment (embedded platform, phone, or other). Use a std::map, save the effort of doing sorting or searching and let the container take care of everything. This will possibly be a sorted tree structure, probably balance (e.g. Red-Black), which means you will get good searching performance. Unless the size of you data is close to the size of one or two pointers, the memory overhead of whatever data structure you pick is negligable. You Graphics Card probably has more memory that you are going to use up for the data you are think about.
As others said there is very little good reason to use vanilla array, if you don't want to use a map use std::vector or std::list depending on whether you need insert/delete data (=>list) or not (=>vector)
Also consider if you really need all that data in memory, how about putting it on disk via sqlite. Or even use sqlite for in memory access. It all depends on what you need to do with your data.
std::vector and std::list seem good for this task. You can use an array if you know the maximum number of records beforehands.
If you need only sequentially search and storage, then list is the proper container.
Also, vector wouldn't be a bad choice.