STL container that combines sequential ordering with constant access time? - c++

This strikes me as something I should have been able to find on Stackoverflow, but maybe I'm searching for the wrong terms here.
I have the scenario where there is a class
class Foo
{
int key;
int b;
...
}
and I want to push new elements of that class onto a list. Number of items is unknown beforehand.
At the same time, I want to be able to quickly check the existence of (and retrieve) an element with a certain key, e.g. key==5.
So, to summarize:
It should have O(1) (or thereabouts) for existence/retrieval/deletion
It should retain the order of the pushed items
One solution to this strikes me a "use a std::list to store the items, and std::unordered_map to retrieve them, with some book-keeping". I could of course implement it myself, but was wondering whether something like this already exists in a convenient form in STL.
EDIT: To preempt the suggestion, std::map is not a solution, because it orders based on some key. That is, it won't retain the ordering in which I pushed the items.

STL does not have the kind of container, you may write your own using std::unordered_map and std::list. Map your keys to the list iterators in std::unordered_map and store key-value pairs in std::list.
void add(const Key& key, const Value& value)
{
auto iterator = list.insert(list.end(), std::pair<Key, Value>(key, value));
map[key] = iterator;
}
void remove(const Key& key)
{
auto iterator = map[key];
map.erase(key);
list.erase(iterator);
}
Or use boost multi-index container
http://www.boost.org/doc/libs/1_57_0/libs/multi_index/doc/tutorial/index.html

Related

Container with key and sorting criteria separate

I want to have a collection of items which are searchable based on a key (an unsigned value), but I want the elements to be sorted based on a different criteria i.e. the last accessed time (Which is part of the value).
How can I achieve this in C++? I can sort them separately on demand, but can I create the container itself such that sorting happens automatically?
Are there ready made containers (in boost) that can have similar feature built into them?
You could probably implement something of this kind, using std::list and std::unordered_map pointing to each other.
#include <list>
#include <unordered_map>
template <typename A>
struct Cache {
using key = unsigned;
struct Composite {
Composite(A &_a, std::list<key>::iterator _it) : a(_a), it(_it) {}
A &a;
std::list<key>::iterator it;
};
std::unordered_map<key, Composite> map;
std::list <key> list;
void insert(key k, A &a) { // Assuming inserting contains accessing
list.emplace_front(k);
map[k] = Composite(a, list.front());
}
A &operator[](key k) {
list.erase(map[k].it);
list.emplace_front(k);
return map[k].a;
}
A &last_accessed() { // or whatever else you wish to implement
assert(!list.empty());
return map[list.front()].a;
}
};
This solution is optimized for keeping track of which element was accessed last. If you want to sort given a different attribute, you can follow a similar process but use an std::set to store the values with your comparison function, and then iterators to that from an std::unordered_map hashed with a key of your choice.

Why does `std::unordered_map` "speak like the Yoda" - re-arrange elements?

When trying to write the std::string keys of an std::unordered_map in the following example, the keys get written in a different order than the one given by the initializer list:
#include <iostream>
#include <unordered_map>
class Data
{
typedef std::unordered_map<std::string, double> MapType;
typedef MapType::const_iterator const_iterator;
MapType map_;
public:
Data(const std::initializer_list<std::string>& i)
{
int counter = 0;
for (const auto& name : i)
{
map_[name] = counter;
}
}
const_iterator begin() const
{
return map_.begin();
}
const_iterator end() const
{
return map_.end();
}
};
std::ostream& operator<<(std::ostream& os, const Data& d)
{
for (const auto& pair : d)
{
os << pair.first << " ";
}
return os;
}
using namespace std;
int main(int argc, const char *argv[])
{
Data d = {"Why", "am", "I", "sorted"};
// The unordered_map speaks like Yoda.
cout << d << endl;
return 0;
}
I expected to see 'Why am I sorted', but I got a Yoda-like output:
sorted I am Why
Reading on the unordered_map here, I saw this:
Internally, the elements are not sorted in any particular order, but organized into buckets. Which bucket an element is placed into depends entirely on the hash of its key. This allows fast access to individual elements, since once hash is computed, it refers to the exact bucket the element is placed into.
Is this why the elements are not ordered in the same way as in the initializer list?
What data structure do I then use when I want the keys to be ordered in the same way as the initializer list? Should I internally keep a vector of strings to somehow save the argument order? Can the bucket organization be turned off somehow by choosing a specific hashing function?
What data structure do I then use when I want the keys to be ordered in the same way as the initializer list? Should I internally keep a vector of strings to somehow save the argument order?
Maybe all you want is actually a list/vector of (key, value) pairs?
If you want both O(1) lookup (hashmap) and iteration in the same order as insertion - then yes, using a vector together with an unordered_map sounds like a good idea. For example, Django's SortedDict (Python) does exactly that, here's the source for inspiration:
https://github.com/django/django/blob/master/django/utils/datastructures.py#L122
Python 2.7's OrderedDict is a bit more fancy (map values point to doubly-linked list links), see:
http://code.activestate.com/recipes/576693-ordered-dictionary-for-py24/
I'm not aware of an existing C++ implementation in standard libs, but this might get you somewhere. See also:
a C++ hash map that preserves the order of insertion
A std::map that keep track of the order of insertion?
unordered_map is, by definition, unordered, so you shall not expect any ordering when accessing the map sequentially.
If you don't want elements sorted by the key value, just use a container that keeps your order of insertion, be it a vector, deque, list or whatever, of pair<key, value> element type if you insist on using it.
Then, if an alement B is added after element A, it will always appear later. This holds true for initializer_list initialization as well.
You could probably use something like Boost.MultiIndex to keep it both sorted by insertion order and arbitrary key.

std::map override element under certain circumstances at insertion time

let's say I have a map whose key is a pair and whose custom comparator guarantees unicity against the first element of that pair.
class comparator
{
public:
bool operator()(const std::pair<std::string, std::int>& left,
const std::pair<std::string, std::int>& right)
{
return left.first < right.first;
}
};
std::map<std::pair<std::string, std::int>, foo, comparator>;
Now I'd like this map to be more intelligent than that, if possible.
Instead of being rejected at insertion time in case a key with the same string as first element of the pair already exists, I'd to overwrite the "already existing element" if the pair's integer (.second) of the "possibly going to be inserted element" is bigger.
Of course I can do this by looking in to the map for the key, getting the key details and overwriting it if necessary.
Alternatively I could adopt a post-insertion approach with a multimap on top of which I would iterate to clean up duplicates keeping just the key with the biggest pair integer.
The question is : can I do that natively by overriding part of the stl implementation ([] operator - insert method) or improving my custom comparator and then simply relying on map's insert method ?
I don't know if this is accepted but we could imagine having a non const comprator which would be able of updating the already stored (key, value) pair under certain circumstances.
ValueThe answer to your question is that you cannot do it.
There are two problems with your proposed implementation:
The keys must remain const as they are the index for the map
Independent of what the comparator did to the elements it is comparing the std::map would still insert the item before or after left based on the return of the comparator
The solution to the problem is as suggested by #MvG. Your key should not be paired, it is your value that should be paired.
This has the added benefit that you don't need a custom comparator.
The problem is that you will need a custom inserter:
std::pair< int, foo >& tempValue = _myMap[ keyToInsert ];
if( valueToInsert.first >= tempValue.first )
{
tempValue = valueToInsert;
}
Note that this will only work if all the valueToInsert.firsts that you use are positive, cause the default constructor for an int is 0. If you had negative valueToInsert.firsts the default constructed value pair would be inserted instead of your element.

hybrid linked list constructed on unordered_map?

Hi I wonder if I can set up another linked struct myself to actually set up my own order between keys in the unordered_map? or there is a standard library? I need the fast look up function of unordered_map...
For example:
#include<string>
#include<tr1/unordered_map>
struct linker
{
string *pt;
string *child1;
string *child2;
};
unordered_map<string,int> map({{"aaa",1},{"bbb",2},{"ccc",3},{"ddd",4}});
linker node1 = new linker;
node1.pt = &map.find("aaa")->first;
node1.child1 = &map.find("ccc")->first;
node1.child2 = &map.find("ddd")->first;
One way to optimize hash lookup is to find a hash function that produces the smallest number of hash collisions on the keys you are going to use.
With std::unordered_map you can also get local iterators to buckets and rearrange the elements in the bucket, if you are so inclined.
A far better solution IMHO would be as follows:
struct comparator {
bool operator()(string const& lhs, string const& rhs) {
return ...;//Your definition of order here!!!
}
};
std::map<string, int, comparator> map{{"aaa",1},{"bbb",2},{"ccc",3},{"ddd",4}};//note the elided paranthesis
Now you can simply use the iterator pair begin()/end() of this map which will be in a specified order see in the accepted answer to this question

Searching for suitable data structure in c++

Suggest a suitable data structure (in C++), such that the below mentioned purpose is solved:
insert an element to the end.
read and delete an element from the end.
read and delete an element from beginning.
find out if a particular element exists.
Right now i am using vectors..but finding if a particular element exists has a great time complexity in vectors as my elements are not sorted.
Is there some better data structure than vectors to accomplish this..if yes..then which one and please give an example.
One possibility is to use std::set or std::unordered_set which is basically a hash table and maintain the order between the elements yourself. This will give you O(log(n)) or amortized O(1) lookup complexity and constant insertion/deletion at the beginning/end. In Java this is called LinkedHashSet. Unfortunately STL doesn't provide this kind of data structure out of the box, but it should be easy to implement on top of a set/unordered_set or map/unordered_map.
Here's a piece of code that illustrates the idea:
template <typename T>
class linked_set {
private:
// Comparator of values with dereferencing.
struct value_deref_less {
bool operator()(const T *lhs, const T *rhs) const {
return *lhs < *rhs;
}
};
typedef std::set<const T*, value_deref_less> Set;
Set set_; // Used for quick lookup
std::deque<T> store_; // Used for ordered storage. deque is used instead of
// vector because the former doesn't invalidate
// pointers/iterators when elements are pushed.
public:
void push_back(const T& value) {
store_.push_back(value);
set_.insert(&store_.back());
// TODO: handle the case of duplicate elements.
}
// TODO: better provide your own iterator.
typedef typename Set::iterator iterator;
iterator find(const T& value) { return set_.find(&value); }
// ...
};
You won't be able to have both fast insertions at the two sides AND fast searches with the same container, at least if you restrict the possibilities to the STL. More exotic non-standard containers may help.
But the approach I generally choose in these cases is to use two containers. For storing the elements, the obvious option is std::deque. For searches, make a std::map<K,V> in which V is an iterator for the deque. Since insert/delete in deques does not invalidate iterators that are not involved, it should be OK IF you always remember to synchronize the map and the deque (i.e. when you do an insert or delete on the deque, do that also on the map).
Another simpler/safer option, instead of using iterators - if after a search in the map you just need the element found (you don't need to visit nearby elements, etc.) - is to have in both the deque and the map smart pointers to the actual objects (more specifically, shared_ptr). Again, you have to be careful to keep both in sync; although it won't be as catastrophic if they loose sync, probably the consistency of your program will be compromised, of course.
struct MyItem
{
std::string name;
int something;
int another;
MyItem(const std::string &name_, int something_, int another_)
:name(name_), something(something_), another(another_) {}
};
class MyContainer
{
public:
typedef std::shared_ptr<MyItem> MyItemPtr;
void push_front(MyItemPtr item)
{
deque.push_front(item);
assert(map.find(item->name) == map.end());
map[item->name] = item;
}
void push_back(MyItemPtr item)
{
deque.push_back(item);
assert(map.find(item->name) == map.end());
map[item->name] = item;
}
MyItemPtr pop_front()
{
item = deque.front();
deque.pop_front();
map.erase(item->name);
return item;
}
MyItemPtr pop_back()
{
item = deque.back();
deque.pop_back();
map.erase(item->name);
return item;
}
MyItemPtr find(const std::string &name)
{
std::map<std::string, MyItemPtr>::iterator iter = map.find(name);
if (iter == map.end())
return MyItemPtr();
else
return iter->second;
}
private:
std::deque<MyItemPtr> deque;
std::map<std::string, MyItemPtr> map;
};
To use it:
MyContainer container;
MyContainer::MyItemPtr a(new MyItem("blah", 1, 2));
container.push_back(a);
MyContainer::MyItemPtr b(new MyItem("foo", 5, 6));
container.push_front(b);
MyContainer::MyItemPtr f = container.find("blah");
if (f)
cout << f->name << ", " << f->something << ", " << f->another;
You can keep the vector, but also use a std::set for fast queries.
The set is not enough for deleting an element from the beginning/end, as you don't really know which is the first/last element you've inserted. You could keep references to those elements, but then in order to support deletion, you would need the next ones and so on, which degrades back to using one more container.
You should start with a std::map to see if logarithmic complexity is suitable.
A B+Tree would be a bit more complex and would require your own implementation or research to find an open source implmentation. But it is a reasonable choice given the requirements and the pain point you cited (searching), if the std::map still proves inadequate.
You would map an element's value to its iterator in a std::list, for example. All operations would be O(lg n) with std::map.
Use std::deque. This is a double-ended queue and it is also used as a container for standard interfaces such as std::stack.
It usually uses a quasi-linked list implementation and has amortized O(1) time complexity for insertions and deletions at edges.
If there is a lot of insert/delete a linked list would be more appropriate.
Beware that a linked list (single or double) will have quite an overhead (usually the size of a pointer, but implementation vary).
The standard template library offers you std::list.