Is there a flat unsorted map/set implementation? - c++

There is the boost.container flat_map and others, and the Loki AssocVector and many others like these which keep the elements sorted.
Is there a modern (c++11 move-enabled, etc.) implementation of an unsorted vector adapted as a map/set?
The idea is to use it for very small maps/sets (less than 20 elements) and with simple keys (for which hashing wouldn't always make sense)

Something like this?
template<class Key, class Value, template<class...>class Storage=std::vector>
struct flat_map {
struct kv {
Key k;
Value v;
template<class K, class V>
kv( K&& kin, V&& vin ):k(std::forward<K>(kin)), v(std::forward<V>(vin)){}
};
using storage_t = Storage<kv>;
storage_t storage;
// TODO: adl upgrade
using iterator=decltype(std::begin(std::declval<storage_t&>()));
using const_iterator=decltype(std::begin(std::declval<const storage_t&>()));
// boilerplate:
iterator begin() {
using std::begin;
return begin(storage);
}
const_iterator begin() const {
using std::begin;
return begin(storage);
}
const_iterator cbegin() const {
using std::begin;
return begin(storage);
}
iterator end() {
using std::end;
return end(storage);
}
const_iterator end() const {
using std::end;
return end(storage);
}
const_iterator cend() const {
using std::end;
return end(storage);
}
size_t size() const {
return storage.size();
}
bool empty() const {
return storage.empty();
}
// these only have to be valid if called:
void reserve(size_t n) {
storage.reserve(n);
}
size_t capacity() const {
return storage.capacity();
}
// map-like interface:
// TODO: SFINAE check for type of key
template<class K>
Value& operator[](K&& k){
auto it = find(k);
if (it != end()) return it->v;
storage.emplace_back( std::forward<K>(k), Value{} );
return storage.back().v;
}
private: // C++14, but you can just inject the lambda at point of use in 11:
template<class K>
auto key_match( K& k ) {
return [&k](kv const& kv){
return kv.k == k;
};
}
public:
template<class K>
iterator find(K&& k) {
return std::find_if( begin(), end(), key_match(k) );
}
template<class K>
const_iterator find(K&& k) const {
return const_cast<flat_map*>(this)->find(k);
}
// iterator-less query functions:
template<class K>
Value* get(K&& k) {
auto it = find(std::forward<K>(k));
if (it==end()) return nullptr;
return std::addressof(it->v);
}
template<class K>
Value const* get(K&& k) const {
return const_cast<flat_map*>(this)->get(std::forward<K>(k));
}
// key-based erase: (SFINAE should be is_comparible, but that doesn't exist)
template<class K, class=std::enable_if_t<std::is_converible<K, Key>{}>>
bool erase(K&& k) {
auto it = std::remove(
storage.begin(), storage.end(), key_match(std::forward<K>(k))
);
if (it == storage.end()) return false;
storage.erase( it, storage.end() );
return true;
}
// classic erase, for iterating:
iterator erase(const_iterator it) {
return storage.erase(it);
}
template<class K2, class V2,
class=std::enable_if_t<
std::is_convertible< K2, Key >{}&&
std::is_convertible< V2, Value >{}
>
>
void set( K2&& kin, V2&& vin ) {
auto it = find(kin);
if (it != end()){
it->second = std::forward<V2>(vin);
return;
} else {
storage.emplace_back( std::forward<K2>(kin), std::forward<V2>(vin) );
}
}
};
I left the container type as a template argument, so you can use a SBO vector-like structure if you choose.
In theory, I should expose a template parameter for replacing equals on the keys. I did, however, make the key-search functions transparent.

If the sets are sure to be small then you can just use a std::vector (or std::deque) and do lookup using linear searches. An O(n) linear search over a small vector can be faster than an O(log(n)) search over a more complicated structure such as a red-black tree.
So you could just put elements in a vector without sorting them. You would still need to do some shuffling if you remove elements, but that will always be true for a flat vector-like structure whether it's sorted or not, unless you only ever remove the back element. Why is it important that it's flat anyway?

There's std::unordered_set and std::unordered_map but as far as I know they are not implemented using vectors.
A possible option is to write your own hash vector and hash the key using std::hash<Key> and then index the resulting number modulo the length of the vector, but then you'll have to figure out a way to handle collisions and all the resulting problems manually. Not sure I recommended that.
An alternative would be to pass a custom allocator to std::unordered_set and std::unordered_map which perform the allocation on a vector (for example by owning an internal vector), as suggested by #BeyelerStudios.

Evgeny Panasyuk is correct, I believe what you want is an Open Address Hash Map.
This fits exactly your requirement, only 1 flat buffer, no allocation of nodes, no pointers to follow, and unsorted.
Otherwise you also have flat_map/AssocVectorbut they are sorted, unlike your requirement.
For OAHM, I have an implementation of a STL-like generic one here:
https://sourceforge.net/projects/cgenericopenaddresshashmap/
Also you might want to take a look the benchmark page of flat_map:
boost::flat_map and its performance compared to map and unordered_map
The OAHM is performing very close to the flat_map in all tests, except iteration.

Please look at the sfl library that I have recently updated to GitHub: https://github.com/slavenf/sfl-library
It is C++11 header-only library that offers flat ordered and unordered containers that store elements contiguously in memory. All containers meet requirements of Container, AllocatorAwareContainer and ContiguousContainer. Library is licensed under zlib license.

Related

Which C++20 standard library containers have a .at member access function?

Which C++20 standard library containers have a .at function? For example, at least std::map, std::unordered_map and std::vector do. What others are there?
Is there some way to work this out without going through the 2000 page standard by hand?
From comments, it seems you want something like:
template <typename Container, typename T>
typename Container::pointer my_at(Container&, const T&)
requires (requires(Container c, const T& key) { c.at(key); })
{
// ...
}
Demo
I don't think a pointer is the appropriate type to return for this operation, instead it should be an iterator. At which point the definition splits into two cases:
Random access containers:
iterator at_(size_type index) noexcept { return index < size() ? begin() + index : end(); }
const_iterator at_(size_type index) const noexcept { return index < size() ? begin() + index : end(); }
map and unordered_map:
template<typename T>
iterator at_(T && key) noexcept { return find(std::forward<T>(key)); }
template<typename T>
const_iterator at_(T && key) const noexcept { return find(std::forward<T>(key)); }
And then rather than testing against nullptr, you test against the container's end().
It's also a reasonable question as to whether (unordered_)map needs an alias for an existing member to match your nomenclature
I ended up grepping libstdc++ include directory for \bat( and that has given me:
std::basic_string
std::basic_string_view
std::array
std::vector
std::vector<bool>
std::unordered_map
std::map
std::dequeue
I'm not sure if this is exhaustive.
There is also rope and vstring but I don't think these are standard.

Using range-based for loop with a third-party container

I'm currently using a third-party library which contains a class that only provides indexed lookup, i.e. operator[].
I'd like to do a range-based for loop on this class's contents. However, having never written an iterator or iterator adapter, I'm quite lost. It seems that writing iterators is not a straightforward task.
My desired usage is:
for(auto element : container)
{
...
}
Instead of having to write:
for(int i = 0; i < container.size(); ++i)
{
auto element = container[i];
...
}
How can this be achieved? Does Boost provide this functionality?
Writing iterators is actually a rather straightforward task, but it gets extremely tedious. Since your container supports indexing by an integer, I assume its iterators would fall into the random access iterator category (if they existed). That needs a lot of boilerplate!
However, to support the range-based for loop, all you'll need is a forward iterator. We'll write a simple wrapper for the container that implements the forward iterator requirements, and then write two functions Iterator begin(Container&) and Iterator end(Container&) that enable the container to be used in the range-based for loop.
This Iterator will contain a reference to the container, the size of the container, and the current index within that container:
template<template<typename> class C, typename T>
class indexer : public std::iterator<std::forward_iterator, T>
{
public:
indexer(C<T>& c, std::size_t i, std::size_t end)
: c_(std::addressof(c)), i_(i), end_(end) {}
T& operator*() const {
return c_[i_];
}
indexer<C, T>& operator++() {
++i_;
return *this;
}
indexer<C, T> operator++(int) {
auto&& tmp = *this;
operator++();
return tmp;
}
bool operator==(indexer<C, T> const& other) const {
return i_ == other.i_;
}
bool operator!=(indexer<C, T> const& other) const {
return !(*this == other);
}
private:
C<T>* c_;
std::size_t i_, end_;
};
Inheriting from std::iterator conveniently declares the appropriate typedefs for use with std::iterator_traits.
Then, you would define begin and end as follows:
template<typename T>
indexer<Container, T> begin(Container<T>& c) {
return indexer<Container, T>(c, 0, c.size());
}
template<typename T>
indexer<Container, T> end(Container<T>& c) {
auto size = c.size();
return indexer<Container, T>(c, size, size);
}
Switch out Container for whatever the type of container is in your example, and with that, your desired usage works!
The requirements and behavior of all the various kinds of iterators can be found in the tables of section 24.2.2 of the standard, which are mirrored at cppreference.com here.
A random-access iterator implementation of the above, along with a demo of usage with a simple vector_view class can be found live on Coliru or ideone.com.
You can do the following:
1) define your own iterator It that contains, inside, a ref ref to your container container and an index i. When the iterator is dereferenced, it returns ref[i] by reference. You can write it yourself or you can use boost for help, it has an iterator library to easily define your own iterators. Constructor should accept a container& and a size_t. You can make also the const_iterator if this concept applies.
2) When operator++ is invoked on one iterator, it simply does ++i on the internal member. operator== and operator!= should simply compare i. You can assert, for security, that both iterators are coherent, that means their refs point to the same object.
3) add begin and end to your container class or, if this is not possible, define a global begin and end that accept your container& c. begin should simply return It(c, 0). end should return It(c, c.size()).
There could be a problem copying the iterators as they contain a reference and some other minor details, but I hope the overall idea is clear and correct.

accessing map with C++98 standard

I have the following C++11 compatible code and I need to compile it with C++98 which doesn't have support for '.at'. How to rewrite it to be compatible with C++98 ?
String suffix("sss");
headers_t& meta = ...;
typedef std::map<std::string, std::string> headerpair_t;
typedef std::map<std::string, headerpair_t> addheader_t;
addheader_t addheader;
for(headerpair_t::const_iterator piter = addheader.at(suffix).begin(); piter != addheader.at(suffix).end(); ++piter){
// Adding header
meta[(*piter).first] = (*piter).second;
}
Just create an at() function which mimicks what C++11 std::map<...>::at() does:
template <typename K, typename V, typename C, typename A>
V const& at(std::map<K, V, C, A> const& m, K const& k) {
typename std::map<K, V, C, A>::const_iterator it(m.find(k));
if (it == m.end()) {
throw std::out_of_range("key not found in map");
}
return it->second;
}
Note that calling at() in each iteration of a loop is a Bad Idea! Searching a std::map<...> is efficient in the theoretical sense but that doesn't mean that it is fast in practice! You are much better off to search the relevant node just one and then keep using it.
You shouldn't use at() in a for loop condition like that. The element does not change between iteration and there is a overhead in retrieving it at every turn. So you should just retrieve it using find and then loop on the iterators:
addheader_t::const_iterator header_iter = addheader.find(suffix); // Retrieve the element
if (header_iter != addheader.end()) // Check that it does exist
{
// Retrieve the sub-map in the pair
const headerpair_t& header_pair_map = it->second;
// Loop on the elements
for (headerpair_t::const_iterator it = header_pair_map.begin(); header_pair_map.end(); ++it)
{
// Use insert to avoid a useless element construction
// Use also `std::make_pair`, but can we directly insert the pair from headerpair ?
meta.insert(std::make_pair(it->first, it->second));
}
}

Insert order std::map

I would create an associative array (like std:: map) that stores the elements in order of insertion. I wrote this class:
template <typename K, typename V>
class My_Map : public std::unordered_map<K,V>
{
public:
V& operator[]( const K&& key )
{
typename std::unordered_map<K,V>::iterator __i = find(key);
if (__i == std::unordered_map<K,V>::end()) //se non l'ho trovato...
{
__i = insert(__i, std::make_pair(std::move(key), V()) );
mHistory.push_back(__i);std::cout<<"Sto inserendo: "<<key<<std::endl;
}
return (*__i).second;
}
typename std::unordered_map<K,V>::iterator cbegin() const
{
return *mHistory.cbegin();
}
typename std::unordered_map<K,V>::iterator cend() const
{
return *mHistory.cend();
}
private:
std::list<typename std::unordered_map<K,V>::iterator> mHistory;
};
using namespace std;
int main()
{
My_Map<string,int> myMap;
myMap["1"] = 1;
myMap["23"] = 23;
myMap["-3"] = 3;
myMap["23"] = 28;
myMap["last element"] = 33;
for (auto x = myMap.cbegin(); x != myMap.cend(); ++x)//{std::cout<<"sn dentro\n";}
cout<<(*x).first <<"\t"<<x->second<<endl;
}
I used unordered_map instead std::map because std::map mix iterators when I insert new element.
This code have a problem: the for in main() fails with a segmentation fault. The iterators passed with cbegin() and cend() are not valid...why? What's wrong?
The first thing is that you can't dereference an end iterator of a list. Secondly, I also doubt if yourMap.cend will be necessarily reachable from yourMap.cbegin.
It looks like you might need an adaptor for the list iterator that automatically dereferences the stored map iterator pointer to map item.
In any case, you need to iterate over the list, not from a random point in the unordered_map to another random point therein.
Also: adding elements can cause rehashing, which will invalidate iterators (but not pointers or references to elements). You should not even store iterators into an unordered_map.

What is a C++ container with a "contains" operation?

I want to use a structure in which I insert integers, and then can ask
if (container.contains(3)) { /**/ }
There has to be something like this.
You can use std::vector.
std::vector<int> myVec;
myVec.push_back(3);
if (std::find(myVec.begin(), myVec.end(), 3) != myVec.end())
{
// do your stuff
}
You can even make a little helper function:
template <class T>
bool contains(const std::vector<T> &vec, const T &value)
{
return std::find(vec.begin(), vec.end(), value) != vec.end();
}
Here is how you would use it:
if (contains(myVec, 3)) { /*...*/ }
Simple algorithm:
template <typename Container>
bool contains(Container const& c, typename Container::const_reference v) {
return std::find(c.begin(), c.end(), v) != c.end();
}
You can customize it for more efficient search on some known containers:
template <typename Key, typename Cmp, typename Alloc>
bool contains(std::set<Key,Cmp,Alloc> const& s, Key const& k) {
return s.find(k) != s.end();
}
template <typename Key, typename Value, typename Cmp, typename Alloc>
bool contains(std::map<Key,Value,Cmp,Alloc> const& m, Key const& k) {
return m.find(k) != m.end();
}
And this way you obtain a single algorithm that performs the search on any container type, and is special cased to be faster on those containers which are ordered.
find on an unsorted vector is O(n).
std::set supports O(log n) insertions and lookups and is a good choice.
std::tr1::unordered_set provides a similar interface but supports near-constant-time lookups. It is the best choice if you have TR1 (or C++0x) and do not need to enumerate the elements in order.
What you want is the find_first_of method from the algorithms library. (or binary search, or anything along those lines)
http://www.cplusplus.com/reference/algorithm/find_first_of/
If you want to use a C++ standard container, due to its design, the containers themselves do not necessarily have "contains", but you can always use the find algorithm.
You should choose your container according to the characteristics of your dataset and the access "workload".
For a good reference of the containers and algorithms on the C++ standard library check http://www.cplusplus.com
Containers, Algorithms
If as it seems, your data is made of unique items, for which you want to associate a value, you probably will be well served by the map container. If all you care about is "membership", then set is a better choice.