I'm trying to use both a list and an unordered_map to store the same set of objects. I'm new to C++, so still getting comfortable with iterators.
Say I have the following test code:
class Test {
public:
int x;
int y;
int z;
Test (int, int, int);
}
Test t1 = Test(1,2,3);
Test t2 = Test(2,4,6);
Test t3 = Test(3,6,9);
std::list<Test> list;
std::unordered_map<int, Test> map;
list.push_back(t3);
list.push_back(t2);
list.push_back(t1);
map[101] = t1;
map[102] = t2;
map[103] = t3;
Is it possible to look up an object by key, and then generate a list iterator from the reference of the object (or from the unordered_map generator?)
So if I have the key 102, I could look up t2 in constant time. I then want to iterate forward/backward/insert/delete relative to t2's position in the list.
I can use find to get a unordered_map iterator pointing to t2. I don't know how to generate a list iterator that starts at t2 (I can only generate iterators at the beginning or the end of the list, and iterate through.)
Would appreciate anyone pointing me to good tutorials on the STL and iterators.
Thanks!
Afterthought:
Is this an acceptable approach? I have many objects and need to efficiently look them up by integer key. I also need to preserve their order (unrelated to these integer keys) and insert/delete/traverse efficiently.
If what you want to do is this:
Is it possible to look up an object by key, and then generate a list iterator from the reference of the object (or from the unordered_map generator?)
Then you can take advantage of the fact that list iterators aren't invalidated on insertion or erase (unless you erase that particular iterator) and reorganize your structures like this:
std::list<Test> list;
std::unordered_map<int, std::list<Test>::iterator> map;
map.insert(std::make_pair(101,
list.insert(list.end(), t1)));
map.insert(std::make_pair(102,
list.insert(list.end(), t2)));
map.insert(std::make_pair(103,
list.insert(list.end(), t3)));
That way your map lookup gives you exactly what you want: a list iterator.
While Barry's approach is good, there is another one, more advanced and complicated. You can put your data object, (integer) key, and all bookkeeping bits in a single chunk of memory. Thus data locality will be improved and pressure on memory allocator will be less. Example, using boost::intrusive:
#include <boost/intrusive/list.hpp>
#include <boost/intrusive/unordered_set.hpp>
#include <array>
using namespace boost::intrusive;
class Foo {
// bookkeeping bits
list_member_hook<> list_hook;
unordered_set_member_hook<> set_hook;
const int key;
// some payload...
public:
// there is even more options to configure container types
using list_type = list<Foo, member_hook<Foo, list_member_hook<>, &Foo::list_hook>>;
using set_type = unordered_set<Foo, member_hook<Foo, unordered_set_member_hook<>, &Foo::set_hook>>;
Foo(int key): key(key) {};
bool operator ==(const Foo &rhs) const {
return key == rhs.key;
}
friend std::size_t hash_value(const Foo &foo) {
return std::hash<int>()(foo.key);
}
};
class Bar {
Foo::list_type list;
std::array<Foo::set_type::bucket_type, 17> buckets;
Foo::set_type set{Foo::set_type::bucket_traits(buckets.data(), buckets.size())};
public:
template<typename... Args>
Foo &emplace(Args&&... args) {
auto foo = new Foo(std::forward<Args>(args)...);
// no more allocations
list.push_front(*foo);
set.insert(*foo);
return *foo;
}
void pop(const Foo &foo) {
set.erase(foo);
list.erase(list.iterator_to(foo));
// Lifetime management fun...
delete &foo;
}
};
int main() {
Bar bar;
auto &foo = bar.emplace(42);
bar.pop(foo);
}
Measure how good are both algorithms on your data. My idea may give you nothing but greater code complexity.
Related
I have an unordered_map that uses a string-type as a key:
std::unordered_map<string, value> map;
A std::hash specialization is provided for string, as well as a
suitable operator==.
Now I also have a "string view" class, which is a weak pointer into an existing string, avoiding heap allocations:
class string_view {
string *data;
size_t begin, len;
// ...
};
Now I'd like to be able to check if a key exists in the map using a string_view object. Unfortunately, std::unordered_map::find takes a Key argument, not a generic T argument.
(Sure, I can "promote" one to a string, but that causes an allocation I'd like to avoid.)
What I would've liked instead was something like
template<class Key, class Value>
class unordered_map
{
template<class T> iterator find(const T &t);
};
which would require operator==(T, Key) and std::hash<T>() to be suitably defined, and would return an iterator to a matching value.
Is there any workaround?
P0919R2 Heterogeneous lookup for unordered containers has been merged in the C++2a's working draft!
The abstract seems a perfect match w.r.t. my original question :-)
Abstract
This proposal adds heterogeneous lookup support to the unordered associative containers in the C++ Standard Library. As a result, a creation of a temporary key object is not needed when different (but compatible) type is provided as a key to the member function. This also makes unordered and regular associative container interfaces and functionality more compatible with each other.
With the changes proposed by this paper the following code will work without any additional performance hits:
template<typename Key, typename Value>
using h_str_umap = std::unordered_map<Key, Value, string_hash>;
h_str_umap<std::string, int> map = /* ... */;
map.find("This does not create a temporary std::string object :-)"sv);
As mentioned above, C++14 does not provide heterogeneous lookup for std::unordered_map (unlike std::map). You can use Boost.MultiIndex to define a fairly close substitute for std::unordered_map that allows you to look up string_views without allocating temporary std::strings:
Live Coliru Demo
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/member.hpp>
#include <string>
using namespace boost::multi_index;
struct string_view
{
std::string *data;
std::size_t begin,len;
};
template<typename T,typename Q>
struct mutable_pair
{
T first;
mutable Q second;
};
struct string_view_hash
{
std::size_t operator()(const string_view& v)const
{
return boost::hash_range(
v.data->begin()+v.begin,v.data->begin()+v.begin+v.len);
}
std::size_t operator()(const std::string& s)const
{
return boost::hash_range(s.begin(),s.end());
}
};
struct string_view_equal_to
{
std::size_t operator()(const std::string& s1,const std::string& s2)const
{
return s1==s2;
}
std::size_t operator()(const std::string& s1,const string_view& v2)const
{
return s1.size()==v2.len&&
std::equal(
s1.begin(),s1.end(),
v2.data->begin()+v2.begin);
}
std::size_t operator()(const string_view& v1,const std::string& s2)const
{
return v1.len==s2.size()&&
std::equal(
v1.data->begin()+v1.begin,v1.data->begin()+v1.begin+v1.len,
s2.begin());
}
};
template<typename Q>
using unordered_string_map=multi_index_container<
mutable_pair<std::string,Q>,
indexed_by<
hashed_unique<
member<
mutable_pair<std::string,Q>,
std::string,
&mutable_pair<std::string,Q>::first
>,
string_view_hash,
string_view_equal_to
>
>
>;
#include <iostream>
int main()
{
unordered_string_map<int> m={{"hello",0},{"boost",1},{"bye",2}};
std::string str="helloboost";
auto it=m.find(string_view{&str,5,5});
std::cout<<it->first<<","<<it->second<<"\n";
}
Output
boost,1
I faced an equal problem.
We need two structs:
struct string_equal {
using is_transparent = std::true_type ;
bool operator()(std::string_view l, std::string_view r) const noexcept
{
return l == r;
}
};
struct string_hash {
using is_transparent = std::true_type ;
auto operator()(std::string_view str) const noexcept {
return std::hash<std::string_view>()(str);
}
};
For unordered_map:
template <typename Value>
using string_unorderd_map = std::unordered_map<std::string, Value, string_hash, string_equal>;
For unordered_set:
using string_unorderd_set = std::unordered_set<std::string, string_hash, string_equal>;
Now using string_view is possible.
It looks like only as recently as C++14 did even the basic map get such a templated find for is_transparent types in the comparison. Most likely the correct implementation for hashed containers was not immediately evident.
As far as I can see your two options are:
Just do the allocation and profile to see if maybe it's not actually a problem.
Take a look at boost::multi_index (http://www.boost.org/doc/libs/1_60_0/libs/multi_index/doc/index.html) and have both string and string_view indexes into the container.
This solution has drawbacks, which may or may not make it unviable for your context.
You can make a wrapper class:
struct str_wrapper {
const char* start, end;
};
And change your map to use str_wrapper as its key. You'd have to add 2 constructors to str_wrapper, one for std::string and one for your string_view. The major decision is whether to make these constructors perform deep or shallow copies.
For example, if you use std::string only for inserts and str_view only for lookups, you'd make the std::string constructor deep and the str_view one shallow (this can be enforced at compile time if you use a custom wrapper around unordered_map). If you care to avoid memory leaks on the deep copy you would need additional fields to support proper destruction.
If your usage is more varied, (looking up std::string's or inserting by str_view), there will be drawbacks, which again, might make the approach too distasteful so as to be unviable. It depends on your intended usage.
Yet another option is to split the lookup and the data management by using multiple containters:
std::unordered_map<string_view, value> map;
std::vector<unique_ptr<const char[]>> mapKeyStore;
Lookups are done using string_views without the need of allocations.
Whenever a new key is inserted we need to add a real string allocation first:
mapKeyStore.push_back(conv(str)); // str can be string_view, char*, string... as long as it converts to unique_ptr<const char[]> or whatever type
map.emplace(mapKeyStore.back().get(), value)
It would be much more intuitive to use std::string in the mapKeyStore. However, using std::string does not guarantee unchanging string memory (e.g. if the vector resizes). With the unique_ptr this is enforced. However, we need some special conversion/allocation routine, called conv in the example. If you have a custom string container which guarantees data consistency under moves (and forces the vector to use moves), then you can use it here.
The drawback
The disadvantage of the above method is that handling deletions is non-trivial and expensive if done naive. If the map is only created once or only growing this is a non-issue and the above pattern works quite well.
Running example
The below example includes a naive deletion of one key.
#include <vector>
#include <unordered_map>
#include <string>
#include <string_view>
#include <iostream>
#include <memory>
#include <algorithm>
using namespace std;
using PayLoad = int;
unique_ptr<const char[]> conv(string_view str) {
unique_ptr<char[]> p (new char [str.size()+1]);
memcpy(p.get(), str.data(), str.size()+1);
return move(p);
}
int main() {
unordered_map<string_view, PayLoad> map;
vector<unique_ptr<const char[]>> mapKeyStore;
// Add multiple values
mapKeyStore.push_back(conv("a"));
map.emplace(mapKeyStore.back().get(), 3);
mapKeyStore.push_back(conv("b"));
map.emplace(mapKeyStore.back().get(), 1);
mapKeyStore.push_back(conv("c"));
map.emplace(mapKeyStore.back().get(), 4);
// Search all keys
cout << map.find("a")->second;
cout << map.find("b")->second;
cout << map.find("c")->second;
// Delete the "a" key
map.erase("a");
mapKeyStore.erase(remove_if(mapKeyStore.begin(), mapKeyStore.end(),
[](const auto& a){ return strcmp(a.get(), "a") == 0; }),
mapKeyStore.end());
// Test if verything is OK.
cout << '\n';
for(auto it : map)
cout << it.first << ": " << it.second << "\n";
return 0;
}
Of course, the two containers can be put into a wrapper which handles the insertion and deletion for its own.
I'll just present one variation I found on github, it involves defining a new map class that wraps the std.
Redefining some key API to intercept the adaptors we want, and use a static string to copy the key.
It's not necessary a good solution, but it's interesting to know it exists for people who deems it enough.
original:
https://gist.github.com/facontidavide/95f20c28df8ec91729f9d8ab01e7d2df
code gist:
template <typename Value>
class StringMap: public std::unordered_map<std::string, Value>
{
public:
typename std::unordered_map<string,Value>::iterator find(const nonstd::string_view& v )
{
tmp_.reserve( v.size() );
tmp_.assign( v.data(), v.size() );
return std::unordered_map<string, Value>::find(tmp_);
}
typename std::unordered_map<std::string,Value>::iterator find(const std::string& v )
{
return std::unordered_map<std::string, Value>::find(v);
}
typename std::unordered_map<std::string,Value>::iterator find(const char* v )
{
tmp_.assign(v);
return std::unordered_map<std::string, Value>::find(v);
}
private:
thread_local static std::string tmp_;
};
credits:
Davide Faconti
Sorry for answering this very old question, but it still comes up in search engine results...
In this case your unordered_map is using the string type as its key, the find method is looking for a reference to a string which will not generate an allocation. Your string_view class stores a pointer to a string. Therefore your string_view class can dereference the pointer into a ref of the type needed for your map without causing an allocation. The method would look like this...
string &string_view::getRef() const
{
return *_ptr;
}
and to use the string_view with the map it would look like this
auto found=map.find(string_view_inst.getRef());
note that this will not work for the c++17 string_view class as it does not internally store a std::string object
ps.
Your string_view class is probably not great for cpu caches as it stores a pointer to a string allocated somewhere on the heap, and the string itself stores a pointer to the actual data located somewhere else on the heap. Every time you access your string_view it will result in a double dereference.
You could allow your view to be implicitly convertible to a std::string:
class StringView {
// ...
operator std::string() const
{
return data->substr(begin, len);
}
// ...
};
I have a class that stores a std::vector of stuff. In my program, I create a std::unordered_set of std::shared_ptr to objects of this class (see code below). I defined custom functions to compute hashes and equality so that the unordered_set "works" with the objects instead of the pointers. This means: Two different pointers to different objects that have the same content should be treated as equal, let's call it "equivalent".
So far everything worked as expected but now I stumbled across a strange behaviour: I add a pointer to an object to the unordered_set and create a different pointer to a different object with the same content. As said I would expect that my_set.find(different_object) would return a valid iterator to the equivalent pointer stored in the set. But it doesn't.
Here is a minimal working code example.
#include <boost/functional/hash.hpp>
#include <cstdlib>
#include <functional>
#include <iostream>
#include <memory>
#include <unordered_set>
#include <vector>
class Foo {
public:
Foo() {}
bool operator==(Foo const & rhs) const {
return bar == rhs.bar;
}
std::vector<int> bar;
};
struct FooHash {
size_t operator()(std::shared_ptr<Foo> const & foo) const {
size_t seed = 0;
for (size_t i = 0; i < foo->bar.size(); ++i) {
boost::hash_combine(seed, foo->bar[i]);
}
return seed;
}
};
struct FooEq {
bool operator()(std::shared_ptr<Foo> const & rhs,
std::shared_ptr<Foo> const & lhs) const {
return *lhs == *rhs;
}
};
int main() {
std::unordered_set<std::shared_ptr<Foo>, FooHash, FooEq> fooSet;
auto empl = fooSet.emplace(std::make_shared<Foo>());
(*(empl.first))->bar.emplace_back(0);
auto baz = std::make_shared<Foo>();
baz->bar.emplace_back(0);
auto eqFun = fooSet.key_eq();
auto hashFun = fooSet.hash_function();
if (**fooSet.begin() == *baz) {
std::cout << "Objects equal" << std::endl;
}
if (eqFun(*fooSet.begin(), baz)) {
std::cout << "Keys equal" << std::endl;
}
if (hashFun(*fooSet.begin()) == hashFun(baz)) {
std::cout << "Hashes equal" << std::endl;
}
if (fooSet.find(baz) != fooSet.end()) {
std::cout << "Baz in fooSet" << std::endl;
} else {
std::cout << "Baz not in fooSet" << std::endl;
}
return 0;
}
Output
Objects equal
Keys equal
Hashes equal
And here is the problem:
Baz not in fooSet
What am I missing here? Why does the set not find the equivalent object?
Possibly of interest: I played around with this and found that if my class stores a plain int instead of a std::vector, it works. If I stick to the std::vector but change my constructor to
Foo(int i) : bar{i} {}
and initialize my objects with
std::make_shared<Foo>(0);
it also works. If I remove the whole pointer stuff, It breaks as std::unordered_set::find returns constant iterators and thus modification of objects in the set cannot be done (this way). However, none of these changes is applicable in my real program, anyway.
I compile with g++ version 7.3.0 using -std=c++17
You can't modify an element of a set (and expect the set to work). Because you have provided FooHash and FooEq which inspect the referent's value, that makes the referent part of the value from the point of view of the set!
If we change the initialisation of fooSet to set up the element before inserting it, we get the result you want/expect:
std::unordered_set<std::shared_ptr<Foo>, FooHash, FooEq> fooSet;
auto e = std::make_shared<Foo>();
e->bar.emplace_back(0); // modification is _before_
fooSet.insert(e); // insertion
Looking up the object in the set depends on the hash value not changing. If we really need to modify a member after it has been added, we need to remove it, make the changes, then add the modified object - see Yakk's answer.
To avoid running into issues like this, it may be safer to use std::shared_ptr<const Foo> as elements, which will prevent modification of the pointed-at Foo through the set (although you're still responsible for the use of any non-const pointers you may also have).
Any operation such that the hash or == result of an element in an unordered_set violates the rules of unordered_set is bad; the result is undefined behavior.
You changed the result of a hash of an element in an unordered_set, because your elements are shared pointers, but their hash and == is based off of the value pointed to. And your code changes the value pointed to.
Make all std::shared_ptr<Foo> in your code std::shared_ptr<Foo const>.
This includes the equals and hash code and unordered set code.
auto empl = fooSet.emplace(std::make_shared<Foo>());
(*(empl.first))->bar.emplace_back(0);
this code is right out, and it will (afterwards) fail to compile, as is safe.
If you want to mutate an element in a fooSet,
template<class C, class It, class F>
void mutate(C& c, It it, F&& f) {
auto e = *it->first;
f(e); // do this before erasing, more exception-safe
auto new_elem = std::make_shared<decltype(e)>(std::move(e));
c.erase(it);
c.insert( new_elem ); // could throw, but hard to avoid.
}
now the code reads:
auto empl = fooSet.emplace(std::make_shared<Foo>());
mutate(fooSet, empl.first, [&](auto&& elem) {
elem.emplace_back(0);
});
mutate copies an element out, removes the pointer from the set, calls the function on it, then reinserts it back into the fooSet.
Of course in this case it is dumb; we just put it in and now we take it out mutate it and put it back.
But in a more general case it will be less dumb.
Here you add an object and it's stored using its current hash value.
auto empl = fooSet.emplace(std::make_shared<Foo>());
Here you change the hash value:
(*(empl.first))->bar.emplace_back(0);
The set now has an object stored using the old/wrong hash value. If you need to change anything in an object that affects its hash value, you need to extract the object, change it and re-insert it. If all mutable members of the class are used to calculate the hash value, make it a set of <const Foo> instead.
To make future declarations of sets of shared_ptr<const Foo> easier, you may also extend the std namespace with your specializations.
class Foo {
public:
Foo() {}
size_t hash() const {
size_t seed = 0;
for (auto& b : bar) {
boost::hash_combine(seed, b);
}
return seed;
}
bool operator==(Foo const & rhs) const {
return bar == rhs.bar;
}
std::vector<int> bar;
};
namespace std {
template<>
struct hash<Foo> {
size_t operator()(const Foo& foo) const {
return foo.hash();
}
};
template<>
struct hash<std::shared_ptr<const Foo>> {
size_t operator()(const std::shared_ptr<const Foo>& foo) const {
/* A version using std::hash<Foo>:
std::hash<Foo> hasher;
return hasher(*foo);
*/
return foo->hash();
}
};
template<>
struct equal_to<std::shared_ptr<const Foo>> {
bool operator()(std::shared_ptr<const Foo> const & rhs,
std::shared_ptr<const Foo> const & lhs) const {
return *lhs == *rhs;
}
};
}
With that in place, you can simply declare your unordered_set like this:
std::unordered_set<std::shared_ptr<const Foo>> fooSet;
which now is the same as declaring it like this:
std::unordered_set<
std::shared_ptr<const Foo>,
std::hash<std::shared_ptr<const Foo>>,
std::equal_to<std::shared_ptr<const Foo>>
> fooSet;
Why in the following the hash function (which returns constant 0) seems not be taking any effect?
Since the hash function is returning constant, I was expecting as output all values to be 3. However, it seems to uniquely map the std::vector values to a unique value, regardless of my hash function being constant.
#include <iostream>
#include <map>
#include <unordered_map>
#include <vector>
// Hash returning always zero.
class TVectorHash {
public:
std::size_t operator()(const std::vector<int> &p) const {
return 0;
}
};
int main ()
{
std::unordered_map<std::vector<int> ,int, TVectorHash> table;
std::vector<int> value1({0,1});
std::vector<int> value2({1,0});
std::vector<int> value3({1,1});
table[value1]=1;
table[value2]=2;
table[value3]=3;
std::cout << "\n1=" << table[value1];
std::cout << "\n2=" << table[value2];
std::cout << "\n3=" << table[value3];
return 0;
}
Obtained output:
1=1
2=2
3=3
Expected output:
1=3
2=3
3=3
What am I missing about hash?
You misunderstood the use of the hash function: it's not used to compare elements. Internally, the map organizes the elements into buckets and the hash function is used to determine the bucket into which the element resides. Comparison of the elements is performed with another template parameter, look at the full declaration of the unordered_map template:
template<
class Key,
class T,
class Hash = std::hash<Key>,
class KeyEqual = std::equal_to<Key>,
class Allocator = std::allocator< std::pair<const Key, T> >
> class unordered_map;
The next template parameter after the hasher is the key comparator. To get the behavior you expect, you would have to do something like this:
class TVectorEquals {
public:
bool operator()(const std::vector<int>& lhs, const std::vector<int>& rhs) const {
return true;
}
};
std::unordered_map<std::vector<int> ,int, TVectorHash, TVectorEquals> table;
Now your map will have a single element and all your results will be 3.
A sane hash table implementation should not lose information, even in the presence of hash collisions. There are several techniques that allow the resolution of collisions (usually trading off runtime performance to data integrity).
Obviously, std::unordered_map implements it.
See: Hash Collision Resolution
Add a predicate key comparer class.
class TComparer {
public:
bool operator()(const std::vector<int> &a, const std::vector<int> &b) const {
return true; // this means that all keys are considered equal
}
};
Use it like this:
std::unordered_map<std::vector<int> ,int, TVectorHash, TComparer> table;
Then the rest of your code will work as expected.
I have an unordered_map that uses a string-type as a key:
std::unordered_map<string, value> map;
A std::hash specialization is provided for string, as well as a
suitable operator==.
Now I also have a "string view" class, which is a weak pointer into an existing string, avoiding heap allocations:
class string_view {
string *data;
size_t begin, len;
// ...
};
Now I'd like to be able to check if a key exists in the map using a string_view object. Unfortunately, std::unordered_map::find takes a Key argument, not a generic T argument.
(Sure, I can "promote" one to a string, but that causes an allocation I'd like to avoid.)
What I would've liked instead was something like
template<class Key, class Value>
class unordered_map
{
template<class T> iterator find(const T &t);
};
which would require operator==(T, Key) and std::hash<T>() to be suitably defined, and would return an iterator to a matching value.
Is there any workaround?
P0919R2 Heterogeneous lookup for unordered containers has been merged in the C++2a's working draft!
The abstract seems a perfect match w.r.t. my original question :-)
Abstract
This proposal adds heterogeneous lookup support to the unordered associative containers in the C++ Standard Library. As a result, a creation of a temporary key object is not needed when different (but compatible) type is provided as a key to the member function. This also makes unordered and regular associative container interfaces and functionality more compatible with each other.
With the changes proposed by this paper the following code will work without any additional performance hits:
template<typename Key, typename Value>
using h_str_umap = std::unordered_map<Key, Value, string_hash>;
h_str_umap<std::string, int> map = /* ... */;
map.find("This does not create a temporary std::string object :-)"sv);
As mentioned above, C++14 does not provide heterogeneous lookup for std::unordered_map (unlike std::map). You can use Boost.MultiIndex to define a fairly close substitute for std::unordered_map that allows you to look up string_views without allocating temporary std::strings:
Live Coliru Demo
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/member.hpp>
#include <string>
using namespace boost::multi_index;
struct string_view
{
std::string *data;
std::size_t begin,len;
};
template<typename T,typename Q>
struct mutable_pair
{
T first;
mutable Q second;
};
struct string_view_hash
{
std::size_t operator()(const string_view& v)const
{
return boost::hash_range(
v.data->begin()+v.begin,v.data->begin()+v.begin+v.len);
}
std::size_t operator()(const std::string& s)const
{
return boost::hash_range(s.begin(),s.end());
}
};
struct string_view_equal_to
{
std::size_t operator()(const std::string& s1,const std::string& s2)const
{
return s1==s2;
}
std::size_t operator()(const std::string& s1,const string_view& v2)const
{
return s1.size()==v2.len&&
std::equal(
s1.begin(),s1.end(),
v2.data->begin()+v2.begin);
}
std::size_t operator()(const string_view& v1,const std::string& s2)const
{
return v1.len==s2.size()&&
std::equal(
v1.data->begin()+v1.begin,v1.data->begin()+v1.begin+v1.len,
s2.begin());
}
};
template<typename Q>
using unordered_string_map=multi_index_container<
mutable_pair<std::string,Q>,
indexed_by<
hashed_unique<
member<
mutable_pair<std::string,Q>,
std::string,
&mutable_pair<std::string,Q>::first
>,
string_view_hash,
string_view_equal_to
>
>
>;
#include <iostream>
int main()
{
unordered_string_map<int> m={{"hello",0},{"boost",1},{"bye",2}};
std::string str="helloboost";
auto it=m.find(string_view{&str,5,5});
std::cout<<it->first<<","<<it->second<<"\n";
}
Output
boost,1
I faced an equal problem.
We need two structs:
struct string_equal {
using is_transparent = std::true_type ;
bool operator()(std::string_view l, std::string_view r) const noexcept
{
return l == r;
}
};
struct string_hash {
using is_transparent = std::true_type ;
auto operator()(std::string_view str) const noexcept {
return std::hash<std::string_view>()(str);
}
};
For unordered_map:
template <typename Value>
using string_unorderd_map = std::unordered_map<std::string, Value, string_hash, string_equal>;
For unordered_set:
using string_unorderd_set = std::unordered_set<std::string, string_hash, string_equal>;
Now using string_view is possible.
It looks like only as recently as C++14 did even the basic map get such a templated find for is_transparent types in the comparison. Most likely the correct implementation for hashed containers was not immediately evident.
As far as I can see your two options are:
Just do the allocation and profile to see if maybe it's not actually a problem.
Take a look at boost::multi_index (http://www.boost.org/doc/libs/1_60_0/libs/multi_index/doc/index.html) and have both string and string_view indexes into the container.
This solution has drawbacks, which may or may not make it unviable for your context.
You can make a wrapper class:
struct str_wrapper {
const char* start, end;
};
And change your map to use str_wrapper as its key. You'd have to add 2 constructors to str_wrapper, one for std::string and one for your string_view. The major decision is whether to make these constructors perform deep or shallow copies.
For example, if you use std::string only for inserts and str_view only for lookups, you'd make the std::string constructor deep and the str_view one shallow (this can be enforced at compile time if you use a custom wrapper around unordered_map). If you care to avoid memory leaks on the deep copy you would need additional fields to support proper destruction.
If your usage is more varied, (looking up std::string's or inserting by str_view), there will be drawbacks, which again, might make the approach too distasteful so as to be unviable. It depends on your intended usage.
Yet another option is to split the lookup and the data management by using multiple containters:
std::unordered_map<string_view, value> map;
std::vector<unique_ptr<const char[]>> mapKeyStore;
Lookups are done using string_views without the need of allocations.
Whenever a new key is inserted we need to add a real string allocation first:
mapKeyStore.push_back(conv(str)); // str can be string_view, char*, string... as long as it converts to unique_ptr<const char[]> or whatever type
map.emplace(mapKeyStore.back().get(), value)
It would be much more intuitive to use std::string in the mapKeyStore. However, using std::string does not guarantee unchanging string memory (e.g. if the vector resizes). With the unique_ptr this is enforced. However, we need some special conversion/allocation routine, called conv in the example. If you have a custom string container which guarantees data consistency under moves (and forces the vector to use moves), then you can use it here.
The drawback
The disadvantage of the above method is that handling deletions is non-trivial and expensive if done naive. If the map is only created once or only growing this is a non-issue and the above pattern works quite well.
Running example
The below example includes a naive deletion of one key.
#include <vector>
#include <unordered_map>
#include <string>
#include <string_view>
#include <iostream>
#include <memory>
#include <algorithm>
using namespace std;
using PayLoad = int;
unique_ptr<const char[]> conv(string_view str) {
unique_ptr<char[]> p (new char [str.size()+1]);
memcpy(p.get(), str.data(), str.size()+1);
return move(p);
}
int main() {
unordered_map<string_view, PayLoad> map;
vector<unique_ptr<const char[]>> mapKeyStore;
// Add multiple values
mapKeyStore.push_back(conv("a"));
map.emplace(mapKeyStore.back().get(), 3);
mapKeyStore.push_back(conv("b"));
map.emplace(mapKeyStore.back().get(), 1);
mapKeyStore.push_back(conv("c"));
map.emplace(mapKeyStore.back().get(), 4);
// Search all keys
cout << map.find("a")->second;
cout << map.find("b")->second;
cout << map.find("c")->second;
// Delete the "a" key
map.erase("a");
mapKeyStore.erase(remove_if(mapKeyStore.begin(), mapKeyStore.end(),
[](const auto& a){ return strcmp(a.get(), "a") == 0; }),
mapKeyStore.end());
// Test if verything is OK.
cout << '\n';
for(auto it : map)
cout << it.first << ": " << it.second << "\n";
return 0;
}
Of course, the two containers can be put into a wrapper which handles the insertion and deletion for its own.
I'll just present one variation I found on github, it involves defining a new map class that wraps the std.
Redefining some key API to intercept the adaptors we want, and use a static string to copy the key.
It's not necessary a good solution, but it's interesting to know it exists for people who deems it enough.
original:
https://gist.github.com/facontidavide/95f20c28df8ec91729f9d8ab01e7d2df
code gist:
template <typename Value>
class StringMap: public std::unordered_map<std::string, Value>
{
public:
typename std::unordered_map<string,Value>::iterator find(const nonstd::string_view& v )
{
tmp_.reserve( v.size() );
tmp_.assign( v.data(), v.size() );
return std::unordered_map<string, Value>::find(tmp_);
}
typename std::unordered_map<std::string,Value>::iterator find(const std::string& v )
{
return std::unordered_map<std::string, Value>::find(v);
}
typename std::unordered_map<std::string,Value>::iterator find(const char* v )
{
tmp_.assign(v);
return std::unordered_map<std::string, Value>::find(v);
}
private:
thread_local static std::string tmp_;
};
credits:
Davide Faconti
Sorry for answering this very old question, but it still comes up in search engine results...
In this case your unordered_map is using the string type as its key, the find method is looking for a reference to a string which will not generate an allocation. Your string_view class stores a pointer to a string. Therefore your string_view class can dereference the pointer into a ref of the type needed for your map without causing an allocation. The method would look like this...
string &string_view::getRef() const
{
return *_ptr;
}
and to use the string_view with the map it would look like this
auto found=map.find(string_view_inst.getRef());
note that this will not work for the c++17 string_view class as it does not internally store a std::string object
ps.
Your string_view class is probably not great for cpu caches as it stores a pointer to a string allocated somewhere on the heap, and the string itself stores a pointer to the actual data located somewhere else on the heap. Every time you access your string_view it will result in a double dereference.
You could allow your view to be implicitly convertible to a std::string:
class StringView {
// ...
operator std::string() const
{
return data->substr(begin, len);
}
// ...
};
I have a need for a "container" that acts like the following. It has 2 subcontainers, called A and B, and I need to be able to iterate over just A, just B, and A and B combined. I don't want to use extra space for redundant data, so I thought of making my own iterator to iterate over A and B combined. What is the easiest way to make your own iterator? Or, what is another way to do this?
EDIT Ultimately, I don't think it was good design. I have redesigned the entire class heirarchy. +1 for refactoring. However, I did solve this problem sufficiently. Here's an abbreviated version of what I did, for reference; it uses boost::filter_iterator. Let T be the type in the container.
enum Flag
{
A_flag,
B_flag
};
class T_proxy
{
public:
T_proxy(const T& t, Flag f) : t_(t), flag_(f) {}
operator T() const {return t_;}
Flag flag() const {return flag_;}
class Compare
{
public:
Compare(Flag f) : matchFlag_(f) {}
operator() (const T_proxy& tp) {return tp.flag() == matchFlag_;}
private:
Flag matchFlag_;
};
private:
T t_;
Flag flag_;
};
class AB_list
{
public:
typedef T_proxy::Compare Compare;
typedef vector<T_proxy>::iterator iterator;
typedef boost::filter_iterator<Compare, iterator> sub_iterator;
void insert(const T& val, Flag f) {data_.insert(T_proxy(val, f));}
// other methods...
// whole sequence
iterator begin() {return data_.begin();}
iterator end() {return data_.end();}
// just A
sub_iterator begin_A() {return sub_iterator(Compare(A_flag), begin(), end());
sub_iterator end_A() {return sub_iterator(Compare(A_flag), end(), end());
// just B is basically the same
private:
vector<T_proxy> data_;
};
// usage
AB_list mylist;
mylist.insert(T(), A_flag);
for (AB_list::sub_iterator it = mylist.begin_A(); it != mylist.end_A(); ++it)
{
T temp = *it; // T_proxy is convertible to T
cout << temp;
}
I will repost my answer to a similar question. I think this will do what you want.
Use a library like Boost.MultiIndex to do what you want. It scales well and there is a lot less boiler plate code if you want to add new indexes. It is also usually more space and time efficient
typedef multi_index_container<
Container,
indexed_by<
sequenced<>, //gives you a list like interface
ordered_unique<Container, std::string, &Container::a_value>, //gives you a lookup by name like map
ordered_unique<Container, std::string, &Container::b_value> //gives you a lookup by name like map
>
> container;
If you are iterating over one index, you can switch to another index by using the iterator projection concept in the library.
Have one container which stores the value you are interested in together with a flag indicating whether it is in A or B.
You could also create a single container containing std::pair<> objects.
Billy3