Say we have a map with larger objects and an index value. The index value is also part of the larger object.
What I would like to know is whether it is possible to replace the map with a set, extracting the index value.
It is fairly easy to create a set that sorts on a functor comparing two larger objects by extracting the index value.
Which leaves searching by index value, which is not supported by default in a set, I think.
I was thinking of using std::find_if, but I believe that searches linearly, ignoring the fact we have set.
Then I thought of using std::binary_search with a functor comparing the larger object and the value, but I believe that it doesn't work in this case as it wouldn't make use of the structure and would use traversal as it doesn't have a random access iterator. Is this correct? Or are there overloads which correctly handle this call on a set?
And then finally I was thinking of using a boost::containter::flat_set, as this has an underlying vector and thus presumably should be able to work well with std::binary_search?
But maybe there is an all together easier way to do this?
Before you answer just use a map where a map ought to be used - I am actually using a vector that is manually sorted (well std::lower_bound) and was thinking of replacing it with boost::containter::flat_set, but it doesn't seem to be easily possible to do so, so I might just stick with the vector.
C++14 will introduce the ability to lookup by a key that does not require the construction of the entire stored object. This can be used as follows:
#include <set>
#include <iostream>
struct StringRef {
StringRef(const std::string& s):x(&s[0]) { }
StringRef(const char *s):x(s) { std::cout << "works: " << s << std::endl; }
const char *x;
};
struct Object {
long long data;
std::size_t index;
};
struct ObjectIndexer {
ObjectIndexer(Object const& o) : index(o.index) {}
ObjectIndexer(std::size_t index) : index(index) {}
std::size_t index;
};
struct ObjComp {
bool operator()(ObjectIndexer a, ObjectIndexer b) const {
return a.index < b.index;
}
typedef void is_transparent; //Allows the comparison with non-Object types.
};
int main() {
std::set<Object, ObjComp> stuff;
stuff.insert(Object{135, 1});
std::cout << stuff.find(ObjectIndexer(1))->data << "\n";
}
More generally, these sorts of problems where there are multiple ways of indexing your data can be solved using Boost.MultiIndex.
Use boost::intrusive::set which can utilize the object's index value directly. It has a find(const KeyType & key, KeyValueCompare comp) function with logarithmic complexity. There are also other set types based on splay trees, AVL trees, scapegoat trees etc. which may perform better depending on your requirements.
If you add the following to your contained object type:
less than operator that only compares the object indices
equality operator that only compares the object indices
a constructor that takes your index type and initializes a dummy object with that value for the index
then you can pass your index type to find, lower_bound, equal_range, etc... and it will act the way you want. When you pass your index to the set's (or flat_set's) find methods it will construct a dummy object of the contained type to use for the comparisons.
Now if your object is really big, or expensive to construct, this might not be the way you want to go.
When trying to write the std::string keys of an std::unordered_map in the following example, the keys get written in a different order than the one given by the initializer list:
#include <iostream>
#include <unordered_map>
class Data
{
typedef std::unordered_map<std::string, double> MapType;
typedef MapType::const_iterator const_iterator;
MapType map_;
public:
Data(const std::initializer_list<std::string>& i)
{
int counter = 0;
for (const auto& name : i)
{
map_[name] = counter;
}
}
const_iterator begin() const
{
return map_.begin();
}
const_iterator end() const
{
return map_.end();
}
};
std::ostream& operator<<(std::ostream& os, const Data& d)
{
for (const auto& pair : d)
{
os << pair.first << " ";
}
return os;
}
using namespace std;
int main(int argc, const char *argv[])
{
Data d = {"Why", "am", "I", "sorted"};
// The unordered_map speaks like Yoda.
cout << d << endl;
return 0;
}
I expected to see 'Why am I sorted', but I got a Yoda-like output:
sorted I am Why
Reading on the unordered_map here, I saw this:
Internally, the elements are not sorted in any particular order, but organized into buckets. Which bucket an element is placed into depends entirely on the hash of its key. This allows fast access to individual elements, since once hash is computed, it refers to the exact bucket the element is placed into.
Is this why the elements are not ordered in the same way as in the initializer list?
What data structure do I then use when I want the keys to be ordered in the same way as the initializer list? Should I internally keep a vector of strings to somehow save the argument order? Can the bucket organization be turned off somehow by choosing a specific hashing function?
What data structure do I then use when I want the keys to be ordered in the same way as the initializer list? Should I internally keep a vector of strings to somehow save the argument order?
Maybe all you want is actually a list/vector of (key, value) pairs?
If you want both O(1) lookup (hashmap) and iteration in the same order as insertion - then yes, using a vector together with an unordered_map sounds like a good idea. For example, Django's SortedDict (Python) does exactly that, here's the source for inspiration:
https://github.com/django/django/blob/master/django/utils/datastructures.py#L122
Python 2.7's OrderedDict is a bit more fancy (map values point to doubly-linked list links), see:
http://code.activestate.com/recipes/576693-ordered-dictionary-for-py24/
I'm not aware of an existing C++ implementation in standard libs, but this might get you somewhere. See also:
a C++ hash map that preserves the order of insertion
A std::map that keep track of the order of insertion?
unordered_map is, by definition, unordered, so you shall not expect any ordering when accessing the map sequentially.
If you don't want elements sorted by the key value, just use a container that keeps your order of insertion, be it a vector, deque, list or whatever, of pair<key, value> element type if you insist on using it.
Then, if an alement B is added after element A, it will always appear later. This holds true for initializer_list initialization as well.
You could probably use something like Boost.MultiIndex to keep it both sorted by insertion order and arbitrary key.
I am working in a C++03 environment, and applying a function to every key of a map is a lot of code:
const std::map<X,Y>::const_iterator end = m_map.end();
for (std::map<X,Y>::const_iterator element = m_map.begin(); element != end; ++element)
{
func( element->first );
}
If a key_iterator existed, the same code could take advantage of std::for_each:
std::for_each( m_map.key_begin(), m_map.key_end(), &func );
So why isn’t it provided? And is there a way to adapt the first pattern to the second one?
Yes, it is a silly shortcoming. But it's easily rectified: you can write your own generic key_iterator class which can be constructed from the map (pair) iterator. I've done it, it's only a few lines of code, and it's then trivial to make value_iterator too.
There is no need for std::map<K, V> to provide iterators for the keys and/or the values: such an iterator can easily be built based on the existing iterator(s). Well, it isn't as easy as it should/could be but it is certainly doable. I know that Boost has a library of iterator adapters.
The real question could be: why doesn't the standard C++ library provide iterator adapters to project iterators? The short answer is in my opinion: because, in general, you don't want to modify the iterator to choose the property accessed! You rather want to project or, more general, transform the accessed value but still keep the same notion of position. Formulated different, I think it is necessary to separate the notion of positioning (i.e., advancing iterator and testing whether their position is valid) from accessing properties at a given position. The approach I envision is would look like this:
std::for_each(m_map.key_pm(), m_map.begin(), m_map.end(), &func);
or, if you know the underlying structure obtained from the map's iterator is a std::pair<K const, V> (as is the case for std::map<K, V> but not necessarily for other containers similar to associative containers; e.g., a associative container based on a b-tree would benefit from splitting the key and the value into separate entities):
std::for_each(_1st, m_map.begin(), m_map.end(), &func);
My STL 2.0 page is an [incomplete] write-up with a bit more details on how I think the standard C++ library algorithms should be improved, including the above separation of iterators into positioning (cursors) and property access (property maps).
So why isn’t it provided?
I don't know.
And is there a way to adapt the first pattern to the second one?
Alternatively to making a “key iterator” (cf. my comment and other answers), you can write a small wrapper around func, e.g.:
class FuncOnFirst { // (maybe find a better name)
public:
void operator()(std::map<X,Y>::value_type const& e) const { func(e.first); }
};
then use:
std::for_each( m_map.begin(), m_map.end(), FuncOnFirst() );
Slightly more generic wrapper:
class FuncOnFirst { // (maybe find a better name)
public:
template<typename T, typename U>
void operator()(std::pair<T, U> const& p) const { func(p.first); }
};
There is no need for key_iterator or value_iterator as value_type of a std::map is a std::pair<const X, Y>, and this is what function (or functor) called by for_each() will operate on. There is no performance gain to be had from individual iterators as the pair is aggregated in the underlying node in the binary tree used by the map.
Accessing the key and value through a std::pair is hardly strenuous.
#include <iostream>
#include <map>
typedef std::map<unsigned, unsigned> Map;
void F(const Map::value_type &v)
{
std::cout << "Key: " << v.first << " Value: " << v.second << std::endl;
}
int main(int argc, const char * argv[])
{
Map map;
map.insert(std::make_pair(10, 20));
map.insert(std::make_pair(43, 10));
map.insert(std::make_pair(5, 55));
std::for_each(map.begin(), map.end(), F);
return 0;
}
Which gives the output:
Key: 5 Value: 55
Key: 10 Value: 20
Key: 43 Value: 10
Program ended with exit code: 0
#include <iostream>
#include <hash_map>
using namespace stdext;
using namespace std;
class CompareStdString
{
public:
bool operator ()(const string & str1, const string & str2) const
{
return str1.compare(str2) < 0;
}
};
int main()
{
hash_map<string, int, hash_compare<string, CompareStdString> > Map;
Map.insert(make_pair("one", 1));
Map.insert(make_pair("two", 2));
Map.insert(make_pair("three", 3));
Map.insert(make_pair("four", 4));
Map.insert(make_pair("five", 5));
hash_map<string, int, hash_compare<string, CompareStdString> > :: iterator i;
for (i = Map.begin(); i != Map.end(); ++i)
{
i -> first; // they are ordered as three, five, two, four, one
}
return 0;
}
I want to use hash_map to keep std::string as a key. But when i insert the next pair order is confused. Why order is do not match to insert order ? how should i get the order one two three four five ??
Why order is do not match to insert order?
That's because a stdext::hash_map (and the platform-independent standard library version std::unordered_map from C++11) doesn't maintain/guarantee any reasonable order of its elements, not even insertion order. That's because it is a hashed container, with the individual elements' position based on their hash value and the size of the container. So you won't be able to maintain a reasonable order for your data with such a container.
What you can use to keep your elements in a guaranteed order is a good old std::map. But this also doesn't order elements by insertion order, but by the order induced by the comparison predicate (which can be confugured to respect insertion time, but that would be quite unintuitive and not that easy at all).
For anything else you won't get around rolling your own (or search for other libraries, don't know if boost has something like that). For example add all elements to a linear std::vector/std::list for insertion order iteration and maintain an additional std::(unordered_)map pointing into that vector/list for O(1)/O(log n) retrieval if neccessary.
This is my code
map<string,int> persons;
persons["B"] = 123;
persons["A"] = 321;
for(map<string,int>::iterator i = persons.begin();
i!=persons.end();
++i)
{
cout<< (*i).first << ":"<<(*i).second<<endl;
}
Expected output:
B:123
A:321
But output it gives is:
A:321
B:123
I want it to maintain the order in which keys and values were inserted in the map<string,int>.
Is it possible? Or should I use some other STL data structure? Which one?
There is no standard container that does directly what you want. The obvious container to use if you want to maintain insertion order is a vector. If you also need look up by string, use a vector AND a map. The map would in general be of string to vector index, but as your data is already integers you might just want to duplicate it, depending on your use case.
Like Matthieu has said in another answer, the Boost.MultiIndex library seems the right choice for what you want. However, this library can be a little tough to use at the beginning especially if you don't have a lot of experience with C++. Here is how you would use the library to solve the exact problem in the code of your question:
struct person {
std::string name;
int id;
person(std::string const & name, int id)
: name(name), id(id) {
}
};
int main() {
using namespace::boost::multi_index;
using namespace std;
// define a multi_index_container with a list-like index and an ordered index
typedef multi_index_container<
person, // The type of the elements stored
indexed_by< // The indices that our container will support
sequenced<>, // list-like index
ordered_unique<member<person, string,
&person::name> > // map-like index (sorted by name)
>
> person_container;
// Create our container and add some people
person_container persons;
persons.push_back(person("B", 123));
persons.push_back(person("C", 224));
persons.push_back(person("A", 321));
// Typedefs for the sequence index and the ordered index
enum { Seq, Ord };
typedef person_container::nth_index<Seq>::type persons_seq_index;
typedef person_container::nth_index<Ord>::type persons_ord_index;
// Let's test the sequence index
persons_seq_index & seq_index = persons.get<Seq>();
for(persons_seq_index::iterator it = seq_index.begin(),
e = seq_index.end(); it != e; ++it)
cout << it->name << ":"<< it->id << endl;
cout << "\n";
// And now the ordered index
persons_ord_index & ord_index = persons.get<Ord>();
for(persons_ord_index::iterator it = ord_index.begin(),
e = ord_index.end(); it != e; ++it)
cout << it->name << ":"<< it->id << endl;
cout << "\n";
// Thanks to the ordered index we have fast lookup by name:
std::cout << "The id of B is: " << ord_index.find("B")->id << "\n";
}
Which produces the following output:
B:123
C:224
A:321
A:321
B:123
C:224
The id of B is: 123
Map is definitely not right for you:
"Internally, the elements in the map are sorted from lower to higher key value following a specific strict weak ordering criterion set on construction."
Quote taken from here.
Unfortunately there is no unordered associative container in the STL, so either you use a nonassociative one like vector, or write your own :-(
I had the same problem every once in a while and here is my solution: https://github.com/nlohmann/fifo_map. It's a header-only C++11 solution and can be used as drop-in replacement for a std::map.
For your example, it can be used as follows:
#include "fifo_map.hpp"
#include <string>
#include <iostream>
using nlohmann::fifo_map;
int main()
{
fifo_map<std::string,int> persons;
persons["B"] = 123;
persons["A"] = 321;
for(fifo_map<std::string,int>::iterator i = persons.begin();
i!=persons.end();
++i)
{
std::cout<< (*i).first << ":"<<(*i).second << std::endl;
}
}
The output is then
B:123
A:321
Besides Neil's recommendation of a combined vector+map if you need both to keep the insertion order and the ability to search by key, you can also consider using boost multi index libraries, that provide for containers addressable in more than one way.
maps and sets are meant to impose a strict weak ordering upon the data. Strick weak ordering maintains that no entries are equavalent (different to being equal).
You need to provide a functor that the map/set may use to perform a<b. With this functor the map/set sorts its items (in the STL from GCC it uses a red-black tree). It determines weather two items are equavalent if !a<b && !b<a -- the equavelence test.
The functor looks like follows:
template <class T>
struct less : binary_function<T,T,bool> {
bool operator() (const T& a, const T& b) const {
return a < b;
}
};
If you can provide a function that tells the STL how to order things then the map and set can do what you want. For example
template<typename T>
struct ItemHolder
{
int insertCount;
T item;
};
You can then easily write a functor to order by insertCount. If your implementation uses red-black trees your underlying data will remain balanced -- however you will get a lot of re-balancing since your data will be generated based on incremental ordering (vs. Random) -- and in this case a list with push_back would be better. However you cannot access data by key as fast as you would with a map/set.
If you want to sort by string -- provide the functor to search by string, using the insertCount you could potentiall work backwards. If you want to search by both you can have two maps.
map<insertcount, string> x; // auxhilary key
map<string, item> y; //primary key
I use this strategy often -- however I have never placed it in code that is run often. I'm considering boost::bimap.
Well, there is no STL container which actually does what you wish, but there are possibilities.
1. STL
By default, use a vector. Here it would mean:
struct Entry { std::string name; int it; };
typedef std::vector<Entry> container_type;
If you wish to search by string, you always have the find algorithm at your disposal.
class ByName: std::unary_function<Entry,bool>
{
public:
ByName(const std::string& name): m_name(name) {}
bool operator()(const Entry& entry) const { return entry.name == m_name; }
private:
std::string m_name;
};
// Use like this:
container_type myContainer;
container_type::iterator it =
std::find(myContainer.begin(), myContainer.end(), ByName("A"));
2. Boost.MultiIndex
This seems way overkill, but you can always check it out here.
It allows you to create ONE storage container, accessible via various indexes of various styles, all maintained for you (almost) magically.
Rather than using one container (std::map) to reference a storage container (std::vector) with all the synchro issues it causes... you're better off using Boost.
For preserving all the time complexity constrains you need map + list:
struct Entry
{
string key;
int val;
};
typedef list<Entry> MyList;
typedef MyList::iterator Iter;
typedef map<string, Iter> MyMap;
MyList l;
MyMap m;
int find(string key)
{
Iter it = m[key]; // O(log n)
Entry e = *it;
return e.val;
}
void put(string key, int val)
{
Entry e;
e.key = key;
e.val = val;
Iter it = l.insert(l.end(), e); // O(1)
m[key] = it; // O(log n)
}
void erase(string key)
{
Iter it = m[key]; // O(log n)
l.erase(it); // O(1)
m.erase(key); // O(log n)
}
void printAll()
{
for (Iter it = l.begin(); it != l.end(); it++)
{
cout<< it->key << ":"<< it->val << endl;
}
}
Enjoy
You could use a vector of pairs, it is almost the same as unsorted map container
std::vector<std::pair<T, U> > unsorted_map;
Use a vector. It gives you complete control over ordering.
I also think Map is not the way to go. The keys in a Map form a Set; a single key can occur only once. During an insert in the map the map must search for the key, to ensure it does not exist or to update the value of that key. For this it is important (performance wise) that the keys, and thus the entries, have some kind of ordering. As such a Map with insert ordering would be highly inefficient on inserts and retrieving entries.
Another problem would be if you use the same key twice; should the first or the last entry be preserved, and should it update the insert order or not?
Therefore I suggest you go with Neils suggestion, a vector for insert-time ordering and a Map for key-based searching.
Yes, the map container is not for you.
As you asked, you need the following code instead:
struct myClass {
std::string stringValue;
int intValue;
myClass( const std::string& sVal, const int& iVal ):
stringValue( sVal ),
intValue( iVal) {}
};
std::vector<myClass> persons;
persons.push_back( myClass( "B", 123 ));
persons.push_back( myClass( "A", 321 ));
for(std::vector<myClass>::iterator i = persons.begin();
i!=persons.end();
++i)
{
std::cout << (*i).stringValue << ":" << (*i).intValue << std::endl;
}
Here the output is unsorted as expected.
Map is ordered collection (second parametr in template is a order functor), as set. If you want to pop elements in that sequenses as pushd you should use deque or list or vector.
In order to do what they do and be efficient at it, maps use hash tables and sorting. Therefore, you would use a map if you're willing to give up memory of insertion order to gain the convenience and performance of looking up by key.
If you need the insertion order stored, one way would be to create a new type that pairs the value you're storing with the order you're storing it (you would need to write code to keep track of the order). You would then use a map of string to this new type for storage. When you perform a look up using a key, you can also retrieve the insertion order and then sort your values based on insertion order.
One more thing: If you're using a map, be aware of the fact that testing if persons["C"] exists (after you've only inserted A and B) will actually insert a key value pair into your map.
Instead of map you can use the pair function with a vector!
ex:
vector<::pair<unsigned,string>> myvec;
myvec.push_back(::pair<unsigned,string>(1,"a"));
myvec.push_back(::pair<unsigned,string>(5,"b"));
myvec.push_back(::pair<unsigned,string>(3,"aa"));`
Output:
myvec[0]=(1,"a"); myvec[1]=(5,"b"); myvec[2]=(3,"aa");
or another ex:
vector<::pair<string,unsigned>> myvec2;
myvec2.push_back(::pair<string,unsigned>("aa",1));
myvec2.push_back(::pair<string,unsigned>("a",3));
myvec2.push_back(::pair<string,unsigned>("ab",2));
Output: myvec2[0]=("aa",1); myvec2[1]=("a",3); myvec2[2]=("ab",2);
Hope this can help someone else in the future who was looking for non sorted maps like me!
struct Compare : public binary_function<int,int,bool> {
bool operator() (int a, int b) {return true;}
};
Use this to get all the elements of a map in the reverse order in which you entered (i.e.: the first entered element will be the last and the last entered element will be the first). Not as good as the same order but it might serve your purpose with a little inconvenience.
Use a Map along with a vector of iterators as you insert in Map. (Map iterators are guaranteed not to be invalidated)
In the code below I am using Set
set<string> myset;
vector<set<string>::iterator> vec;
void printNonDuplicates(){
vector<set<string>::iterator>::iterator vecIter;
for(vecIter = vec.begin();vecIter!=vec.end();vecIter++){
cout<<(*vecIter)->c_str()<<endl;
}
}
void insertSet(string str){
pair<set<string>::iterator,bool> ret;
ret = myset.insert(str);
if(ret.second)
vec.push_back(ret.first);
}
If you don't want to use boost::multi_index, I have put a proof of concept class template up for review here:
https://codereview.stackexchange.com/questions/233157/wrapper-class-template-for-stdmap-stdlist-to-provide-a-sequencedmap-which
using std::map<KT,VT> and std::list<OT*> which uses pointers to maintain the order.
It will take O(n) linear time for the delete because it needs to search the whole list for the right pointer. To avoid that would need another cross reference in the map.
I'd vote for typedef std::vector< std::pair< std::string, int > > UnsortedMap;
Assignment looks a bit different, but your loop remains exactly as it is now.
There is std::unordered_map that you can check out. From first view, it looks like it might solve your problem.