Having a composite key for hash map in c++ - c++

I have a data structure which has,
<Book title>, <Author>, and <rate>
Since Book title or Author can be duplicated, I'd like to build a composite key.
(let's say I cannot make extra unique key, such as ID)
Since data is pretty huge, I'm using GCC unordered_map for the sake of speed,
and I built my structure like this:
typedef pair<string, string> keys_t
typedef unordered_map<keys_t, double> map_t;
Everything works okay in general,
But the problem happens when I want to refer one specific key.
For example, let's suppose I'd like to find the best-rated book among books titled as "math", or I'd like to find the average rate of "Tolstoy"'s books.
In this case, this becomes very bothersome, since I cannot only refer only one of the key pair.
I happened to find boost::multi_index but I'm having some trouble understanding the documents.
Does anyone have some idea or guideline for this?
Solution to make multiple indices, succinct example for multi_index, any other approach, etc.. any help will be appreciated.
Thank you.

I figured out how to use boost::multi_index
I referred this code: Boost multi_index composite keys using MEM_FUN
and here's my code for your reference.
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/mem_fun.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/composite_key.hpp>
#include <boost/multi_index/member.hpp>
#include <iostream>
#include <string>
using namespace boost::multi_index;
using namespace std;
class Book {
public:
Book(const string &lang1, const string &lang2, const double &value) : m_lang1(lang1) , m_lang2(lang2) , m_value(value) {}
friend std::ostream& operator << (ostream& os,const Book& n) {
os << n.m_lang1 << " " << n.m_lang2 << " " << n.m_value << endl;
return os;
}
const string &lang1() const { return m_lang1; }
const string &lang2() const { return m_lang2; }
const double &value() const { return m_value; }
private:
string m_lang1, m_lang2;
double m_value;
};
// These will be Tag names
struct lang1 {};
struct lang2 {};
struct value {};
typedef multi_index_container <
Book,
indexed_by<
ordered_non_unique<tag<lang1>, BOOST_MULTI_INDEX_CONST_MEM_FUN( Book, const string &, lang1)
>,
ordered_non_unique<tag<lang2>, BOOST_MULTI_INDEX_CONST_MEM_FUN(Book, const string &, lang2)
>,
ordered_non_unique<tag<value>, BOOST_MULTI_INDEX_CONST_MEM_FUN(Book, const double &, value), greater<double>
>,
ordered_unique<
// make as a composite key with Title and Author
composite_key<
Book,
BOOST_MULTI_INDEX_CONST_MEM_FUN(Book, const string &, lang1),
BOOST_MULTI_INDEX_CONST_MEM_FUN(Book, const string &, lang2)
>
>
>
> Book_set;
// Indices for iterators
typedef Book_set::index<lang1>::type Book_set_by_lang1;
typedef Book_set::index<lang2>::type Book_set_by_lang2;
typedef Book_set::index<value>::type Book_set_by_value;
int main() {
Book_set books;
books.insert(Book("Math", "shawn", 4.3));
books.insert(Book("Math", "john", 4.2));
books.insert(Book("Math2", "abel", 3.8));
books.insert(Book("Novel1", "Tolstoy", 5.0));
books.insert(Book("Novel1", "Tolstoy", 4.8)); // This will not be inserted(duplicated)
books.insert(Book("Novel2", "Tolstoy", 4.2));
books.insert(Book("Novel3", "Tolstoy", 4.4));
books.insert(Book("Math", "abel", 2.5));
books.insert(Book("Math2", "Tolstoy", 3.0));
cout << "SORTED BY TITLE" << endl;
for (Book_set_by_lang1::iterator itf = books.get<lang1>().begin(); itf != books.get<lang1>().end(); ++itf)
cout << *itf;
cout << endl<<"SORTED BY AUTHOR" << endl;
for (Book_set_by_lang2::iterator itm = books.get<lang2>().begin(); itm != books.get<lang2>().end(); ++itm)
cout << *itm;
cout << endl<<"SORTED BY RATING" << endl;
for (Book_set_by_value::iterator itl = books.get<value>().begin(); itl != books.get<value>().end(); ++itl)
cout << *itl;
// Want to see Tolstoy's books? (in descending order of rating)
cout << endl;
Book_set_by_lang2::iterator mitchells = books.get<lang2>().find("Tolstoy");
while (mitchells->lang2() == "Tolstoy")
cout << *mitchells++;
return 0;
}
Thank you all who made comments!

What I did in a similar case was use a single container to contain the
objects and separate std::multiset<ObjectType const*, CmpType> for
each possible index; when inserting, I'd do a push_back, then recover
the address from back(), and insert it into each of the std::set.
(std::unordered_set and std::unordered_multiset would be better in
your case: in my case, not only was the order significant, but I didn't
have access to a recent compiler with unordered_set either.)
Note that this supposes that the objects are immutable once they are in
the container. If you're going to mutate one of them, you probably should
extract it from all of the sets, do the modification, and reinsert it.
This also supposes that the main container type will never invalidate
pointers and references to an object; in my case, I knew the maximum
size up front, so I could do a reserve() and use std::vector.
Failing this, you could use std::deque, or simply use an std::map
for the primary (complete) key.
Even this requires accessing the complete element in the key. It's not
clear from your posting whether this is enough—“books titled
math” makes me thing that you might need a substring search in the
title (and should “Tolstoy” match “Leo
Tolstoy”?). If you want to match an arbitrary substring, either
your multiset will be very, very large (since you'll insert all possible
substrings as entries), or you'll do a linear search. (On a long
running system where the entries aren't changing, it might be worth
compromizing: do the linear search the first time the substring is
requested, but cache the results in a multiset, so that the next time,
you can find them quickly. It's likely that people will often use the
same substrings, e.g. “math” for any book with
“math” in its title.)

There is an article on the same subject:
http://marknelson.us/2011/09/03/hash-functions-for-c-unordered-containers/
The author, Mark Nelson, was trying to do the similar : "use of a simple class or structure to hold the person’s name", basically he's using a pair as key (just like you) for his unordered_map:
typedef pair<string,string> Name;
int main(int argc, char* argv[])
{
unordered_map<Name,int> ids;
ids[Name("Mark", "Nelson")] = 40561;
ids[Name("Andrew","Binstock")] = 40562;
for ( auto ii = ids.begin() ; ii != ids.end() ; ii++ )
cout << ii->first.first
<< " "
<< ii->first.second
<< " : "
<< ii->second
<< endl;
return 0;
}
He realized that the unordered_map doesn’t know how to create a hash for the given key type of std::pair.
So he demonstrates 4 ways of creating a hash function for use in unordered_map.

If it is an infrequent operation you can search for the value.
for(auto& p : m)
{
if(p.second.name==name_to_find)
{
//you now have the element
}
}
however if the map is large this will be problematic because it will be a linear procedure rather than O(log n), this is a problem because maps are inherently slow.

Related

How to preserve insertion order in Map? [duplicate]

I currently have a std::map<std::string,int> that stores an integer value to a unique string identifier, and I do look up with the string. It does mostly what I want, except that it does not keep track of the insertion order. So when I iterate the map to print out the values, they are sorted according to the string; but I want them to be sorted according to the order of (first) insertion.
I thought about using a vector<pair<string,int>> instead, but I need to look up the string and increment the integer values about 10,000,000 times, so I don't know whether a std::vector will be significantly slower.
Is there a way to use std::map or is there another std container that better suits my need?
I'm on GCC 3.4, and I have probably no more than 50 pairs of values in my std::map.
If you have only 50 values in std::map you could copy them to std::vector before printing out and sort via std::sort using appropriate functor.
Or you could use boost::multi_index. It allows to use several indexes.
In your case it could look like the following:
struct value_t {
string s;
int i;
};
struct string_tag {};
typedef multi_index_container<
value_t,
indexed_by<
random_access<>, // this index represents insertion order
hashed_unique< tag<string_tag>, member<value_t, string, &value_t::s> >
>
> values_t;
You might combine a std::vector with a std::tr1::unordered_map (a hash table). Here's a link to Boost's documentation for unordered_map. You can use the vector to keep track of the insertion order and the hash table to do the frequent lookups. If you're doing hundreds of thousands of lookups, the difference between O(log n) lookup for std::map and O(1) for a hash table might be significant.
std::vector<std::string> insertOrder;
std::tr1::unordered_map<std::string, long> myTable;
// Initialize the hash table and record insert order.
myTable["foo"] = 0;
insertOrder.push_back("foo");
myTable["bar"] = 0;
insertOrder.push_back("bar");
myTable["baz"] = 0;
insertOrder.push_back("baz");
/* Increment things in myTable 100000 times */
// Print the final results.
for (int i = 0; i < insertOrder.size(); ++i)
{
const std::string &s = insertOrder[i];
std::cout << s << ' ' << myTable[s] << '\n';
}
Tessil has a very nice implementaion of ordered map (and set) which is MIT license. You can find it here: ordered-map
Map example
#include <iostream>
#include <string>
#include <cstdlib>
#include "ordered_map.h"
int main() {
tsl::ordered_map<char, int> map = {{'d', 1}, {'a', 2}, {'g', 3}};
map.insert({'b', 4});
map['h'] = 5;
map['e'] = 6;
map.erase('a');
// {d, 1} {g, 3} {b, 4} {h, 5} {e, 6}
for(const auto& key_value : map) {
std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
}
map.unordered_erase('b');
// Break order: {d, 1} {g, 3} {e, 6} {h, 5}
for(const auto& key_value : map) {
std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
}
}
Keep a parallel list<string> insertionOrder.
When it is time to print, iterate on the list and do lookups into the map.
each element in insertionOrder // walks in insertionOrder..
print map[ element ].second // but lookup is in map
If you need both lookup strategies, you will end up with two containers. You may use a vector with your actual values (ints), and put a map< string, vector< T >::difference_type> next to it, returning the index into the vector.
To complete all that, you may encapsulate both in one class.
But I believe boost has a container with multiple indices.
What you want (without resorting to Boost) is what I call an "ordered hash", which is essentially a mashup of a hash and a linked list with string or integer keys (or both at the same time). An ordered hash maintains the order of the elements during iteration with the absolute performance of a hash.
I've been putting together a relatively new C++ snippet library that fills in what I view as holes in the C++ language for C++ library developers. Go here:
https://github.com/cubiclesoft/cross-platform-cpp
Grab:
templates/detachable_ordered_hash.cpp
templates/detachable_ordered_hash.h
templates/detachable_ordered_hash_util.h
If user-controlled data will be placed into the hash, you might also want:
security/security_csprng.cpp
security/security_csprng.h
Invoke it:
#include "templates/detachable_ordered_hash.h"
...
// The 47 is the nearest prime to a power of two
// that is close to your data size.
//
// If your brain hurts, just use the lookup table
// in 'detachable_ordered_hash.cpp'.
//
// If you don't care about some minimal memory thrashing,
// just use a value of 3. It'll auto-resize itself.
int y;
CubicleSoft::OrderedHash<int> TempHash(47);
// If you need a secure hash (many hashes are vulnerable
// to DoS attacks), pass in two randomly selected 64-bit
// integer keys. Construct with CSPRNG.
// CubicleSoft::OrderedHash<int> TempHash(47, Key1, Key2);
CubicleSoft::OrderedHashNode<int> *Node;
...
// Push() for string keys takes a pointer to the string,
// its length, and the value to store. The new node is
// pushed onto the end of the linked list and wherever it
// goes in the hash.
y = 80;
TempHash.Push("key1", 5, y++);
TempHash.Push("key22", 6, y++);
TempHash.Push("key3", 5, y++);
// Adding an integer key into the same hash just for kicks.
TempHash.Push(12345, y++);
...
// Finding a node and modifying its value.
Node = TempHash.Find("key1", 5);
Node->Value = y++;
...
Node = TempHash.FirstList();
while (Node != NULL)
{
if (Node->GetStrKey()) printf("%s => %d\n", Node->GetStrKey(), Node->Value);
else printf("%d => %d\n", (int)Node->GetIntKey(), Node->Value);
Node = Node->NextList();
}
I ran into this SO thread during my research phase to see if anything like OrderedHash already existed without requiring me to drop in a massive library. I was disappointed. So I wrote my own. And now I've shared it.
Here is solution that requires only standard template library without using boost's multiindex:
You could use std::map<std::string,int>; and vector <data>; where in map you store the index of the location of data in vector and vector stores data in insertion order. Here access to data has O(log n) complexity. displaying data in insertion order has O(n) complexity. insertion of data has O(log n) complexity.
For Example:
#include<iostream>
#include<map>
#include<vector>
struct data{
int value;
std::string s;
}
typedef std::map<std::string,int> MapIndex;//this map stores the index of data stored
//in VectorData mapped to a string
typedef std::vector<data> VectorData;//stores the data in insertion order
void display_data_according_insertion_order(VectorData vectorData){
for(std::vector<data>::iterator it=vectorData.begin();it!=vectorData.end();it++){
std::cout<<it->value<<it->s<<std::endl;
}
}
int lookup_string(std::string s,MapIndex mapIndex){
std::MapIndex::iterator pt=mapIndex.find(s)
if (pt!=mapIndex.end())return it->second;
else return -1;//it signifies that key does not exist in map
}
int insert_value(data d,mapIndex,vectorData){
if(mapIndex.find(d.s)==mapIndex.end()){
mapIndex.insert(std::make_pair(d.s,vectorData.size()));//as the data is to be
//inserted at back
//therefore index is
//size of vector before
//insertion
vectorData.push_back(d);
return 1;
}
else return 0;//it signifies that insertion of data is failed due to the presence
//string in the map and map stores unique keys
}
You cannot do that with a map, but you could use two separate structures - the map and the vector and keep them synchronized - that is when you delete from the map, find and delete the element from the vector. Or you could create a map<string, pair<int,int>> - and in your pair store the size() of the map upon insertion to record position, along with the value of the int, and then when you print, use the position member to sort.
One thing you need to consider is the small number of data elements you are using. It is possible that it will be faster to use just the vector. There is some overhead in the map that can cause it to be more expensive to do lookups in small data sets than the simpler vector. So, if you know that you will always be using around the same number of elements, do some benchmarking and see if the performance of the map and vector is what you really think it is. You may find the lookup in a vector with only 50 elements is near the same as the map.
Another way to implement this is with a map instead of a vector. I will show you this approach and discuss the differences:
Just create a class that has two maps behind the scenes.
#include <map>
#include <string>
using namespace std;
class SpecialMap {
// usual stuff...
private:
int counter_;
map<int, string> insertion_order_;
map<string, int> data_;
};
You can then expose an iterator to iterator over data_ in the proper order. The way you do that is iterate through insertion_order_, and for each element you get from that iteration, do a lookup in the data_ with the value from insertion_order_
You can use the more efficient hash_map for insertion_order since you don't care about directly iterating through insertion_order_.
To do inserts, you can have a method like this:
void SpecialMap::Insert(const string& key, int value) {
// This may be an over simplification... You ought to check
// if you are overwriting a value in data_ so that you can update
// insertion_order_ accordingly
insertion_order_[counter_++] = key;
data_[key] = value;
}
There are a lot of ways you can make the design better and worry about performance, but this is a good skeleton to get you started on implementing this functionality on your own. You can make it templated, and you might actually store pairs as values in data_ so that you can easily reference the entry in insertion_order_. But I leave these design issues as an exercise :-).
Update: I suppose I should say something about efficiency of using map vs. vector for insertion_order_
lookups directly into data, in both cases are O(1)
inserts in the vector approach are O(1), inserts in the map approach are O(logn)
deletes in the vector approach are O(n) because you have to scan for the item to remove. With the map approach they are O(logn).
Maybe if you are not going to use deletes as much, you should use the vector approach. The map approach would be better if you were supporting a different ordering (like priority) instead of insertion order.
This is somewhat related to Faisals answer. You can just create a wrapper class around a map and vector and easily keep them synchronized. Proper encapsulation will let you control the access method and hence which container to use... the vector or the map. This avoids using Boost or anything like that.
// Should be like this man!
// This maintains the complexity of insertion is O(logN) and deletion is also O(logN).
class SpecialMap {
private:
int counter_;
map<int, string> insertion_order_;
map<string, int> insertion_order_reverse_look_up; // <- for fast delete
map<string, Data> data_;
};
There is no need to use a separate std::vector or any other container for keeping track of the insertion order. You can do what you want as shown below.
If you want to keep the insertion order then you can use the following program(version 1):
Version 1: For counting unique strings using std::map<std::string,int> in insertion order
#include <iostream>
#include <map>
#include <sstream>
int findExactMatchIndex(const std::string &totalString, const std::string &toBeSearched)
{
std::istringstream ss(totalString);
std::string word;
std::size_t index = 0;
while(ss >> word)
{
if(word == toBeSearched)
{
return index;
}
++index;
}
return -1;//return -1 when the string to be searched is not inside the inputString
}
int main() {
std::string inputString = "this is a string containing my name again and again and again ", word;
//this map maps the std::string to their respective count
std::map<std::string, int> wordCount;
std::istringstream ss(inputString);
while(ss >> word)
{
//std::cout<<"word:"<<word<<std::endl;
wordCount[word]++;
}
std::cout<<"Total unique words are: "<<wordCount.size()<<std::endl;
std::size_t i = 0;
std::istringstream gothroughStream(inputString);
//just go through the inputString(stream) instead of map
while( gothroughStream >> word)
{
int index = findExactMatchIndex(inputString, word);
if(index != -1 && (index == i)){
std::cout << word <<"-" << wordCount.at(word)<<std::endl;
}
++i;
}
return 0;
}
The output of the above program is as follows:
Total unique words are: 9
this-1
is-1
a-1
string-1
containing-1
my-1
name-1
again-3
and-2
Note that in the above program, if you have a comma or any other delimiter then it is counted as a separate word. So for example lets say you have the string this is, my name is then the string is, has count of 1 and the string is has count of 1. That is is, and is are different. This is because the computer doesn't know our definition of a word.
Note
The above program is a modification of my answer to How do i make the char in an array output in order in this nested for loop? which is given as version 2 below:
Version 2: For counting unique characters using std::map<char, int> in insertion order
#include <iostream>
#include <map>
int main() {
std::string inputString;
std::cout<<"Enter a string: ";
std::getline(std::cin,inputString);
//this map maps the char to their respective count
std::map<char, int> charCount;
for(char &c: inputString)
{
charCount[c]++;
}
std::size_t i = 0;
//just go through the inputString instead of map
for(char &c: inputString)
{
std::size_t index = inputString.find(c);
if(index != inputString.npos && (index == i)){
std::cout << c <<"-" << charCount.at(c)<<std::endl;
}
++i;
}
return 0;
}
In both cases/versions there is no need to use a separate std::vector or any other container to keep track of the insertion order.
Use boost::multi_index with map and list indices.
A map of pair (str,int) and static int that increments on insert calls indexes pairs of data. Put in a struct that can return the static int val with an index () member perhaps?

How to use STL unsorted key-value as pair in map

I need to use STL C++ map to store key value pairs.
I need to store more than one data information in stl map.
e.g
Need to store DataType,Data and its behavior as(in param/outparam) all in string format.
But map always use key value pair
so if I
store it like
std::map<map<"int","50",>"behavior">.
But always it sorts the the data on basis of keys which I dont want. If I use like ..
pair<string, pair<string,string> >;
pair<string, pair<string,string>>("int",("100","in"));
This prompts compile time error!
error C2664: 'std::pair<_Ty1,_Ty2>::pair(const std::pair<_Ty1,_Ty2> &)' : cannot convert parameter 1 from 'const char *' to 'const std::pair<_Ty1,_Ty2> &'
What should be the exact solution of the above problem?
Regards
If you don't want ordering, don't use ordered containers like map or set. You could achieve what you're looking for using a class and a vector. The sorted nature of std::map is what makes it efficient to lookup by key. If you want/need unsorted and more hash like behavior for lookups, look at Boost unordered containers. Note that this doesn't guarantee the order in is going to be the order out. I'm assuming you want to preserve the order of the types your putting in the container, the example below will do that.
#include <iostream>
#include <vector>
using namespace std;
class DataBehavior
{
public:
DataBehavior(const string &type,
const string &data,
const string &behavior)
:type(type), data(data), behavior(behavior)
{
}
const string &getType() const { return type; }
const string &getData() const { return data; }
const string &getBehavior() const { return behavior; }
private:
string type;
string data;
string behavior;
};
typedef vector<DataBehavior> Ctr;
int main (int argc, char *argv[])
{
Ctr ctr;
ctr.push_back(DataBehavior("int", "50", "behavior"));
ctr.push_back(DataBehavior("int", "80", "behavior"));
ctr.push_back(DataBehavior("string", "25", "behavior2"));
for (Ctr::const_iterator it = ctr.begin(); it != ctr.end(); ++it)
{
cout << "Type: " << it->getType() << " "
<< "Data: " << it->getData() << " "
<< "Behavior: " << it->getBehavior() << "\n";
}
return 1;
}
You're declaring the variables wrong in the
pair<string, pair<string,string> >;
pair<string, pair<string,string>>("int",("100","in"));
You'd need to do
pair<string, pair<string,string>>("int",pair<string,string>("100","in"));
which is fairly verbose. You could use the make_pair function to significantly shorten that:
make_pair( "int", make_pair( "100", "in" ) );
Also, as the others have said, if you don't want sorted order, use unordered_map from TR1 or Boost. If you're on MS you can use hash_map instead.
Need to store DataType,Data and its
behavior as(in param/outparam) all in
string format. But map always use key
value pair
To me it seems you need to use different data structure, e.g.
typedef std::tuple<std::string, // DataType
std::string, // Data
std::string> element_type; // behaviour
// use can replace tuple with sth like:
//struct element_type {
// std::string m_DataType;
// std::string m_Data;
// std::string m_Behaviour;
// ... ctors, operators ...
//};
// or even with std::array<std::string, 3>
std::set<element_type> elements_set;
// or
typedef int32_t element_id;
std::map<element_id, element_type> elements_map;
And you've got rather a design problem than a programming one.
Why do you use a map in the first place? What do you want to use as lookup key -- the datatype? The value? The "param/outparam" characteristic?
Whatever you want to use as a key should go first, and the rest could be a struct / pointer / string (*urg*) / ... .
So I suggest you give some more thought to your needs and design a datastructure accordingly.

c++ container search by key and value

I'm trying to build a container of string representations of ordinal numbers, searchable both by the string and by the number. For example, I can do it trivially, but inefficiently, with an array:
std::string ordinalStrings = {"zeroth", "first", "second",..}
getting the string with ordinalStrings[number], and the integer with a while(not found) loop, but I figure that one of the STL containers would do a better job, just not sure which one.
It needn't be searchable by both, for example, simply getting the numerical key of a map with a string key would work, but there doesn't seem to be a function for it, and besides, it seems a little messy.
As an aside, is it possible to have a constant STL container? I only need the searching functionality, not the insertion. Also, since I assume I can't, is it possible to fill the container without an initializer function? That is, can a header simply say:
std::map<std::string, int> ordinals;
ordinals["zero"] = 0;
ordinals["one"] = 1;
...
It just seems silly to have an initializer function for what is essentially a constant value;
Thanks,
Wyatt
Stephen suggested using Boost.MultiIndex, however it's a bit an overkill.
Boost.Bimap has been developped over it, and offer extra functionalities. It is in fact tailored right for this task. It's also one of the few libraries which has (imho) a good documentation :)
Here is an example right from this page.
#include <iostream>
#include <boost/bimap.hpp>
struct country {};
struct place {};
int main()
{
using namespace boost::bimaps;
// Soccer World cup.
typedef bimap
<
tagged< std::string, country >,
tagged< int , place >
> results_bimap;
typedef results_bimap::value_type position;
results_bimap results;
results.insert( position("Argentina" ,1) );
results.insert( position("Spain" ,2) );
results.insert( position("Germany" ,3) );
results.insert( position("France" ,4) );
std::cout << "Countries names ordered by their final position:"
<< std::endl;
for( results_bimap::map_by<place>::const_iterator
i = results.by<place>().begin(),
iend = results.by<place>().end() ;
i != iend; ++i )
{
std::cout << i->get<place >() << ") "
<< i->get<country>() << std::endl;
}
std::cout << std::endl
<< "Countries names ordered alphabetically along with"
"their final position:"
<< std::endl;
for( results_bimap::map_by<country>::const_iterator
i = results.by<country>().begin(),
iend = results.by<country>().end() ;
i != iend; ++i )
{
std::cout << i->get<country>() << " ends "
<< i->get<place >() << "º"
<< std::endl;
}
return 0;
}
You can provide two different lookup mechanisms using boost::multi_index (here's an example of a bidirectional map using multi_index). Another (maybe simpler) option is to maintain two containers: one to lookup by ordinal, one to search by string. You could use two std::map or a std::map and a std::vector (for constant time lookup of ordinality).
As an aside, is it possible to have a constant STL container? I only need the searching functionality, not the insertion.
Yes, but you must initialize a non-const container. You can then copy that into a const container. For example, you can use a helper function: const std::map<string, int> ordinals = create_map();
That is, can a header simply say: ...
No, you should initialize the container within a proper control flow.

C++ STL map I don't want it to sort!

This is my code
map<string,int> persons;
persons["B"] = 123;
persons["A"] = 321;
for(map<string,int>::iterator i = persons.begin();
i!=persons.end();
++i)
{
cout<< (*i).first << ":"<<(*i).second<<endl;
}
Expected output:
B:123
A:321
But output it gives is:
A:321
B:123
I want it to maintain the order in which keys and values were inserted in the map<string,int>.
Is it possible? Or should I use some other STL data structure? Which one?
There is no standard container that does directly what you want. The obvious container to use if you want to maintain insertion order is a vector. If you also need look up by string, use a vector AND a map. The map would in general be of string to vector index, but as your data is already integers you might just want to duplicate it, depending on your use case.
Like Matthieu has said in another answer, the Boost.MultiIndex library seems the right choice for what you want. However, this library can be a little tough to use at the beginning especially if you don't have a lot of experience with C++. Here is how you would use the library to solve the exact problem in the code of your question:
struct person {
std::string name;
int id;
person(std::string const & name, int id)
: name(name), id(id) {
}
};
int main() {
using namespace::boost::multi_index;
using namespace std;
// define a multi_index_container with a list-like index and an ordered index
typedef multi_index_container<
person, // The type of the elements stored
indexed_by< // The indices that our container will support
sequenced<>, // list-like index
ordered_unique<member<person, string,
&person::name> > // map-like index (sorted by name)
>
> person_container;
// Create our container and add some people
person_container persons;
persons.push_back(person("B", 123));
persons.push_back(person("C", 224));
persons.push_back(person("A", 321));
// Typedefs for the sequence index and the ordered index
enum { Seq, Ord };
typedef person_container::nth_index<Seq>::type persons_seq_index;
typedef person_container::nth_index<Ord>::type persons_ord_index;
// Let's test the sequence index
persons_seq_index & seq_index = persons.get<Seq>();
for(persons_seq_index::iterator it = seq_index.begin(),
e = seq_index.end(); it != e; ++it)
cout << it->name << ":"<< it->id << endl;
cout << "\n";
// And now the ordered index
persons_ord_index & ord_index = persons.get<Ord>();
for(persons_ord_index::iterator it = ord_index.begin(),
e = ord_index.end(); it != e; ++it)
cout << it->name << ":"<< it->id << endl;
cout << "\n";
// Thanks to the ordered index we have fast lookup by name:
std::cout << "The id of B is: " << ord_index.find("B")->id << "\n";
}
Which produces the following output:
B:123
C:224
A:321
A:321
B:123
C:224
The id of B is: 123
Map is definitely not right for you:
"Internally, the elements in the map are sorted from lower to higher key value following a specific strict weak ordering criterion set on construction."
Quote taken from here.
Unfortunately there is no unordered associative container in the STL, so either you use a nonassociative one like vector, or write your own :-(
I had the same problem every once in a while and here is my solution: https://github.com/nlohmann/fifo_map. It's a header-only C++11 solution and can be used as drop-in replacement for a std::map.
For your example, it can be used as follows:
#include "fifo_map.hpp"
#include <string>
#include <iostream>
using nlohmann::fifo_map;
int main()
{
fifo_map<std::string,int> persons;
persons["B"] = 123;
persons["A"] = 321;
for(fifo_map<std::string,int>::iterator i = persons.begin();
i!=persons.end();
++i)
{
std::cout<< (*i).first << ":"<<(*i).second << std::endl;
}
}
The output is then
B:123
A:321
Besides Neil's recommendation of a combined vector+map if you need both to keep the insertion order and the ability to search by key, you can also consider using boost multi index libraries, that provide for containers addressable in more than one way.
maps and sets are meant to impose a strict weak ordering upon the data. Strick weak ordering maintains that no entries are equavalent (different to being equal).
You need to provide a functor that the map/set may use to perform a<b. With this functor the map/set sorts its items (in the STL from GCC it uses a red-black tree). It determines weather two items are equavalent if !a<b && !b<a -- the equavelence test.
The functor looks like follows:
template <class T>
struct less : binary_function<T,T,bool> {
bool operator() (const T& a, const T& b) const {
return a < b;
}
};
If you can provide a function that tells the STL how to order things then the map and set can do what you want. For example
template<typename T>
struct ItemHolder
{
int insertCount;
T item;
};
You can then easily write a functor to order by insertCount. If your implementation uses red-black trees your underlying data will remain balanced -- however you will get a lot of re-balancing since your data will be generated based on incremental ordering (vs. Random) -- and in this case a list with push_back would be better. However you cannot access data by key as fast as you would with a map/set.
If you want to sort by string -- provide the functor to search by string, using the insertCount you could potentiall work backwards. If you want to search by both you can have two maps.
map<insertcount, string> x; // auxhilary key
map<string, item> y; //primary key
I use this strategy often -- however I have never placed it in code that is run often. I'm considering boost::bimap.
Well, there is no STL container which actually does what you wish, but there are possibilities.
1. STL
By default, use a vector. Here it would mean:
struct Entry { std::string name; int it; };
typedef std::vector<Entry> container_type;
If you wish to search by string, you always have the find algorithm at your disposal.
class ByName: std::unary_function<Entry,bool>
{
public:
ByName(const std::string& name): m_name(name) {}
bool operator()(const Entry& entry) const { return entry.name == m_name; }
private:
std::string m_name;
};
// Use like this:
container_type myContainer;
container_type::iterator it =
std::find(myContainer.begin(), myContainer.end(), ByName("A"));
2. Boost.MultiIndex
This seems way overkill, but you can always check it out here.
It allows you to create ONE storage container, accessible via various indexes of various styles, all maintained for you (almost) magically.
Rather than using one container (std::map) to reference a storage container (std::vector) with all the synchro issues it causes... you're better off using Boost.
For preserving all the time complexity constrains you need map + list:
struct Entry
{
string key;
int val;
};
typedef list<Entry> MyList;
typedef MyList::iterator Iter;
typedef map<string, Iter> MyMap;
MyList l;
MyMap m;
int find(string key)
{
Iter it = m[key]; // O(log n)
Entry e = *it;
return e.val;
}
void put(string key, int val)
{
Entry e;
e.key = key;
e.val = val;
Iter it = l.insert(l.end(), e); // O(1)
m[key] = it; // O(log n)
}
void erase(string key)
{
Iter it = m[key]; // O(log n)
l.erase(it); // O(1)
m.erase(key); // O(log n)
}
void printAll()
{
for (Iter it = l.begin(); it != l.end(); it++)
{
cout<< it->key << ":"<< it->val << endl;
}
}
Enjoy
You could use a vector of pairs, it is almost the same as unsorted map container
std::vector<std::pair<T, U> > unsorted_map;
Use a vector. It gives you complete control over ordering.
I also think Map is not the way to go. The keys in a Map form a Set; a single key can occur only once. During an insert in the map the map must search for the key, to ensure it does not exist or to update the value of that key. For this it is important (performance wise) that the keys, and thus the entries, have some kind of ordering. As such a Map with insert ordering would be highly inefficient on inserts and retrieving entries.
Another problem would be if you use the same key twice; should the first or the last entry be preserved, and should it update the insert order or not?
Therefore I suggest you go with Neils suggestion, a vector for insert-time ordering and a Map for key-based searching.
Yes, the map container is not for you.
As you asked, you need the following code instead:
struct myClass {
std::string stringValue;
int intValue;
myClass( const std::string& sVal, const int& iVal ):
stringValue( sVal ),
intValue( iVal) {}
};
std::vector<myClass> persons;
persons.push_back( myClass( "B", 123 ));
persons.push_back( myClass( "A", 321 ));
for(std::vector<myClass>::iterator i = persons.begin();
i!=persons.end();
++i)
{
std::cout << (*i).stringValue << ":" << (*i).intValue << std::endl;
}
Here the output is unsorted as expected.
Map is ordered collection (second parametr in template is a order functor), as set. If you want to pop elements in that sequenses as pushd you should use deque or list or vector.
In order to do what they do and be efficient at it, maps use hash tables and sorting. Therefore, you would use a map if you're willing to give up memory of insertion order to gain the convenience and performance of looking up by key.
If you need the insertion order stored, one way would be to create a new type that pairs the value you're storing with the order you're storing it (you would need to write code to keep track of the order). You would then use a map of string to this new type for storage. When you perform a look up using a key, you can also retrieve the insertion order and then sort your values based on insertion order.
One more thing: If you're using a map, be aware of the fact that testing if persons["C"] exists (after you've only inserted A and B) will actually insert a key value pair into your map.
Instead of map you can use the pair function with a vector!
ex:
vector<::pair<unsigned,string>> myvec;
myvec.push_back(::pair<unsigned,string>(1,"a"));
myvec.push_back(::pair<unsigned,string>(5,"b"));
myvec.push_back(::pair<unsigned,string>(3,"aa"));`
Output:
myvec[0]=(1,"a"); myvec[1]=(5,"b"); myvec[2]=(3,"aa");
or another ex:
vector<::pair<string,unsigned>> myvec2;
myvec2.push_back(::pair<string,unsigned>("aa",1));
myvec2.push_back(::pair<string,unsigned>("a",3));
myvec2.push_back(::pair<string,unsigned>("ab",2));
Output: myvec2[0]=("aa",1); myvec2[1]=("a",3); myvec2[2]=("ab",2);
Hope this can help someone else in the future who was looking for non sorted maps like me!
struct Compare : public binary_function<int,int,bool> {
bool operator() (int a, int b) {return true;}
};
Use this to get all the elements of a map in the reverse order in which you entered (i.e.: the first entered element will be the last and the last entered element will be the first). Not as good as the same order but it might serve your purpose with a little inconvenience.
Use a Map along with a vector of iterators as you insert in Map. (Map iterators are guaranteed not to be invalidated)
In the code below I am using Set
set<string> myset;
vector<set<string>::iterator> vec;
void printNonDuplicates(){
vector<set<string>::iterator>::iterator vecIter;
for(vecIter = vec.begin();vecIter!=vec.end();vecIter++){
cout<<(*vecIter)->c_str()<<endl;
}
}
void insertSet(string str){
pair<set<string>::iterator,bool> ret;
ret = myset.insert(str);
if(ret.second)
vec.push_back(ret.first);
}
If you don't want to use boost::multi_index, I have put a proof of concept class template up for review here:
https://codereview.stackexchange.com/questions/233157/wrapper-class-template-for-stdmap-stdlist-to-provide-a-sequencedmap-which
using std::map<KT,VT> and std::list<OT*> which uses pointers to maintain the order.
It will take O(n) linear time for the delete because it needs to search the whole list for the right pointer. To avoid that would need another cross reference in the map.
I'd vote for typedef std::vector< std::pair< std::string, int > > UnsortedMap;
Assignment looks a bit different, but your loop remains exactly as it is now.
There is std::unordered_map that you can check out. From first view, it looks like it might solve your problem.

Checking value exist in a std::map - C++

I know find method finds the supplied key in std::map and return an iterator to the element. Is there anyway to find the value and get an iterator to the element? What I need to do is to check specified value exist in std::map. I have done this by looping all items in the map and comparing. But I wanted to know is there any better approach for this.
Here is what I have wrote
bool ContainsValue(Type_ value)
{
bool found = false;
Map_::iterator it = internalMap.begin(); // internalMap is std::map
while(it != internalMap.end())
{
found = (it->second == value);
if(found)
break;
++it;
}
return found;
}
Edit
How about using another map internally which stores value,key combination. So I can call find on it? Is find() in std::map doing sequential search?
Thanks
You can use boost::multi_index to create a bidirectional map - you can use either value of the pair as a key to do a quick lookup.
If you have access to the excellent boost library then you should be using boost::multi_index to create bidirectional map as Mark says. Unlike a std::map this allows you to look up by either the key or the value.
If you only have the STL to hand the following code will do the trick (templated to work with any kind of map where the mapped_type supports operator==):
#include <map>
#include <string>
#include <algorithm>
#include <iostream>
#include <cassert>
template<class T>
struct map_data_compare : public std::binary_function<typename T::value_type,
typename T::mapped_type,
bool>
{
public:
bool operator() (typename T::value_type &pair,
typename T::mapped_type i) const
{
return pair.second == i;
}
};
int main()
{
typedef std::map<std::string, int> mapType;
mapType map;
map["a"] = 1;
map["b"] = 2;
map["c"] = 3;
map["d"] = 4;
map["e"] = 5;
const int value = 3;
std::map<std::string, int>::iterator it = std::find_if( map.begin(), map.end(), std::bind2nd(map_data_compare<mapType>(), value) );
if ( it != map.end() )
{
assert( value == it->second);
std::cout << "Found index:" << it->first << " for value:" << it->second << std::endl;
}
else
{
std::cout << "Did not find index for value:" << value << std::endl;
}
}
How about using another map internally which stores value,key combination. So I can call find on it?
Yes: maintain two maps, with one map using one type of key and the other using the other.
Is find() in std::map doing sequential search?
No it's a binary search of a sorted tree: its speed is O(log(n)).
Look into boost's bidirectional maps: http://www.boost.org/doc/libs/1_38_0/libs/bimap/doc/html/index.html
It lets both values act like a key.
Otherwise, iteration is the way to go.
What you are requesting is precisely what std::find does (not the member function)
template< class InputIt, class T >
InputIt find( InputIt first, InputIt last, const T& value );
No, you have to loop over the std::map and check all values manually. Depending on what you want to do, you could wrap the std::map in a simple class that also caches all of the values that are inserted into the map in something that's easily search-able and doesn't allow duplicates, like a std::set. Don't inherit from the std::map (it doesn't have a virtual destructor!), but wrap it so that you can do something like this:
WrappedMap my_map< std::string, double >;
my_map[ "key" ] = 99.0;
std::set< double > values = my_map.values(); // should give back a set with only 99.0 in it
An alternative to rolling your own would be to use the Boost bidirectional map, which is easily found in the posts below or by Google.
It really depends on what you want to do, how often you want to do it, and how hard it is to roll your own little wrapper class versus installing and using Boost. I love Boost, so that's a good way to go - but there's something nice and complete about making your own wrapper class. You have the advantage of understanding directly the complexity of operations, and you may not need the full reverse mapping of values => keys that's provided by the Boost bidirectional map.
Not a very best option but might be useful in few cases where user is assigning default value like 0 or NULL at initialization.
Ex.
< int , string >
< string , int >
< string , string >
consider < string , string >
mymap["1st"]="first";
mymap["second"]="";
for (std::map<string,string>::iterator it=mymap.begin(); it!=mymap.end(); ++it)
{
if ( it->second =="" )
continue;
}
I am adding this answer, if someone come here and looks for c++11 and above..
//DECLARE A MAP
std::map<int, int> testmap;
//SAMPLE DATA
testmap.insert(std::make_pair(1, 10));
testmap.insert(std::make_pair(2, 20));
testmap.insert(std::make_pair(3, 30));
testmap.insert(std::make_pair(4, 20));
//ELEMENTS WITH VALUE TO BE FOUND
int value = 20;
//RESULTS
std::map<int, int> valuesMatching;
//ONE STEP TO FIND ALL MATCHING MAP ELEMENTS
std::copy_if(testmap.begin(), testmap.end(), std::inserter(valuesMatching, valuesMatching.end()), [value](const auto& v) {return v.second == value; });
Possible that I don't fully understand what you're trying to accomplish. But to simply test whether or not a map contains a value, I believe you can use the std::map's built in find.
bool ContainsValue(Type_ value)
{
return (internalMap.find(value) != internalMap.end());
}