std::map override element under certain circumstances at insertion time - c++

let's say I have a map whose key is a pair and whose custom comparator guarantees unicity against the first element of that pair.
class comparator
{
public:
bool operator()(const std::pair<std::string, std::int>& left,
const std::pair<std::string, std::int>& right)
{
return left.first < right.first;
}
};
std::map<std::pair<std::string, std::int>, foo, comparator>;
Now I'd like this map to be more intelligent than that, if possible.
Instead of being rejected at insertion time in case a key with the same string as first element of the pair already exists, I'd to overwrite the "already existing element" if the pair's integer (.second) of the "possibly going to be inserted element" is bigger.
Of course I can do this by looking in to the map for the key, getting the key details and overwriting it if necessary.
Alternatively I could adopt a post-insertion approach with a multimap on top of which I would iterate to clean up duplicates keeping just the key with the biggest pair integer.
The question is : can I do that natively by overriding part of the stl implementation ([] operator - insert method) or improving my custom comparator and then simply relying on map's insert method ?
I don't know if this is accepted but we could imagine having a non const comprator which would be able of updating the already stored (key, value) pair under certain circumstances.

ValueThe answer to your question is that you cannot do it.
There are two problems with your proposed implementation:
The keys must remain const as they are the index for the map
Independent of what the comparator did to the elements it is comparing the std::map would still insert the item before or after left based on the return of the comparator
The solution to the problem is as suggested by #MvG. Your key should not be paired, it is your value that should be paired.
This has the added benefit that you don't need a custom comparator.
The problem is that you will need a custom inserter:
std::pair< int, foo >& tempValue = _myMap[ keyToInsert ];
if( valueToInsert.first >= tempValue.first )
{
tempValue = valueToInsert;
}
Note that this will only work if all the valueToInsert.firsts that you use are positive, cause the default constructor for an int is 0. If you had negative valueToInsert.firsts the default constructed value pair would be inserted instead of your element.

Related

implement linear probing in c++ for a generic type

I wanted to implement linear probing for hashtabe in c++,but the key,value
pair would be of generic type like: vector< pair< key,value> >(where key,value is of generic type).
Now,in linear probing if a cell is occupied we traverse the vector until we find an empty cell and then place the new pair in that cell.
The problem is,that in generic type,how would I be able to check if a particular cell is occupied or not?
I cannot use these conditions:
if(key == '\0')//As key is not of type string
OR
if(key == 0)//As key is not of type int
So,how will be able to check if a paricular cell in the vector is empty or not?
Please correct me if I misunderstood the concept.
I think you can just check if the element of the vector has meaningful key and value:
if(vector[i] == std::pair<key, value>())
//empty
or, if you only care about the keys
if(vector[i].first == key())
//empty
This approach assumes that the default constructor of key type constructs an object that would be considered an "empty" or "invalid" key.
You could use a free function isEmpty for checking if a key type is empty. Define a templated default function that works for most types, and make special functions that cannot be handled by the default.
E.g.
template<typename T>
bool isEmpty(const T &t) {
return !t;
}
bool isEmpty(const std::string &s) {
return s.length() == 0;
}
bool isEmpty(double d) {
return d < 0;
}
isEmpty(0); // true
isEmpty(1); // false
isEmpty(std::string()); // true
isEmpty(std::string("not empty")); // false
isEmpty(1.0); // false
isEmpty(-1.0); // true
You only need to specialize for key types that do not have an operator ! or where a different logic is needed for the check
In case you don't want to rule out the possibility to have "default constructed" elements in your hash you can build a parallel data structure, like a std::bitset (if you know in advance the maximum size of your data structure) or a std::vector<bool>, that in the following i will call has. has[i] will be true if vector[i] contains a valid, actually inserted element.
So the operations should be modified as follows:
insert: go on scanning the vector until you find a position i such that has[i] == false
remove element in position i: set has[i] to false
Hope this helps

STL container that combines sequential ordering with constant access time?

This strikes me as something I should have been able to find on Stackoverflow, but maybe I'm searching for the wrong terms here.
I have the scenario where there is a class
class Foo
{
int key;
int b;
...
}
and I want to push new elements of that class onto a list. Number of items is unknown beforehand.
At the same time, I want to be able to quickly check the existence of (and retrieve) an element with a certain key, e.g. key==5.
So, to summarize:
It should have O(1) (or thereabouts) for existence/retrieval/deletion
It should retain the order of the pushed items
One solution to this strikes me a "use a std::list to store the items, and std::unordered_map to retrieve them, with some book-keeping". I could of course implement it myself, but was wondering whether something like this already exists in a convenient form in STL.
EDIT: To preempt the suggestion, std::map is not a solution, because it orders based on some key. That is, it won't retain the ordering in which I pushed the items.
STL does not have the kind of container, you may write your own using std::unordered_map and std::list. Map your keys to the list iterators in std::unordered_map and store key-value pairs in std::list.
void add(const Key& key, const Value& value)
{
auto iterator = list.insert(list.end(), std::pair<Key, Value>(key, value));
map[key] = iterator;
}
void remove(const Key& key)
{
auto iterator = map[key];
map.erase(key);
list.erase(iterator);
}
Or use boost multi-index container
http://www.boost.org/doc/libs/1_57_0/libs/multi_index/doc/tutorial/index.html

C++ find and erase a multimap element

I need to add, store and delete some pairs of objects, e.g. Person-Hobby. Any person can have several hobbies and several persons can have the same hobby. So, multimap is a good container, right?
Before adding a pair I need to know, if it's not added yet. As I can see here there is no standard class-method to know, if the concrete pair e.g. Peter-Football exists in the MM. Thus, I've written a method which returns a positive integer (equal to the distance between mm.begin() and pair iterator) if the pair exists and -1 otherwise.
Then I need to delete some pair. I call my find method, which returns, some positive integer. I call myMultiMap.erase(pairIndex); but the pair is not being deleted for some reason. That is my problem. Obviously the erase method needs an iterator, not the int. The question is: how do I convert an integer to an iterator?
Thanks!
UPDATE:
I've tried this c.begin() + int_value but got an error error: no match for ‘operator+’ on this line....
Not that I favour your approach, but if the int is distance between begin() and the iterator in question, you can just use
c.begin() + int_value
or
std::advance(c.begin(), int_value)
to get the iterator. The second version is needed for iterators which are not random-access-iterators.
In the interest of your personal sanity (and the program's speed), I'd suggest you return the iterator directly in some form.
There are many possible interfaces that solve this one way or the other. What I would call the "old C way" would be returning by an out parameter:
bool find_stuff(stuff, container::iterator* out_iter) {
...
if(found && out_iter)
*out_iter = found_iter;
return found;
}
use it:
container::iterator the_iter;
if(find_stuff(the_stuff, &the_iter)) ...
or
if(find_stuff(the_stuff, 0)) // if you don't need the iterator
This is not idiomatic C++, but Linus would be pleased with it.
The second possible and theoretically sound version is using something like boost::optional to return the value. This way, you return either some value or none.
boost::optional<container::iterator> find_stuff(stuff) {
...
if(found && out_iter)
return found_iter;
return boost::none;
}
Use:
boost::optional<container::iterator> found = find_stuff(the_stuff);
if(found) {
do something with *found, which is the iterator.
}
or
if(find_stuff(the_stuff)) ...
Third possible solution would be going the std::set::insert way, ie. returning a pair consisting of a flag and a value:
std::pair<bool, container::iterator> find_stuff(stuff) {
...
return std::make_pair(found, found_iter);
}
Use:
std::pair<bool, container::iterator> found = find_stuff(the_stuff);
if(found.first) ...
Consider to change your mulitmap<Person,Hoobby> to set<pair<Person,Hobby> > - then you will not have problems you have now. Or consider to change to map<Person, set<Hobby> >. Both options will not allow to insert duplicate pairs.
Use 2 sets(not multi sets) one for hobbies and one for persons, these two acts as filters so you don't add the same person twice(or hobbie). the insert opertations on these sets gives the iterator for the element that is inserted(or the "right" iterator for element if it allready was inserted). The two iterators you get from inserting into hobbies_set and person_set are now used as key and value in a multimap
Using a third set(not multi_set) for the relation instead of a multi_map, may give the advantage of not needing to check before inserting a relation if it is allready there it will not be added again, and if it's not there it will be added. In both ways it will return an iterator and bool(tells if it was allready there or if it was added)
datastructures:
typedef std::set<Hobbie> Hobbies;
typedef std::set<Person> Persons;
typedef std::pair<Hobbies::iterator,bool> HobbiesInsertRes;
typedef std::pair<Persons::iterator,bool> PersonsInsertRes;
struct Relation {
Hobbies::iterator hobbieIter;
Persons::iterator personIter;
// needed operator<(left for the as an exercies for the reader);
};
typedef set<Relation> Relations;
Hobbies hobbies;
Persons persons;
Relations relations;
insert:
HobbiesInsertRes hres = hobbies.insert(Hobbie("foo"));
PersonsInsertRes pres = persons.insert(Person("bar"));
relations.insert(Relation(hres.first, pres.first));
// adds the relation if does not exists, if it allready did exist, well you only paid the same amount of time that you would have if you would to do a check first.
lookup:
// for a concrete Person-Hobbie lookup use
relations.find(Relation(Hobbie("foo"),Person("Bar")));
// to find all Hobbies of Person X you will need to do some work.
// the easy way, iterate all elements of relations
std::vector<Hobbie> hobbiesOfX;
Persons::iterator personX = persons.find(Person("bar"));
std::for_each(relations.begin(), relations.end(), [&hobbiesOfBar, personX](Relation r){
if(r.personIter = personX)
hobbiesOfX.push_back(r.hobbieIter);
});
// other way to lookup all hobbies of person X
Persons::iterator personX = persons.find(Person("bar"));
relations.lower_bound(Relation(personX,Hobbies.begin());
relations.upper_bound(Relation(personX,Hobbies.end());
// this needs operator< on Relation to be implemented in a way that does ordering on Person first, Hobbie second.

predicate for a map from string to int

I have this small program that reads a line of input & prints the words in it, with their respective number of occurrences. I want to sort the elements in the map that stores these values according to their occurrences. I mean, the words that only appear once, will be ordered to be at the beginning, then the words that appeared twice 7 so on. I know that the predicate should return a bool value, but I don't know what the parameters should be. Should it be two iterators to the map? If some one could explain this, it would be greatly appreciated. Thank you in advance.
#include<iostream>
#include<map>
using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::map;
int main()
{
string s;
map<string,int> counters; //store each word & an associated counter
//read the input, keeping track of each word & how often we see it
while(cin>>s)
{
++counters[s];
}
//write the words & associated counts
for(map<string,int>::const_iterator iter = counters.begin();iter != counters.end();iter++)
{
cout<<iter->first<<"\t"<<iter->second<<endl;
}
return 0;
}
std::map is always sorted according to its key. You cannot sort the elements by their value.
You need to copy the contents to another data structure (for example std::vector<std::pair<string, int> >) which can be sorted.
Here is a predicate that can be used to sort such a vector. Note that sorting algorithms in C++ standard library need a "less than" predicate which basically says "is a smaller than b".
bool cmp(std::pair<string, int> const &a, std::pair<string, int> const &b) {
return a.second < b.second;
}
You can't resort a map, it's order is predefined (by default, from std::less on the key type). The easiest solution for your problem would be to create a std::multimap<int, string> and insert your values there, then just loop over the multimap, which will be ordered on the key type (int, the number of occurences), which will give you the order that you want, without having to define a predicate.
You are not going to be able to do this with one pass with an std::map. It can only be sorted on one thing at a time, and you cannot change the key in-place. What I would recommend is to use the code you have now to maintain the counters map, then use std::max_element with a comparison function that compares the second field of each std::pair<string, int> in the map.
A map has its keys sorted, not its values. That's what makes the map efficent. You cannot sort it by occurrences without using another data structure (maybe a reversed index!)
As stated, it simply won't work -- a map always remains sorted by its key value, which would be the strings.
As others have noted, you can copy the data to some other structure, and sort by the value. Another possibility would be to use a Boost bimap instead. I've posted a demo of the basic idea previously.
You probably want to transform map<string,int> to vector<pair<const string, int> > then sort the vector on the int member.
You could do
struct PairLessSecond
{
template< typename P >
bool operator()( const P& pairLeft, const P& pairRight ) const
{
return pairLeft.second < pairRight.second;
}
};
You can probably also construct all this somehow using a lambda with a bind.
Now
std::vector< std::map<std::string,int>::value_type > byCount;
std::sort( byCount.begin(), byCount.end(), PairLessSecond() );

c++ map find() to possibly insert(): how to optimize operations?

I'm using the STL map data structure, and at the moment my code first invokes find(): if the key was not previously in the map, it calls insert() it, otherwise it does nothing.
map<Foo*, string>::iterator it;
it = my_map.find(foo_obj); // 1st lookup
if(it == my_map.end()){
my_map[foo_obj] = "some value"; // 2nd lookup
}else{
// ok do nothing.
}
I was wondering if there is a better way than this, because as far as I can tell, in this case when I want to insert a key that is not present yet, I perform 2 lookups in the map data structures: one for find(), one in the insert() (which corresponds to the operator[] ).
Thanks in advance for any suggestion.
Normally if you do a find and maybe an insert, then you want to keep (and retrieve) the old value if it already existed. If you just want to overwrite any old value, map[foo_obj]="some value" will do that.
Here's how you get the old value, or insert a new one if it didn't exist, with one map lookup:
typedef std::map<Foo*,std::string> M;
typedef M::iterator I;
std::pair<I,bool> const& r=my_map.insert(M::value_type(foo_obj,"some value"));
if (r.second) {
// value was inserted; now my_map[foo_obj]="some value"
} else {
// value wasn't inserted because my_map[foo_obj] already existed.
// note: the old value is available through r.first->second
// and may not be "some value"
}
// in any case, r.first->second holds the current value of my_map[foo_obj]
This is a common enough idiom that you may want to use a helper function:
template <class M,class Key>
typename M::mapped_type &
get_else_update(M &m,Key const& k,typename M::mapped_type const& v) {
return m.insert(typename M::value_type(k,v)).first->second;
}
get_else_update(my_map,foo_obj,"some value");
If you have an expensive computation for v you want to skip if it already exists (e.g. memoization), you can generalize that too:
template <class M,class Key,class F>
typename M::mapped_type &
get_else_compute(M &m,Key const& k,F f) {
typedef typename M::mapped_type V;
std::pair<typename M::iterator,bool> r=m.insert(typename M::value_type(k,V()));
V &v=r.first->second;
if (r.second)
f(v);
return v;
}
where e.g.
struct F {
void operator()(std::string &val) const
{ val=std::string("some value")+" that is expensive to compute"; }
};
get_else_compute(my_map,foo_obj,F());
If the mapped type isn't default constructible, then make F provide a default value, or add another argument to get_else_compute.
There are two main approaches. The first is to use the insert function that takes a value type and which returns an iterator and a bool which indicate if an insertion took place and returns an iterator to either the existing element with the same key or the newly inserted element.
map<Foo*, string>::iterator it;
it = my_map.find(foo_obj); // 1st lookup
my_map.insert( map<Foo*, string>::value_type(foo_obj, "some_value") );
The advantage of this is that it is simple. The major disadvantage is that you always construct a new value for the second parameter whether or not an insertion is required. In the case of a string this probably doesn't matter. If your value is expensive to construct this may be more wasteful than necessary.
A way round this is to use the 'hint' version of insert.
std::pair< map<foo*, string>::iterator, map<foo*, string>::iterator >
range = my_map.equal_range(foo_obj);
if (range.first == range.second)
{
if (range.first != my_map.begin())
--range.first;
my_map.insert(range.first, map<Foo*, string>::value_type(foo_obj, "some_value") );
}
The insertiong is guaranteed to be in amortized constant time only if the element is inserted immediately after the supplied iterator, hence the --, if possible.
Edit
If this need to -- seems odd, then it is. There is an open defect (233) in the standard that hightlights this issue although the description of the issue as it applies to map is clearer in the duplicate issue 246.
In your example, you want to insert when it's not found. If default construction and setting the value after that is not expensive, I'd suggest simpler version with 1 lookup:
string& r = my_map[foo_obj]; // only lookup & insert if not existed
if (r == "") r = "some value"; // if default (obj wasn't in map), set value
// else existed already, do nothing
If your example tells what you actually want, consider adding that value as str Foo::s instead, you already have the object, so no lookups would be needed, just check if it has default value for that member. And keep the objs in the std::set. Even extending class FooWithValue2 may be cheaper than using map.
But If joining data through the map like this is really needed or if you want to update only if it existed, then Jonathan has the answer.