C++ - Modifying key for all map elements

C++ - Modifying key for all map elements - c++

Let's consider this code:
std::map< int, char > charMap;
for( auto& i : charMap )
{
charMap[ i.first + 1 ] = charMap[ i.first ];
charMap.erase( i.first );
}
Let's say that the map has some values with randomed keys. I am trying to shift the keys by 1.
This won't work because the loop goes on forever.
Is there a fast way to make it work?

In C++17, you can use node extraction and splicing (see P0083R3):
std::map<int, char> tmpMap;
for (auto it = charMap.begin(); it != charMap.end(); )
{
auto nh = charMap.extract(it++); // node handle
++nh.key();
tmpMap.insert(tmpMap.end(), std::move(nh));
}
tmpMap.swap(charMap);
The loop extracts consecutive map nodes, mutates them, and reinserts the node into tmpMap (now with the different key). At the end, charMap is empty and tmpMap contains all the elements with their modified keys, so we swap the two.
Before C++17, you would have to copy (or move) the value data around to insert a new element with a new key.
std::map<int, char> tmpMap;
for (auto & p : charMap)
tmpMap.emplace_hint(tmpMap.end(), p.first + 1, std::move(p.second));
tmpMap.swap(charMap);
This requires memory allocations for the nodes, though, so the new splicing-based solution is more efficient.
In either case we can use the hinted insertion, because we are reconstructing elements in the same order and so the newest element is always inserted at the end.

Ad hoc solution using the known impact on order
You could simply opt for a backward iteration, starting from the last element:
for( auto pi = charMap.end(); pi-- != charMap.begin(); pi=charMap.erase( pi ))
charMap[ pi->first + 1 ] = charMap[ pi->first ];
Online demo
This will not loop forever here, because the new element that you insert will always be after the current one and will hence not be reprocessed again and again.
More general solution
For a more general transformation where you can't be sure about the impact on element ordering, I'd rather go for a std::transform():
std::map<int, char> tmp;
std::transform(charMap.begin(), charMap.end(), std::inserter(tmp,tmp.begin()),
[](auto e) { return std::make_pair(e.first+1, e.second); });
std::swap(tmp, charMap); // the old map will be discarded when tmp goes out of scope
Online demo

You cannot use this kind of range iteration for two fundamental reasons:
The first reason is that a fundamental property of a map is that iterating over the map iterates in key order.
You are iterating over the map. So, if the first key in the map is key 0, you will copy the value to key 1. Then, you iterate to the next key in the map, which is the key 1 that you just created, and then copy it to key 2. Lather, rinse, repeat.
The are several ways to solve this, but none of that matters because of a second fundamental aspect of the map:
charMap[1]=charMap[0];
This copes charMap[0] to charMap[1]. It does nothing to charMap[0]. It is still there. Nothing happened to it. So, presuming that the lowest key in the map is 0, and you shifted the keys correctly, you will still have a value in the map with key 0. Ditto for the everything else in the map.
But let's say you solved the first problem in one of the several ways that it could be solved. Then, let's say your map has values for keys 0, 5, and 7.
After you copy key #0 to key #1, key #5 to key #6, and key #7 to key #8, take a paper and pencil, and figure out what you have now in your map.
Answer: it is not going to be keys 1, 6, and 8. It will be keys 0, 1, 5, 6, 7, and 8.
All you did was copy each value to the next key. This is because a computer does exactly what you tell it to do, no more, no less. A computer does not do what you want it to do.
The easiest way to do this is to create a new map, and copy the contents of the old map to the new map, with an updated key value. You can still use range iteration for that. Then, replace the old map with the new map.
Of course, this becomes impractical if the map is very large. In that case, it is still possible to do this without using a second map, but the algorithm is going to be somewhat complicated. The capsule summary is:
1) Iterate over the keys in reverse order. Can't use range iteration here.
2) After copying the key to the next value in the map, explicitly remove the value from its original key.

Related

Validity of std::map::iterator after erasing elements

I have written a code for solving the following problem: We have a map<double,double> with (relatively) huge number of items. We want to merge the adjacent items in order to reduce the size of the map keeping a certain "loss factor" as low as possible.
To do so, I first populate a list containing adjacent iterators and the associated loss factor (let's say each list element has the following type:
struct myPair {
map<double,double>::iterator curr, next;
double loss;
myPair(map<double,double>::iterator c, map<double,double>::iterator n,
double l): curr(c), next(n), loss(l) {}
};
). This is done as follows:
for (map<double,double>::iterator it1 = myMap.begin(); it1 != --(myMap.end());
it1++) {
map<double,double>::iterator it2 = it1; it2++;
double l = computeLoss(it1,it2);
List.push(myPair(it1,it2,l));
}
Then, I find the list element corresponding to the lowest loss factor, erase the corresponding elements from the map and insert a new element (result of merging curr and next) in the map. Since this also changes the list elements corresponding to the element after next or before curr I update the corresponding entries and also the associated loss factor.
(I don't get into the details of how to implement the above efficiently but basically I am combining a double linked list and a heap).
While the erase operations should not invalidate the remaining iterators for some specific input instances of the program I get the double free or corruption error exactly at the point where I attempt to erase the elements from the map.
I tried to track this and it seems this happens when both first and second entries of the two map elements are very close (more precisely when the firsts of curr and next are very close).
A strange thing is that I put an assert while populating the list to ensure that in all entries curr and next are different and the same assert in the loop of removing elements. The second one fails!
I would appreciate if anyone can help me.
P.S. I am sorry for not being very precise but I wanted to keep the details as low as possible.
UPDATE: This is (a very simplified version of) how I erase the elements from the map:
while (myMap.size() > MAX_SIZE) {
t = list.getMin();
/* compute the merged version ... let's call the result as (a,b) */
myMap.erase(t.curr);
myMap.erase(t.next);
myMap.insert(pair<double,double>(a,b));
/* update the adjacent entries */
}

Stored iterators in myPair stay invalid after container modification. You should avoid such technique. Probably when you look into header file you will find some ready drafts for your task?

As mentioned already by the other people, it turns out that using double as the key of the map is problematic. In particular when the values are computed.
Hence, my solution was to use std::multimap instead of map (and then merge the elements with the same key just after populating the map). With this, for example even if a is very close to both keys of t.curr and t.next or any other element, for sure the insert operation creates a new element such that no existing iterator in the list would point to that.

Get index of element in C++ map

I have a std::map called myMap in my C++ application, and I want to get an element using either myMap.find(key) or myMap[key]. However, I would also like to get the index of that element in the map.
std::map<string, int> myMap;
// Populate myMap with a bunch of items...
myElement = myMap["myKey"];
// Now I need to get the index of myElement in myMap
Is there a clean way to do that?
Thank you.

I came here seeking for this answer but i found this
distance function takes 2 iterators and returns an index
cout << distance(mymap.begin(),mymap.find("198765432"));
hope this helps :D

A std::map doesn't really have an index, instead it has an iterator for a key / value pair. This is similar to an index in that it represents a position of sorts in the collection but it is not numeric. To get the iterator of a key / value pair use the find method
std::map<string, int>::iterator it = myMap.find("myKey");

Most of the time when you are working with indices and maps, it usually means that your map is fixed after some insertions. If this assumption holds true for your use case, you can use my answer.
If your map is already fixed (you wouldn't add/delete any key afterward), and you want to find an index of a key, just create a new map that maps from key to index.
std::map<string, int> key2index; // you can use unordered_map for it to be faster
int i = 0;
for (pair<K, V> entry : yourMap) {
key2index[entry.first] = i++;
}
From this key2index map you can query the key as often as you can. Just call key2index['YourKey'] to get your index.
The benefit of this method over distance function is access time complexity. It's O(1) and very fast if you do query often.
Extra Section
If you want to do the opposite, you want to access key from index then do the following.
Create an array or vector that stores keys of your entire map. Then you can access the key by specifying the index.
vector<int> keys;
for (pair<K,V> entry : yourMap) {
keys.push_back(entry.first);
}
To access an index i of your map, use yourMap[keys[i]]. This is also O(1) and significantly faster because it's using only an array/vector, not a map.

Well - map is keeping the key and the data as a pair
so you can extract key by dereferecing the map's iterator into pair or directly into pair's first element.
std::map<string, int> myMap;
std::map<string, int>::iterator it;
for(it=myMap.begin();it!=myMap.end();it++)
{
std::cout<<it->first<<std::endl;
}

Use
int k = distance(mymap.begin(), mymap.find(mykey));
It will give you the index of the key element.

There is no such thing as an index in a map. Maps are not stored (not necessarly, at least; and indeed they are not in most implementations) as a sequence of "pairs".
Regardless of the implementation, however, std::map does not model a container having an index.
Depending on what you are asking this question for, the "index" can be an iterator (as suggested by others) or the key itself.
However, it sounds strange you asked this question. If you could give us a bit more details we would probably be able to point you to a better solution to your problem.

The semantic of a map does not include indexes. To understand that, you can note that Maps are typically implemented as trees. Therefore, elements in it do not have an index (try to define an index in a natural way for a tree).

Map is a key-value data structure which internally data in a tree structure. There are O(n) solution stated above.
" distance(mymap.begin(),mymap.find("198765432")) " will not bring you the correct answer.
For your requirement, you have to build your own segment tree type data structure for O log(n) competitive operations.

A use case: if you want to know how many items are smaller or equal as you progress on a vector. Constraint : i < = j, how many v[i]'s are smaller or equal to v[j]). let's insert it into a map or set.
vector<int> v={1, 4, 2, 3};
set<int> s;
s = {1}; // 1's position is 1 (one based)
s = {1,4}; //4's positon is 2
s = {1, 2, 4} ;//2's position is 2
s = {1 , 2, 3, 4}; //3's positon is 3
it seems std:distance would need a O(n) time.
I could achieve same affect using set.lower_bound() and counting backward till set.begin(). Does anyone have a better solution than requiring O(n) , perhaps using additional data structures?
OK, on a second thought here is a solution to store index (1 based) for this specific problem. However it may not solve the problem for get the correct index of items in the finished map.
vector<int> arr={1 , 1 , 2, 4, 2};
multimap<int, int> track;
for(auto a:arr)
{
auto it = track.insert(make_pair(a, 1)); //first item is 1
if(it!=track.begin())
{
--it;
int prev=it->second;
it++;
it->second+=prev;
}
cout<<a<<','<<it->second-1<<endl;
}

QMultiHash insert() behavior for duplicates

I have a QMultiHash<Key, Value*>. I may have more than one Value* per Key so I do want to store every Value* that corresponds to each Key, but I don't want to store exact duplicates where key1 == key2 && value1 == value2 more than once.
If I call QMultiHash::insert( Key, Value* ) with a Key/Value* pair that is already in the hash, will it add a second copy? In other words, if I call insert() multiple times with the same Key/Value* pair, and then call QMultiHash::values( Key ) will I get back the same Value* once, or will I get a list with the Value* occurring the same number of times that I called insert?

No. QMultiHash, by definition, allows multiple values associated with a given key. That's the "Multi" part of the QMultiHash. For example,
QMultiHash<int, int> multi; //multi.size() = 0
multi.insert(5, 1); //multi.size() = 1
multi.insert(5, 2); //multi.size() = 2
QList<int> list(multi.values(5);) //list = {2, 1};
If you want to enforce unique keys, you should use QHash to communicate this fact to other programmers and check 'QHash::contains(key)' prior to insertion. Also note that insertion order matters!
Similarly, QMultiHash allows duplicate key-value pairs, not just duplicate keys. For example,
QMultiHash<int, int> multi; //multi.size() = 0
multi.insert(5, 2); //multi.size() = 1
multi.insert(5, 2); //multi.size() = 2
QList<int> list(multi.values(5);) //list = {2, 2};
If you want to allow multiple values with a single key but still enforce unique key-value pairs, you have to manually check for the unique pair prior to insertion using QMultiHash::contains(key, value).
Both of these facts are intended features in Qt, putting the burden on the programmer to enforce uniqueness instead of taking a performance hit doing that check with every insertion. This is what a C++ programmer should expect from a well-designed class.

No, you will not get back the Value* only once. You will get back one copy for every time you called insert. QMultiHash::values( key ) will return a QList<Value*> that contains the same number of duplicates as calls to QMultiHash::insert() This was determined by running a test to see what would happen.

How to access/iterate over all non-unique keys in an unordered_multimap?

I would like to access/iterate over all non-unique keys in an unordered_multimap.
The hash table basically is a map from a signature <SIG> that does indeed occur more than once in practice to identifiers <ID>. I would like to find those entries in the hash table where occurs once.
Currently I use this approach:
// map <SIG> -> <ID>
typedef unordered_multimap<int, int> HashTable;
HashTable& ht = ...;
for(HashTable::iterator it = ht.begin(); it != ht.end(); ++it)
{
size_t n=0;
std::pair<HashTable::iterator, HashTable::iterator> itpair = ht.equal_range(it->first);
for ( ; itpair.first != itpair.second; ++itpair.first) {
++n;
}
if( n > 1 ){ // access those items again as the previous iterators are not valid anymore
std::pair<HashTable::iterator, HashTable::iterator> itpair = ht.equal_range(it->first);
for ( ; itpair.first != itpair.second; ++itpair.first) {
// do something with those items
}
}
}
This is certainly not efficient as the outer loop iterates over all elements of the hash table (via ht.begin()) and the inner loop tests if the corresponding key is present more than once.
Is there a more efficient or elegant way to do this?
Note: I know that with a unordered_map instead of unordered_multimap I wouldn't have this issue but due to application requirements I must be able to store multiple keys <SIG> pointing to different identifiers <ID>. Also, an unordered_map<SIG, vector<ID> > is not a good choice for me as it uses roughly 150% of memory as I have many unique keys and vector<ID> adds quite a bit of overhead for each item.

Use std::unordered_multimap::count() to determine the number of elements with a specific key. This saves you the first inner loop.
You cannot prevent iterating over the whole HashTable. For that, the HashTable would have to maintain a second index that maps cardinality to keys. This would introduce significant runtime and storage overhead and is only usefull in a small number of cases.
You can hide the outer loop using std::for_each(), but I don't think it's worth it.

I think that you should change your data model to something like:
std::map<int, std::vector<int> > ht;
Then you could easily iterate over map, and check how many items each element contains with size()
But in this situation building a data structure and reading it in linear mode is a little bit more complicated.

Inserting elements at desired positions in a STL map

map <int, string> rollCallRegister;
map <int, string> :: iterator rollCallRegisterIter;
map <int, string> :: iterator temporaryRollCallRegisterIter;
rollCallRegisterIter = rollCallRegister.begin ();
tempRollCallRegisterIter = rollCallRegister.insert (rollCallRegisterIter, pair <int, string> (55, "swati"));
rollCallRegisterIter++;
tempRollCallRegisterIter = rollCallRegister.insert (rollCallRegisterIter, pair <int, string> (44, "shweta"));
rollCallRegisterIter++;
tempRollCallRegisterIter = rollCallRegister.insert (rollCallRegisterIter, pair <int, string> (33, "sindhu"));
// Displaying contents of this map.
cout << "\n\nrollCallRegister contains:\n";
for (rollCallRegisterIter = rollCallRegister.begin(); rollCallRegisterIter != rollCallRegister.end(); ++rollCallRegisterIter)
{
cout << (*rollCallRegisterIter).first << " => " << (*rollCallRegisterIter).second << endl;
}
Output:
rollCallRegister contains:
33 => sindhu
44 => shweta
55 => swati
I have incremented the iterator. Why is it still getting sorted? And if the position is supposed to be changed by the map on its own, then what's the purpose of providing an iterator?

Because std::map is a sorted associative container.
In a map, the key value is generally used to uniquely identify the element, while the mapped value is some sort of value associated to this key.
According to here position parameter is
the position of the first element to be compared for the insertion
operation. Notice that this does not force the new element to be in
that position within the map container (elements in a set always
follow a specific ordering), but this is actually an indication of a
possible insertion position in the container that, if set to the
element that precedes the actual location where the element is
inserted, makes for a very efficient insertion operation. iterator is
a member type, defined as a bidirectional iterator type.
So the purpose of this parameter is mainly slightly increasing the insertion speed by narrowing the range of elements.
You can use std::vector<std::pair<int,std::string>> if the order of insertion is important.

The interface is indeed slightly confusing, because it looks very much like std::vector<int>::insert (for example) and yet does not produce the same effect...
For associative containers, such as set, map and the new unordered_set and co, you completely relinquish the control over the order of the elements (as seen by iterating over the container). In exchange for this loss of control, you gain efficient look-up.
It would not make sense to suddenly give you control over the insertion, as it would let you break invariants of the container, and you would lose the efficient look-up that is the reason to use such containers in the first place.
And thus insert(It position, value_type&& value) does not insert at said position...
However this gives us some room for optimization: when inserting an element in an associative container, a look-up need to be performed to locate where to insert this element. By letting you specify a hint, you are given an opportunity to help the container speed up the process.
This can be illustrated for a simple example: suppose that you receive elements already sorted by way of some interface, it would be wasteful not to use this information!
template <typename Key, typename Value, typename InputStream>
void insert(std::map<Key, Value>& m, InputStream& s) {
typename std::map<Key, Value>::iterator it = m.begin();
for (; s; ++s) {
it = m.insert(it, *s).first;
}
}
Some of the items might not be well sorted, but it does not matter, if two consecutive items are in the right order, then we will gain, otherwise... we'll just perform as usual.

The map is always sorted, but you give a "hint" as to where the element may go as an optimisation.
The insertion is O(log N) but if you are able to successfully tell the container where it goes, it is constant time.
Thus if you are creating a large container of already-sorted values, then each value will get inserted at the end, although the tree will need rebalancing quite a few times.

As sad_man says, it's associative. If you set a value with an existing key, then you overwrite the previous value.
Now the iterators are necessary because you don't know what the keys are, usually.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js