Partial match for the key of a std::map - c++

I have an std::map and I want to search for a key using a substring. For example, I have the following code:
#include <iostream>
#include <map>
#include <string>
using namespace std;
typedef std::map<std::string, std::string> TStrStrMap;
typedef std::pair<std::string, std::string> TStrStrPair;
int main(int argc, char *argv[])
{
TStrStrMap tMap;
tMap.insert(TStrStrPair("John", "AA"));
tMap.insert(TStrStrPair("Mary", "BBB"));
tMap.insert(TStrStrPair("Mother", "A"));
tMap.insert(TStrStrPair("Marlon", "C"));
return 0;
}
Now, I want to search for the position that holds the substring "Marl" and not "Marlon", if "Marla" is stored in the map. I want to find something that starts with "Marl". I need to find at most one position. Is this possible? If so, how?
I don't want to use any Boost libraries!

You can't efficiently search for substring, but you can for prefix:
#include <iostream>
#include <map>
#include <string>
#include <algorithm>
using namespace std;
typedef map<string, string> TStrStrMap;
typedef pair<string, string> TStrStrPair;
TStrStrMap::const_iterator FindPrefix(const TStrStrMap& map, const string& search_for) {
TStrStrMap::const_iterator i = map.lower_bound(search_for);
if (i != map.end()) {
const string& key = i->first;
if (key.compare(0, search_for.size(), search_for) == 0) // Really a prefix?
return i;
}
return map.end();
}
void Test(const TStrStrMap& map, const string& search_for) {
cout << search_for;
auto i = FindPrefix(map, search_for);
if (i != map.end())
cout << '\t' << i->first << ", " << i->second;
cout << endl;
}
int main(int argc, char *argv[])
{
TStrStrMap tMap;
tMap.insert(TStrStrPair("John", "AA"));
tMap.insert(TStrStrPair("Mary", "BBB"));
tMap.insert(TStrStrPair("Mother", "A"));
tMap.insert(TStrStrPair("Marlon", "C"));
Test(tMap, "Marl");
Test(tMap, "Mo");
Test(tMap, "ther");
Test(tMap, "Mad");
Test(tMap, "Mom");
Test(tMap, "Perr");
Test(tMap, "Jo");
return 0;
}
This prints:
Marl Marlon, C
Mo Mother, A
ther
Mad
Mom
Perr
Jo John, AA

When your substring is a prefix as in your example, you can use lower_bound to search for "Marl".
map<string,string>::const_iterator m = tMap.lower_bound("Marl");
cerr << (*m).second << endl;
This does not work for non-prefix substrings: in the general case, searching a map is not much different from searching other containers.

I'd like to expand on the answer by Sergey by providing a full solution using map::lower_bound(). As mentioned in the comments on that answer, you have to check whether lower_bound() returns tMap.end(). If not, then you also have to check whether the found key is actually prefixed with the search string. Latter can be checked, for example, by using string::compare(). As a result, my C++11 solution looks as follows:
std::map<std::string, std::string> myMap{
{"John", "AA"}, {"Mary", "BBB"}, {"Mother", "A"}, {"Marlon", "C"}, {"Marla", "D"}
};
std::string prefix("Marl");
auto it = myMap.lower_bound(prefix);
if (it != std::end(myMap) && it->first.compare(0, prefix.size(), prefix) == 0)
std::cout << it->first << ": " << it->second << std::endl;
Output:
Marla: D
However, if you want to find all keys in your map that are prefixed with the search string, then you can use the following loop:
for (auto it = myMap.lower_bound(prefix); it != std::end(myMap) && it->first.compare(0, prefix.size(), prefix) == 0; ++it)
std::cout << it->first << ": " << it->second << std::endl;
Output:
Marla: D
Marlon: C
Code on Ideone

To search for a substring of a key in a map you have no choice but to either use a new map on a special kind of key type or to search your map in O(n). std::map uses (by default) operator<() for ordering keys and for searching, and that compare function for std::string is a plain lexicographical compare.
If you create a new map on a special key type that has operator<() compare on basis of a substring take note that this will also affect the decision of whether a new element to insert would be a duplicate. In other words, such a map will only have elements that are not substrings of each other.
The O(n) search practically means you use std::find() over the map, with a custom predicate that takes a std::pair<std::string,std::string> and returns true if the second element of the pair is a substring of the first.

typedef TStrStrMap::value_type map_value_type;
struct key_contains_substring
: std::binary_function<map_value_type, std::string, bool>
{
bool operator()(const map_value_type& map_value, const std::string& substr)
{
return std::search(map_value.first.begin(), map_value.first.end(),
substr.begin(), substr.end()) != map_value.first.end();
}
};
...
TStrStrMap::const_iterator it = std::find_if(tMap.begin(), tMap.end(),
std::bind2nd(key_contains_substring(), "Marl");

Related

Composite map: take data from another Map

I need to take the occurrences of words taken from a file, using map<string,int>, and then I need to copy them to a map<int,
vector<string>, cmpDec >, and print them in decreasing order.
I tried to take word frequencies from a file to a map<string, int> and then I'm trying to copy it to a map<int,
vector<string> > with no results
I have declared 2 maps:
map<string, int> text;
map<int, vector<string>, cmpDec> freq;
I take the text from a file in the first map with the word frequencies:
while (rf >> words) {
text[words]++;
}
Now I have to put the frequencies in the second map (required), where I need to have first int, for num of word frequencies, vector with the words for each freq, and the compare for decreasing frequencies.
Now i'm trying to put the datas in the second map from the first in these ways:
map<string, int>::iterator iter_map1 = text.begin();
map<int, vector<string>>::iterator iter = freq.begin();
vector<string>::iterator iter_v;
for (; iter_map1 != text.end(); ++iter_map1) {
iter->first.insert(make_pair(iter_map1->second, iter->second.push_back(iter_map1->first)));
}
It gives 2 errors on the iter->second.... line:
...\BagOfWords.cpp|56|error: request for member 'insert' in 'iter.std::_Rb_tree_iterator<_Tp>::operator-><std::pair<const int, std::vector<std::__cxx11::basic_string<char> > > >()->std::pair<const int, std::vector<std::__cxx11::basic_string<char> > >::first', which is of non-class type 'const int'|
and
...\BagOfWords.cpp|56|error: invalid use of void expression|
What am I doing wrong? Is there an easier way to take words (and their frequencies) from a file and put them on the second map without passing from the first?
With C++17 you can do structured binding, which helps a lot when iterating through a map.
#include <map>
#include <vector>
#include <string>
#include <iostream>
using WordCounts = std::map<std::string, int>;
using FrequencyOfWords = std::map<int, std::vector<std::string>, std::greater<int>>;
int main()
{
WordCounts word_counts;
FrequencyOfWords words_freq;
std::vector<std::string> words = {"test", "hello", "test", "hello", "word"};
for(const auto& word : words)
word_counts[word]++;
for(const auto& [word, count] : word_counts)
words_freq[count].push_back(word);
for (const auto& [freq, words] : words_freq)
{
std::cout << "freq " << freq << " words";
for (auto const& word: words)
std::cout << " " << word;
std::cout << '\n';
}
}
I don't think that you can do this in one pass as you don't know the word counts upfront.
First, a couple of recommendations. Use typedef (or using for C++ 11 or later). This will save you some typing and also ensure that your types are correct. In your code freq and iter don't have the same underlying container type (they differ in the comparison used).
Secondly, try to use the standard library as much as possible. You don't show cmpDec but I guess that it is a comparator based on greater-than rather than the default less-than. I would prefer to see std::greater<int> rather than a custom comparator.
For your errors, in the line
iter->first.insert(...
iter is at the start of freq and you are trying to insert to first which is int.
This should probably be something like
freq[iter_map1->second].push_back(iter_map1->first);
Breaking that down
freq[iter_map1->second] This uses the int word count from text to lookup an entry in freq. If there is no entry an empty one will be inserted to freq.
.push_back(iter_map1->first) This inserts the string from text to the vector that was found or created in the previous step
Here is a full example of what I think you are trying to achieve.
#include <map>
#include <vector>
#include <string>
#include <functional>
#include <fstream>
#include <iostream>
using std::map;
using std::vector;
using std::string;
using std::greater;
using std::ifstream;
using std::cout;
using WordCounts = map<string, int>;
using FrequencyOfWords = map<int, vector<string>, greater<int>>;
int main()
{
WordCounts text;
FrequencyOfWords freq;
ifstream rf("so26.cpp");
string words;
while (rf >> words)
{
text[words]++;
}
WordCounts::const_iterator iter_map1 = text.begin();
for (; iter_map1 != text.end(); ++iter_map1)
{
freq[iter_map1->second].push_back(iter_map1->first);
}
for (auto const& e: freq)
{
cout << "freq " << e.first << " words";
for (auto const& w: e.second)
{
cout << " " << w;
}
cout << "\n";
}
}
Perhaps I misunderstood the question, but I reckon the following does what you want (I prefer unordered maps, as they are faster and you don't seem to need the ordering)
std::unordered_map<std::string,int> word_counts;
std::string word;
while(input >> word)
word_counts[word]++;
std::unordered_map<int,std::vector<std::string>> words_by_freq;
for(const auto& counted : word_counts)
words_by_freq[counted::second].push_back(counted::first);

Efficient way to get key from value when map contain vector of string as value

How to get key using value which is vector of string and vice versa. Below is my code.
#include<iostream>
#include<map>
#include<string>
#include <unordered_map>
#include <vector>
using namespace std;
int main()
{
std::unordered_map<std::string, std::vector<std::string>> Mymap;
Mymap["unique1"] = {"hello", "world"};
Mymap["unique2"] = {"goodbye", "goodmorning", "world"};
Mymap["unique3"] = {"sun", "mon", "tue"};
for(auto && pair : Mymap) {
for(auto && value : pair.second) {
std::cout << pair.first<<" " << value<<"\n";
if(value == "goodmorning") // how get key i.e unique2 ?
}}
}
case 1: When value is input. key is output.
Input : goodmorning
output : unique2
case 2: When key is input value is output.
Input : unique3
output: sun ,mon ,tue
Note : No boost library available.
For case 1, a combination of find_if and any_of will do the job.
For case 2, you can simply use the find method of unordered_map.
#include<iostream>
#include<map>
#include<string>
#include <unordered_map>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
unordered_map<string, vector<string>> Mymap;
Mymap["unique1"] = { "hello", "world" };
Mymap["unique2"] = { "goodbye", "goodmorning", "world" };
Mymap["unique3"] = { "sun", "mon", "tue" };
// Case 1
string test_value = "goodmorning";
auto iter1 = find_if(Mymap.begin(), Mymap.end(),
[&test_value](const decltype(*Mymap.begin()) &pair)
{
return any_of(pair.second.begin(), pair.second.end(), [&test_value](const string& str) { return str == test_value; });
});
if (iter1 != Mymap.end())
{
cout << "Key: " << iter1->first << endl;
}
else
{
cout << "No key found for " << test_value;
}
// Case 2
test_value = "unique3";
auto iter2 = Mymap.find(test_value);
if (iter2 != Mymap.end())
{
int first = true;
for (auto v : iter2->second)
{
cout << (first ? "" : ", ") << v;
first = false;
}
cout << endl;
}
else
{
cout << "No value found for key " << test_value << endl;
}
return 0;
}
The key is stored in pair.first. Just use that if your use-case is in loop iteration as you illustrated.
If you mean in any use, without iteration, that is, given a value obtain the associated key, there is not a direct way to do that. You could build inverse maps for each value to key but that would not be really efficient considering also the fact that you would also need unique values.
Create another map going the other way for every vector entry?
If the array entries are not unique, then you would need to do the same map-to-vector, or use multimap.
Also consider using hash map (unordered_map), and stringview as ways to reduce the memory usage of the second map?
But the best answer would be the boost 2-way map, sorry. You could wrap the two maps in your own class that exposes the functionality of a 2-way map.

How to change the key value of inner map in c++?

I'm working on map and I have following nested map and initialized with some values:
map<string, map<int, int> > wordsMap;
map<int, int> innerMap;
map<int, int>::iterator iti;
for(int i = 2; i < argc; i++)
{
wordsMap[argv[i]].insert(pair<int, int>(0,0));
}
And after some processing I'm trying to change the content if inner map, I use following code:
while(some_condition)
{
i = 0
for( it = wordsMap.begin() ; it != wordsMap.end(); it++)
{
innerMap = it->second;
int cnt = count(words.begin(), words.end(), it->first);
if(cnt != 0){
wordsMap[it->first][i] = cnt;
}
}
i++;
}
In the above scenario, How to change the value of first key (i.e. "0") and its value used while initialization of the inner map with another key-value pair?
You can't change the key of an element in an std::map. Doing so would break ordering.
Instead, you must insert a new element in the map with the key you want, and delete the previous element from the map.
I'm not sure if I understand your intend. I assume you want to save,
<KEY : file_name, VALUE : <KEY : line, VALUE : words count>>
And you don't want to save second map if there is no words.
So, I wrote below code.
If you want to not present second map, just keep empty map through not inserting key-value.
Additionally, sincestd::map is an associative container, which means it is saved sorted based on Key value, you should try to avoid change the key value after saving it.
#include "stdafx.h"
#include <algorithm>
#include <string>
#include <iostream>
#include <vector>
#include <map>
using namespace std;
typedef std::map<int, int> WORDS_COUNT_MAP_T; //for line, words count
typedef std::map<string, WORDS_COUNT_MAP_T> FILE_WORDS_COUNT_MAP_T; //for file name, WORDS_COUNT_MAP_T
int _tmain(int argc, _TCHAR* argv[])
{
FILE_WORDS_COUNT_MAP_T file_words_count_map;
//Input dummy data for test
//init file names
std::vector<string> file_names;
file_names.push_back("first");
file_names.push_back("second");
file_names.push_back("third");
//get and set words count in each file
for_each(file_names.begin(), file_names.end(), [&](const string& file_name)
{
//Just for test
WORDS_COUNT_MAP_T words_count_map;
if(file_name == "second")
{
//not find words, so nothing to do
}
else
{
words_count_map[0] = 10;
words_count_map[1] = 20;
}
file_words_count_map.insert(FILE_WORDS_COUNT_MAP_T::value_type(file_name, words_count_map));
});
//print
for_each (file_words_count_map.begin(), file_words_count_map.end(), [&](FILE_WORDS_COUNT_MAP_T::value_type& file_words_map)
{
cout << "file name : " << file_words_map.first << endl;
WORDS_COUNT_MAP_T words_count_map = file_words_map.second;
for_each (words_count_map.begin(), words_count_map.end(), [](WORDS_COUNT_MAP_T::value_type& words_map)
{
cout << "line : " << words_map.first << ", count : " << words_map.second << endl;
});
cout << "----" << endl;
});
getchar();
return 0;
}
This code will print like below,

Text Histogram, tokens stored in map

I am reading from a file and take the words as tokens with strtok. I am trying to store the words in a map structure. I don't really know how to insert the tokens in the map.
My code so far:
#include <iostream>
#include <string.h>
#include <fstream>
#include <map>
using namespace std;
//std::map <string, int> grade_list;
int main()
{
std::map <string, int> grade_list;
char text[100];
int nr=0, i=1;
char *ptr;
ifstream myfile("ana.txt");
if(!myfile.is_open())
cout << "Could not open file" << endl;
else
{
myfile.get(text, 100);
ptr = strtok(text, " ,.-?!");
while(ptr != NULL)
{
nr++;
cout << ptr << endl;
ptr = strtok(NULL, " ,.-?!");
grade_list.insert(ptr);
i++;
}
}
cout << "\nAveti " << nr << " cuvinte." << endl;
return 0;
}
std::map is an associative container, provides Key -> Value relationship. In your case it is std::string -> int. So, you should specify Value while inserting too:
grade_list[ptr] = nr;
Also, instead of char array and using strtok I suggest use std::string and boost::algorithm::split, or boost::tokenizer.
I want to see for each word in the file how manny times it appears in the text.
So, you have to change Value type in map to std::size_t(since you din't need to negative values):
std::map <string, std::size_t> grade_list;
And just write:
++grade_list[ptr];
You should probably look at the std::map::insert definition, the value_type parameter is a std::pair< std::string, int > so you should probably write the insert statement as:
grade_list.insert(std::pair< std::string, int >(std::string(ptr), 1));
This will add an entry into the map with the key "token" and the value 1.
What you probably want is more like add an entry if it does not exist or increment the value :
this can be achieved by writing something like
if (grade_list.find(ptr) == grade_list.end())
{
// insert new entry
grade_list.insert(std::pair< std::string, int >(std::string(ptr), 1)); // can be written as grade_list[ptr] = 1;
}
else
{
// increment token
grade_list[ptr] += 1; // can be written as grade_list[ptr]++;
}

can I put multimap iteration logic to another function?

I'm particularly interested for backward looping through keys with non repeating:
#include <map>
#include <iostream>
std::multimap<int,int> myMap = {
{1,2}, {1,2}, {2,2}, {2,2}, {3,2},
};
int main() {
using namespace std;
cout << "the keys backwards:" << endl;
typedef multimap<int, int> multimap_type;
typedef std::reverse_iterator<multimap_type::iterator> reverse_iterator;
for (auto it = myMap.rbegin(), end = myMap.rend(); it != end; it = reverse_iterator(myMap.lower_bound(it->first)))
{
cout << it->first << endl;
}
}
As you can see I must repeat multimap name three times among other things. Can I write my own function for handling all that and then call simply while or range for loop ? Like that:
while( (auto it = myIterFunc(myMap)) {
//...
}
for ( auto it : myIterFunc(myMap)) {
//...
}
for ( auto it : myIterFunc(myMap)) {
The names it and myIterFunc imply you are confused about the new range-based for loop. The variable it is not an iterator, it's an element of the range. The function myIterFunc should not return iterators, it should return something that looks like a range i.e. has begin() and end() members that allow iterating over the desired range.
You can use a Boost.Range adaptor to loop through it in reverse:
#include <boost/range/adaptors.hpp>
for (auto& val : boost::adaptors::reverse(myMap))
cout << val.first << endl;
You could combine that with a filter adaptor to skip over duplicate keys. (There is a uniqued adaptor but it uses == to determine uniqueness, instead of only inspecting keys)