Finding the key with most values in map<string, vector<string>>

Finding the key with most values in map<string, vector<string>> - c++

set<string> myFunc (const map<string, vector<string>>& m)
I want to return all the keys in a set of strings, that map the most values (several keys if number of mapped values is the same). My attempt was:
set<string> ret;
auto max_e = *max_element(m.begin(), m.end(), [] (const pair<string, vector<string>>& m1, const pair<string, vector<string>>& m2) {
return m1.second.size() < m2.second.size();
});
ret.insert(max_e.first);
return ret;
Logically, this cannot work (I think) since this would only return one key with the highest value. Any ideas?

One way of doing it would be iterating twice:
1st one to get the maximum size out of all keys.
2nd one to get the keys that map to that size.
It should look along the lines of:
set <string> myFunc(const map<string, vector<string>>& m) {
set <string> ret;
size_t maximumSize = 0;
for (const auto& e : m) {
maximumSize = max(maximumSize, e.second.size());
}
for (const auto& e : m) {
if (e.second.size() == maximumSize) {
ret.insert(e.first);
}
}
return ret;
}

In addition to #a.Li's answer, if possible, you can also optimize quite a few things along the way.
Of course, iterating the map twice is probably the least expensive & simple way of solving the issue:
using StringMapType = std::map<std::string, std::vector<std::string>>;
using StringMapVectorType = StringMapType::value_type::second_type;
std::set<StringMapType::key_type> findKeys(const StringMapType &stringMap) {
StringMapVectorType::size_type maximumSize {};
for (const auto &[key, values] : stringMap)
maximumSize = std::max(maximumSize, values.size());
std::set<StringMapType::key_type> results {};
for (const auto &[key, values] : stringMap)
if (values.size() == maximumSize)
results.emplace(key);
return results;
}
However, I would recommend the following, if possible:
if you aren't interested in ordering the keys in your map type, use std::unordered_map,
replace the return value type (std::set with std::vector, if you aren't interested in the order of the keys stored in the results.
Object lifetime specific optimizations:
use std::string_view for the keys you find; this will avoid additional copies of the strings, assuming they aren't optimized out with short string optimization,
return an array of iterators, instead of their keys
If applied, the code could look something like this:
std::vector<StringMapType::const_iterator> findKeys(const StringMapType &stringMap) {
StringMapVectorType::size_type maximumSize {};
for (const auto &[key, values] : stringMap)
maximumSize = std::max(maximumSize, values.size());
std::vector<StringMapType::const_iterator> results {};
for (auto iterator = stringMap.cbegin(); iterator !=
stringMap.cend(); ++iterator)
if (const auto &values = iterator->second;
values.size() == maximumSize)
results.emplace_back(iterator);
return results;
}
Of course, if you'd like to avoid the whole issue, you can instead sort your values at the time of insertion using a custom comparator, or find the the entry with the most amount of elements in its array, and insert the new entry before it (of course, you'd have to use an unordered map, or another container).
Useful things for the future:
In which scenario do I use a particular STL container?

Related

The best practice for (unordered) map keys and values modification

The map of the form map<long long, vector<long long>> is given. One has to take all keys and values modulo some integer N. Some keys can merge and corresponding values must join accordingly. For example, the map {{1,{2,6,4}}, {5,{8,4,9}}, {10,{5,1,7}}} should be equal to {{1,{2,1,4}}, {0,{0,1,2,3,4}}} after reduction modulo 5.
My way is in using a new map but I think there should be a better way.
code added
vector<long long> tmp;
//integer N, for example N = 5
int N = 5;
unordered_map<long long, vector<long long>> map;
//temporary map
unordered_map<long long, vector<long long>> map_tmp;
for (auto & x : map)
{
tmp.clear();
for (auto & y : x.second) tmp.push_back(y % N);
ind = x.first % N;
map_tmp[ind].insert(map_tmp[ind].end(), tmp.begin(), tmp.end());
sort(map_tmp[ind].begin(), map_tmp[ind].end());
map_tmp[ind].erase(unique(map_tmp[ind].begin(), map_tmp[ind].end()), map_tmp[ind].end());
}
map = map_tmp;

Since apparently values in map are unique and after applying modulo operation values contains unique items, then you should use different data structure. for example:
using Map = std::unordered_map<int, std::set<int>>;
std::set will handle uniqueness and order of items for given key.
Now the whole trick is to inspect API of std::unordered_map and std::set and how item can be inserted there. See:
std::unordered_map::insert
std::set::insert
Note return value: std::pair<iterator,bool> which gives you iterator to inserted or exciting item in map/set.
Knowing this thing writing a code which is able to meet your requriements is quite simple:
using Map = std::unordered_map<int, std::set<int>>;
Map moduloMap(const Map& in, int mod)
{
Map out;
for (const auto& [k, s] : in) {
if (s.empty())
continue;
auto& destSet = out.insert({ k % mod, {} }).first->second;
for (auto x : s) {
destSet.insert(x % mod);
}
}
return out;
}
Live demo with tests

Sometimes a for loop can be the easiest, clearest way to do something.
map<long long, vector<long long>> result;
for (const auto& [key, vec] : input) {
process (result[key%5], vec);
}
and process takes the vector by (non-const) reference and appends the reduced values from the second (const) argument.
update
After seeing the code you posted, I have several suggestions:
use a set instead. You are spending multiple steps to append the new values, sort the whole thing together, then remove duplicates. Just use a set which maintains a single copy of each value automatically.
use structured binding in your loop. Instead of x.second and x.first you can just name them key and vec as in my earlier post.
Assuming you still need tmp, declare it where you are calling .clear() now, instead of declaring it way up at the top of your code. You don't need to clear it each time through the loop; it will be empty each time through the loop naturally.

How to iterate through a container within a const container?

Suppose you have the incomplete function iterate that has the following parameter, a const map that has pair<int, vector<string>> and the following loops:
string iterate(const map<int, vector<string>>& m) {
for (map<int, vector<string>>::const_iterator start_iter = m.begin(); start_iter != m.end(); ++start_iter) {
for (auto vector_iter = m[(*start_iter).first].begin(); vector_iter != m[(*start_iter).first].end(); ++vector_iter) {
}
}
}
Iterating through a const map requires that the iterator be const, map<int, vector<string>>::const_iterator. That makes sense. However, when trying to iterate through the vector within the constmap what type does auto have to be, or is it not possible to iterate through a container within a const container. I tried making auto vector<string>::const_iterator, but the function still fails. Am I missing something?

Really, do yourself a favor and use range based for loops:
for (auto&& [key, vec] : m) {
for (auto&& vec_element : vec) {
// All vec elements
}
}
If you don't want to use C++17 (you should, C++17 is great!) then do that:
for (auto&& m_element : m) {
for (auto&& vec_element : m_element.second) {
// All vec elements
}
}
If you really want to use use iterators (you really should use range based for loops) then continue reading.
You are not using iterator correctly. You are using a iterator that points on an element in the map, then get it's key to get back the element. This is not how to use an iterator.
Instead, you should use the object the iterator is pointing to directly.
for (auto start_iter = m.begin(); start_iter != m.end(); ++start_iter) {
for (auto vector_iter = start_iter->second.begin(); vector_iter != start_iter->second.end(); ++vector_iter) {
// shuff with vector_iter
}
}
Now why does operator[] failed?
That's because it's a std::map. A with a map, you should be able to create something on the fly:
// creates element at key `123`
m[123] = {"many", "strings", "in", "vector"};
For the element to be created by the container at a new key, it must save that new element in itself, so it must mutate itself to provide operator[], so it's not const.
You could have also use std::map::at(), but it isn't worth it in your case. Using either operator[] or map.at(key) will make the map search for the key, which isn't trivial in complexity.

Getting all the keys of a map of the form <pair<int,int>, int*> in C++

In my C++ code I am using a map like this:
std::map<std::pair<int,int>,int*> patterns;
The problem is that I cannot figure out how I get all the keys of that map which are of the form
pair<int,int>
I have seen a few questions related to it, but in all the cases keys are single integers.

If you wanted to just iterate through all the keys:
C++03
for (std::map<std::pair<int,int>,int*>::iterator I = patterns.begin(); I != patterns.end(); I++) {
// I->first is a const reference to a std::pair<int,int> stored in the map
}
C++11
for (auto& kv : patterns) {
// kv.first is a const reference to a std::pair<int,int> stored in the map
}
If you wanted to copy the keys into a new container:
C++03
std::vector<std::pair<int,int> > V;
std::set<std::pair<int,int> > S;
for (std::map<std::pair<int,int>,int*>::iterator I = patterns.begin(); I != patterns.end(); I++) {
V.push_back(I->first);
S.insert(I->first);
}
C++11
std::vector<std::pair<int,int>> V;
std::set<std::pair<int,int>> S;
for (auto& kv : patterns) {
V.push_back(kv.first);
S.insert(kv.first);
}
Because I'm bored, here are a few additional solutions:
You could also do it with standard algorithms and a lambda function, but I don't think this is really better than just writing the loop yourself:
std::vector<std::pair<int,int>> V(patterns.size());
std::transform(patterns.begin(), patterns.end(), V.begin(),
[](decltype(patterns)::value_type& p){ return p.first; });
std::set<std::pair<int,int>> S;
std::for_each(patterns.begin(), patterns.end(),
[&S](decltype(patterns)::value_type& p){ S.insert(p.first); });
You could also use a Boost transform iterator to wrap iterators from the map, such that when the wrapped iterator is dereferenced, it gives you just the key from the map. Then you could call std::vector::insert or std::set::insert directly on a range of transform iterators.

What could be reason it crashes when I use vector::erase?

I am trying to do some operation on vector. And calling erase on vector only at some case.
here is my code
while(myQueue.size() != 1)
{
vector<pair<int,int>>::iterator itr = myQueue.begin();
while(itr != myQueue.end())
{
if(itr->first%2 != 0)
myQueue.erase(itr);
else
{
itr->second = itr->second/2;
itr++;
}
}
}
I am getting crash in 2nd iteration.And I am getting this crash with message vector iterator incompatible .
What could be the reason of crash?

If erase() is called the iterator is invalidated and that iterator is then accessed on the next iteration of the loop. std::vector::erase() returns the next iterator after the erased iterator:
itr = myQueue.erase(itr);

Given an iterator range [b, e) where b is the beginning and e one past the end of the range for a vector an erase operation on an iterator i somewhere in the range will invalidate all iterators from i upto e. Which is why you need to be very careful when calling erase. The erase member does return a new iterator which you can you for subsequent operations and you ought to use it:
itr = myQueue.erase( itr );
Another way would be to swap the i element and the last element and then erase the last. This is more efficient since less number of moves of elements beyond i are necessary.
myQueue.swap( i, myQueue.back() );
myQueue.pop_back();
Also, from the looks of it, why are you using vector? If you need a queue you might as well use std::queue.

That is undefined behavior. In particular, once you erase an iterator, it becomes invalid and you can no longer use it for anything. The idiomatic way of unrolling the loop would be something like:
for ( auto it = v.begin(); it != v.end(); ) {
if ( it->first % 2 != 0 )
it = v.erase(it);
else {
it->second /= 2;
++it;
}
}
But then again, it will be more efficient and idiomatic not to roll your own loop and rather use the algorithms:
v.erase( std::remove_if( v.begin(),
v.end(),
[]( std::pair<int,int> const & p ) {
return p.first % 2 != 0;
}),
v.end() );
std::transform( v.begin(), v.end(), v.begin(),
[]( std::pair<int,int> const & p ) {
return std::make_pair(p.first, p.second/2);
} );
The advantage of this approach is that there is a lesser number of copies of the elements while erasing (each valid element left in the range will have been copied no more than once), and it is harder to get it wrong (i.e. misuse an invalidated iterator...) The disadvantage is that there is no remove_if_and_transform so this is a two pass algorithm, which might be less efficient if there is a large number of elements.

Iterating while modifying a loop is generally tricky.
Therefore, there is a specific C++ idiom usable with non-associative sequences: the erase-remove idiom.
It combines the use of the remove_if algorithm with the range overload of the erase method:
myQueue.erase(
std::remove_if(myQueue.begin(), myQueue.end(), /* predicate */),
myQueue.end());
where the predicate is expressed either as a typical functor object or using the new C++11 lambda syntax.
// Functor
struct OddKey {
bool operator()(std::pair<int, int> const& p) const {
return p.first % 2 != 0;
}
};
/* predicate */ = OddKey()
// Lambda
/* predicate */ = [](std::pair<int, int> const& p) { return p.first % 2 != 0; }
The lambda form is more concise but may less self-documenting (no name) and only available in C++11. Depending on your tastes and constraints, pick the one that suits you most.
It is possible to elevate your way of writing code: use Boost.Range.
typedef std::vector< std::pair<int, int> > PairVector;
void pass(PairVector& pv) {
auto const filter = [](std::pair<int, int> const& p) {
return p.first % 2 != 0;
};
auto const transformer = [](std::pair<int, int> const& p) {
return std::make_pair(p.first, p.second / 2);
};
pv.erase(
boost::transform(pv | boost::adaptors::filtered( filter ),
std::back_inserter(pv),
transformer),
pv.end()
);
}
You can find transform and the filtered adaptor in the documentation, along with many others.

Composability of STL algorithms

The STL algorithms are a pretty useful thing in C++. But one thing that kind of irks me is that they seem to lack composability.
For example, let's say I have a vector<pair<int, int>> and want to transform that to a vector<int> containing only the second member of the pair. That's simple enough:
std::vector<std::pair<int, int>> values = GetValues();
std::vector<int> result;
std::transform(values.begin(), values.end(), std::back_inserter(result),
[] (std::pair<int, int> p) { return p.second; });
Or maybe I want to filter the vector for only those pairs whose first member is even. Also pretty simple:
std::vector<std::pair<int, int>> values = GetValues();
std::vector<std::pair<int, int>> result;
std::copy_if(values.begin(), values.end(), std::back_inserter(result),
[] (std::pair<int, int> p) { return (p.first % 2) == 0; });
But what if I want to do both? There is no transform_if algorithm, and using both transform and copy_if seems to require allocating a temporary vector to hold the intermediate result:
std::vector<std::pair<int, int>> values = GetValues();
std::vector<std::pair<int, int>> temp;
std::vector<int> result;
std::copy_if(values.begin(), values.end(), std::back_inserter(temp),
[] (std::pair<int, int> p) { return (p.first % 2) == 0; });
std::transform(values.begin(), values.end(), std::back_inserter(result),
[] (std::pair<int, int> p) { return p.second; });
This seems rather wasteful to me. The only way I can think of to avoid the temporary vector is to abandon transform and copy_if and simply use for_each (or a regular for loop, whichever suits your fancy):
std::vector<std::pair<int, int>> values = GetValues();
std::vector<int> result;
std::for_each(values.begin(), values.end(),
[&result] (std::pair<int, int> p)
{ if( (p.first % 2) == 0 ) result.push_back(p.second); });
Am I missing something here? Is there a good way to compose two existing STL algorithms into a new one without needing temporary storage?

You're right. You can use Boost.Range adaptors to achieve composition.

I think the problem is unfortunately structural
C++ uses two iterators to represent a sequence
C++ functions are single-valued
so you cannot chain them because a function cannot return "a sequence".
An option would have been to use single-object sequences instead (like the range approach from boost). This way you could have combined the result of one processing as the input of another... (one object -> one object).
In the standard C++ library instead the processing is (two objects -> one object) and it's clear that this cannot be chained without naming the temporary object.

Back in 2000, the problem was already noted. Gary Powell and Martin Weiser came up with a "view" concept, and coined the name "View Template Library". It didn't take off then but the idea makes sense. A "view" adaptor essentially applies an on-the-fly transform. For instance, it can adapt the value_type.
The concept probably should be readdressed now we have C++0x. We've made quite some progress in generic programming since 2000.
For example, let's use the vector<pair<int, int>> to vector<int> example. That could be quite simple:
std::vector<std::pair<int, int>> values = GetValues();
vtl2::view v (values, [](std::pair<int, int> p) { return p.first });
std::vector<int> result(view.begin(), view.end());
Or, using the boost::bind techniques, even simpler:
std::vector<std::pair<int, int>> values = GetValues();
vtl2::view v (values, &std::pair<int, int>::first);
std::vector<int> result(view.begin(), view.end());

Since C++20 you can use std::ranges::copy together with the range adaptors std::views::filter and std::views::values from the Ranges library as follows:
int main() {
std::vector<std::pair<int, int>> values = { {1,2}, {4,5}, {6,7}, {9,10} };
std::vector<int> result;
auto even = [](const auto& p) { return (p.first % 2) == 0; };
std::ranges::copy(values | std::views::filter(even) | std::views::values,
std::back_inserter(result));
for (int i : result)
std::cout << i << std::endl;
return 0;
}
Output:
5
7
In the solution above, no temporary vector is created for an intermediate result, because the view adaptors create ranges that don't contain elements. These ranges are just views over the input vector, but with a customized iteration behavior.
Code on Wandbox

Not sure if this is still active, but...
A new light wait header only lib that does what you describe. Doc talks about lazy evaluation and com compossible generators.
Doc snippet:
Read in up to 10 integers from a file "test.txt".
filter for the even numbers, square them and sum their values.
int total = lz::read<int>(ifstream("test.txt")) | lz::limit(10) |
lz::filter([](int i) { return i % 2 == 0; }) |
lz::map([](int i) { return i * i; }) | lz::sum();
you can split that line up into multiple expressions.
auto numbers = lz::read<int>(ifstream("test.txt")) | lz::limit(10);
auto evenFilter = numbers | lz::filter([](int i) { return i % 2 == 0; });
auto squares = evenFilter | lz::map([](int i) { return i * i; });
int total = squares | lz::sum();
Even though this expression is split over multiple variable assignments, it is not any less efficient.
Each intermediate variable simply
describes a unit of code to be executed. All held in stack.
https://github.com/SaadAttieh/lazyCode

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Finding the key with most values in map<string, vector<string>> - c++

Related

The best practice for (unordered) map keys and values modification

How to iterate through a container within a const container?

Getting all the keys of a map of the form <pair<int,int>, int*> in C++

What could be reason it crashes when I use vector::erase?

Composability of STL algorithms

Categories

Resources