finding a key in a map - c++

I have a map which I have declared as follows:
map<int, bool> index;
and I insert values into the map as:
int x; cin>>x;
index[x]=true;
However,
cout<<index[y]; // for any number y not inindexgives me 0
As I get the value 0 when I check for a key which is not present in the map, how can I reliably find out if a key is present in the map or not?
I'm using a map for trying to find out if two sets are disjoint or not, and for the same I am using a map, and two vectors to store the input. Is this shabby in any way? Some other data structure I should be using?

You can use if (index.find(key) == index.end()) to determine if a key is present. Using index[key] you default-construct a new value (in this case, you call bool(), and it gets printed as 0.) The newly constructed value also gets inserted into the map (i.e. index[key] is equal in this case to index.insert(std::make_pair(key, bool()).)
Using two data structures for the same data is ok. However, is there any need to use a map, wouldn't a set suffice in your use case? I.e. if they key is presents, the value is true, and false otherwise?

To find if two sets (given as std::set) are disjoint, you can simply compute their intersection:
std::set<T> X, Y; // populate
std::set<T> I;
std::set_difference(X.begin(), X.end(), y.begin(), y.end(), std::back_inserter(I));
const bool disjoint = I.empty();
If your containers aren't std::sets, you have to make sure the ranges are ordered.
If you want to be more efficient, you can implement the algorithm for set_intersection and stop once you have a common element:
template <typename Iter1, typename Iter2>
bool disjoint(Iter1 first1, Iter1 last1, Iter2 first2, Iter2 last2)
{
while (first1 != last1 && first2 != last2)
{
if (*first1 < *first2) ++first1;
else if (*first2 < *first1) ++first2;
else { return false; }
}
return true;
}

Use map::find.

you can use index.find(key) != index.end() or index.count(key) > 0
Depending on the range of index items, it might be good to use a bitmap (only makes sense for a reasonnably small range of possible index items. Will make checks for being disjoint super easy and efficient) or use a a set instead of a map (map stores additional bools that are not really needed). A set also offers methods count(key) and find(key)

1, Use index.count(y). It's more concise than and equivalent to index.find(y) != index.end(), except for the fact that it's an integer 1 or 0, whereas of course != gives you a bool.
The downside is that count is potentially less efficient for multimap than it is for map, since it may have to count more than one entry. Since you aren't using a multimap, no problem.
2, You could sort both vectors and use std::set_intersection, but it's not a perfect fit if all you care is whether the intersection is empty or not. Depending where the input comes from, you may be able to get rid of both vectors and just construct a map as you go from the first load of input, then check each element of the second load of input against it. Finally, use a set instead of a map.

Related

std::map get value - find vs handcrafted loop [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I have a std::map object map<string , Property*> _propertyMap, where string is the property's name and Property* contains the property values.
I need to process the properties values and convert them to a specific data format- each property has its own format, e.g.. if the map initialization is as following:
_propertyMap["id"] = new Property(Property::UUID, "12345678");
_propertyMap["name"] = new Property(Property::STRING, "name");
....
then "id" should be processed differently than "name" etc.
This means that I need to look for each property in the map and process its values accordingly.
I thought about two ways to do that.
One, use std::map::find method to get a specific property, like that:
map<string , Property*>::iterator it1 = _propertyMap.find("id");
if(it1 != _propertyMap.end())
{
//element found - process id values
}
map<string , Property*>::iterator it2 = _propertyMap.find("name");
if(it2 != _propertyMap.end())
{
//element found - process name values
}
....
Two, iterate the map and for each entry check what the property's name is and proceed accordingly:
for (it = _propertyMap.begin(); it != _propertyMap.end(); ++it )
{
//if it is events - append the values to the matching nodes
if (it->first == "id")
{
//process id values
}
else if (it->first == "name")
{
//process name values
}
.....
}
Given that the Time complexity of std::map::find is O(logN), the complexity of the first solution is O(NlogN). I'm not sure about the complexity of the second solution, because it iterates the map once (O(N)), but performs a lot of if-else each iteration. I tried to google common map::find() questions, but couldn't find any useful information; most of them just need to get one value from the map, and then find() does this with better complexity (O(logN) vs O(N)).
What is a better approach? or perhaps there is another one which I didn't think of?
Also, code styling speaking, which one is more good and clear code?
I see a few different use-cases here, depending on what you have in mind:
Fixed properties
(Just for completeness, i guess it is not what you want) If both name and type of possible properties should be fixed, the best version is to use a simple class/struct, possibly using boost::optional (std::optional with C++17) for values that might be present or not
struct Data{
int id = 0;
std::string name = "";
boost::optional<int> whatever = boost::none;
}
Pros:
All "lookups" are resolved at compile-time
Cons:
No flexibility to expand at runtime
Process only specific options depending on their key
If you want to process only a specific subset of options, but keep the option to have (unprocessed) custom keys your approaches seem suitable.
In this case remember that using find like this:
it1 = _propertyMap.find("id");
has complexity O(logN) but is used M times, with M beeing the number of processed options. This is not the size of your map, it is the number of times you use find() to get a specific property. In your (shortened) example this means a complexity of O(2 * logN), since you only look for 2 keys.
So basically using M-times find() scales better than looping when only the size of the map increases, but worse if you increase the number of finds in the same manner. But only profiling can tell you which one is the faster for your size and use case.
Process all options depending on type
Since your map looks a lot like the keys can be custom but the types are from a small subset, consider looping over the map and using the types instead of the names to determine how to process them. Something like this:
for (it = _propertyMap.begin(); it != _propertyMap.end(); ++it )
{
if (it->first.type() == Property::UUID)
{
//process UUID values
}
else if (it->first.type() == Property::STRING)
{
//process STRING values
}
.....
}
This has the advantage, that you do not need any information about what the keys of your map really are, only what types it is able to store.
Suppose we have a map of N properties, and we are looking for a subset of P properties. Here a rough analysis, not knowing the statistical distribution of the keys:
In the pure map approach you search P times with a complexity of O(log(n)), that is O(p*log(n))
In the chained-if approach you are going to traverse once the map. That's O(N). But you should not forget that an if-then chain is also a (hiden) traversal of list of P elements. So for every of the N elements you are doing a search of potentially up to P elements. So that you have here a complexity of O(p*n).
This means that the map approach will outperform your traversal, and the performance gap will increase significantly with n. Of course this doesn't take into account function call overhead in map that you don't have in the if-chain. So that if P and N are small, your approach could still stand the theoretical comparison.
What you could eventually do to increase peformance further would be to use an unordered_map, which is O(1) in complexity, reducing your problem complexity to O(P).
There is another option which combines the best of both. Given a function like this (which is an adaptation of std::set_intersection):
template<class InputIt1, class InputIt2,
class Function, class Compare>
void match(InputIt1 first1, InputIt1 last1,
InputIt2 first2, InputIt2 last2,
Function f, Compare comp)
{
while (first1 != last1 && first2 != last2) {
if (comp(*first1,*first2)) {
++first1;
} else {
if (!comp(*first2,*first1)) {
f(*first1++,*first2);
}
++first2;
}
}
}
You can use it to process all your properties in O(N+M) time. Here is an example:
#include <map>
#include <string>
#include <functional>
#include <cassert>
using std::map;
using std::string;
using std::function;
struct Property {
enum Type { UUID, STRING };
Type type;
string value;
};
int main()
{
map<string,Property> properties;
map<string,function<void(Property&)>> processors;
properties["id"] = Property{Property::UUID,"12345678"};
properties["name"] = Property{Property::STRING,"name"};
bool id_found = false;
bool name_found = false;
processors["id"] = [&](Property&){ id_found = true; };
processors["name"] = [&](Property&){ name_found = true; };
match(
properties.begin(),properties.end(),
processors.begin(),processors.end(),
[](auto &a,auto &b){ b.second(a.second); },
[](auto &a,auto &b) { return a.first < b.first; }
);
assert(id_found && name_found);
}
The processors map can be built separately and reused to reduce the overhead.

How to take unique values from file with vector in CPP using map?

I have vector of some data type (Let's say-int) and I need to push back only unique values from the file? I am new to use STL. So i don't know how can i do it using map as i read that map only takes unique values. If I simply push back, then it will take all the values irrespective of its uniqueness.
The correct container to use for unique values is either std::set or std::unordered_set:
std::set<int> s;
s.insert(4); // s has size 1
s.insert(5); // s has size 2
s.insert(4); // s still has size 2
If you want to use vector, you'd have to maintain it sorted, which is a lot more code and work, and doesn't have the nice characteristic of set that everybody knows the contents are unique:
void add_value(std::vector<int>& v, int value) {
// do a binary search to find value
std::vector<int>::iterator it = std::lower_bound(v.begin(), v.end(), value);
if (it != v.end() && *it == value) {
// duplicate - do nothing
}
else {
// insert our value here
v.insert(it, value);
}
}
... or I guess you could delete the duplicates at the end using a rarely-used algorithm (std::unique) that will probably raise some eyebrows:
void uniqify(std::vector<int>& v) {
std::sort(v.begin(), v.end());
v.erase(std::unique(v.begin(), v.end()), v.end());
}
[UPDATE] It has been pointed out to me that I completely misunderstood your question - and that you may have been looking for just which values occur exactly once - not a list of which values occur without duplicate. For that, the correct container to use is either a std::map or std::unordered_map - so you can associate a count with a particular key:
std::map<int, int> keyCounts;
int value;
while (fileStream >> value) { // or whatever
++keyCounts[value]; // operator[] gives us a reference to the value
// if it wasn't present before, it'll insert a default
// one - which for int is zero - so this handles
// both cases correctly
}
// Now, any key with value 1 is a unique key
// what you want to do with them is up to you
// e.g., let's put it in a vector
std::vector<int> uniq;
uniq.reserve(keyCounts.size());
for (std::map<int, int>::iterator it = keyCounts.begin(); it != keyCounts.end(); ++it)
{
if (it->second == 1) {
uniq.push_back(it->first);
}
}
A std::map will let you handle a mapping of unique keys to some values (which may or may not be unique). Math-wise, You may see it as a surjective function from the set of keys to the set of values of your dataset.
If your goal is to keep unique indices (or keys), then std::map is what you need. Otherwise, use std::set to store unique values.
Now, to keep only unique values from your dataset, you basically want to remove values which appear more than once. The simplest algorithm is to add values from the file as keys in a map, with its corresponding value being a counter for the number of occurrences of that entry in the file. Initialize a counter to 1 the first time the value is met in the file, and increment it each time it is met again. After having parsed the whole file, simply keep the keys whose values are exactly 1.
Counting the values:
template <typename key>
void count(std::istream &is, std::map<key,int> &map){
while (!is.eof() && is.good()){
key << is;
auto it = map.find(key);
if (it == map.end())
map[key] = 1;
else (*it)++;
}
}
The above assumes that the << has been overloaded to extract values from the stream sequentially. You will have to adapt the algorithm to fit your own way of parsing the data.
Filtering the resulting map to keep unique values can be achieved with std::remove_if and a function returning true when the counter is above 1:
The function:
bool duplicate (std::const_iterator<int> &it){ return *it > 1;}
The map filtering:
std::remove_if (map.begin(), map.end(), duplicate);

How can I sort a std::map first by value, then by key?

I need to sort a std::map by value, then by key. The map contains data like the following:
1 realistically
8 really
4 reason
3 reasonable
1 reasonably
1 reassemble
1 reassembled
2 recognize
92 record
48 records
7 recs
I need to get the values in order, but the kicker is that the keys need to be in alphabetical order after the values are in order. How can I do this?
std::map will sort its elements by keys. It doesn't care about the values when sorting.
You can use std::vector<std::pair<K,V>> then sort it using std::sort followed by std::stable_sort:
std::vector<std::pair<K,V>> items;
//fill items
//sort by value using std::sort
std::sort(items.begin(), items.end(), value_comparer);
//sort by key using std::stable_sort
std::stable_sort(items.begin(), items.end(), key_comparer);
The first sort should use std::sort since it is nlog(n), and then use std::stable_sort which is n(log(n))^2 in the worst case.
Note that while std::sort is chosen for performance reason, std::stable_sort is needed for correct ordering, as you want the order-by-value to be preserved.
#gsf noted in the comment, you could use only std::sort if you choose a comparer which compares values first, and IF they're equal, sort the keys.
auto cmp = [](std::pair<K,V> const & a, std::pair<K,V> const & b)
{
return a.second != b.second? a.second < b.second : a.first < b.first;
};
std::sort(items.begin(), items.end(), cmp);
That should be efficient.
But wait, there is a better approach: store std::pair<V,K> instead of std::pair<K,V> and then you don't need any comparer at all — the standard comparer for std::pair would be enough, as it compares first (which is V) first then second which is K:
std::vector<std::pair<V,K>> items;
//...
std::sort(items.begin(), items.end());
That should work great.
You can use std::set instead of std::map.
You can store both key and value in std::pair and the type of container will look like this:
std::set< std::pair<int, std::string> > items;
std::set will sort it's values both by original keys and values that were stored in std::map.
As explained in Nawaz's answer, you cannot sort your map by itself as you need it, because std::map sorts its elements based on the keys only. So, you need a different container, but if you have to stick to your map, then you can still copy its content (temporarily) into another data structure.
I think, the best solution is to use a std::set storing flipped key-value pairs as presented in ks1322's answer.
The std::set is sorted by default and the order of the pairs is exactly as you need it:
3) If lhs.first<rhs.first, returns true. Otherwise, if rhs.first<lhs.first, returns false. Otherwise, if lhs.second<rhs.second, returns true. Otherwise, returns false.
This way you don't need an additional sorting step and the resulting code is quite short:
std::map<std::string, int> m; // Your original map.
m["realistically"] = 1;
m["really"] = 8;
m["reason"] = 4;
m["reasonable"] = 3;
m["reasonably"] = 1;
m["reassemble"] = 1;
m["reassembled"] = 1;
m["recognize"] = 2;
m["record"] = 92;
m["records"] = 48;
m["recs"] = 7;
std::set<std::pair<int, std::string>> s; // The new (temporary) container.
for (auto const &kv : m)
s.emplace(kv.second, kv.first); // Flip the pairs.
for (auto const &vk : s)
std::cout << std::setw(3) << vk.first << std::setw(15) << vk.second << std::endl;
Output:
1 realistically
1 reasonably
1 reassemble
1 reassembled
2 recognize
3 reasonable
4 reason
7 recs
8 really
48 records
92 record
Code on Ideone
Note: Since C++17 you can use range-based for loops together with structured bindings for iterating over a map.
As a result, the code for copying your map becomes even shorter and more readable:
for (auto const &[k, v] : m)
s.emplace(v, k); // Flip the pairs.
std::map already sorts the values using a predicate you define or std::less if you don't provide one. std::set will also store items in order of the of a define comparator. However neither set nor map allow you to have multiple keys. I would suggest defining a std::map<int,std::set<string> if you want to accomplish this using your data structure alone. You should also realize that std::less for string will sort lexicographically not alphabetically.
EDIT: The other two answers make a good point. I'm assuming that you want to order them into some other structure, or in order to print them out.
"Best" can mean a number of different things. Do you mean "easiest," "fastest," "most efficient," "least code," "most readable?"
The most obvious approach is to loop through twice. On the first pass, order the values:
if(current_value > examined_value)
{
current_value = examined_value
(and then swap them, however you like)
}
Then on the second pass, alphabetize the words, but only if their values match.
if(current_value == examined_value)
{
(alphabetize the two)
}
Strictly speaking, this is a "bubble sort" which is slow because every time you make a swap, you have to start over. One "pass" is finished when you get through the whole list without making any swaps.
There are other sorting algorithms, but the principle would be the same: order by value, then alphabetize.

C++ map start search from iterator position

My question is the following:
After using find on a std::map to get an iterator pointed to the desired element pair, is it possible to reuse that iterator on subsequent find()'s to take advantage of knowing that the elements im looking for afterwards are close to the first found element? Something like:
std::map<key, value> map_elements;
std::map<key, value>::iterator it;
it = map_elements.find(some_key);
it = it.find(a_close_key)
Thank you in advance
If you're sure it's really nearby, you could use std::find (instead of map::find) to do a linear search for the item. If it's within approximately log(N) items of the current position, this is likely to be a win (where N is the number of items in the map).
Also note that you'll have to figure out whether you want a search before or after the current position, and specify current, end() if it's after, and begin(), current if it's before. If it's before, you'll want to do a reverse search (find_end, if memory serves) since the target is presumably close to the end of that range.
Your question is not complete about how far Item1(found by map::find) can be far from Item2. In some case its more effecient to make new map::find; in some cases you can just iterate your iterator to find where your second item can be. Using just search map::find it will be O(log n) complexity and it can be around 10-20 steps.
So, if you know your Item2 is not so far, you can just iterate it iterator to find it out. Most important thing here is how to check you must stop search. std::map uses std::less<T> by default to arrange items, so it can be used to find out container is not containing Item2 at all. Something like this(not tested):
std::map<key, value> map_elements;
std::map<key, value>::iterator it, it2;
it2 = it = map_elements.find(some_key);
bool found=false;
while( it2!=map_elements.end() && !(a_close_key < it2->first) ) {
if( !(a_close_key < it2->first) && !(it2->first < a_close_key) ) {
//Equivalency is not ==, but its what used in std::map
found=true;
break;
}
it2++;
}
if( found ) {
//.... use it2
}
Inside if( found ) block your iterator it2 value should be same as if you called map_elements.lower_bound(a_close_key)

Get all the keys which matches a query in a map

Say I have more than one key with the same value in a map. Then in that case how do I retrieve all keys that matches a query.
Or, Is there any possibility to tell find operation to search after a specific value.
I am using an std::map, C++.
Would something like this work for you:
void FindKeysWithValue(Value aValue, list<Key>& aList)
{
aList.clear();
for_each(iMap.begin(), iMap.end(), [&] (const pair<Key, Value>& aPair)
{
if (aPair.second == aValue)
{
aList.push_back(aPair.first);
}
});
}
The associative containers probably won't help you too much because for std::map<K, V> the key happens to be unique and chances that your chosen query matches the ordering relation you used may not be too high. If the order matches, you can use the std::map<K, V> members lower_bound() and upper_bound(). For std::multimap<K, V> you can also use equal_range().
In general, i.e., if you query isn't really related to the order, you can use std::copy_if() to get a sequence of objects matching a predicate:
Other other;
// ...
std::vector<Other::value_type> matches;
std::copy_if(other.begin(), other.end(),
std::back_inserter(matches), predicate);
When copying the elements is too expensive, you should probably consider using std:find_if() instead:
for (auto it(other.begin());
other.end() != (it = std::find_if(it, other.end(), predicate));
++it) {
// do something with it
}
The only way is to iterate over map.
this link may be useful: Reverse map lookup
Provided you want quick access and you don't mind using some more space, then you maintain another map that gets stored as value, key. In your case, you would need to handle the duplicate values (that you will be storing as keys).
Not a great idea but definitely an option.
A map is meant for efficient lookup of keys. Lookup based on values is not efficient, and you basically have to iterate through the map, extracting matches yourself:
for(map<A,B>::iterator i = m.begin(); i != m.end(); i++)
if(i->second == foo)
you_found_a_match();
If you intend to do this often, you can build up a multimap mapping the other way, so you can efficiently perform a value-based lookup:
multimap<B,A> reverse;
for(map<A,B>::iterator i = m.begin(); i != m.end(); i++)
reverse.insert(pair<B,A>(i->second,i->first));
You can now easily find the keys with a given value value:
matches = reverse.equal_range(value);
for(multimap<B,A>::iterator i = matches.first; i != matches.second; i++)
A & key = i->second;
If these maps aren't going to grow continuously, it may be more efficient to simply maintain a vector > instead, define a comparator for it based on the value, and use equal_range on that instead.