Have in C++ analog ContainsKey (TKey) from C# - what is best practics? - c++

Have in C++ analog IDictionary.ContainsKey (TKey) or List.Contains (T) from C# ?
For example I have array of elements and need to know have this array some value or not ?
What is best way or best practics - without "foreach" for each element !
It will be good if it's will from std library for example.
UPD 1: In std lib have many containers, but I want to find a best way - faster, little code, less complicated and so on ...
Lookind that better desigion is std::unordered_set if going on this logic
#include <unordered_set>
std::unordered_set<std::string> NamesOfValues = {
"one",
"two",
"Date",
"Time"
};
// and now check is value exists in set
if(NamesOfValues.count(value))
{
// value exists
}

It seems most concise to use count, and this should work for any container.
if ( my_map.count(key) ) { // shorthand for `count(key) != 0`
// It exists
} else {
// It does not
}
If we're talking about [unordered_]map and [unordered_]set, which are closest to your original dictionary type, then these containers enforce unique keys, so the returned .count() can only be 0 or 1, and there's no need to worry about the code pointlessly iterating the rest of the container once it finds a match (as would occur for containers supporting duplicates)
Either way, simply using implicit conversion to bool leads to the most concise code. And if you end up having a design that might allow/need different counts per key, then you could compare against a specific value.

Your are looking for std::find. Find looks for an arbitrary type input to an arbitrary iterable and returns an iterator to that element.
For example, to find an element in a dictionary you can do the following:
std::unordered_map<char,int> my_map = { {'a', 1},{'b', 2}};
auto found_element = std::find(my_map.begin(), my_map.end(), 'a');
if( found_element == my_map.end() ){
//nothing was found
}
else{
// do something
}
For standard map you also have map.find(T) for O(1) access instead of O(n).
if( my_map.find('a') != my_map.end() ){
//something was found!
}
else{
//nothing was found
}
This is more clear than my_map.count()... you would only use that if you were actually trying to figure out how many elements you have and if you were using non unique keys.

Related

How to implement something like std::copy_if but apply a function before inserting into a different container

Full disclosure, this may be a hammer and nail situation trying to use STL algorithms when none are needed. I have seen a reappearing pattern in some C++14 code I am working with. We have a container that we iterate through, and if the current element matches some condition, then we copy one of the elements fields to another container.
The pattern is something like:
for (auto it = std::begin(foo); it!=std::end(foo); ++it){
auto x = it->Some_member;
// Note, the check usually uses the field would add to the new container.
if(f(x) && g(x)){
bar.emplace_back(x);
}
}
The idea is almost an accumulate where the function being applied does not always return a value. I can only think of a solutions that either
Require a function for accessing the member your want to accumulate and another function for checking the condition. i.e How to combine std::copy_if and std::transform?
Are worse then the thing I want to replace.
Is this even a good idea?
A quite general solution to your issue would be the following (working example):
#include <iostream>
#include <vector>
using namespace std;
template<typename It, typename MemberType, typename Cond, typename Do>
void process_filtered(It begin, It end, MemberType iterator_traits<It>::value_type::*ptr, Cond condition, Do process)
{
for(It it = begin; it != end; ++it)
{
if(condition((*it).*ptr))
{
process((*it).*ptr);
}
}
}
struct Data
{
int x;
int y;
};
int main()
{
// thanks to iterator_traits, vector could also be an array;
// kudos to #Yakk-AdamNevraumont
vector<Data> lines{{1,2},{4,3},{5,6}};
// filter even numbers from Data::x and output them
process_filtered(std::begin(lines), std::end(lines), &Data::x, [](int n){return n % 2 == 0;}, [](int n){cout << n;});
// output is 4, the only x value that is even
return 0;
}
It does not use STL, that is right, but you merely pass an iterator pair, the member to lookup and two lambdas/functions to it that will first filter and second use the filtered output, respectively.
I like your general solutions but here you do not need to have a lambda that extracts the corresponding attribute.
Clearly, the code can be refined to work with const_iterator but for a general idea, I think, it should be helpful. You could also extend it to have a member function that returns a member attribute instead of a direct member attribute pointer, if you'd like to use this method for encapsulated classes.
Sure. There are a bunch of approaches.
Find a library with transform_if, like boost.
Find a library with transform_range, which takes a transformation and range or container and returns a range with the value transformed. Compose this with copy_if.
Find a library with filter_range like the above. Now, use std::transform with your filtered range.
Find one with both, and compose filtering and transforming in the appropriate order. Now your problem is just copying (std::copy or whatever).
Write your own back-inserter wrapper that transforms while inserting. Use that with std::copy_if.
Write your own range adapters, like 2 3 and/or 4.
Write transform_if.

Fast 'group by/count' std::vector<std::u16string> into a std::map<u16string, int>

I have a function that reads ~10000 words into a vector, I then want to group all the words into a map to 'count' how many times a certain word appears.
While the code 'works' it can sometimes take 2 seconds to re-build the map.
NB: Unfortunately, I cannot change the 'read' function, I have to work with the vector of std::u16string.
std::vector<std::u16string> vValues;
vValues.push_back( ... )
...
std::map<std::u16string, int> mValues;
for( auto it = vValues.begin(); it != vValues.end(); ++it )
{
if( mValues.find( *it ) == mValues.end() )
{
mValues[*it] = 1;
}
else
{
++mValues[*it];
}
}
How could I speed up the 'group by' while keeping track of the number of times the word appears in the vector?
If you call std::map::operator[] on a new key, the value of the key will be value initialized (to 0 for PODs like int). So, your loop can be simplified to:
for (auto it = vValues.begin(); it != vValues.end(); ++it)
++mValues[*it];
If there is no key *it, then the default value will be 0, but then it is incremented immediately, and it becomes 1.
If the key already exists, then it is simply incremented.
Furthermore, it doesn't look like you need the map to be ordered, so you can use a std::unordered_map instead, as insertion is average constant time, instead of logarithmic, which would speed it up even further.
std::vector<std::u16string> vValues;
vValues.push_back( ... )
...
std::sort( vValues.begin(), vValues.end() );
struct counted {
std::u16string value;
std::size_t count;
};
std::vector<counted> result;
auto it = vValues.begin();
while (it != vValues.end()) {
auto r = std::equal_range( it, vValues.end(), *it );
result.push_back({ *it, r.second-r.first });
it = r.second;
}
After this is done, result will contain {value, count} for each value and will be sorted.
As all work was done in contiguous containers, it should be faster than your implementation.
If you aren't allowed to mutate vValues, one thing you could do is create a vector of gsl::span<char16_t> from it then sort that, then create the result vector similarly. (If you don't have gsl::span, write one, they aren't hard to write)
Failing that, even copying result once may be faster than your original solution.
Using a gsl::span<char16_t const> in counted would save some allocations as well (reuse the storage within the vValues, at the cost of tying their lifetimes together.
One serious concern is that if your strings are extremely long, determining that two strings are equal is expensive. And if they have common prefixes, determining they are different can be expensive. We do log(n) comparisons per distinct element in the equal_range code, and n log(n) in the sort; sometimes sorting (hash of string, string) pairs can be faster than sorting (string)s alone, as it makes unlike strings easy to detect.
Live example with 4 different versions. Simply change the test1 to test2 or test3 or test4.
test3 is fastest in every test I did:
std::unordered_map<std::string, int> test3(std::vector<std::string> vValues)
{
std::unordered_map<std::string, int> mValues;
for( auto it = vValues.begin(); it != vValues.end(); ++it )
{
++mValues[std::move(*it)];
}
return mValues;
}
than all the other versions.
And here’s an alternative. You might consider storing a non-owning shared pointer, but if you can’t control the format of your inputs, Yakk’s suggestion of gsl::span might work. This is from the Guidelines Support Library.
std::unordered_map<std::u16string, unsigned> hash_corpus;
// constexpr float heuristic_parameter = ?;
// hash_corpus.max_load_factor(heuristic_parameter);
/* The maximum possible number of entries in the hash table is the size of
* the input vector.
*/
hash_corpus.reserve(corpus.size());
// Paul McKenzie suggested this trick in the comments:
for ( const std::u16string& s : corpus)
++hash_corpus[s]; // If the key is not in the table, [] inserts with value 0.

C++ elegant way to mark index which doesn't belong to a vector

I was wondering about a proper and elegant way to mark index which doesn't belong to a vector/an array. Let me show you a brief example showing what I mean (using some pseudocode phrases):
std::vector<**type**> vector;
int getIndex()
{
if (**user has selected something**)
{
return **index of the thing in our vector**;
} else
return -1;
}
int main()
{
int selectedItem = getIndex();
if (selectedItem<vector.size()) //checking if selected index is valid, -1 is not
{
**do something using selected object**
}
}
Of course I mean to use it in much more sophisticated way, but I hope the problem is shown in the example. Is it a good idea to mark an index which is not in a vector using -1 constans? It leads to a warning about comparing signed and unsigned values, but still it works as I want it to.
I don't want to check additionaly if my selectedItem variable is -1, that gives one additional, unnecessary condition. So is this a good solution or should I consider something else?
The most elegant way to indicate that something you're looking for wasn't found in a vector is to use the C++ Standard Library facilities the way they were intended -- with iterators:
std::vector<type>::iterator it = std::find (vec.begin(), vec.end(), something_to_find);
if (it != vec.end())
{
// we found it
}
else
{
// we didn't find it -- it's not there
}
It's better to use iterators, but if you decide to stick with the indices, it's better to make getIndex return size_t as string::find() does:
size_t getIndex()
{
//...
return -1; // the same as std::numeric_limits<size_t>::max()
}
This way getIndex(element) < vec.size() if and only if the element is present in vector.
If you insist on using integer indexes instead of iterators, then -1 is the usual sentinel value used to say "not found". However instead of comparing against vec.size() you should compare to 0 instead, to avoid the signed/unsigned mismatch.
struct SelectableItem {bool selected;/*more suff here*/};
struct IsSelected(const SelectableItem& sel) {return sel.selected;}
int main(int argc, char** argv)
{
std::vector<SelectableItem> vec;
//populate vector
auto found = vec.find_if(vec.begin(), vec.end(), IsSelected());
if (found != vec.end())
{
SelectedItem& selected_item = *found;
/*do something*/
}
}
Don't reinvent the wheel.
If you decide to use vec.end() then you can guard yourself against invalidated iterators (e.g. you insert an element in the vector after you have created the iterator) by compiling with -D_GLIBCXX_DEBUG in debug mode.
I would use -1 though, but use the size_t type everywhere. Iterators are so error prone and the ISO standard is ambiguous and diffuse when it comes to the details.

std::map - Element access without exception and without insertion

I have a recurrent pattern with the use of std::map.
I want to retrieve the value only when the key is present, otherwise I don't want to insert element. Currently I'm using count(key) or find(key) (which one is better? from the documentation the complexity seems to be the same) and if them returns a positive value that I access the map. However I would like to avoid the use of two operations on the map. Something like:
map<string, int> myMap;
int returnvalue;
boole result = myMap.get("key1",returnValue)
if(result){
\\ use returnValue
}
Reading the std::map documentation on cplusplus.com I found two functions for accessing map elements:
at(): which throws an excpetion if the key is not present
[]: which insert a new value if the key is not present
None of them satisfy my necessity.
Use map::find:
auto it = myMap.find(key);
if (it != myMap.end())
{
// use it->second
}
else
{
// not found
}
This part was easy. The harder problem is when you want to look up if an element exists and return it if it does, but otherwise insert a new element at that key, all without searching the map twice. For that you need to use lower_bound followed by hinted insertion.
using count() for sure the key is exists
then uses find() to get the k/v pair
if (myMap.count(key))
{
auto it = myMap.find(key)
}
else
{
// not found
}

Searching in std::set without iterator C++

I have std::set which contains come int values. Now i use iterator to find out whether set contans value.
But my application use this search very ofter and search using iterator too slow, can i do something like that:
std::set<int> fdsockets;
void myfunc(int fd)
{
if(fdsockets[fd] != fdsockets.end())
{
// my code
}
}
But i have error when compile using G++
no match for 'operator[]' in 'fdsockets[fd]'
Maybe i can use something instead of std::set?
Thanks!
std::unorered_set or an ordered vector with binary search are more effective for simple membership test. If the maximum value of intergers is low a lookup table might be an alternative.
It sounds like you want set::find()
if( fdsockets.find(fd) != fdsockets.end() )
{
// my code
}
There is no operator[] in std::set.
You probably mean
if(fdsockets.find(fd) != fdsockets.end())
If you don't need the iterator that set::find returns (you're just testing existence, not actually going to access the fdsocket), here's an alternative:
if(fdsockets.count(fd))
{
// my code
}