Searching in std::set without iterator C++

Searching in std::set without iterator C++ - c++

I have std::set which contains come int values. Now i use iterator to find out whether set contans value.
But my application use this search very ofter and search using iterator too slow, can i do something like that:
std::set<int> fdsockets;
void myfunc(int fd)
{
if(fdsockets[fd] != fdsockets.end())
{
// my code
}
}
But i have error when compile using G++
no match for 'operator[]' in 'fdsockets[fd]'
Maybe i can use something instead of std::set?
Thanks!

std::unorered_set or an ordered vector with binary search are more effective for simple membership test. If the maximum value of intergers is low a lookup table might be an alternative.

It sounds like you want set::find()
if( fdsockets.find(fd) != fdsockets.end() )
{
// my code
}

There is no operator[] in std::set.
You probably mean
if(fdsockets.find(fd) != fdsockets.end())

If you don't need the iterator that set::find returns (you're just testing existence, not actually going to access the fdsocket), here's an alternative:
if(fdsockets.count(fd))
{
// my code
}

Related

How to implement something like std::copy_if but apply a function before inserting into a different container

Full disclosure, this may be a hammer and nail situation trying to use STL algorithms when none are needed. I have seen a reappearing pattern in some C++14 code I am working with. We have a container that we iterate through, and if the current element matches some condition, then we copy one of the elements fields to another container.
The pattern is something like:
for (auto it = std::begin(foo); it!=std::end(foo); ++it){
auto x = it->Some_member;
// Note, the check usually uses the field would add to the new container.
if(f(x) && g(x)){
bar.emplace_back(x);
}
}
The idea is almost an accumulate where the function being applied does not always return a value. I can only think of a solutions that either
Require a function for accessing the member your want to accumulate and another function for checking the condition. i.e How to combine std::copy_if and std::transform?
Are worse then the thing I want to replace.
Is this even a good idea?

A quite general solution to your issue would be the following (working example):
#include <iostream>
#include <vector>
using namespace std;
template<typename It, typename MemberType, typename Cond, typename Do>
void process_filtered(It begin, It end, MemberType iterator_traits<It>::value_type::*ptr, Cond condition, Do process)
{
for(It it = begin; it != end; ++it)
{
if(condition((*it).*ptr))
{
process((*it).*ptr);
}
}
}
struct Data
{
int x;
int y;
};
int main()
{
// thanks to iterator_traits, vector could also be an array;
// kudos to #Yakk-AdamNevraumont
vector<Data> lines{{1,2},{4,3},{5,6}};
// filter even numbers from Data::x and output them
process_filtered(std::begin(lines), std::end(lines), &Data::x, [](int n){return n % 2 == 0;}, [](int n){cout << n;});
// output is 4, the only x value that is even
return 0;
}
It does not use STL, that is right, but you merely pass an iterator pair, the member to lookup and two lambdas/functions to it that will first filter and second use the filtered output, respectively.
I like your general solutions but here you do not need to have a lambda that extracts the corresponding attribute.
Clearly, the code can be refined to work with const_iterator but for a general idea, I think, it should be helpful. You could also extend it to have a member function that returns a member attribute instead of a direct member attribute pointer, if you'd like to use this method for encapsulated classes.

Sure. There are a bunch of approaches.
Find a library with transform_if, like boost.
Find a library with transform_range, which takes a transformation and range or container and returns a range with the value transformed. Compose this with copy_if.
Find a library with filter_range like the above. Now, use std::transform with your filtered range.
Find one with both, and compose filtering and transforming in the appropriate order. Now your problem is just copying (std::copy or whatever).
Write your own back-inserter wrapper that transforms while inserting. Use that with std::copy_if.
Write your own range adapters, like 2 3 and/or 4.
Write transform_if.

Have in C++ analog ContainsKey (TKey) from C# - what is best practics?

Have in C++ analog IDictionary.ContainsKey (TKey) or List.Contains (T) from C# ?
For example I have array of elements and need to know have this array some value or not ?
What is best way or best practics - without "foreach" for each element !
It will be good if it's will from std library for example.
UPD 1: In std lib have many containers, but I want to find a best way - faster, little code, less complicated and so on ...
Lookind that better desigion is std::unordered_set if going on this logic
#include <unordered_set>
std::unordered_set<std::string> NamesOfValues = {
"one",
"two",
"Date",
"Time"
};
// and now check is value exists in set
if(NamesOfValues.count(value))
{
// value exists
}

It seems most concise to use count, and this should work for any container.
if ( my_map.count(key) ) { // shorthand for `count(key) != 0`
// It exists
} else {
// It does not
}
If we're talking about [unordered_]map and [unordered_]set, which are closest to your original dictionary type, then these containers enforce unique keys, so the returned .count() can only be 0 or 1, and there's no need to worry about the code pointlessly iterating the rest of the container once it finds a match (as would occur for containers supporting duplicates)
Either way, simply using implicit conversion to bool leads to the most concise code. And if you end up having a design that might allow/need different counts per key, then you could compare against a specific value.

Your are looking for std::find. Find looks for an arbitrary type input to an arbitrary iterable and returns an iterator to that element.
For example, to find an element in a dictionary you can do the following:
std::unordered_map<char,int> my_map = { {'a', 1},{'b', 2}};
auto found_element = std::find(my_map.begin(), my_map.end(), 'a');
if( found_element == my_map.end() ){
//nothing was found
}
else{
// do something
}
For standard map you also have map.find(T) for O(1) access instead of O(n).
if( my_map.find('a') != my_map.end() ){
//something was found!
}
else{
//nothing was found
}
This is more clear than my_map.count()... you would only use that if you were actually trying to figure out how many elements you have and if you were using non unique keys.

Why does std::set not have a "contains" member function?

I'm heavily using std::set<int> and often I simply need to check if such a set contains a number or not.
I'd find it natural to write:
if (myset.contains(number))
...
But because of the lack of a contains member, I need to write the cumbersome:
if (myset.find(number) != myset.end())
..
or the not as obvious:
if (myset.count(element) > 0)
..
Is there a reason for this design decision ?

I think it was probably because they were trying to make std::set and std::multiset as similar as possible. (And obviously count has a perfectly sensible meaning for std::multiset.)
Personally I think this was a mistake.
It doesn't look quite so bad if you pretend that count is just a misspelling of contains and write the test as:
if (myset.count(element))
...
It's still a shame though.

To be able to write if (s.contains()), contains() has to return a bool (or a type convertible to bool, which is another story), like binary_search does.
The fundamental reason behind the design decision not to do it this way is that contains() which returns a bool would lose valuable information about where the element is in the collection. find() preserves and returns that information in the form of an iterator, therefore is a better choice for a generic library like STL. This has always been the guiding principle for Alex Stepanov, as he has often explained (for example, here).
As to the count() approach in general, although it's often an okay workaround, the problem with it is that it does more work than a contains() would have to do.
That is not to say that a bool contains() isn't a very nice-to-have or even necessary. A while ago we had a long discussion about this very same issue in the
ISO C++ Standard - Future Proposals group.

It lacks it because nobody added it. Nobody added it because the containers from the STL that the std library incorporated where designed to be minimal in interface. (Note that std::string did not come from the STL in the same way).
If you don't mind some strange syntax, you can fake it:
template<class K>
struct contains_t {
K&& k;
template<class C>
friend bool operator->*( C&& c, contains_t&& ) {
auto range = std::forward<C>(c).equal_range(std::forward<K>(k));
return range.first != range.second;
// faster than:
// return std::forward<C>(c).count( std::forward<K>(k) ) != 0;
// for multi-meows with lots of duplicates
}
};
template<class K>
containts_t<K> contains( K&& k ) {
return {std::forward<K>(k)};
}
use:
if (some_set->*contains(some_element)) {
}
Basically, you can write extension methods for most C++ std types using this technique.
It makes a lot more sense to just do this:
if (some_set.count(some_element)) {
}
but I am amused by the extension method method.
The really sad thing is that writing an efficient contains could be faster on a multimap or multiset, as they just have to find one element, while count has to find each of them and count them.
A multiset containing 1 billion copies of 7 (you know, in case you run out) can have a really slow .count(7), but could have a very fast contains(7).
With the above extension method, we could make it faster for this case by using lower_bound, comparing to end, and then comparing to the element. Doing that for an unordered meow as well as an ordered meow would require fancy SFINAE or container-specific overloads however.

You are looking into particular case and not seeing bigger picture. As stated in documentation std::set meets requirement of AssociativeContainer concept. For that concept it does not make any sense to have contains method, as it is pretty much useless for std::multiset and std::multimap, but count works fine for all of them. Though method contains could be added as an alias for count for std::set, std::map and their hashed versions (like length for size() in std::string ), but looks like library creators did not see real need for it.

Although I don't know why std::set has no contains but count which only ever returns 0 or 1,
you can write a templated contains helper function like this:
template<class Container, class T>
auto contains(const Container& v, const T& x)
-> decltype(v.find(x) != v.end())
{
return v.find(x) != v.end();
}
And use it like this:
if (contains(myset, element)) ...

The true reason for set is a mystery for me, but one possible explanation for this same design in map could be to prevent people from writing inefficient code by accident:
if (myMap.contains("Meaning of universe"))
{
myMap["Meaning of universe"] = 42;
}
Which would result in two map lookups.
Instead, you are forced to get an iterator. This gives you a mental hint that you should reuse the iterator:
auto position = myMap.find("Meaning of universe");
if (position != myMap.cend())
{
position->second = 42;
}
which consumes only one map lookup.
When we realize that set and map are made from the same flesh, we can apply this principle also to set. That is, if we want to act on an item in the set only if it is present in the set, this design can prevent us from writing code as this:
struct Dog
{
std::string name;
void bark();
}
operator <(Dog left, Dog right)
{
return left.name < right.name;
}
std::set<Dog> dogs;
...
if (dogs.contain("Husky"))
{
dogs.find("Husky")->bark();
}
Of course all this is a mere speculation.

Since c++20,
bool contains( const Key& key ) const
is available.

I'd like to point out , as mentioned by Andy, that since C++20 the standard added the contains Member function for maps or set:
bool contains( const Key& key ) const; (since C++20)
Now I'd like to focus my answer regarding performance vs readability.
In term of performance if you compare the two versions:
#include <unordered_map>
#include <string>
using hash_map = std::unordered_map<std::string,std::string>;
hash_map a;
std::string get_cpp20(hash_map& x,std::string str)
{
if(x.contains(str))
return x.at(str);
else
return "";
};
std::string get_cpp17(hash_map& x,std::string str)
{
if(const auto it = x.find(str); it !=x.end())
return it->second;
else
return "";
};
You will find that the cpp20 version takes two calls to std::_Hash_find_last_result while the cpp17 takes only one call.
Now I find myself with many data structure with nested unordered_map.
So you end up with something like this:
using my_nested_map = std::unordered_map<std::string,std::unordered_map<std::string,std::unordered_map<int,std::string>>>;
std::string get_cpp20_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
if(x.contains(level1) &&
x.at(level1).contains(level2) &&
x.at(level1).at(level2).contains(level3))
return x.at(level1).at(level2).at(level3);
else
return "";
};
std::string get_cpp17_nested(my_nested_map& x,std::string level1,std::string level2,int level3)
{
if(const auto it_level1=x.find(level1); it_level1!=x.end())
if(const auto it_level2=it_level1->second.find(level2);it_level2!=it_level1->second.end())
if(const auto it_level3=it_level2->second.find(level3);it_level3!=it_level2->second.end())
return it_level3->second;
return "";
};
Now if you have plenty of condition in-between these ifs, using the iterator really is painful, very error prone and unclear, I often find myself looking back at the definition of the map to understand what kind of object was at level 1 or level2, while with the cpp20 version , you see at(level1).at(level2).... and understand immediately what you are dealing with.
So in term of code maintenance/review, contains is a very nice addition.

What about binary_search ?
set <int> set1;
set1.insert(10);
set1.insert(40);
set1.insert(30);
if(std::binary_search(set1.begin(),set1.end(),30))
bool found=true;

contains() has to return a bool. Using C++ 20 compiler I get the following output for the code:
#include<iostream>
#include<map>
using namespace std;
int main()
{
multimap<char,int>mulmap;
mulmap.insert(make_pair('a', 1)); //multiple similar key
mulmap.insert(make_pair('a', 2)); //multiple similar key
mulmap.insert(make_pair('a', 3)); //multiple similar key
mulmap.insert(make_pair('b', 3));
mulmap.insert({'a',4});
mulmap.insert(pair<char,int>('a', 4));
cout<<mulmap.contains('c')<<endl; //Output:0 as it doesn't exist
cout<<mulmap.contains('b')<<endl; //Output:1 as it exist
}

Another reason is that it would give a programmer the false impression that std::set is a set in the math set theory sense. If they implement that, then many other questions would follow: if an std::set has contains() for a value, why doesn't it have it for another set? Where are union(), intersection() and other set operations and predicates?
The answer is, of course, that some of the set operations are already implemented as functions in (std::set_union() etc.) and other are as trivially implemented as contains(). Functions and function objects work better with math abstractions than object members, and they are not limited to the particular container type.
If one need to implement a full math-set functionality, he has not only a choice of underlying container, but also he has a choice of implementation details, e.g., would his theory_union() function work with immutable objects, better suited for functional programming, or would it modify its operands and save memory? Would it be implemented as function object from the start or it'd be better to implement is a C-function, and use std::function<> if needed?
As it is now, std::set is just a container, well-suited for the implementation of set in math sense, but it is nearly as far from being a theoretical set as std::vector from being a theoretical vector.

C++ elegant way to mark index which doesn't belong to a vector

I was wondering about a proper and elegant way to mark index which doesn't belong to a vector/an array. Let me show you a brief example showing what I mean (using some pseudocode phrases):
std::vector<**type**> vector;
int getIndex()
{
if (**user has selected something**)
{
return **index of the thing in our vector**;
} else
return -1;
}
int main()
{
int selectedItem = getIndex();
if (selectedItem<vector.size()) //checking if selected index is valid, -1 is not
{
**do something using selected object**
}
}
Of course I mean to use it in much more sophisticated way, but I hope the problem is shown in the example. Is it a good idea to mark an index which is not in a vector using -1 constans? It leads to a warning about comparing signed and unsigned values, but still it works as I want it to.
I don't want to check additionaly if my selectedItem variable is -1, that gives one additional, unnecessary condition. So is this a good solution or should I consider something else?

The most elegant way to indicate that something you're looking for wasn't found in a vector is to use the C++ Standard Library facilities the way they were intended -- with iterators:
std::vector<type>::iterator it = std::find (vec.begin(), vec.end(), something_to_find);
if (it != vec.end())
{
// we found it
}
else
{
// we didn't find it -- it's not there
}

It's better to use iterators, but if you decide to stick with the indices, it's better to make getIndex return size_t as string::find() does:
size_t getIndex()
{
//...
return -1; // the same as std::numeric_limits<size_t>::max()
}
This way getIndex(element) < vec.size() if and only if the element is present in vector.

If you insist on using integer indexes instead of iterators, then -1 is the usual sentinel value used to say "not found". However instead of comparing against vec.size() you should compare to 0 instead, to avoid the signed/unsigned mismatch.

struct SelectableItem {bool selected;/*more suff here*/};
struct IsSelected(const SelectableItem& sel) {return sel.selected;}
int main(int argc, char** argv)
{
std::vector<SelectableItem> vec;
//populate vector
auto found = vec.find_if(vec.begin(), vec.end(), IsSelected());
if (found != vec.end())
{
SelectedItem& selected_item = *found;
/*do something*/
}
}
Don't reinvent the wheel.

If you decide to use vec.end() then you can guard yourself against invalidated iterators (e.g. you insert an element in the vector after you have created the iterator) by compiling with -D_GLIBCXX_DEBUG in debug mode.
I would use -1 though, but use the size_t type everywhere. Iterators are so error prone and the ISO standard is ambiguous and diffuse when it comes to the details.

Skipping iterator

I have a sequence of values that I'd like to pass to a function that takes a (iterator begin, iterator end) pair. However, I only want every second element in the original sequence to be processed.
Is there a nice way using Standard-Lib/Boost to create an iterator facade that will allow me to pass in the original sequence? I figured something simple like this would already be in the boost iterators or range libraries, but I didn't find anything.
Or am I missing another completely obvious way to do this? Of course, I know I always have the option of copying the values to another sequence, but that's not what I want to do.
Edit: I know about filter_iterator, but that filters on values - it doesn't change the way the iteration advances.

I think you want boost::adaptors::strided

struct TrueOnEven {
template< typename T >
bool operator()(const T&) { return mCount++ % 2 == 0; }
TrueOnEven() : mCount(0) {}
private:
int mCount;
};
int main() {
std::vector< int > tVec, tOtherVec;
...
typedef boost::filter_iterator< TrueOnEven, int > TakeEvenFilterType;
std::copy(
TakeEvenFilterType(tVec.begin(), tVec.end()),
TakeEvenFilterType(tVec.end(), tVec.end()),
std::back_inserter(tOtherVec));
}
To be honest, this is anything else than nice and intuitive. I wrote a simple "Enumerator" library including lazy integrated queries to avoid hotchpotch like the above. It allows you to write:
Query::From(tVec.begin(), tVec.end())
.Skip<2>()
.ToStlSequence(std::back_inserter(tOtherVec));
where Skip<2> basically instantiates a generalized "Filter" which skips every N-th (in this case every second) element.

Here's Boost's filter iterator. It is exactly what you want.
UPDATE: Sorry, read wrongly-ish. Here's a list of all iterator funkiness in Boost:
http://www.boost.org/doc/libs/1_46_1/libs/iterator/doc/#specialized-adaptors
I think a plain iterator_adaptor with an overloaded operator++ that increments the underlying iterator value twice is all you need.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Searching in std::set without iterator C++ - c++

std::unorered_set or an ordered vector with binary search are more effective for simple membership test. If the maximum value of intergers is low a lookup table might be an alternative.

It sounds like you want set::find() if( fdsockets.find(fd) != fdsockets.end() ) { // my code }

There is no operator[] in std::set. You probably mean if(fdsockets.find(fd) != fdsockets.end())

If you don't need the iterator that set::find returns (you're just testing existence, not actually going to access the fdsocket), here's an alternative: if(fdsockets.count(fd)) { // my code }

Related

How to implement something like std::copy_if but apply a function before inserting into a different container

Have in C++ analog ContainsKey (TKey) from C# - what is best practics?

Why does std::set not have a "contains" member function?

C++ elegant way to mark index which doesn't belong to a vector

Skipping iterator

Categories

Resources