What is a C++ container with a "contains" operation? - c++

I want to use a structure in which I insert integers, and then can ask
if (container.contains(3)) { /**/ }
There has to be something like this.

You can use std::vector.
std::vector<int> myVec;
myVec.push_back(3);
if (std::find(myVec.begin(), myVec.end(), 3) != myVec.end())
{
// do your stuff
}
You can even make a little helper function:
template <class T>
bool contains(const std::vector<T> &vec, const T &value)
{
return std::find(vec.begin(), vec.end(), value) != vec.end();
}
Here is how you would use it:
if (contains(myVec, 3)) { /*...*/ }

Simple algorithm:
template <typename Container>
bool contains(Container const& c, typename Container::const_reference v) {
return std::find(c.begin(), c.end(), v) != c.end();
}
You can customize it for more efficient search on some known containers:
template <typename Key, typename Cmp, typename Alloc>
bool contains(std::set<Key,Cmp,Alloc> const& s, Key const& k) {
return s.find(k) != s.end();
}
template <typename Key, typename Value, typename Cmp, typename Alloc>
bool contains(std::map<Key,Value,Cmp,Alloc> const& m, Key const& k) {
return m.find(k) != m.end();
}
And this way you obtain a single algorithm that performs the search on any container type, and is special cased to be faster on those containers which are ordered.

find on an unsorted vector is O(n).
std::set supports O(log n) insertions and lookups and is a good choice.
std::tr1::unordered_set provides a similar interface but supports near-constant-time lookups. It is the best choice if you have TR1 (or C++0x) and do not need to enumerate the elements in order.

What you want is the find_first_of method from the algorithms library. (or binary search, or anything along those lines)
http://www.cplusplus.com/reference/algorithm/find_first_of/

If you want to use a C++ standard container, due to its design, the containers themselves do not necessarily have "contains", but you can always use the find algorithm.
You should choose your container according to the characteristics of your dataset and the access "workload".
For a good reference of the containers and algorithms on the C++ standard library check http://www.cplusplus.com
Containers, Algorithms
If as it seems, your data is made of unique items, for which you want to associate a value, you probably will be well served by the map container. If all you care about is "membership", then set is a better choice.

Related

Template to check if vector and map contains value

I'm a beginner in c++ i was searching for templates that could check if a vector / map independent of their data type, contains a given value, I have found these:
template <typename Container, typename Value>
bool vector_contains(const Container& c, const Value& v)
{
return std::find(std::begin(c), std::end(c), v) != std::begin(c);
}
template< typename container, typename key >
auto map_contains(container const& c, key const& k)
-> decltype(c.find(k) != c.end())
{
return c.find(k) != c.end();
}
My doubt is, does using templates to do this kind of verification impact performance somehow?
I have found these
Ok, but do analyze them. They are sub optimal and/or plain wrong.
template <typename Container, typename Value>
bool vector_contains(const Container& c, const Value& v)
{
return std::find(std::begin(c), std::end(c), v) != std::begin(c);
}
This will return true as long as v is not the first value found. It'll also return true if v is not found at all.
A vector, without any other information, is unsorted, which means that contains will have to search from the first element to the last if the value is not found. Such searches are considered expensive.
If you on the other hand std::sort the vector and use the same Comparator when using std::binary_search, it'll have a quicker lookup. Sorting takes time too, though.
template< typename container, typename key >
auto map_contains(container const& c, key const& k) -> decltype(c.find(k) != c.end())
{
return c.find(k) != c.end();
}
This looks like it may work for types matching the function template. It should use map::contains instead - if it's meant to be used with maps.

Is there a flat unsorted map/set implementation?

There is the boost.container flat_map and others, and the Loki AssocVector and many others like these which keep the elements sorted.
Is there a modern (c++11 move-enabled, etc.) implementation of an unsorted vector adapted as a map/set?
The idea is to use it for very small maps/sets (less than 20 elements) and with simple keys (for which hashing wouldn't always make sense)
Something like this?
template<class Key, class Value, template<class...>class Storage=std::vector>
struct flat_map {
struct kv {
Key k;
Value v;
template<class K, class V>
kv( K&& kin, V&& vin ):k(std::forward<K>(kin)), v(std::forward<V>(vin)){}
};
using storage_t = Storage<kv>;
storage_t storage;
// TODO: adl upgrade
using iterator=decltype(std::begin(std::declval<storage_t&>()));
using const_iterator=decltype(std::begin(std::declval<const storage_t&>()));
// boilerplate:
iterator begin() {
using std::begin;
return begin(storage);
}
const_iterator begin() const {
using std::begin;
return begin(storage);
}
const_iterator cbegin() const {
using std::begin;
return begin(storage);
}
iterator end() {
using std::end;
return end(storage);
}
const_iterator end() const {
using std::end;
return end(storage);
}
const_iterator cend() const {
using std::end;
return end(storage);
}
size_t size() const {
return storage.size();
}
bool empty() const {
return storage.empty();
}
// these only have to be valid if called:
void reserve(size_t n) {
storage.reserve(n);
}
size_t capacity() const {
return storage.capacity();
}
// map-like interface:
// TODO: SFINAE check for type of key
template<class K>
Value& operator[](K&& k){
auto it = find(k);
if (it != end()) return it->v;
storage.emplace_back( std::forward<K>(k), Value{} );
return storage.back().v;
}
private: // C++14, but you can just inject the lambda at point of use in 11:
template<class K>
auto key_match( K& k ) {
return [&k](kv const& kv){
return kv.k == k;
};
}
public:
template<class K>
iterator find(K&& k) {
return std::find_if( begin(), end(), key_match(k) );
}
template<class K>
const_iterator find(K&& k) const {
return const_cast<flat_map*>(this)->find(k);
}
// iterator-less query functions:
template<class K>
Value* get(K&& k) {
auto it = find(std::forward<K>(k));
if (it==end()) return nullptr;
return std::addressof(it->v);
}
template<class K>
Value const* get(K&& k) const {
return const_cast<flat_map*>(this)->get(std::forward<K>(k));
}
// key-based erase: (SFINAE should be is_comparible, but that doesn't exist)
template<class K, class=std::enable_if_t<std::is_converible<K, Key>{}>>
bool erase(K&& k) {
auto it = std::remove(
storage.begin(), storage.end(), key_match(std::forward<K>(k))
);
if (it == storage.end()) return false;
storage.erase( it, storage.end() );
return true;
}
// classic erase, for iterating:
iterator erase(const_iterator it) {
return storage.erase(it);
}
template<class K2, class V2,
class=std::enable_if_t<
std::is_convertible< K2, Key >{}&&
std::is_convertible< V2, Value >{}
>
>
void set( K2&& kin, V2&& vin ) {
auto it = find(kin);
if (it != end()){
it->second = std::forward<V2>(vin);
return;
} else {
storage.emplace_back( std::forward<K2>(kin), std::forward<V2>(vin) );
}
}
};
I left the container type as a template argument, so you can use a SBO vector-like structure if you choose.
In theory, I should expose a template parameter for replacing equals on the keys. I did, however, make the key-search functions transparent.
If the sets are sure to be small then you can just use a std::vector (or std::deque) and do lookup using linear searches. An O(n) linear search over a small vector can be faster than an O(log(n)) search over a more complicated structure such as a red-black tree.
So you could just put elements in a vector without sorting them. You would still need to do some shuffling if you remove elements, but that will always be true for a flat vector-like structure whether it's sorted or not, unless you only ever remove the back element. Why is it important that it's flat anyway?
There's std::unordered_set and std::unordered_map but as far as I know they are not implemented using vectors.
A possible option is to write your own hash vector and hash the key using std::hash<Key> and then index the resulting number modulo the length of the vector, but then you'll have to figure out a way to handle collisions and all the resulting problems manually. Not sure I recommended that.
An alternative would be to pass a custom allocator to std::unordered_set and std::unordered_map which perform the allocation on a vector (for example by owning an internal vector), as suggested by #BeyelerStudios.
Evgeny Panasyuk is correct, I believe what you want is an Open Address Hash Map.
This fits exactly your requirement, only 1 flat buffer, no allocation of nodes, no pointers to follow, and unsorted.
Otherwise you also have flat_map/AssocVectorbut they are sorted, unlike your requirement.
For OAHM, I have an implementation of a STL-like generic one here:
https://sourceforge.net/projects/cgenericopenaddresshashmap/
Also you might want to take a look the benchmark page of flat_map:
boost::flat_map and its performance compared to map and unordered_map
The OAHM is performing very close to the flat_map in all tests, except iteration.
Please look at the sfl library that I have recently updated to GitHub: https://github.com/slavenf/sfl-library
It is C++11 header-only library that offers flat ordered and unordered containers that store elements contiguously in memory. All containers meet requirements of Container, AllocatorAwareContainer and ContiguousContainer. Library is licensed under zlib license.

accessing map with C++98 standard

I have the following C++11 compatible code and I need to compile it with C++98 which doesn't have support for '.at'. How to rewrite it to be compatible with C++98 ?
String suffix("sss");
headers_t& meta = ...;
typedef std::map<std::string, std::string> headerpair_t;
typedef std::map<std::string, headerpair_t> addheader_t;
addheader_t addheader;
for(headerpair_t::const_iterator piter = addheader.at(suffix).begin(); piter != addheader.at(suffix).end(); ++piter){
// Adding header
meta[(*piter).first] = (*piter).second;
}
Just create an at() function which mimicks what C++11 std::map<...>::at() does:
template <typename K, typename V, typename C, typename A>
V const& at(std::map<K, V, C, A> const& m, K const& k) {
typename std::map<K, V, C, A>::const_iterator it(m.find(k));
if (it == m.end()) {
throw std::out_of_range("key not found in map");
}
return it->second;
}
Note that calling at() in each iteration of a loop is a Bad Idea! Searching a std::map<...> is efficient in the theoretical sense but that doesn't mean that it is fast in practice! You are much better off to search the relevant node just one and then keep using it.
You shouldn't use at() in a for loop condition like that. The element does not change between iteration and there is a overhead in retrieving it at every turn. So you should just retrieve it using find and then loop on the iterators:
addheader_t::const_iterator header_iter = addheader.find(suffix); // Retrieve the element
if (header_iter != addheader.end()) // Check that it does exist
{
// Retrieve the sub-map in the pair
const headerpair_t& header_pair_map = it->second;
// Loop on the elements
for (headerpair_t::const_iterator it = header_pair_map.begin(); header_pair_map.end(); ++it)
{
// Use insert to avoid a useless element construction
// Use also `std::make_pair`, but can we directly insert the pair from headerpair ?
meta.insert(std::make_pair(it->first, it->second));
}
}

Get all vector elements that don't belong to another vector

In C# if I want to get all elements in a List List1, which don't belong to another List List2 I can do
var result List1.Except(List2);
Is there something equivalent for std::vectors in C++? (C++11 is allowed)
The following populates List3 with the content from List1 that is not in List2. I hope it is what you're looking for:
std::vector<Type> List1, List2;
//
// populate List1 and List2
//
std::vector<Type> List3;
std::copy_if(List1.begin(), List1.end(), std::back_inserter(List3),
[&List2](const Type& arg)
{ return (std::find(List2.begin(), List2.end(), arg) == List2.end());});
Alternatively, this is likely better performing, since you don't have to search the entire list to determine lack of existence. Rather you can get an early "hit" and just move to the next node. Note the logic flip in the predicate:
std::vector<Type> List3;
std::remove_copy_if(List1.begin(), List1.end(), std::back_inserter(List3),
[&List2](const Type& arg)
{ return (std::find(List2.begin(), List2.end(), arg) != List2.end());});
You need to write your own function something like this:
for (auto element : List1)
{
auto it = std::find(List2.begin(), List2.end(), element);
if(it == List2.end())
{
result.push_back(element);
}
}
You should consider if a std::list is the right data structure for that, as it is - at least in C++ - not sorted by default, so in the worst case you will have to iterate size(list2) times through all elements of the list1, using an algorithm like Asha pointed out.
A better approach would be the use of an ordered container, e.g. multiset and use std::set_difference to create a result.
For any arbitrary container you can always use the std::remove_if + container::erase combination:
template <typename Cont, typename FwdIt>
void subtract(Cont& cont, FwdIt first, FwdIt last) {
using std::begin; using std::end;
using const_reference = typename Cont::value_type const&;
cont.erase(std::remove_if(begin(cont), end(cont),
[first, last](const_reference value){
return last != std::find(first, last, value);
}), end(cont));
}
template <typename Cont1, typename Cont2>
void subtract(Cont1& cont1, Cont2 const& cont2) {
using std::begin; using std::end;
subtract(cont1, begin(cont2), end(cont2));
}
In the case of std::list you can overload the subtract function, because std::list has a dedicated remove_if member function:
template <typename T, typename Alloc, typename FwdIt>
void subtract(std::list<T, Alloc>& l, FwdIt first, FwdIt last) {
l.remove_if([first, last](T const& value){
return last != std::find(first, last, value);
});
}
template <typename T, typename Alloc, typename Cont>
void subtract(std::list<T, Alloc>& l, Cont const& cont) {
using std::begin; using std::end;
subtract(l, begin(cont), end(cont));
}
These implementation are generic and make no assumption about the sorting of the sequences.
If only your second container is guaranteed to be sorted, you can use std::binary_seach instead of find.
If both sequences are sorted, you should use std::set_difference.

A reduce function (for many set unions) in C++

What I am trying to do:
I have a simple set union function in C++ using STL, and I'm trying to wrap it in a function that will let me perform the union of arbitrarily many sets contained in STL data structures (e.g. std::list, std::vector, std::forward_list, ...).
How I tried to do it:
To start, my simple set union:
#include <algorithm>
template <typename set_type>
set_type sunion(const set_type & lhs, const set_type & rhs)
{
set_type result;
std::set_union( lhs.begin(), lhs.end(), rhs.begin(), rhs.end(), std::inserter(result, result.end()) );
return result;
}
where set_type defines some STL std::set<T>, e.g. std::set<int>.
After noticing several times that I end up needing to perform several unions on iterators of sets (in Python this would be a reduce of my sunion function over some iterable object of set_types). For instance, I might have
std::vector<std::set<int> > all_sets;
or
std::list<std::set<int> > all_sets;
etc., and I want to get the union of all sets in all_sets. I am trying to implement a simple reduce for this, which essentially does a (faster, more elegant, non-copying) version of:
sunion(... sunion( sunion( all_sets.begin(), all_sets.begin()+1 ), all_sets.begin()+2 ) , ... )
Essentially, to do this quickly, I just want to declare a set_type result and then iterate through all_sets and insert value in every set in all_sets into the result object:
template <typename set_type>
set_type sunion_over_iterator_range(const std::iterator<std::forward_iterator_tag, set_type> & begin, const std::iterator<std::forward_iterator_tag, set_type> & end)
{
set_type result;
for (std::iterator<std::forward_iterator_tag, set_type> iter = begin; iter != end; iter++)
{
insert_all(result, *iter);
}
return result;
}
where insert_all is defined:
// |= operator; faster than making a copy and performing union
template <typename set_type>
void insert_all(set_type & lhs, const set_type & rhs)
{
for (typename set_type::iterator iter = rhs.begin(); iter != rhs.end(); iter++)
{
lhs.insert(*iter);
}
}
How it didn't work:
Unfortunately, my sunion_over_iterator_range(...) doesn't work with arguments std::vector<set_type>::begin(), std::vector<set_type>::end(), which are of type std::vector<set_type>::iterator. I thought std::vector<T>::iterator returns an iterator<random_access_iterator_tag, T>. A
After compilation failed because of type incompatibility of the iterators, I looked at the stl vector source (located in /usr/include/c++/4.6/bits/stl_vector.h for g++ 4.6 & Ubuntu 11.10), and was surprised to see the typedef for vector<T>::iterator to be typedef __gnu_cxx::__normal_iterator<pointer, vector> iterator;. I had thought that a ForwardIterator was a subtype of RandomAccessIterator, and so should be fine, but clearly I was incorrect, or I would not be here.
How I am grateful and ashamed of inciting your frustration due to my inexperience:
Apologies if I'm showing my ignorance-- I am trying to learn to be a better object oriented programmer (in the past I have simply hacked everything out in C-style code).
I'm doing my best, coach! Please help me out and spare the world from bad code that I would produce without your code ninja insight...
Here's a very naive approach:
std::set<T> result;
std::vector<std::set<T>> all_sets;
for (std::set<T> & s : all_sets)
{
result.insert(std::make_move_iterator(s.begin()),
std::make_move_iterator(s.end()));
}
This invalidates the elements in the source sets, though it doesn't actually move the element nodes over. If you want to leave the source sets intact, just remove the make_move_iterator.
Unfortunately there's no interface for std::set that lets you "splice" two sets in a way that doesn't reallocate the internal tree nodes, so this is more or less as good as you can get.
Here's a variadic template approach:
template <typename RSet> void union(RSet &) { }
template <typename RSet, typename ASet, typename ...Rest>
void union(RSet & result, ASet const & a, Rest const &... r)
{
a.insert(a.begin(), a.end());
union(result, r...);
}
Usage:
std::set<T> result
union(result, s1, s2, s3, s4);
(Similar move-optimizations are feasible here; you can even add some branching that will copy from immutables but move from mutables, or from rvalues only, if you like.)
Here's a version using std::accumulate:
std::set<T> result =
std::accumulate(all_sets.begin(), all_sets.end(), std::set<T>(),
[](std::set<T> & s, std::set<T> const & t)
{ s.insert(t.begin(), t.end()); return s; } );
This version seems to rely on return value optimisation a lot, though, so you might like to compare it to this hacked up and rather ugly version:
std::set<T> result;
std::accumulate(all_sets.begin(), all_sets.end(), 0,
[&result](int, std::set<T> const & t)
{ result.insert(t.begin(), t.end()); return 0; } );
Usually, when using iterators we don't care about the actual category. Just let the implementation sort it out. That means, just change the function to accept any type:
template <typename T>
typename std::iterator_traits<T>::value_type sunion_over_iterator_range(T begin, T end)
{
typename std::iterator_traits<T>::value_type result;
for (T iter = begin; iter != end; ++ iter)
{
insert_all(result, *iter);
}
return result;
}
Note that I have used typename std::iterator_traits<T>::value_type, which is the type of *iter.
BTW, the iterator pattern is not related to OOP. (That doesn't mean it's a bad thing).