Related
I want to find the minimum element of a filtered list. In Python, I would write:
it = (x for x in [1, 8, 4, 3] if x % 2 == 0)
min(it, default=None)
I hoped that the c++ equivalent would read something like:
const std::vector<int> array {1, 8, 4, 3};
const auto arr_end = std::end(array);
auto it = std::find_if(std::begin(array), arr_end, [](int value) { return value % 2 == 0; });
auto jt = std::min_element(it, arr_end);
if (jt != arr_end) {
std::cout << "Min even element is: " << *jt << std::endl;
} else {
std::cout << "No even element exists!" << std::endl;
}
The expected result is 4, but of course the actual result is 3. The reason: find_if skips to 8. Then from 8 to end the min element is chosen, which is 3.
My question: Is there a way to create an iterator over all even values that can be used to find the minimum element? I am not allowed to use boost, create a copy or to write to array. We are using c++17.
There isn't an answer in std as of C++17. In C++20 you can use std::ranges::filter_view, outside of std you can use ranges::filter_view from the range-v3 library, which was the demonstration implementation for the C++20 ranges proposal.
auto filtered = ranges::filter_view(array, [](int value) { return value % 2 == 0; });
auto it = std::min_element(filtered.begin(), filtered.end());
if (it != filtered.end()) {
std::cout << "Min even element is: " << *jt << std::endl;
} else {
std::cout << "No even element exists!" << std::endl;
}
My question: Is there a way to create an iterator over all even values that can be used to find the minimum element?
Yes!
It's slightly unfortunate that you're limited to C++17 with no Boost, because you ideally want ranges - specifically ranges::filter_view etc. which was added in C++20, and preceded by the Boost.Range library.
You may possibly be able to use the intermediate experimental range extension.
If none of those are viable, you can of course write your own filtered_iterator to use with std::min_element.
It's not much fun: although it's probably more reusable (and easier to test) than encoding all the logic into a single lambda, it's a lot of work if you're not planning to reuse it. Also, C++ iterators aren't ideally suited to emulating a Python-style generator, as demonstrated by the redundant end iterator e_ and the copy-assignment operator. You can't elide the end & predicate members of the filtered end iterator either, because both iterators usually need to be the same type.
template <typename BaseIterator, typename UnaryPredicate>
class filter_iterator
{
BaseIterator i_;
BaseIterator e_;
UnaryPredicate pred_;
public:
using reference = typename std::iterator_traits<BaseIterator>::reference;
using value_type = typename std::iterator_traits<BaseIterator>::value_type;
filter_iterator(filter_iterator &&) = default;
filter_iterator(filter_iterator const&) = default;
filter_iterator(BaseIterator i, BaseIterator e, UnaryPredicate p)
: i_(i), e_(e), pred_(p)
{}
filter_iterator& operator=(filter_iterator &&) = default;
filter_iterator& operator=(filter_iterator const& other) {
i_ = other.i_;
e_ = other.e_;
// This is questionable, because we can't copy the predicate without adding
// a level of indirection (ie, always wrapping it in std::function).
// For now, just assume it is stateless for convenience.
return *this;
}
bool operator==(filter_iterator const& other) const
{
return i_ == other.i_;
}
filter_iterator& operator++() {
// We could check i_ is not already e_ here,
// but the caller is required to check this outside anyway
i_ = find_if(next(i_), e_, pred_);
return *this;
}
filter_iterator operator++(int) const {
filter_iterator i(*this);
++i;
return i;
}
reference operator*() { return *i_; }
std::add_const_t<reference> operator*() const { return *i_; }
};
template <typename BaseIterator, typename UnaryPredicate>
bool operator!=(filter_iterator<BaseIterator, UnaryPredicate> const& a,
filter_iterator<BaseIterator, UnaryPredicate> const& b)
{
return !(a == b);
}
Then the wrapper function hides most of this ugliness for us:
template <typename BaseIterator, typename UnaryPredicate>
std::pair<filter_iterator<BaseIterator, UnaryPredicate>,
filter_iterator<BaseIterator, UnaryPredicate>>
filter(BaseIterator b, BaseIterator e, UnaryPredicate p)
{
using f = filter_iterator<BaseIterator, UnaryPredicate>;
auto fbegin = find_if(b, e, p);
return {f{fbegin, e, p}, {e, e, p}};
}
and we can use it like:
int main() {
std::vector<int> a {7, 1, 8, 4, 3, 2};
auto be = filter(a.begin(), a.end(),
[](int i){ return (i%2) == 0;});
auto min = std::min_element(be.first, be.second);
return *min;
}
If you are limited at c++17 there is no solution without making a copy.
If you can transition to C++ 20 the solution is pretty easy. C++ 20 introduced the std::views concept and added the <ranges> library. The concept of std::view is to not create a copy of the underlying container, and it does not modifies the actual values of the container. Behind the scenes the views are actually iterators(actually it is a bit more but lets stay at the basics)
So in your case you could something like this
const std::vector<int> array {1, 8, 4, 3};
auto isEven = [](auto i) { return i % 2 == 0; };
//This is actually an iterator pair(begin, end)
//No copies of the container ever made, the container does not change
auto filtered = array | std::views::filter(isEven);
auto min = std::ranges::min_element(filtered );
if (min != filtered .end())
std::cout << "Min " << *min << std::endl;
else
std::cout << "No min\n";
//You can try to print the vector, it will be unchanged!!!
std::find_if does not filter the vector. It only returns the first element for which the predicate is true. I suppose there is an elegant solution using ranges. The rather inelegant way is to use a custom comparator with min_element:
#include <vector>
#include <algorithm>
#include <iostream>
int main() {
const std::vector<int> array {1, 8, 4, 3};
std::vector<float> x;
if (array.size()) {
auto it = std::min_element(begin(array),end(array),
[](auto a, auto b){
if ((a % 2) && (b % 2)) return a < b;
if (a % 2) return false;
if (b % 2) return true;
return a < b;
});
if (*it % 2 == 0) std::cout << *it;
}
}
Odd elements are considered to be not < than other elements. When both are odd or both are even the "normal" < is used. Output is:
4
Note that I have to check if (*it % 2 == 0) because when there is no even element then the call to min_element will return an iterator to the smallest odd element.
PS: The tricky part of custom comparators is to get strict weak ordering correct. The above comparator can be written in a more concise way (thanks to Jarod42) like this:
return std::tuple{ bool{a%2} , a} < std::tuple{ bool{b%2} , b};
Tuples have a operator< that implements a strict weak ordering (given that the elements type provide one), hence writing it this way it is much easier to convice yourself that the comparator really is a strict weak ordering.
I want to compare one value against several others and check if it matches at least one of those values, I assumed it would be something like
if (x = any_of(1, 2 ,3)
// do something
But the examples of it I've seen online have been
bool any_of(InputIt first, InputIt last, UnaryPredicate)
What does that mean?
New to c++ so apologies if this is a stupid question.
There is plenty of literature and video tutorials on the subject of "iterators in C++", you should do some research in that direction because it's a fundamental concept in C++.
A quick summary on the matter: an iterator is something that points to an element in a collection (or range) of values. A few examples of such collections:
std::vector is the most common one. It's basically a resizable array.
std::list is a linked list.
std::array is a fixed size array with some nice helpers around C style arrays
int myInt[12] is a C style array of integers. This one shouldn't be used anymore.
Algorithms from the C++ standard library that operate on a collection of values (such as std::any_of) take the collection by two iterators. The first iterator InputIt first points to the beginning of said collection, while InputIt last points to the end of the collection (actually one past the end).
A UnaryPredicate is a function that takes 1 argument (unary) and returns a bool (predicate).
In order to make std::any_of do what you want, you have to put your values in a collection and x in the UnaryPredicate:
int x = 3;
std::vector values = {1, 2, 3};
if (std::any_of(values.begin(), values.end(), [x](int y) { return x == y; }))
// ...
The UnaryPredicate in this case is a lambda function.
As you can see this is quite verbose code given your example. But once you have a dynamic amound of values that you want to compare, or you want to check for more complex things than just equality, this algorithm becomes way more beneficial.
Fun little experiment
Just for fun, I made a little code snippet that implements an any_of like you wanted to have it. It's quite a lot of code and pretty complicated aswell (definitely not beginner level!) but it is very flexible and actually nice to use. The full code can be found here.
Here is how you would use it:
int main()
{
int x = 7;
std::vector dynamic_int_range = {1, 2, 3, 4, 5, 6, 7, 8};
if (x == any_of(1, 2, 3, 4, 5))
{
std::cout << "x is in the compile time collection!\n";
}
else if (x == any_of(dynamic_int_range))
{
std::cout << "x is in the run time collection!\n";
}
else
{
std::cout << "x is not in the collection :(\n";
}
std::string s = "abc";
std::vector<std::string> dynamic_string_range = {"xyz", "uvw", "rst", "opq"};
if (s == any_of("abc", "def", "ghi"))
{
std::cout << "s is in the compile time collection!\n";
}
else if (s == any_of(dynamic_string_range))
{
std::cout << "s is in the run time collection!\n";
}
else
{
std::cout << "s is not in the collection :(\n";
}
}
And here how it's implemented:
namespace detail
{
template <typename ...Args>
struct ct_any_of_helper
{
std::tuple<Args...> values;
constexpr ct_any_of_helper(Args... values) : values(std::move(values)...) { }
template <typename T>
[[nodiscard]] friend constexpr bool operator==(T lhs, ct_any_of_helper const& rhs) noexcept
{
return std::apply([&](auto... vals) { return ((lhs == vals) || ...); }, rhs.values);
}
};
template <typename Container>
struct rt_any_of_helper
{
Container const& values;
constexpr rt_any_of_helper(Container const& values) : values(values) { }
template <typename T>
[[nodiscard]] friend constexpr bool operator==(T&& lhs, rt_any_of_helper&& rhs) noexcept
{
return std::any_of(cbegin(rhs.values), cend(rhs.values), [&](auto val)
{
return lhs == val;
});
}
};
template <typename T>
auto is_container(int) -> decltype(cbegin(std::declval<T>()) == cend(std::declval<T>()), std::true_type{});
template <typename T>
std::false_type is_container(...);
template <typename T>
constexpr bool is_container_v = decltype(is_container<T>(0))::value;
}
template <typename ...Args>
[[nodiscard]] constexpr auto any_of(Args&&... values)
{
using namespace detail;
if constexpr (sizeof...(Args) == 1 && is_container_v<std::tuple_element_t<0, std::tuple<Args...>>>)
return rt_any_of_helper(std::forward<Args>(values)...);
else
return ct_any_of_helper(std::forward<Args>(values)...);
}
In case an expert sees this code and wants to complain about the dangling reference: come on, who would write someting like this:
auto a = any_of(std::array {1, 2, 3, 4});
if (x == std::move(a)) // ...
That's not what this function is for.
Your values must already exist somewhere else, it is very likely that it will be a vector.
std::any_of operates on iterators.
Iterators in C++ are ranges, two values that tell you where is the beginning, and where is the end of the range.
Most C++ Standard Template Library collections, including std::vector, support iterator API, and so you can use std::any_of on them.
For the sake of a full example, lets check if a vector contains 42 in over the top way, just to use std::any_of.
Since we only want to check if value in vector exists without changing anything (std::any_of doesn't modify the collection), we use .cbegin() and .cend() that return constant beginning and end of the vector, those are important to std::any_of, as it has to iterate over the entire vector to check if there's at least one value matching the given predicate.
The last parameter must be unary predicate, that means that it is a function, that accepts a single argument, and returns whether given argument fits some criteria.
To put it simply, std::any_of is used to check whether there's at least one value in a collection, that has some property that you care about.
Code:
#include <algorithm>
#include <iostream>
#include <vector>
bool is_42(int value) {
return value == 42;
}
int main() {
std::vector<int> vec{
1, 2, 3,
// 42 // uncomment this
};
if(std::any_of(vec.cbegin(), vec.cend(), is_42)) {
std::cout << "42 is in vec" << std::endl;
} else {
std::cout << "42 isn't in vec" << std::endl;
}
}
As stated by user #a.abuzaid, you can create your own method for this. The method they provided, however, lacks in a number of areas stated in the comments of the answer. I can't really get my head around std::any_of as of right now and just decided to create this template:
template <typename Iterable, typename type>
bool any_of(Iterable iterable, type value) {
for (type comparison : iterable) {
if (comparison == value) {
return true;
}
}
return false;
}
An example use here would be if (any_of(myVectorOfStrings, std::string("Find me!"))) { do stuff }, in which the iterable is a vector of strings and the value is the string "Find me!".
You can just create a function where you are comparing x to two other numbers to check if they are the same for instance
bool anyof(int x, int y, int z) {
if ((x == y) || (x == z))
return true;
}
and then within your main you can call the function like this:
if (anyof(x, 1, 2))
cout << "Matches a number";
I'm using multitreading and want to merge the results. For example:
std::vector<int> A;
std::vector<int> B;
std::vector<int> AB;
I want AB to have to contents of A and the contents of B in that order. What's the most efficient way of doing something like this?
AB.reserve( A.size() + B.size() ); // preallocate memory
AB.insert( AB.end(), A.begin(), A.end() );
AB.insert( AB.end(), B.begin(), B.end() );
This is precisely what the member function std::vector::insert is for
std::vector<int> AB = A;
AB.insert(AB.end(), B.begin(), B.end());
Depends on whether you really need to physically concatenate the two vectors or you want to give the appearance of concatenation of the sake of iteration. The boost::join function
http://www.boost.org/doc/libs/1_43_0/libs/range/doc/html/range/reference/utilities/join.html
will give you this.
std::vector<int> v0;
v0.push_back(1);
v0.push_back(2);
v0.push_back(3);
std::vector<int> v1;
v1.push_back(4);
v1.push_back(5);
v1.push_back(6);
...
BOOST_FOREACH(const int & i, boost::join(v0, v1)){
cout << i << endl;
}
should give you
1
2
3
4
5
6
Note boost::join does not copy the two vectors into a new container
but generates a pair of iterators (range) that cover the span of
both containers. There will be some performance overhead but maybe
less that copying all the data to a new container first.
In the direction of Bradgonesurfing's answer, many times one doesn't really need to concatenate two vectors (O(n)), but instead just work with them as if they were concatenated (O(1)). If this is your case, it can be done without the need of Boost libraries.
The trick is to create a vector proxy: a wrapper class which manipulates references to both vectors, externally seen as a single, contiguous one.
USAGE
std::vector<int> A{ 1, 2, 3, 4, 5};
std::vector<int> B{ 10, 20, 30 };
VecProxy<int> AB(A, B); // ----> O(1). No copies performed.
for (size_t i = 0; i < AB.size(); ++i)
std::cout << AB[i] << " "; // 1 2 3 4 5 10 20 30
IMPLEMENTATION
template <class T>
class VecProxy {
private:
std::vector<T>& v1, v2;
public:
VecProxy(std::vector<T>& ref1, std::vector<T>& ref2) : v1(ref1), v2(ref2) {}
const T& operator[](const size_t& i) const;
const size_t size() const;
};
template <class T>
const T& VecProxy<T>::operator[](const size_t& i) const{
return (i < v1.size()) ? v1[i] : v2[i - v1.size()];
};
template <class T>
const size_t VecProxy<T>::size() const { return v1.size() + v2.size(); };
MAIN BENEFIT
It's O(1) (constant time) to create it, and with minimal extra memory allocation.
SOME STUFF TO CONSIDER
You should only go for it if you really know what you're doing when dealing with references. This solution is intended for the specific purpose of the question made, for which it works pretty well. To employ it in any other context may lead to unexpected behavior if you are not sure on how references work.
In this example, AB does not provide a non-const
access operator ([ ]). Feel free to include it, but keep in mind: since AB contains references, to assign it
values will also affect the original elements within A and/or B. Whether or not this is a
desirable feature, it's an application-specific question one should
carefully consider.
Any changes directly made to either A or B (like assigning values,
sorting, etc.) will also "modify" AB. This is not necessarily bad
(actually, it can be very handy: AB does never need to be explicitly
updated to keep itself synchronized to both A and B), but it's
certainly a behavior one must be aware of. Important exception: to resize A and/or B to sth bigger may lead these to be reallocated in memory (for the need of contiguous space), and this would in turn invalidate AB.
Because every access to an element is preceded by a test (namely, "i
< v1.size()"), VecProxy access time, although constant, is also
a bit slower than that of vectors.
This approach can be generalized to n vectors. I haven't tried, but
it shouldn't be a big deal.
Based on Kiril V. Lyadvinsky answer, I made a new version. This snippet use template and overloading. With it, you can write vector3 = vector1 + vector2 and vector4 += vector3. Hope it can help.
template <typename T>
std::vector<T> operator+(const std::vector<T> &A, const std::vector<T> &B)
{
std::vector<T> AB;
AB.reserve(A.size() + B.size()); // preallocate memory
AB.insert(AB.end(), A.begin(), A.end()); // add A;
AB.insert(AB.end(), B.begin(), B.end()); // add B;
return AB;
}
template <typename T>
std::vector<T> &operator+=(std::vector<T> &A, const std::vector<T> &B)
{
A.reserve(A.size() + B.size()); // preallocate memory without erase original data
A.insert(A.end(), B.begin(), B.end()); // add B;
return A; // here A could be named AB
}
One more simple variant which was not yet mentioned:
copy(A.begin(),A.end(),std::back_inserter(AB));
copy(B.begin(),B.end(),std::back_inserter(AB));
And using merge algorithm:
#include <algorithm>
#include <vector>
#include <iterator>
#include <iostream>
#include <sstream>
#include <string>
template<template<typename, typename...> class Container, class T>
std::string toString(const Container<T>& v)
{
std::stringstream ss;
std::copy(v.begin(), v.end(), std::ostream_iterator<T>(ss, ""));
return ss.str();
};
int main()
{
std::vector<int> A(10);
std::vector<int> B(5); //zero filled
std::vector<int> AB(15);
std::for_each(A.begin(), A.end(),
[](int& f)->void
{
f = rand() % 100;
});
std::cout << "before merge: " << toString(A) << "\n";
std::cout << "before merge: " << toString(B) << "\n";
merge(B.begin(),B.end(), begin(A), end(A), AB.begin(), [](int&,int&)->bool {});
std::cout << "after merge: " << toString(AB) << "\n";
return 1;
}
All the solutions are correct, but I found it easier just write a function to implement this. like this:
template <class T1, class T2>
void ContainerInsert(T1 t1, T2 t2)
{
t1->insert(t1->end(), t2->begin(), t2->end());
}
That way you can avoid the temporary placement like this:
ContainerInsert(vec, GetSomeVector());
For this use case, if you know beforehand the number of results each thread produces, you could preallocate AB and pass a std::span to each thread. This way the concatenation need not be done. Example:
std::vector<int> AB(total_number_of_results, 0);
std::size_t chunk_length = …;
std::size_t chunk2_start = chunk_length;
std::size_t chunk3_start = 2 * chunk_length; // If needed
…
// Pass these to the worker threads.
std::span<int> A(AB.data(), chunk_length);
std::span<int> B(AB.data() + chunk2_start, chunk_length);
…
My answer is based on Mr.Ronald Souza's original solution. In addition to his original solution, I've written a vector proxy that supports iterators too!
short description for people who are not aware of the context of the original solution: the joined_vector template class (i.e the vector proxy)takes two references of two vectors as constructor arguments, it then treats them as one contiguous vector. My implementation also supports a forward-iterator.
USAGE:
int main()
{
std::vector<int> a1;
std::vector<int> a2;
joined_vector<std::vector<int>> jv(a1,a2);
for (int i = 0; i < 5; i++)
a1.push_back(i);
for (int i = 5; i <=10; i++)
a2.push_back(i);
for (auto e : jv)
std::cout << e<<"\n";
for (int i = 0; i < jv.size(); i++)
std::cout << jv[i] << "\n";
return 0;
}
IMPLEMENTATION:
template<typename _vec>
class joined_vector
{
_vec& m_vec1;
_vec& m_vec2;
public:
struct Iterator
{
typedef typename _vec::iterator::value_type type_value;
typedef typename _vec::iterator::value_type* pointer;
typedef typename _vec::iterator::value_type& reference;
typedef std::forward_iterator_tag iterator_category;
typedef std::ptrdiff_t difference_type;
_vec* m_vec1;
_vec* m_vec2;
Iterator(pointer ptr) :m_ptr(ptr)
{
}
Iterator operator++()
{
if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
m_ptr = &(*m_vec2)[0];
else
++m_ptr;
return m_ptr;
}
Iterator operator++(int)
{
pointer curr = m_ptr;
if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
m_ptr = &(*m_vec2)[0];
else
++m_ptr;
return curr;
}
reference operator *()
{
return *m_ptr;
}
pointer operator ->()
{
return m_ptr;
}
friend bool operator == (Iterator& itr1, Iterator& itr2)
{
return itr1.m_ptr == itr2.m_ptr;
}
friend bool operator != (Iterator& itr1, Iterator& itr2)
{
return itr1.m_ptr != itr2.m_ptr;
}
private:
pointer m_ptr;
};
joined_vector(_vec& vec1, _vec& vec2) :m_vec1(vec1), m_vec2(vec2)
{
}
Iterator begin()
{
//checkes if m_vec1 is empty and gets the first elemet's address,
//if it's empty then it get's the first address of the second vector m_vec2
//if both of them are empty then nullptr is returned as the first pointer
Iterator itr_beg((m_vec1.size() != 0) ? &m_vec1[0] : ((m_vec2.size() != 0) ? &m_vec2[0] : nullptr));
itr_beg.m_vec1 = &m_vec1;
itr_beg.m_vec2 = &m_vec2;
return itr_beg;
}
Iterator end()
{
//check if m_vec2 is empty and get the last address of that vector
//if the second vector is empty then the m_vec1's vector/the first vector's last element's address is taken
//if both of them are empty then a null pointer is returned as the end pointer
typename _vec::value_type* p = ((m_vec2.size() != 0) ? &m_vec2[m_vec2.size() - 1] : ((m_vec1.size()) != 0 ? &m_vec1[m_vec1.size() - 1] : nullptr));
Iterator itr_beg(p != nullptr ? p + 1 : nullptr);
itr_beg.m_vec1 = &m_vec1;
itr_beg.m_vec2 = &m_vec2;
return itr_beg;
}
typename _vec::value_type& operator [](int i)
{
if (i < m_vec1.size())
return m_vec1[i];
else
return m_vec2[i - m_vec1.size()];
}
size_t size()
{
return m_vec1.size() + m_vec2.size();
}
};
If your vectors are sorted*, check out set_union from <algorithm>.
set_union(A.begin(), A.end(), B.begin(), B.end(), AB.begin());
There's a more thorough example in the link.
With almost all code I write, I am often dealing with set reduction problems on collections that ultimately end up with naive "if" conditions inside of them. Here's a simple example:
for(int i=0; i<myCollection.size(); i++)
{
if (myCollection[i] == SOMETHING)
{
DoStuff();
}
}
With functional languages, I can solve the problem by reducing the collection to another collection (easily) and then perform all operations on my reduced set. In pseudocode:
newCollection <- myCollection where <x=true
map DoStuff newCollection
And in other C variants, like C#, I could reduce with a where clause like
foreach (var x in myCollection.Where(c=> c == SOMETHING))
{
DoStuff();
}
Or better (at least to my eyes)
myCollection.Where(c=>c == Something).ToList().ForEach(d=> DoStuff(d));
Admittedly, I am doing a lot of paradigm mixing and subjective/opinion based style, but I can't help but feel that I am missing something really fundamental that could allow me to use this preferred technique with C++. Could someone enlighten me?
IMHO it's more straight forward and more readable to use a for loop with an if inside it. However, if this is annoying for you, you could use a for_each_if like the one below:
template<typename Iter, typename Pred, typename Op>
void for_each_if(Iter first, Iter last, Pred p, Op op) {
while(first != last) {
if (p(*first)) op(*first);
++first;
}
}
Usecase:
std::vector<int> v {10, 2, 10, 3};
for_each_if(v.begin(), v.end(), [](int i){ return i > 5; }, [](int &i){ ++i; });
Live Demo
Boost provides ranges that can be used w/ range-based for. Ranges have the advantage that they don't copy the underlying data structure, they merely provide a 'view' (that is, begin(), end() for the range and operator++(), operator==() for the iterator). This might be of your interest: http://www.boost.org/libs/range/doc/html/range/reference/adaptors/reference/filtered.html
#include <boost/range/adaptor/filtered.hpp>
#include <iostream>
#include <vector>
struct is_even
{
bool operator()( int x ) const { return x % 2 == 0; }
};
int main(int argc, const char* argv[])
{
using namespace boost::adaptors;
std::vector<int> myCollection{1,2,3,4,5,6,7,8,9};
for( int i: myCollection | filtered( is_even() ) )
{
std::cout << i;
}
}
Instead of creating a new algorithm, as the accepted answer does, you can use an existing one with a function that applies the condition:
std::for_each(first, last, [](auto&& x){ if (cond(x)) { ... } });
Or if you really want a new algorithm, at least reuse for_each there instead of duplicating the iteration logic:
template<typename Iter, typename Pred, typename Op>
void
for_each_if(Iter first, Iter last, Pred p, Op op) {
std::for_each(first, last, [&](auto& x) { if (p(x)) op(x); });
}
The idea of avoiding
for(...)
if(...)
constructs as an antipattern is too broad.
It is completely fine to process multiple items that match a certain expression from inside a loop, and the code cannot get much clearer than that. If the processing grows too large to fit on screen, that is a good reason to use a subroutine, but still the conditional is best placed inside the loop, i.e.
for(...)
if(...)
do_process(...);
is vastly preferable to
for(...)
maybe_process(...);
It becomes an antipattern when only one element will match, because then it would be clearer to first search for the element, and perform the processing outside of the loop.
for(int i = 0; i < size; ++i)
if(i == 5)
is an extreme and obvious example of this. More subtle, and thus more common, is a factory pattern like
for(creator &c : creators)
if(c.name == requested_name)
{
unique_ptr<object> obj = c.create_object();
obj.owner = this;
return std::move(obj);
}
This is hard to read, because it isn't obvious that the body code will be executed once only. In this case, it would be better to separate the lookup:
creator &lookup(string const &requested_name)
{
for(creator &c : creators)
if(c.name == requested_name)
return c;
}
creator &c = lookup(requested_name);
unique_ptr obj = c.create_object();
There is still an if within a for, but from the context it becomes clear what it does, there is no need to change this code unless the lookup changes (e.g. to a map), and it is immediately clear that create_object() is called only once, because it is not inside a loop.
Here is a quick relatively minimal filter function.
It takes a predicate. It returns a function object that takes an iterable.
It returns an iterable that can be used in a for(:) loop.
template<class It>
struct range_t {
It b, e;
It begin() const { return b; }
It end() const { return e; }
bool empty() const { return begin()==end(); }
};
template<class It>
range_t<It> range( It b, It e ) { return {std::move(b), std::move(e)}; }
template<class It, class F>
struct filter_helper:range_t<It> {
F f;
void advance() {
while(true) {
(range_t<It>&)*this = range( std::next(this->begin()), this->end() );
if (this->empty())
return;
if (f(*this->begin()))
return;
}
}
filter_helper(range_t<It> r, F fin):
range_t<It>(r), f(std::move(fin))
{
while(true)
{
if (this->empty()) return;
if (f(*this->begin())) return;
(range_t<It>&)*this = range( std::next(this->begin()), this->end() );
}
}
};
template<class It, class F>
struct filter_psuedo_iterator {
using iterator_category=std::input_iterator_tag;
filter_helper<It, F>* helper = nullptr;
bool m_is_end = true;
bool is_end() const {
return m_is_end || !helper || helper->empty();
}
void operator++() {
helper->advance();
}
typename std::iterator_traits<It>::reference
operator*() const {
return *(helper->begin());
}
It base() const {
if (!helper) return {};
if (is_end()) return helper->end();
return helper->begin();
}
friend bool operator==(filter_psuedo_iterator const& lhs, filter_psuedo_iterator const& rhs) {
if (lhs.is_end() && rhs.is_end()) return true;
if (lhs.is_end() || rhs.is_end()) return false;
return lhs.helper->begin() == rhs.helper->begin();
}
friend bool operator!=(filter_psuedo_iterator const& lhs, filter_psuedo_iterator const& rhs) {
return !(lhs==rhs);
}
};
template<class It, class F>
struct filter_range:
private filter_helper<It, F>,
range_t<filter_psuedo_iterator<It, F>>
{
using helper=filter_helper<It, F>;
using range=range_t<filter_psuedo_iterator<It, F>>;
using range::begin; using range::end; using range::empty;
filter_range( range_t<It> r, F f ):
helper{{r}, std::forward<F>(f)},
range{ {this, false}, {this, true} }
{}
};
template<class F>
auto filter( F&& f ) {
return [f=std::forward<F>(f)](auto&& r)
{
using std::begin; using std::end;
using iterator = decltype(begin(r));
return filter_range<iterator, std::decay_t<decltype(f)>>{
range(begin(r), end(r)), f
};
};
};
I took short cuts. A real library should make real iterators, not the for(:)-qualifying pseudo-fascades I did.
At point of use, it looks like this:
int main()
{
std::vector<int> test = {1,2,3,4,5};
for( auto i: filter([](auto x){return x%2;})( test ) )
std::cout << i << '\n';
}
which is pretty nice, and prints
1
3
5
Live example.
There is a proposed addition to C++ called Rangesv3 which does this kind of thing and more. boost also has filter ranges/iterators available. boost also has helpers that make writing the above much shorter.
One style that gets used enough to mention, but hasn't been mentioned yet, is:
for(int i=0; i<myCollection.size(); i++) {
if (myCollection[i] != SOMETHING)
continue;
DoStuff();
}
Advantages:
Doesn't change the indentation level of DoStuff(); when condition complexity increases. Logically, DoStuff(); should be at the top-level of the for loop, and it is.
Immediately makes it clear that the loop iterates over the SOMETHINGs of the collection, without requiring the reader to verify that there is nothing after the closing } of the if block.
Doesn't require any libraries or helper macros or functions.
Disadvantages:
continue, like other flow control statements, gets misused in ways that lead to hard-to-follow code so much that some people are opposed to any use of them: there is a valid style of coding that some follow that avoids continue, that avoids break other than in a switch, that avoids return other than at the end of a function.
for(auto const &x: myCollection) if(x == something) doStuff();
Looks pretty much like a C++-specific for comprehension to me. To you?
If DoStuff() would be dependent on i somehow in the future then I'd propose this guaranteed branch-free bit-masking variant.
unsigned int times = 0;
const int kSize = sizeof(unsigned int)*8;
for(int i = 0; i < myCollection.size()/kSize; i++){
unsigned int mask = 0;
for (int j = 0; j<kSize; j++){
mask |= (myCollection[i*kSize+j]==SOMETHING) << j;
}
times+=popcount(mask);
}
for(int i=0;i<times;i++)
DoStuff();
Where popcount is any function doing a population count ( count number of bits = 1 ). There will be some freedom to put more advanced constraints with i and their neighbors. If that is not needed we can strip the inner loop and remake the outer loop
for(int i = 0; i < myCollection.size(); i++)
times += (myCollection[i]==SOMETHING);
followed by a
for(int i=0;i<times;i++)
DoStuff();
Also, if you don't care reordering the collection, std::partition is cheap.
#include <iostream>
#include <vector>
#include <algorithm>
#include <functional>
void DoStuff(int i)
{
std::cout << i << '\n';
}
int main()
{
using namespace std::placeholders;
std::vector<int> v {1, 2, 5, 0, 9, 5, 5};
const int SOMETHING = 5;
std::for_each(v.begin(),
std::partition(v.begin(), v.end(),
std::bind(std::equal_to<int> {}, _1, SOMETHING)), // some condition
DoStuff); // action
}
I am in awe of the complexity of the above solutions. I was going to suggest a simple #define foreach(a,b,c,d) for(a; b; c)if(d) but it has a few obvious deficits, for example, you have to remember to use commas instead of semicolons in your loop, and you can't use the comma operator in a or c.
#include <list>
#include <iostream>
using namespace std;
#define foreach(a,b,c,d) for(a; b; c)if(d)
int main(){
list<int> a;
for(int i=0; i<10; i++)
a.push_back(i);
for(auto i=a.begin(); i!=a.end(); i++)
if((*i)&1)
cout << *i << ' ';
cout << endl;
foreach(auto i=a.begin(), i!=a.end(), i++, (*i)&1)
cout << *i << ' ';
cout << endl;
return 0;
}
Another solution in case the i:s are important. This one builds a list that fills in the indexes of which to call doStuff() for. Once again the main point is to avoid the branching and trade it for pipelineable arithmetic costs.
int buffer[someSafeSize];
int cnt = 0; // counter to keep track where we are in list.
for( int i = 0; i < container.size(); i++ ){
int lDecision = (container[i] == SOMETHING);
buffer[cnt] = lDecision*i + (1-lDecision)*buffer[cnt];
cnt += lDecision;
}
for( int i=0; i<cnt; i++ )
doStuff(buffer[i]); // now we could pass the index or a pointer as an argument.
The "magical" line is the buffer loading line that arithmetically calculates wether to keep the value and stay in position or to count up position and add value. So we trade away a potential branch for some logics and arithmetics and maybe some cache hits. A typical scenario when this would be useful is if doStuff() does a small amount of pipelineable calculations and any branch in between calls could interrupt those pipelines.
Then just loop over the buffer and run doStuff() until we reach cnt. This time we will have the current i stored in the buffer so we can use it in the call to doStuff() if we would need to.
One can describe your code pattern as applying some function to a subset of a range, or in other words: applying it to the result of applying a filter to the whole range.
This is achievable in the most straightforward manner with Eric Neibler's ranges-v3 library; although it's a bit of an eyesore, because you want to work with indices:
using namespace ranges;
auto mycollection_has_something =
[&](std::size_t i) { return myCollection[i] == SOMETHING };
auto filtered_view =
views::iota(std::size_t{0}, myCollection.size()) |
views::filter(mycollection_has_something);
for (auto i : filtered_view) { DoStuff(); }
But if you're willing to forego indices, you'd get:
auto is_something = [&SOMETHING](const decltype(SOMETHING)& x) { return x == SOMETHING };
auto filtered_collection = myCollection | views::filter(is_something);
for (const auto& x : filtered_collection) { DoStuff(); }
which is nicer IMHO.
PS - The ranges library is mostly going into the C++ standard in C++20.
I'll just mention Mike Acton, he would definitely say:
If you have to do that, you have a problem with your data. Sort your data!
I'm using multitreading and want to merge the results. For example:
std::vector<int> A;
std::vector<int> B;
std::vector<int> AB;
I want AB to have to contents of A and the contents of B in that order. What's the most efficient way of doing something like this?
AB.reserve( A.size() + B.size() ); // preallocate memory
AB.insert( AB.end(), A.begin(), A.end() );
AB.insert( AB.end(), B.begin(), B.end() );
This is precisely what the member function std::vector::insert is for
std::vector<int> AB = A;
AB.insert(AB.end(), B.begin(), B.end());
Depends on whether you really need to physically concatenate the two vectors or you want to give the appearance of concatenation of the sake of iteration. The boost::join function
http://www.boost.org/doc/libs/1_43_0/libs/range/doc/html/range/reference/utilities/join.html
will give you this.
std::vector<int> v0;
v0.push_back(1);
v0.push_back(2);
v0.push_back(3);
std::vector<int> v1;
v1.push_back(4);
v1.push_back(5);
v1.push_back(6);
...
BOOST_FOREACH(const int & i, boost::join(v0, v1)){
cout << i << endl;
}
should give you
1
2
3
4
5
6
Note boost::join does not copy the two vectors into a new container
but generates a pair of iterators (range) that cover the span of
both containers. There will be some performance overhead but maybe
less that copying all the data to a new container first.
In the direction of Bradgonesurfing's answer, many times one doesn't really need to concatenate two vectors (O(n)), but instead just work with them as if they were concatenated (O(1)). If this is your case, it can be done without the need of Boost libraries.
The trick is to create a vector proxy: a wrapper class which manipulates references to both vectors, externally seen as a single, contiguous one.
USAGE
std::vector<int> A{ 1, 2, 3, 4, 5};
std::vector<int> B{ 10, 20, 30 };
VecProxy<int> AB(A, B); // ----> O(1). No copies performed.
for (size_t i = 0; i < AB.size(); ++i)
std::cout << AB[i] << " "; // 1 2 3 4 5 10 20 30
IMPLEMENTATION
template <class T>
class VecProxy {
private:
std::vector<T>& v1, v2;
public:
VecProxy(std::vector<T>& ref1, std::vector<T>& ref2) : v1(ref1), v2(ref2) {}
const T& operator[](const size_t& i) const;
const size_t size() const;
};
template <class T>
const T& VecProxy<T>::operator[](const size_t& i) const{
return (i < v1.size()) ? v1[i] : v2[i - v1.size()];
};
template <class T>
const size_t VecProxy<T>::size() const { return v1.size() + v2.size(); };
MAIN BENEFIT
It's O(1) (constant time) to create it, and with minimal extra memory allocation.
SOME STUFF TO CONSIDER
You should only go for it if you really know what you're doing when dealing with references. This solution is intended for the specific purpose of the question made, for which it works pretty well. To employ it in any other context may lead to unexpected behavior if you are not sure on how references work.
In this example, AB does not provide a non-const
access operator ([ ]). Feel free to include it, but keep in mind: since AB contains references, to assign it
values will also affect the original elements within A and/or B. Whether or not this is a
desirable feature, it's an application-specific question one should
carefully consider.
Any changes directly made to either A or B (like assigning values,
sorting, etc.) will also "modify" AB. This is not necessarily bad
(actually, it can be very handy: AB does never need to be explicitly
updated to keep itself synchronized to both A and B), but it's
certainly a behavior one must be aware of. Important exception: to resize A and/or B to sth bigger may lead these to be reallocated in memory (for the need of contiguous space), and this would in turn invalidate AB.
Because every access to an element is preceded by a test (namely, "i
< v1.size()"), VecProxy access time, although constant, is also
a bit slower than that of vectors.
This approach can be generalized to n vectors. I haven't tried, but
it shouldn't be a big deal.
Based on Kiril V. Lyadvinsky answer, I made a new version. This snippet use template and overloading. With it, you can write vector3 = vector1 + vector2 and vector4 += vector3. Hope it can help.
template <typename T>
std::vector<T> operator+(const std::vector<T> &A, const std::vector<T> &B)
{
std::vector<T> AB;
AB.reserve(A.size() + B.size()); // preallocate memory
AB.insert(AB.end(), A.begin(), A.end()); // add A;
AB.insert(AB.end(), B.begin(), B.end()); // add B;
return AB;
}
template <typename T>
std::vector<T> &operator+=(std::vector<T> &A, const std::vector<T> &B)
{
A.reserve(A.size() + B.size()); // preallocate memory without erase original data
A.insert(A.end(), B.begin(), B.end()); // add B;
return A; // here A could be named AB
}
One more simple variant which was not yet mentioned:
copy(A.begin(),A.end(),std::back_inserter(AB));
copy(B.begin(),B.end(),std::back_inserter(AB));
And using merge algorithm:
#include <algorithm>
#include <vector>
#include <iterator>
#include <iostream>
#include <sstream>
#include <string>
template<template<typename, typename...> class Container, class T>
std::string toString(const Container<T>& v)
{
std::stringstream ss;
std::copy(v.begin(), v.end(), std::ostream_iterator<T>(ss, ""));
return ss.str();
};
int main()
{
std::vector<int> A(10);
std::vector<int> B(5); //zero filled
std::vector<int> AB(15);
std::for_each(A.begin(), A.end(),
[](int& f)->void
{
f = rand() % 100;
});
std::cout << "before merge: " << toString(A) << "\n";
std::cout << "before merge: " << toString(B) << "\n";
merge(B.begin(),B.end(), begin(A), end(A), AB.begin(), [](int&,int&)->bool {});
std::cout << "after merge: " << toString(AB) << "\n";
return 1;
}
All the solutions are correct, but I found it easier just write a function to implement this. like this:
template <class T1, class T2>
void ContainerInsert(T1 t1, T2 t2)
{
t1->insert(t1->end(), t2->begin(), t2->end());
}
That way you can avoid the temporary placement like this:
ContainerInsert(vec, GetSomeVector());
For this use case, if you know beforehand the number of results each thread produces, you could preallocate AB and pass a std::span to each thread. This way the concatenation need not be done. Example:
std::vector<int> AB(total_number_of_results, 0);
std::size_t chunk_length = …;
std::size_t chunk2_start = chunk_length;
std::size_t chunk3_start = 2 * chunk_length; // If needed
…
// Pass these to the worker threads.
std::span<int> A(AB.data(), chunk_length);
std::span<int> B(AB.data() + chunk2_start, chunk_length);
…
My answer is based on Mr.Ronald Souza's original solution. In addition to his original solution, I've written a vector proxy that supports iterators too!
short description for people who are not aware of the context of the original solution: the joined_vector template class (i.e the vector proxy)takes two references of two vectors as constructor arguments, it then treats them as one contiguous vector. My implementation also supports a forward-iterator.
USAGE:
int main()
{
std::vector<int> a1;
std::vector<int> a2;
joined_vector<std::vector<int>> jv(a1,a2);
for (int i = 0; i < 5; i++)
a1.push_back(i);
for (int i = 5; i <=10; i++)
a2.push_back(i);
for (auto e : jv)
std::cout << e<<"\n";
for (int i = 0; i < jv.size(); i++)
std::cout << jv[i] << "\n";
return 0;
}
IMPLEMENTATION:
template<typename _vec>
class joined_vector
{
_vec& m_vec1;
_vec& m_vec2;
public:
struct Iterator
{
typedef typename _vec::iterator::value_type type_value;
typedef typename _vec::iterator::value_type* pointer;
typedef typename _vec::iterator::value_type& reference;
typedef std::forward_iterator_tag iterator_category;
typedef std::ptrdiff_t difference_type;
_vec* m_vec1;
_vec* m_vec2;
Iterator(pointer ptr) :m_ptr(ptr)
{
}
Iterator operator++()
{
if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
m_ptr = &(*m_vec2)[0];
else
++m_ptr;
return m_ptr;
}
Iterator operator++(int)
{
pointer curr = m_ptr;
if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
m_ptr = &(*m_vec2)[0];
else
++m_ptr;
return curr;
}
reference operator *()
{
return *m_ptr;
}
pointer operator ->()
{
return m_ptr;
}
friend bool operator == (Iterator& itr1, Iterator& itr2)
{
return itr1.m_ptr == itr2.m_ptr;
}
friend bool operator != (Iterator& itr1, Iterator& itr2)
{
return itr1.m_ptr != itr2.m_ptr;
}
private:
pointer m_ptr;
};
joined_vector(_vec& vec1, _vec& vec2) :m_vec1(vec1), m_vec2(vec2)
{
}
Iterator begin()
{
//checkes if m_vec1 is empty and gets the first elemet's address,
//if it's empty then it get's the first address of the second vector m_vec2
//if both of them are empty then nullptr is returned as the first pointer
Iterator itr_beg((m_vec1.size() != 0) ? &m_vec1[0] : ((m_vec2.size() != 0) ? &m_vec2[0] : nullptr));
itr_beg.m_vec1 = &m_vec1;
itr_beg.m_vec2 = &m_vec2;
return itr_beg;
}
Iterator end()
{
//check if m_vec2 is empty and get the last address of that vector
//if the second vector is empty then the m_vec1's vector/the first vector's last element's address is taken
//if both of them are empty then a null pointer is returned as the end pointer
typename _vec::value_type* p = ((m_vec2.size() != 0) ? &m_vec2[m_vec2.size() - 1] : ((m_vec1.size()) != 0 ? &m_vec1[m_vec1.size() - 1] : nullptr));
Iterator itr_beg(p != nullptr ? p + 1 : nullptr);
itr_beg.m_vec1 = &m_vec1;
itr_beg.m_vec2 = &m_vec2;
return itr_beg;
}
typename _vec::value_type& operator [](int i)
{
if (i < m_vec1.size())
return m_vec1[i];
else
return m_vec2[i - m_vec1.size()];
}
size_t size()
{
return m_vec1.size() + m_vec2.size();
}
};
If your vectors are sorted*, check out set_union from <algorithm>.
set_union(A.begin(), A.end(), B.begin(), B.end(), AB.begin());
There's a more thorough example in the link.