How to implement a lazily evaluated function on two C++20 ranges? - c++

There is a zip_with function provided by Eric Niebler.
But, now that C++20 have support for ranges I would like to build something similar.
The problem with filter and transform is that they iterate a range?
How would I go about doing this? I have been stuck with this for a while and would hate to use Expression Templates for the same.
Let's say for example I have two vectors M1{1,2,3} and M2{4,5,6}.
I would like to use the ranges library to overload a operator to return a view which contains matrix addition of these two - M1+M2 := {5,7,9}.
With ranges-v3, I can perform auto sum = zip_with(std::plus,M1,M2);
The above expression is evaluated lazily. How can I re-create this expression with C++20 Ranges?

The principle is quite trivial. Create an iterator that stores an iterator for each vectors, that when incremented, increments the two stored iterators and does the addition only when it is dereferenced.
Here is a piece of code that shoes the principle:
template <class It1, class It2>
struct adder_iterator{
It1 it1;
It2 it2;
decltype(auto)
operator++(){
++it1; ++it2;
return *this;
}
auto
operator *()const{
return *it1+*it2;
}
//....
};
You will also need to implement a sentinel and a view (by deriving from std::view_interface).
The sentinel is the end iterator. You can use the adder_iterator class for that. But you can think about optimization: in your view constructor, you ensure that the shortest vector begin iterator is always it1 end then only use this iterator to test the end of the iteration. You should try to see.

I don't know what is allowed in c++20, but the following works with range-v3's cpp20 namespace.
#include <range/v3/all.hpp>
#include <vector>
#include <iostream>
int main() {
std::vector<int> m1 = {1, 2, 3};
std::vector<int> m2 = {4, 5, 6};
auto sum = ranges::cpp20::views::transform(m1, m2, std::plus{});
for (auto i : sum)
std::cout << i << " "; // 5 7 9
}

Related

Is it valid to call operator-- for an iterator when it points to std::begin()

Is it valid to call operator-- on an iterator that already points to the first element of the collection? Does the answer change for different collections (e.g. list vs vector vs set). E.g. see below
#include <algorithm>
#include <iostream>
#include <numeric>
#include <string>
#include <vector>
int main()
{
std::vector<int> v {1, 2, 4, 8, 16};
auto i=v.begin();
auto j=i;
--i; // not sure what's the effect of this
v.erase(j);
++i; // also not sure if this is not std::begin() anymore, what's the effect of ++
v.erase(i);
// Print vector.
std::for_each(v.begin(), v.end(), [](const int n) { std::cout << n << ' '; });
}
I suspect it's undefined behaviour but not quite sure.
Furthermore, what about removing elements from a std::list like below
std::list<int> v = { 1, 2, 3, 4, 5, 6 };
for (auto it = v.begin(); it != v.end(); it++)
{
v.erase(it--);
}
Let's take std::list as an example, because essentially the same reasoning will apply to the other containers.
Looking at the member types of std::list, we see that std::list::iterator is a LegacyBidirectionalIterator. Checking the description there, we see the following precondition listed for operator-- to be valid:
Preconditions:
a is decrementable (there exists such b that a == ++b)
This is not the case for an iterator to the first element in a container, and indeed cppreference explicitly calls this out:
The begin iterator is not decrementable and the behavior is undefined if --container.begin() is evaluated.
Other containers like std::vector use more expansive notions like LegacyRandomAccessIterator, but there's nothing there that changes the behavior of decrementing a begin iterator.

Applying std::min_element only on elments that satisfy a condition

Is there a way to find the minimum odd element of a vector of integers without basically reimplementing std::min_element and without doing additional work like computing the vector of odd integers first?
While a custom comparison object suggested in another answer will be a simple solution for std::min_element (and similar) in particular, it won't work with all standard algorithms. A general approach that works with any standard algorithm is to define a custom iterator.
Customising, combining and extending standard algorithms can nearly always be achieved with iterators. Writing custom iterators from scratch involves a lot of boilerplate and unfortunately standard doesn't provide templates for many iterator adaptors. Boost does provide plenty of iterator adaptor templates, and in this case boost::filter_iterator should prove useful.
Instead of the more traditional iterator algorithms, you could use range algorithms instead.
Since C++20, there are a host of standard range adaptors for range algorithms which are easy to compose:
auto it = std::ranges::min_element(
container | std::views::filter(condition)
);
Note that at the moment of writing, only libstdc++ has implemented the ranges standard library.
A simple solution consists in using a custom comparator function with sd::min_element.
What should be added in the following code is to check that the obtained value is odd indeed, as mentioned by #MSalters in their answer and by #Kevin in a comment.
#include <iostream>
#include <vector>
#include <algorithm>
int main() {
std::vector<int> v = {0, 3, 4, 1};
auto comp = [](int a, int b) {
if ((a%2) and (b%2 == 0)) return true;
if ((a%2 == 0) and (b%2)) return false;
return a < b;
};
auto min_odd = std::min_element (v.begin(), v.end(), comp);
std::cout << *min_odd << std::endl;
}
A C++20 solution:
std::vector<int> ints{0, 1, 2, 3, 4, 5};
auto odd = [](int i) { return bool(i % 2); };
auto e = std::ranges::min_element(ints | std::views::filter(odd));
Yes, that's not very hard. Implement a custom comparison that sorts each even element above all odd elements. You still need to sort the odd elements in their usual order, and at the end check that there was at least one odd element in the vector.

What is the difference between std::transform and std::for_each?

Both can be used to apply a function to a range of elements.
On a high level:
std::for_each ignores the return value of the function, and
guarantees order of execution.
std::transform assigns the return value to the iterator, and does
not guarantee the order of execution.
When do you prefer using the one versus the other? Are there any subtle caveats?
std::transform is the same as map. The idea is to apply a function to each element in between the two iterators and obtain a different container composed of elements resulting from the application of such a function. You may want to use it for, e.g., projecting an object's data member into a new container. In the following, std::transform is used to transform a container of std::strings in a container of std::size_ts.
std::vector<std::string> names = {"hi", "test", "foo"};
std::vector<std::size_t> name_sizes;
std::transform(names.begin(), names.end(), std::back_inserter(name_sizes), [](const std::string& name) { return name.size();});
On the other hand, you execute std::for_each for the sole side effects. In other words, std::for_each closely resembles a plain range-based for loop.
Back to the string example:
std::for_each(name_sizes.begin(), name_sizes.end(), [](std::size_t name_size) {
std::cout << name_size << std::endl;
});
Indeed, starting from C++11 the same can be achieved with a terser notation using range-based for loops:
for (std::size_t name_size: name_sizes) {
std::cout << name_size << std::endl;
}
Your high level overview
std::for_each ignores the return value of the function and guarantees order of execution.
std::transform assigns the return value to the iterator, and does not guarantee the order of execution.
pretty much covers it.
Another way of looking at it (to prefer one over the other);
Do the results (the return value) of the operation matter?
Is the operation on each element a member method with no return value?
Are there two input ranges?
One more thing to bear in mind (subtle caveat) is the change in the requirements of the operations of std::transform before and after C++11 (from en.cppreference.com);
Before C++11, they were required to "not have any side effects",
After C++11, this changed to "must not invalidate any iterators, including the end iterators, or modify any elements of the ranges involved"
Basically these were to allow the undetermined order of execution.
When do I use one over the other?
If I want to manipulate each element in a range, then I use for_each. If I have to calculate something from each element, then I would use transform. When using the for_each and transform, I normally pair them with a lambda.
That said, I find my current usage of the traditional for_each being diminished somewhat since the advent of the range based for loops and lambdas in C++11 (for (element : range)). I find its syntax and implementation very natural (but your mileage here will vary) and a more intuitive fit for some use cases.
Although the question has been answered, I believe that this example would clarify the difference further.
for_each belongs to non-modifying STL operations, meaning that these operations do not change elements of the collection or the collection itself. Therefore, the value returned by for_each is always ignored and is not assigned to a collection element.
Nonetheless, it is still possible to modify elements of collection, for example when an element is passed to the f function using reference. One should avoid such behavior as it is not consistent with STL principles.
In contrast, transform function belongs to modifying STL operations and applies given predicates (unary_op or binary_op) to elements of the collection or collections and store results in another collection.
#include <vector>
#include <iostream>
#include <algorithm>
#include <functional>
using namespace std;
void printer(int i) {
cout << i << ", ";
}
int main() {
int mynumbers[] = { 1, 2, 3, 4 };
vector<int> v(mynumbers, mynumbers + 4);
for_each(v.begin(), v.end(), negate<int>());//no effect as returned value of UnaryFunction negate() is ignored.
for_each(v.begin(), v.end(), printer); //guarantees order
cout << endl;
transform(v.begin(), v.end(), v.begin(), negate<int>());//negates elements correctly
for_each(v.begin(), v.end(), printer);
return 0;
}
which will print:
1, 2, 3, 4,
-1, -2, -3, -4,
Real example of using std::tranform is when you want to convert a string to uppercase, you can write code like this :
std::transform(s.begin(), s.end(), std::back_inserter(out), ::toupper);
if you will try to achieve same thing with std::for_each like :
std::for_each(s.begin(), s.end(), ::toupper);
It wont convert it into uppercase string

How to use functional programming in C++11 to obtain the keys from a map?

An std::map<K,V> m, in a mathematical view, is a function fm in which all pairs of domain and range elements (x,y) ∈ K × V such that fm(x) = y.
So, I want to get the domain of fm, i.e. the set of all keys (or perhaps the range - the set of all values). I can do this procedurally with C++11, like so:
std::unordered_set<K> keys;
for (const auto& kv_pair : m) { keys.insert(kv_pair->first); }
right? But - I want to do it functionally (read: In a fancy way which makes me feel superior). How would I do that?
Notes:
I do not necessarily need the result to be an std::unordered_set; something which would could replace such a set would probably work too (e.g. a set Facade).
Readability, (reasonable) terseness and avoiding gratuitous copying of data are all considerations.
Boost.Range provides exactly that, with the adaptor map_keys. Look at this example from the documentation.
You can write:
auto keys = m | boost::adaptors::map_keys;
// keys is a range view to the keys in your map, no copy involved
// you can use keys.begin() and keys.end() to iterate over it
EDIT : I'll leave my old answer below, it uses iterators instead of ranges. Notice that
the range represented by the two boost::transform_iterator still represents the set of keys in your map.
IMO the functional way to do that would require an iterator that points to the keys of the map, so that you can simply use std::copy.
It makes sense because you are not transforming or accumulating anything, you are just copying the keys.
Unfortunately the standard does not provide iterator adaptors, but you can use those provided by Boost.Iterator.
#include <algorithm>
#include <map>
#include <unordered_set>
#include <boost/iterator/transform_iterator.hpp>
struct get_first
{
template<class A, class B>
const A & operator()(const std::pair<A,B> & val) const
{
return val.first;
}
};
int main()
{
std::map<int, std::string> m;
std::unordered_set<int> r;
// ...
std::copy(boost::make_transform_iterator(m.begin(), get_first{}),
boost::make_transform_iterator(m.end(), get_first{}),
std::inserter(r, r.end()) );
}
It would be more expressive to have an iterator that dereferences the Kth element of a tuple/pair, but transform_iterator will do the job fine.
IMHO, an important characteristic for intuitive functional code is that the algorithm actually return the result, rather than setting some variable elsewhere as a side effect. This can be done with std::accumulate, e.g.:
#include <iostream>
#include <set>
#include <map>
#include <algorithm>
int main()
{
typedef std::map<int, int> M;
M m { {1, -1}, {2, -2}, {3, -3}, {4, -4} };
auto&& x = std::accumulate(std::begin(m), std::end(m), std::set<int>{},
[](std::set<int>& s, const M::value_type& e)
{
return s.insert(e.first), std::move(s);
// .first is key,
}); // .second is value
for (auto& i : x)
std::cout << i << ' ';
std::cout << '\n';
}
Output:
1 2 3 4
See it run here
The std::begin(m), std::end(m) bit is actually a big headache, as it frustrates chaining of such operations. For example, it's be ideal if we could chain "functional" operations like our "GET KEYS" above alongside others...
x = m. GET KEYS . SQUARE THEM ALL . REMOVE THE ODD ONES
...or at least...
x = f(f(f(m, GET KEYS), SQUARE THEM ALL), REMOVE THE ODD ONES)
...but you'll have to write some trivial code yourself to get there or pick up a library supporting functional "style".
There's a number of ways you could write this. One slightly more 'functional' way is:
vector<string> keys;
transform(begin(m), end(m), back_inserter(keys), [](const auto& p){ return p.first; });
But to really improve on this and enable a more functional style using the standard library we need something like Eric Niebler's Range Proposal to be standardized. In the meantime, there are a number of non-standard range based libraries like Eric's range-v3 and boost Range you can use to get a more functional style.
std::map<int, int> m;
std::unordered_set<int> keys;
std::for_each(m.begin(), m.end(), [&keys](decltype(*m.begin()) kv)-> void {keys.insert(kv.first);});

Saving function evaluations while using std::min_element()

Suppose you are given a vector of 2D points and are expected to find the point with the least Euclidean norm.
The points are provided as std::vector<point_t> points whith the following typedef std::pair<double, double> point_t. The norm can be calculated using
double norm(point_t p)
{
return pow(p.first, 2) + pow(p.second, 2);
}
Writing the loop myself I would do the following:
auto leastPoint = points.cend();
auto leastNorm = std::numeric_limits<double>::max();
for (auto iter = points.cbegin(), end = points.cend(); iter != end; ++iter)
{
double const currentNorm = norm(*iter);
if (currentNorm < leastNorm)
{
leastNorm = currentNorm;
leastPoint = iter;
}
}
But one should use STL algorithms instead of wirting one's own loops, so I'm tempted to to the following:
auto const leastPoint = std::min_element(points.cbegin(), points.cend(),
[](point_t const lhs, point_t const rhs){ return norm(lhs) < norm(rhs); });
But there is a caveat: if n = points.size() then the first implementation needs n evaluations of norm(), but the second implementation needs 2n-2 evaluations. (at least if this possible implementation is used)
So my question is if there exists any STL algorithm with which I can find that point but with only n evaluations of norm()?
Notes:
I am aware that big-Oh complexity is the same, but still the latter will lead to twice as many evaluations
Creating a separate vector and populating it with the distances seems a bit overkill just to enable the usage of an STL algorithm - different opinions on that?
edit: I actually need an iterator to that vector element to erase that point.
You could use std::accumulate (in the algorithm header):
Accumulate receive:
range
initial value
binary operator (optional, if not passed, operator+ would be called)
The initial value and every element of the range would be feed into the operator, the operator would return a result of the type of the initial value that would be feed into the next call to operator with the next element of the range and so on.
Example Code (Tested GCC 4.9.0 with C++11):
#include <algorithm>
#include <iostream>
#include <vector>
#include <cmath>
typedef std::pair<double, double> point_t;
struct norm_t {
point_t p;
double norm;
};
double norm(const point_t& p) {
return std::pow(p.first, 2) + std::pow(p.second, 2);
}
norm_t min_norm(const norm_t& x, const point_t& y) {
double ny = norm(y);
if (ny < x.norm)
return {y, ny};
return x;
}
int main() {
std::vector<point_t> v{{1, 2}, {3, 4}, {5, 6}, {7, 8}, {9, 10}};
norm_t first_norm{v[0], norm(v[0])};
auto min_norm_point =
std::accumulate(v.begin(), v.end(), first_norm, min_norm);
std::cout << "(" << min_norm_point.p.first << "," << min_norm_point.p.second
<< "): " << min_norm_point.norm << '\n';
}
You could cache the minimum norm in the functor for avoid extra calculation (be aware: I'm using info about the implementation of std::min_element). The second element is the smallest found and the first is the iteration element.
struct minimum_norm {
minimum_norm() : cached_norm(-1) {}
bool operator()(const point_t& first, const point_t& second) {
if (cached_norm == -1)
cached_norm = norm(second);
double norm_first = norm(first);
if (norm_first < cached_norm) {
cached_norm = norm_first;
return true;
}
return false;
}
private:
double cached_norm;
};
int main()
{
std::vector<point_t> v{{3, 4}, {5, 6}, {1, 2}, {7, 8}, {9, 10}};
auto result = std::min_element(std::begin(v), std::end(v), minimum_norm());
std::cout << "min element at: " << std::distance(std::begin(v), result) << std::endl;
}
This is the sort of problem that boost::transform_iterator from the boost iterator library is designed to solve. There are limitations with the decorated iterator approach however and the C++ standards committee Ranges working group is looking into adding ranges to the standard which would potentially allow for a more functional approach of piping e.g. a transform to a min_element without needing temporary storage.
Eric Niebler has some interesting posts on ranges at his blog.
Unfortunately transform_iterator doesn't quite solve your problem given the way min_element is typically implemented - both iterators are dereferenced for each comparison so your function will still end up getting called more often than necessary. You could use the boost iterator_adaptor to implement something like a 'caching_transform_iterator' which avoids recomputing on each dereference but it would probably be overkill for something like norm(). It might be a useful technique if you had a more expensive computation though.
EDIT: Nevermind this, I misread the question.
I think you are mistaken in your assumption that min_element will perform 2N-2 comparisons
Per the c++ reference of min_element you can see that the algorithm performs essentially N comparison, which is the minimum for an unsorted array.
Here is a copy for the (very) unlikely case that www.cplusplus.com ever fails.
template <class ForwardIterator>
ForwardIterator min_element ( ForwardIterator first, ForwardIterator last )
{
if (first==last) return last;
ForwardIterator smallest = first;
while (++first!=last)
if (*first<*smallest) // or: if (comp(*first,*smallest)) for version (2)
smallest=first;
return smallest;
}