Range transformations with stateful lambdas and std::views::drop

Range transformations with stateful lambdas and std::views::drop - c++

it's my first time digging into the new <ranges> library and I tried a little experiment combining std::views::transform with a stateful lambda and 'piping' the resulting range to std::views::drop:
#include <iostream>
#include <ranges>
#include <vector>
using namespace std;
int main() {
auto aggregator = [sum = 0](int val) mutable
{
return sum += val;
};
vector<int> data{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
cout << "Expected:\n";
int sum = 0;
for (int val : data) {
cout << (sum += val) << ' ';
}
cout << '\n';
cout << "Transformation:\n- - - ";
for (int val : data | views::transform(aggregator) | views::drop(3)) {
cout << val << ' ';
}
cout << '\n';
}
The output was:
Expected:
1 3 6 10 15 21 28 36 45 55
Transformation:
- - - 4 9 15 22 30 39 49
Now, the difference between each expected and actual output is 1 + 2 + 3 = 6. I am guessing it is a result of the lazy evaluation of ranges that causes std::views::drop to disregard the first three transformations.
Is there a way for me to force the evaluation of the aggregator functor for the three elements I drop? Or are stateful lambdas and ranges considered incompatible?

transform_view is required to be a pure function. This is codified in the regular_invocable concept:
The invoke function call expression shall be equality-preserving and shall not modify the function object or the arguments.
This is important to allow transform_view to not lie about its iterator status. Forward iterators, for example, are supposed to allow multipass iteration. This means that the value at each iterator position within the range must be independent of any other iterator position. That's not possible if the transformation functor is not pure.
Note that all predicate functors are also regular_invocable. So this also applies to things like filter_view and take_while_view.
Note that the algorithm transform does not have this requirement.

Related

What is wrong with my comparison function?

I am trying to sort vector of pair of pair and int as below. But not getting expected output. In the actual output, the last element is supposed to come before the second element.Can someone please explain what i am missing?
int main()
{
using elem_type = std::pair<std::pair<int,int>,int>;
std::vector<elem_type> vec;
vec.push_back(std::make_pair(std::make_pair(3, 1), 2));
vec.push_back(std::make_pair(std::make_pair(6, 5), 4));
vec.push_back(std::make_pair(std::make_pair(6, 4), 7));
vec.push_back(std::make_pair(std::make_pair(5, 4), 6));
auto cmp = [](const elem_type & left, const elem_type & right){
return ((left.first.first< right.first.first)
&&
(left.first.second < right.first.second));
};
std::sort(vec.begin(), vec.end(), cmp);
//print sorted vector
for(size_t i = 0; i < vec.size(); ++i){
std::cout << vec[i].first.first << " " << vec[i].first.second << " " << vec[i].second << "\n";
}
}
Expected output
3 1 2
5 4 6
6 4 7
6 5 4
Actual output
3 1 2
6 5 4
6 4 7
5 4 6

You haven't explained how you want to sort your triples, so all I can say is that your expectations are wrong.
Your comparison function considers your last three elements to be equal.
A triple (x0,x1,x2) is considered less than another triple (y0,y1,y2) if x0 < y0 and x1 < y1. For example, when comparing (6,4,7) and (6,5,4), neither triple is considered less than the other because the first number in each triple is the same (6 < 6 is false). Similarly, (5,4,6) is considered equal to (6,4,7) because neither is less than the other (4 < 4 is false).
The only thing you might reasonably expect is that (5,4,6) < (6,5,4), but your comparison function also says both of those are equal to (6,4,7). In other words, the function claims there are values a, b, c where a = b and b = c but a < c. This makes no sense, so your comparison function is broken.
If all you want is a lexicographical ordering, you don't need to do anything special:
std::sort(vec.begin(), vec.end());
std::pair sorts by its first component first; if those are equal, it compares the second components. That seems to be exactly the behavior you expect.

A vector as a patchwork of two other vectors

Subset a vector
Below is the benchmark of two different solutions to subset a vector
#include <vector>
#include <iostream>
#include <iomanip>
#include <sys/time.h>
using namespace std;
int main()
{
struct timeval timeStart,
timeEnd;
// Build the vector 'whole' to subset
vector<int> whole;
for (int i = 0 ; i < 10000000 ; i++)
    {
whole.push_back(i);
}
// Solution 1 - Use a for loops
gettimeofday(&timeStart, NULL);
vector<int> subset1;
subset1.reserve(9123000 - 1200);
for (int i = 1200 ; i < 9123000 ; i++)
{
subset1.push_back(i);
}
gettimeofday(&timeEnd, NULL);
cout << "Solution 1 took " << ((timeEnd.tv_sec - timeStart.tv_sec) * 1000000 + timeEnd.tv_usec - timeStart.tv_usec) << " us" << endl;
// Solution 2 - Use iterators and constructor
gettimeofday(&timeStart, NULL);
vector<int>::iterator first = whole.begin() + 1200;
vector<int>::iterator last = whole.begin() + 9123000;
vector<int> subset2(first, last);
gettimeofday(&timeEnd, NULL);
cout << "Solution 2 took " << ((timeEnd.tv_sec - timeStart.tv_sec) * 1000000 + timeEnd.tv_usec - timeStart.tv_usec) << " us" << endl;
}
On my old laptop, it outputs
Solution 1 took 243564 us
Solution 2 took 164220 us
Clearly solution 2 is faster.
Make a patchwork of two vectors
I would like to create a vector as a patchwork of two different vectors of the same size. The vector starts as one and then takes the value of the other and back and forth. I guess I don't fully understand how to copy values to a vector by using iterator pointing to elements in another vector. The only implementation I can think of requires using an analogous to solution 1 above. Something like...
#include <vector>
#include <iostream>
#include <cmath>
#include <iomanip>
#include <sys/time.h>
#include <limits.h>
using namespace std;
int main()
{
// input
vector<int> breakpoints = {2, 5, 7, INT_MAX};
vector<int> v1 = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
vector<int> v2 = { 10, 20, 30, 40, 50, 60, 70, 80, 90 };
// Create output
vector<int> ExpectedOutput;
ExpectedOutput.reserve(v1.size());
int origin = 0;
int breakpoints_index = 0;
for (int i = 0 ; i < v1.size() ; i++)
{
if (origin)
{
ExpectedOutput.push_back(v1[i]);
} else
{
ExpectedOutput.push_back(v2[i]);
}
if (breakpoints[breakpoints_index] == i)
{
origin = !origin;
breakpoints_index++;
}
}
// print output
cout << "output: ";
for (int i = 0 ; i < ExpectedOutput.size() ; i++)
{
cout << ExpectedOutput[i] << " ";
}
cout << endl;
return 0;
}
which outputs
output: 10 20 30 4 5 6 70 80 9
It feels like there must be a better solution such as something analogous to Solution 2 from above. Is there a faster solution?

Repeating push_back() means that every time around the loop, a check is being performed to ensure capacity() is large enough (if not, then more space must be reserved). When you copy a whole range, only one capacity() check needs to be done.
You can still be a bit smarter with your interleaving by copying chunks. Here's the very basic idea:
int from = 0;
for( int b : breakpoints )
{
std::swap( v1, v2 );
int to = 1 + std::min( b, static_cast<int>( v1.size() ) - 1 );
ExpectedOutput.insert( ExpectedOutput.end(), v1.begin() + from, v1.begin() + to );
from = to;
}
For the sake of brevity, this code actually swaps v1 and v2 and so always operates on v1. I did the swap before the insert, to emulate the logic in your code (which is acting on v2 first). You can do this in a non-modifying way instead if you want.
Of course, you can see a bit more is going on in this code. It would only make sense if you have considerably fewer breakpoints than values. Note that it also assumes v1 and v2 are the same length.

Shorthand for for-loop - syntactic sugar in C++(11)

Actually these are two related questions.
I know there is a new syntax in C++11 for range-based for loops of the form:
//v is some container
for (auto &i: v){
// Do something with i
}
First question: how can I infer at which iteration I am in this loop? (Say I want to fill a vector with value j at position j).
Second question: I wanted to know if there also is some other way to write a loop of the form
for (int i=0; i<100; i++) { ... }
I find this way of writing it a bit cumbersome, and I do this so often and I would love to have a more concise syntax for it.
Something along the lines:
for(i in [0..99]){ ... }
would be great.
For both questions I would like to avoid having to use additional libraries.

First answer: you don't. You've used a simple construct for a simple purpose; you'll need something more complicated if you have more complicated needs.
Second answer: You could make an iterator type that yields consecutive integer values, and a "container" type that gives a range of those. Unless you have a good reason to do it yourself, Boost has such a thing:
#include <boost/range/irange.hpp>
for (int i : boost::irange(0,100)) {
// i goes from 0 to 99 inclusive
}

Use this:
size_t pos = 0;
for (auto& i : v) {
i = pos;
++pos;
}
(Boost is good, but it is not universally accepted.)

For the first question, the answer is pretty simple: if you need the iteration count, don't use the syntactic construct which abstracts away the iteration count. Just use a normal for loop and not the range-based one.
For the second question, I don't think there's anything currently in the standard library, but you could use a boost::irange for it:
for (int i : boost::irange(0, 100))

For the second question - if Boost is too heavy, you could always use this library:
cppitertools
for(auto i : range(10, 15)) { cout << i << '\n'; } will print 10 11 12 13 14
for(auto i : range(20, 30, 2)) { cout << i << '\n'; } will print 20 22 24 26 28
Doubles and other numeric types are supported too.
It has other pythonic iteration tools and is header-only.

You can do both of these things with Boost.Range: http://boost.org/libs/range
boost::adaptors::indexed: element value & index
boost::irange: integer range
For brevity (and to spice things up a little, since boost::irange has been already demonstrated in isolation), here's a sample code demonstrating these features working together:
// boost::adaptors::indexed
// http://www.boost.org/doc/libs/master/libs/range/doc/html/range/reference/adaptors/reference/indexed.html
#include <boost/range/adaptor/indexed.hpp>
// boost::irange
// http://www.boost.org/doc/libs/master/libs/range/doc/html/range/reference/ranges/irange.html
#include <boost/range/irange.hpp>
#include <iostream>
#include <vector>
int main()
{
std::vector<int> input{11, 22, 33, 44, 55};
std::cout << "boost::adaptors::indexed" << '\n';
for (const auto & element : input | boost::adaptors::indexed())
{
std::cout << "Value = " << element.value()
<< " Index = " << element.index()
<< '\n';
}
endl(std::cout);
std::cout << "boost::irange" << '\n';
for (const auto & element : boost::irange(0, 5) | boost::adaptors::indexed(100))
{
std::cout << "Value = " << element.value()
<< " Index = " << element.index()
<< '\n';
}
return 0;
}
Sample output:
boost::adaptors::indexed
Value = 11 Index = 0
Value = 22 Index = 1
Value = 33 Index = 2
Value = 44 Index = 3
Value = 55 Index = 4
boost::irange
Value = 0 Index = 100
Value = 1 Index = 101
Value = 2 Index = 102
Value = 3 Index = 103
Value = 4 Index = 104

If v is a vector (or any std contiguous container), then
for(auto& x : v ) {
size_t i = &x-v.data();
x = i;
}
will set the ith entry to the value i.
An output iterator that counts is reasonably easy to write. Boost has one and has an easy-to-generate range of them called irange.
Extracting the indexes of a container is relatively easy. I have written a function called indexes that can take a container, or a range of integers, and produces random output iterators over the range in question.
That gives you:
for (size_t i : indexes(v) ) {
v[i] = i;
}
There probably is an equivalent container-to-index range function in Boost.
If you need both, and you don't want to do the work, you can write a zipper.
for( auto z : zip( v, indexes(v) ) ) {
auto& x = std::get<0>(z);
size_t i = std::get<1>(z);
x = i;
}
where zip takes two or more iterable ranges (or containers) and produces a range view over tuples of iterator_traits<It>::references to the elements.
Here is Boost zip iterator: http://www.boost.org/doc/libs/1_41_0/libs/iterator/doc/zip_iterator.html -- odds are there is a Boost zip range that handles syntax like the above zip function.

For the 2nd question:
There is another way, but I would not use or recommend it. However, for quickly setting up a test you could write:
if you do not want to use a library and you are fine with only providing the top bound of the range you can write:
for (auto i:vector<bool>(10)) {
cout << "x";
}
This will create a boolean vector of size 10, with uninitialized values. Looping through these unitialized values using i (so do not use i) it will print 10 times "x".

For the second question, if you are using the latest Visual Studio versions, type 'if' then Tab, Tab, and Tab to fill in init value, step-up and so on.

STL algorithm to generate Fibonacci numbers until a certain value is reached

The following code will generate the first 10 Fibonacci numbers using the adjacent_difference algorithm:
v = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
std::adjacent_difference(v.begin(), v.end() - 1, v.begin() + 1, std::plus<int>());
for (auto n : v) {
std::cout << n << ' ';
}
std::cout << '\n';
Output: 1 1 2 3 5 8 13 21 34 55
But what if I want to continue generating Fibonacci numbers until one with a value of (say) 4,000,000 is reached (e.g. not the fourth millionth Fibonacci number, but rather the Nth Fibonacci number whose value happens to be 4 million (or greater)).
Obviously a do while loop with a push_back would do the job, but I was wondering if it were possible to combine an STL algorithm with a back_inserter and lambda function to specify the repeat until condition (e.g. stop inserting after value 4 million is reached or exceeded)?
The problem I see is that most algorithms operate on a range, and ahead of time we do not know how many elements will be required to produce the Fibonacci number with 4 million.

Standard algorithms are there to extract implementations that are common in programming practice. This makes it easier both for you to understand the code and for the reader to understand it. Using built-in algorithms to accumulate the fibonnacci numbers up to a given value is an overkill both for you and for whoever reads your code.
Writing a 'dumb' solution for your usecase is really easy and will be easier to maintain. For instance:
void fibUpTo(int limit) {
int a, b, c;
a = b = 1;
while (a < limit) {
cout << a << endl;
c = a + b;
a = b;
b = c;
}
}

int my_plus(int a, int b)
{
int result = a + b;
if (result >= 4000000)
throw result;
return result;
}
try {
adjacent_difference(v.begin(), v.end() - 1, v.begin() + 1, my_plus);
} catch (int final) {
cout << final << endl;
}
That's what I'd consider a "dumb hack," but I think it will work. If you want to pretty it up a little, make an exception class to hold the final result instead of throwing a raw integer. And make the threshold a template parameter.
But really, don't do any of this, because it's a dumb hack: just use a "for" loop as you mentioned.

With find_if and a little help from boost iterator lib:
#include <boost/iterator/function_input_iterator.hpp>
#include <algorithm>
#include <climits>
struct fibonacci_generator {
typedef int result_type;
fibonacci_generator() : n(0) {}
// dummy generator
// put the code to generate fibonacci
// sequence here
int operator()() { return n++; }
private:
int n;
};
int main()
{
fibonacci_generator g;
int i = *std::find_if(
make_function_input_iterator(g, boost::infinite()),
make_function_input_iterator(g, boost::infinite()),
[](int i) { return i > 1000000; });
}
A copy_until algorithm could be useful here to push back results to a vector, but you need to write your own.

std::set::equal_range for a container of std::pair

I'm trying to find all the ranges [a,b] enclosing int values i, where a <= i <= b. I'm using set<std:pair<int,int>> for the set of ranges.
In the following, using equal range on a vector<int> yields the start and one past the end of the range.
When I do the same for a set<pair<int,int>>, the result starts and ends at one past the end of the range and therefore doesn't include the range enclosing the value.
#include <set>
#include <iostream>
#include <algorithm>
using namespace std;
int main()
{
int ia[] = {1,2,3,4,5,6,7,8,9,10};
set<int> s1(begin(ia),end(ia));
auto range1 = s1.equal_range(5);
cout << *range1.first << " " << *range1.second << endl; //prints 5 6
pair<int,int> p[] = {make_pair(1,10),
make_pair(11,20),
make_pair(21,30),
make_pair(31,40)};
set<pair<int,int>> s(begin(p), end(p));
auto range = s.equal_range(make_pair(12,12));
cout << range.first->first << " " << range.first->second << endl; //prints 21 30, why?
cout << range.second->first << " " << range.second->second << endl; //prints 21 30
}
prints 5 6
21 30
21 30
Why does equal_range on the set<pair<int,int>> not include the range that encloses the value (12), namely [11.20]

equal_range is behaving completely correctly:
assert( std::make_pair(11, 20) < std::make_pair(12, 12) );
assert( std::make_pair(12, 12) < std::make_pair(21, 30) );
[11,20] is not a range, it's a pair. The pair [12,12] is not "within" another pair, that makes no sense to even say.
[12,12] is not "within" [11,20], it's greater than it. The less-than operator for std::pair compares the first elements first, and only if they're equal does it look at the second elements, so make_pair(11,x) is less than make_pair(12, y) for any x and y
So equal_range tells you that [12,12] would be inserted after [11,20] and before [21,30], which is correct.
If you want to treat pairs as ranges of values you need to write code to do that, not assume the built-in comparisons for pairs does that. You're actually trying to find an int 12 in a range of pairs of ints, but have written code to find a pair [12,12] in a range of pairs of ints. That's not the same thing.

It does not include [11, 20] in the range, because it doesn't include anything in the range. There are no equal elements to [12, 12], so it returns an empty range (represented by the half-open interval [x, x)).
BTW dereferencing the upper bound of the range may invoke undefined behavior, since that may be equal to s.end().

The pair [12, 12] is sorted after [11, 20] and before [21, 30].
std::set::equal_range() includes a range of equal elements. There is no equal element in your set (especially not [11, 20]), so equal_range() returns [21, 30], [21, 30].

equal_range is implemented as to call lower_bound first then call upper_bound to search the rest data set.
template <class ForwardIterator, class T>
pair<ForwardIterator,ForwardIterator>
equal_range ( ForwardIterator first, ForwardIterator last, const T& value )
{
ForwardIterator it = lower_bound (first,last,value);
return make_pair ( it, upper_bound(it,last,value) );
}
Look at your sample:
It calls lower_bound to locate the lower bound of value(which is pair(12,12), which arrives at
pair<int,int> p[] = {make_pair(1,10),
make_pair(11,20),
make_pair(21,30), // <--- lower_bound() points to here
make_pair(31,40)};
Then it calls upper_bound() to search on (21,30),(31,40) and it cound't find it, it returns (21,30)
http://www.cplusplus.com/reference/algorithm/upper_bound/

I don't think your std::set<std::pair<int, int> > won't help you much in intersecting it with your integer: You can find the s.lower_bound(std::make_pair(i + 1, i + 1) to cut off the search but all ranges starting at an index lower than i + 1 can potentially include the value i if the second boundary is large enough. What might help is if you know the maximum size of the ranges in which case you can bound the search towards the front by s.lower_bound(std::make_pair(i - max_range_size, i - max_range_size)). You'll need to inspect each of the ranges in turn to identify if your i falls into them:
auto it = s.lower_bound(std::make_pair(i - max_range_size, i - max_range_size));
auto upper = s.lower_bound(std::make_pair(i + 1, i + 1));
for (; it != upper; ++it) {
if (i < it->second) {
std::cout << "range includes " << i << ": "
<< [" << it.first << ", " << it->second << "]\n";
}
(or something like this...)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Range transformations with stateful lambdas and std::views::drop - c++

Related

What is wrong with my comparison function?

A vector as a patchwork of two other vectors

Shorthand for for-loop - syntactic sugar in C++(11)

STL algorithm to generate Fibonacci numbers until a certain value is reached

std::set::equal_range for a container of std::pair

Categories

Resources