range v3 flattening a sequence - c++

So I recently watched this talk on c++:
https://www.youtube.com/watch?v=mFUXNMfaciE
And I was very interested on trying it out. So after some toy program I am stuck on how to properly flatten a vector of vectors into a vector. According to the documentation here: https://ericniebler.github.io/range-v3/ This is possible using ranges::view::for_each. However I just can't seem to get it to work. Here is some minimal code.
#include <range/v3/all.hpp>
#include <iostream>
#include <vector>
int main()
{
auto nums = std::vector<std::vector<int>>{
{0, 1, 2, 3},
{5, 6, 7, 8},
{10, 20},
{30},
{55}
};
auto filtered = nums
| ranges::view::for_each([](std::vector<int> num) { return ranges::yield_from(num); })
| ranges::view::remove_if([](int i) { return i % 2 == 1; })
| ranges::view::transform([](int i) { return std::to_string(i); });
for (const auto i : filtered)
{
std::cout << i << std::endl;
}
}

range-v3 error messages tend to be pretty horrible, so much so that this one is actually better than most:
prog.cc: In lambda function:
prog.cc:16:90: error: no match for call to '(const ranges::v3::yield_from_fn) (std::vector<int>&)'
| ranges::view::for_each([](std::vector<int> num) { return ranges::yield_from(num); })
^
In file included from /opt/wandbox/range-v3/include/range/v3/view.hpp:38:0,
from /opt/wandbox/range-v3/include/range/v3/all.hpp:21,
from prog.cc:1:
/opt/wandbox/range-v3/include/range/v3/view/for_each.hpp:133:17: note: candidate: template<class Rng, int _concept_requires_132, typename std::enable_if<((_concept_requires_132 == 43) || ranges::v3::concepts::models<ranges::v3::concepts::View, T>()), int>::type <anonymous> > Rng ranges::v3::yield_from_fn::operator()(Rng) const
Rng operator()(Rng rng) const
^~~~~~~~
to someone with a bit of knowledge of range-v3's concepts emulation layer, this "clearly" states that the call to yield_from failed because the type of the argument you passed to it - std::vector<int> - does not satisfy the View concept.
The View concept characterizes a subset of ranges that do not own their elements, and therefore have all operations - move/copy construction/assignment, begin, end, and default construction - computable in O(1). The range composition algrebra in range-v3 works only on views to avoid having to deal with element lifetimes and to provide predictable performance.
yield_from rejects the std::vectors you are trying to pass since they are not views, but you could easily provide views by (1) taking the vectors as lvalues instead of by value in for_each, and (2) yielding view::all of those lvalues [DEMO]:
auto filtered = nums
| ranges::view::for_each([](std::vector<int>& num) {
return ranges::yield_from(ranges::view::all(num)); })
| ranges::view::remove_if([](int i) { return i % 2 == 1; })
| ranges::view::transform([](int i) { return std::to_string(i); });
But in this simple case, flattening a range of ranges of elements into a range of elements already has a purpose-specific view in range-v3: view::join. You may as well use that [DEMO]:
auto filtered = nums
| ranges::view::join
| ranges::view::remove_if([](int i) { return i % 2 == 1; })
| ranges::view::transform([](int i) { return std::to_string(i); });

Related

Is it okay to let a range view own a shared_ptr containing state it needs?

Basically I am trying to get around the fact that the following does not work with range-v3 ranges: (this is a toy example but illustrates the problem.)
namespace rv = ranges::views;
auto bad_foo() {
std::vector<int> local_vector = { 23, 45, 182, 3, 5, 16, 1 };
return local_vector |
rv::transform([](int n) {return 2 * n; });
}
int main() {
for (int n :bad_foo()) {
std::cout << n << " ";
}
std::cout << "\n";
return 0;
}
The above doesn't work because we are returning a view that references data that will go out of scope when bad_foo returns.
What I want to do is store the vector in a shared pointer in such a way that the range view will keep the shared pointer alive. Note that the obvious version of this idea does not work i.e.
auto still_bad_foo() {
auto shared_vector = std::make_shared<std::vector<int>>(
std::initializer_list{ 23, 45, 182, 3, 5, 16, 1 }
);
return *shared_vector |
rv::transform([](int n) {return 2 * n; });
}
still fails because shared_vector is not actually getting captured by the range view. What we want to do is force the range view to own the shared pointer.
The following seems to work to me.
auto good_foo() {
auto shared_vector = std::make_shared<std::vector<int>>(
std::initializer_list{ 23, 45, 182, 3, 5, 16, 1 }
);
return rv::single(shared_vector) |
rv::transform([](auto ptr) { return rv::all(*ptr); }) |
rv::join |
rv::transform([](int n) {return 2 * n; });
}
We use single to turn the shared pointer into a single item range view, turn the single item into a range of ints by dereferencing the pointer in transform and wrapping the vector in a range with all, yielding a composition of range views, which we then flatten via join.
My question is is the above really safe and if so is there a less verbose version of the same idea?
Edit: updated the question to just be about range-v3 ranges as #康桓瑋 has alerted me that this issue is not a problem with C++20 ranges as long as it is not a requirement for the owning range views to be copyable.
Basically I am trying to get around the fact that the following does
not work: (this is a toy example but illustrates the problem. I'm
using range-v3 but this question applies to standard ranges too)
If you are using the standard <ranges> then you don't have to worry about this, just move the vector into the pipe, which will move the ownership of the vector into owning_view
auto not_bad_foo() {
std::vector<int> local_vector = { 23, 45, 182, 3, 5, 16, 1 };
return std::move(local_vector) | // <- here
std::views::transform([](int n) {return 2 * n; });
}
Demo
How about capturing the vector in a lambda to extend its lifetime?
auto create_foo() {
std::vector<int> local_vector = { 23, 45, 182, 3, 5, 16, 1 };
return [captured_vector = std::move(local_vector)]() {
return captured_vector | rv::transform([](int n) {return 2 * n; });
};
}
int main() {
for (int n : create_foo()()) {
std::cout << n << " ";
}
std::cout << "\n";
}

Is one-line initialization of a <set> possible with C++20 <ranges>?

I would like to know if with <ranges> in C++20 it is finally possible to select from a sequence and initialize a set in one line, as is possible in C# with IEnumerable. This would probably require converting a <ranges> object to a std::initializer_list.
C#:
int[] sequence = new int[] { 0,1,2,3,4 };
HashSet<int> set = new HashSet<int>(sequence.Where((int i) => i % 2 == 0));
C++:
std::vector<int> sequence { 0,1,2,3,4 };
auto matcher = sequence | std::ranges::views::filter([](int i) { return !(i % 2); });
std::set<int> myset(matcher.begin(), matcher.end());
I want to do something like:
std::vector<int> sequence { 0,1,2,3,4 };
std::set<int> myset { sequence | std::ranges::views::filter([](int i) { return !(i % 2); }) };
Actually, it is not possible for now, but in near future, we can get ranges::to which will have such functionality.

Can I use C++20 ranges to break when matched count is greater than some threshold?

Consider the following pre ranges code:
std::vector<int> v(1000*1000);
bool count_gt_5_v1(int val){
return std::count(v.begin(), v.end(), val)>5;
}
It looks nicer than the raw loop, but it can be very inefficient if val is very common in v.
Is there any way to use C++20 ranges so that iteration will stop after I encounter val 6 times.
In other words I am looking for a way to introduce a break when my condition is satisfied.
I have this abomination, that seems to work, but it is much much uglier than raw for loop.
bool count_gt_5_v2(int val){
int cnt=0;
auto span = std::ranges::views::take_while(v,[&cnt, &val]
(const auto elem)
{
cnt+=elem==val;
return cnt<6;
});
std::ranges::distance(span);
return cnt==6;
}
Link to full code: https://godbolt.org/z/86djdK
You could do this:
auto matches = v | rv::filter([=](int i){ return i == val; })
| rv::take(6);
return ranges::distance(matches) == 6;
Or, better:
auto matches = v | rv::filter([=](int i){ return i == val; });
return not ranges::empty(matches | rv::drop(5));
This attempt:
std::ranges::views::take_while(v, [&cnt, &val](const auto elem){
cnt+=elem==val;
return cnt<6;
});
doesn't meet the requirements of take_while. All of the predicates in ranges have to be equality-preserving - same inputs, same output. Here, that's not the case - if we call the predicate twice on a single element, we'd get different output. So that's undefined behavior.

Order a list depending on other list

Given:
struct Object {
int id;
...
};
list<Object> objectList;
list<int> idList;
What is the best way to order objectList depending on order of idList?
Example (pseudo code):
INPUT
objectList = {o1, o2, o3};
idList = {2, 3, 1};
ACTION
sort(objectList, idList);
OUTPUT
objectList = {o2, o3, o1};
I searched in documentation but I only found methods to order elements comparing among themselves.
You can store the objects in an std::map, with id as key. Then traverse idList, get the object out of map with its id.
std::map<int, Object> objectMap;
for (auto itr = objectList.begin(); itr != objectList.end(); itr++)
{
objectMap.insert(std::make_pair(itr->id, *itr));
}
std::list<Object> newObjectList;
for (auto itr = idList.begin(); itr != idList.end(); itr++)
{
// here may fail if your idList contains ids which does not appear in objectList
newObjectList.push_back(objectMap[*itr]);
}
// now newObjectList is sorted as order in idList
Here is another variant, which works in O(n log n). This is asymptotcally optimal.
#include <list>
#include <vector>
#include <algorithm>
#include <iostream>
#include <cassert>
int main() {
struct O {
int id;
};
std::list<O> object_list{{1}, {2}, {3}, {4}};
std::list<int> index_list{4, 2, 3, 1};
assert(object_list.size() == index_list.size());
// this vector is optional. It is needed if sizeof(O) is quite large.
std::vector<std::pair<int, O*>> tmp_vector(object_list.size());
// this is O(n)
std::transform(begin(object_list), end(object_list), begin(tmp_vector),
[](auto& o) { return std::make_pair(o.id, &o); });
// this is O(n log n)
std::sort(begin(tmp_vector), end(tmp_vector),
[](const auto& o1, const auto& o2) {
return o1.first < o2.first;
});
// at this point, tmp_vector holds pairs in increasing index order.
// Note that this may not be a contiguous list.
std::list<O> tmp_list(object_list.size());
// this is again O (n log n), because lower_bound is O (n)
// we then insert the objects into a new list (you may also use some
// move semantics here).
std::transform(begin(index_list), end(index_list), begin(tmp_list),
[&tmp_vector](const auto& i) {
return *std::lower_bound(begin(tmp_vector), end(tmp_vector),
std::make_pair(i, nullptr),
[](const auto& o1, const auto& o2) {
return o1.first < o2.first;
})->second;
});
// As we just created a new list, we swap the new list with the old one.
std::swap(object_list, tmp_list);
for (const auto& o : object_list)
std::cout << o.id << std::endl;
}
I assumed that O is quite large and not easily movable. Therefore i first create tmp_vector which only contains of pairs. Then I sort this vector.
Afterwards I can simply go through the index_list and find the matching indices using binary search.
Let me elaborate on why a map is not the best solution eventhough you get a quite small piece of code. If you use a map you need to rebalance your tree after each insertion. This doesn't cost asympatotically (because n times rebalancing costs you the same as sorting once), but the constant is way larger. A "constant map" makes not that much sense (except accessing it may be easier).
I then timed the "simple" map-approach against my "not-so-simple" vector-approach. I created a randomly sorted index_list with N entries. And this is what I get (in us):
N map vector
1000 90 75
10000 1400 940
100000 24500 15000
1000000 660000 250000
NOTE: This test shows the worst case as in my case only index_list was randomly sorted, while the object_list (which is inserted into the map in order) is sorted. So rebalancing shows all its effect. If the object_list is kind of random, performance will behave more similar, eventhough performance will always be worse. The vector list will even behave better when the object list is completely random.
So already with 1000 entries the difference is already quite large. So I would strongly vote for a vector-based approach.
Assuming the data is handled to you externally and you don't have the choice of the containers:
assert( objectList.size() == idList.size() );
std::vector<std::pair<int,Object>> wrapper( idList.size() );
auto idList_it = std::begin( idList );
auto objectList_it = std::begin( objectList );
for( auto& e: wrapper )
e = std::make_pair( *idList_it++, *objectList_it++ );
std::sort(
std::begin(wrapper),
std::end(wrapper),
[]
(const std::pair<int,Object>& a, const std::pair<int,Object>& b) -> bool
{ return a.first<b.first; }
);
Then, copy back to original container.
{
auto objectList_it = std::begin( objectList );
for( const auto& e: wrapper )
*objectList_it++ = e;
}
But this solution is not optimal, I'm sure somebody will come with a better solution.
Edit: The default comparison operator for pairs requires that it is defined both for first and second members. Thus the easiest way is to provide a lambda.
Edit2: for some reason, this doesn't build if using a std::list for the wrapper. But it's ok if you use a std::vector (see here).
std::list has a sort member function you can use with a custom comparison functor.
That custom functor has to look up an object's id in the idList and can then use std::distance to calculate the position of the element in idList. It does so for both objects to be compared and returns true if the first position is smaller than the second.
Here is an example:
#include <iostream>
#include <list>
#include <algorithm>
#include <stdexcept>
struct Object
{
int id;
};
int main()
{
Object o1 = { 1 };
Object o2 = { 2 };
Object o3 = { 3 };
std::list<Object> objectList = { o1, o2, o3 };
std::list<int> const idList = { 2, 3, 1 };
objectList.sort([&](Object const& first, Object const& second)
{
auto const id_find_iter1 = std::find(begin(idList), end(idList), first.id);
auto const id_find_iter2 = std::find(begin(idList), end(idList), second.id);
if (id_find_iter1 == end(idList) || id_find_iter2 == end(idList))
{
throw std::runtime_error("ID not found");
}
auto const pos1 = std::distance(begin(idList), id_find_iter1);
auto const pos2 = std::distance(begin(idList), id_find_iter2);
return pos1 < pos2;
});
for (auto const& object : objectList)
{
std::cout << object.id << '\n';
}
}
It's probably not terribly efficient, but chances are you will never notice. If it still bothers you, you might want to look for a solution with std::vector, which unlike std::list provides random-access iterators. That turns std::distance from O(n) to O(1).
I would find it strange to end up in this situation as I would use the pointers instead of the ids. Though; there might be usecases for this.
Note that in all examples below, I assume that the ids-list contains all ids exactly ones.
Writing it yourself
The issue you like to solve is creating/sorting a list of objects based on the order of the ids in another list.
The naive way of doing this, is simply writing it yourself:
void sortByIdVector(std::list<Object> &list, const std::list<int> &ids)
{
auto oldList = std::move(list);
list = std::list<Object>{};
for (auto id : ids)
{
auto itElement = std::find_if(oldList.begin(), oldList.end(), [id](const Object &obj) { return id == obj.id; });
list.emplace_back(std::move(*itElement));
oldList.erase(itElement);
}
}
If you use a sorted vector as input, you can optimize this code to get the best performance out of it. I'm leaving it up-to you to do so.
Using sort
For this implementation, I'm gonna assume this are std::vector instead of std::list, as this is the better container to request the index of an element. (You can with some more code do the same for list)
size_t getIntendedIndex(const std::vector<int> &ids, const Object &obj)
{
auto itElement = std::find_if(ids.begin(), ids.end(), [obj](int id) { return id == obj.id; });
return itElement - ids.begin();
}
void sortByIdVector(std::list<Object> &list, const std::vector<int> &ids)
{
list.sort([&ids](const Object &lhs, const Object &rhs){ return getIntendedIndex(ids, lhs) < getIntendedIndex(ids, rhs); });
}
Insertion
Another approach, also more suitable for std::vector would be simply inserting the elements at the right place and will be more performant than the std::sort.
void sortByIdVector(std::vector<Object> &list, const std::vector<int> &ids)
{
auto oldList = std::move(list);
list = std::vector<Object>{};
list.resize(oldList.size());
for (Object &obj : oldList)
{
auto &newLocation = list[getIntendedIndex(ids, obj)];
newLocation = std::move(obj);
}
}
objectList.sort([&idList] (const Object& o1, const Object& o2) -> bool
{ return std::find(++std::find(idList.begin(), idList.end(), o1.id),
idList.end(), o2.id)
!= idList.end();
});
The idea is to check if we find o1.id before o2.id in the idList.
We search o1.id, increment the found position then we search o2.id: if found, that implies o1 < o2.
Test
#include <iostream>
#include <string>
#include <list>
#include <algorithm>
struct Object {
int id;
string name;
};
int main()
{
list<Object> objectList {{1, "one_1"}, {2, "two_1"}, {3, "three_1"}, {2, "two_2"}, {1, "one_2"}, {4, "four_1"}, {3, "Three_2"}, {4, "four_2"}};
list<int> idList {3, 2, 4, 1};
objectList.sort([&idList] (const Object& o1, const Object& o2) -> bool
{ return std::find(++std::find(idList.begin(), idList.end(), o1.id), idList.end(), o2.id) != idList.end(); });
for(const auto& o: objectList) cout << o.id << " " << o.name << "\n";
}
/* OUTPUT:
3 three_1
3 Three_2
2 two_1
2 two_2
4 four_1
4 four_2
1 one_1
1 one_2
*/

A way to filter range by indices, to get min_element from filtered indices only?

In comments to this question is-there-a-way-to-iterate-over-at-most-n-elements-using-range-based-for-loop there was additional question - is this possible to have "index view" on a container, i.e. to have subrange with some indexes filtered out.
Additionally I encountered a problem to find minimum value from a range with some indexes filtered out.
I.e. is it possible to replace such code as below with std and/or boost algorithms, filters - to make it more readable and maintainable:
template <typename Range, typename IndexPredicate>
auto findMin(const Range& range, IndexPredicate ipred)
-> boost::optional<typename Range::value_type>
{
bool found = false;
typename Range::value_type minValue{};
for (std::size_t i = 0; i < range.size(); ++i)
{
if (not ipred(i))
continue;
if (not found)
{
minValue = range[i];
found = true;
}
else if (minValue > range[i])
{
minValue = range[i];
}
}
if (found)
{
return minValue;
}
else
{
return boost::none;
}
}
Just to be used like this:
#include <iostream>
#include <vector>
int main() {
std::vector<float> ff = {1.2,-1.2,2.3,-2.3};
auto onlyEvenIndex = [](auto i){ return (i&1) == 0;};
auto minValue = findMin(ff, onlyEvenIndex);
std::cout << *minValue << std::endl;
}
Using the recently Standard range-v3 proposal:
#include <range/v3/all.hpp>
#include <iostream>
#include <vector>
int main()
{
std::vector<float> rng = {1.2,-1.2,2.3,-2.3};
auto even_indices =
ranges::view::iota(0ul, rng.size()) |
ranges::view::filter([](auto i) { return !(i & 1); })
;
auto min_ind = *ranges::min_element(
even_indices, [&rng](auto L, auto R) {
return rng[L] < rng[R];
});
std::cout << rng[min_ind];
}
Live Example. Note that the syntax is roughly similar to Boost.Range, but fully revamped to take advantage of C++14 (generalized lambdas, auto return type deduction etc.)
The solution to this is to think beyond the natural way of filtering ranges in C++. I mean - we need to filter the range of indexes, not the range of values. But from where we got the range of indexes? There is way to get the range of indexes - boost::irange. So - see this:
#include <boost/range/irange.hpp>
#include <boost/range/adaptor/filtered.hpp>
#include <boost/range/algorithm/min_element.hpp>
#include <functional>
template <typename Range, typename IndexPredicate>
auto findMin(const Range& range, IndexPredicate ipred) -> boost::optional<typename Range::value_type>
{
using boost::adaptors::filtered;
using boost::irange;
auto filtered_indexes = irange(0u, range.size()) | filtered(std::function<bool(std::size_t)>{ipred});
One drawback of using boost ranges is that they have problems to use raw lambdas - so that is why I need to use std::function.
Nest step is as easy as using boost::min_element - the only thing to remember is that you should compare values. not just indexes:
auto lessByValue = [&range] (auto lhs, auto rhs)
{
return range[lhs] < range[rhs];
};
And the final steps:
auto foundMin = boost::min_element(filtered_indexes, lessByValue);
if (foundMin != std::end(filtered_indexes))
return range[*foundMin];
else
return boost::none;
Start with this answer to the earlier question. Optionally do the described tasks in that question augments.
Augment indexT to support a stride template argument size_t stride=1: Replace ++t; with std::advance(t,stride);
Add ItB base() const { return b+**a(); } to indexing_iterator (this is for later).
Add template<size_t stride> using index_stride_range=range<indexT<size_t, stride>>; This is an indexing range with a compile time stride.
Write intersect that works on two index_stride_ranges. The output is an index_stride_range of stride gcd(lhs_stride, rhs_stride). Working out where it starts is another bit of work, but only high-school math. Note that an index_range is a type of index_stride_range with stride=1, so this upgrade will work with index_ranges as well.
Upgrade index_filter to take a index_stride_range as the first argument. The implementation is the same (other than relying on upgraded intersect).
Write every_nth_index<N>(offset), which is an index_stride_range that goes from offset to size_t(-(1+(abs(offset))%stride) - (size_t(-(1+(abs(offset)%stride)))%stride) or some such (basically 0 to infinity in the simple case -- the extra math is to find the biggest number that is equal to offset+k*stride that fits in a size_t.
Now we get:
auto filtered = index_filter( every_nth_index<2>(), container );
auto it = (std::min)( filtered.begin(), filtered.end() );
and we have an iterator. it.base() will return the iterator into the container that contains the element if it!=filtered.end(); (not it.base()!=container.end(), which is different).