How to chain C++ `transform` and `inner_product` calls? - c++

I would like to do something like:
vector<int> v;
v.push_back(0);
v.push_back(1);
v
.transform([](auto i) { return i + 2; })
.transform([](auto i) { return i * 3; })
.inner_product(0);
In other words, just implicitly use begin() and end() for the first and last iterators and chain the results.
Is there anything (eg some library) that would allow this?

Just write your own class to augment std::vector.
#include <algorithm>
#include <iostream>
#include <numeric>
#include <vector>
#include <utility>
template < typename T >
class augmented_vector
{
std::vector < T > m_vec;
public:
augmented_vector() : m_vec() {}
explicit augmented_vector(std::vector<T> const& in) : m_vec(in) {}
void push_back ( T&& value )
{
m_vec.push_back(std::forward<T>(value));
}
template < typename F >
augmented_vector < T > transform(F const& f)
{
std::vector < T > new_vec(m_vec.size());
std::transform( m_vec.begin(), m_vec.end(), new_vec.begin(), f);
return augmented_vector < T > ( new_vec );
}
T inner_product(T value)
{
return std::inner_product( m_vec.begin(), m_vec.end(), m_vec.begin(), value);
}
};
int main()
{
augmented_vector<int> v;
v.push_back(0);
v.push_back(1);
auto val = v
.transform([](auto i) { return i + 2; })
.transform([](auto i) { return i * 3; })
.inner_product(0);
std::cout << val << '\n';
}
Or use the D programming language. It has universal function call syntax.
import std.algorithm : fold, map;
import std.stdio : writeln;
void main()
{
auto v = [0, 1];
auto x = v
.map!(a => a+2)
.map!(a => a*3)
.fold!((a,b) => a + b^^2)(0);
writeln(x);
}

Wrap the functions, providing the boilerplate in the wrapper functions:
template<typename Container, typename Transform>
void transform_container(Container & container, Transform transform) {
std::transform(std::begin(container), std::end(container),
std::begin(container), /* requires output iterator */
transform);
}
template<typename T, typename Container>
auto inner_product_self(Container&& container, T initial) {
return std::inner_product(std::begin(container), std::end(container),
std::begin(container),
initial);
}
Your code then becomes:
int main() {
std::vector<int> v(2);
std::itoa(std::begin(v), std::end(v), 0);
transform_container(v, [](auto i) { return i + 2; });
transform_container(v, [](auto i) { return i * 3; });
auto result = inner_product_self(container, 0);
std::cout << "result: " << result;
}
(Live on ideone)
You're not chained to object oriented programming!

Related

C++: function that works with container and container of pointers as well

I think I'm facing something that I imagine is a quite common problem here.
I'd like to write a function that would be able to accept both a container (let's say std::vector) of objects, and a container of pointers to those objects.
What would be the proper way to do so?
Right now, I'm thinking
int sum(std::vector<int *> v)
{
int s = 0;
for (int * i : v) s += *i;
return s;
}
int sum(std::vector<int> v)
{
std::vector<int *> vp;
for (size_t i = 0; i < v.size(); ++i)
vp[i] = &v[i];
return sum(vp);
}
But it doesn't seem quite right, does it?
Consider the standard algorithm library where the problem you see has a solution.
Most algorithms have some default behavior but often allow you to customize that behavior via functor parameters.
For your specific case the algorithm of choice is std::accumulate.
Because this algorithm already exists I can restrict to a rather simplified illustration here:
#include <iostream>
#include <functional>
template <typename T,typename R,typename F = std::plus<>>
R sum(const std::vector<T>& v,R init,F f = std::plus<>{})
{
for (auto& e : v) init = f(init,e);
return init;
}
int main() {
std::vector<int> x{1,2,3,4};
std::vector<int*> y;
for (auto& e : x ) y.push_back(&e);
std::cout << sum(x,0) << "\n";
std::cout << sum(y,0,[](auto a, auto b) {return a + *b;});
}
std::plus is a functor that adds two values. Because the return type may differ from the vectors element type an additional template parameter R is used. Similar to std::accumulate this is deduced from the initial value passed as parameter. When adding int the default std::plus<> is fine. When adding integers pointed to by pointers, the functor can add the accumulator with the dereferenced vector element. As already mentioned this is just a simple toy example. In the above link you can find a possible implementation of std::accumulate (which uses iterators rather than the container directly).
With C++20 (or another ranges library), you can easily add or remove pointerness
template <std::ranges::range R, typename T>
concept range_of = requires std::same<std::ranges::range_value_t<R>, T>;
template <range_of<int *> IntPointers>
int sum_pointers(IntPointers int_pointers)
{
int result = 0;
for (int * p : int_pointers) result += *p;
return result;
}
void call_adding_pointer()
{
std::vector<int> v;
sum_pointers(v | std::ranges::views::transform([](int & i){ return &i; });
}
Or
template <range_of<int> Ints>
int sum(Ints ints)
{
int result = 0;
for (int i : ints) result += i;
return result;
}
void call_removing_pointer()
{
std::vector<int *> v;
sum(v | std::ranges::views::transform([](int * p){ return *p; });
}
You can make a function template, which behaves differently for pointer and non-pointer:
#include <iostream>
#include <vector>
using namespace std;
template <class T>
auto sum(const std::vector<T> &vec)
{
if constexpr (std::is_pointer_v<T>)
{
typename std::remove_pointer<T>::type sum = 0;
for (const auto & value : vec) sum += *value;
return sum;
}
if constexpr (!std::is_pointer_v<T>)
{
T sum = 0;
for (const auto & value : vec) sum += value;
return sum;
}
}
int main(){
std::vector<int> a{3, 4, 5, 8, 10};
std::vector<int*> b{&a[0], &a[1], &a[2], &a[3], &a[4]};
cout << sum(a) << endl;
cout << sum(b) << endl;
}
https://godbolt.org/z/sch3KovaK
You can move almost everything out of the if constexpr to reduce code duplication:
template <class T>
auto sum(const std::vector<T> &vec)
{
typename std::remove_pointer<T>::type sum = 0;
for (const auto & value : vec)
{
if constexpr (std::is_pointer_v<T>)
sum += *value;
if constexpr (!std::is_pointer_v<T>)
sum += value;
}
return sum;
}
https://godbolt.org/z/rvqK89sEK
Based on #mch solution:
template<typename T>
std::array<double, 3> center(const std::vector<T> & particles)
{
if (particles.empty())
return {0, 0, 0};
std::array<double, 3> cumsum = {0, 0, 0};
if constexpr (std::is_pointer_v<T>)
{
for (const auto p : particles)
{
cumsum[0] += p->getX();
cumsum[1] += p->getY();
cumsum[2] += p->getZ();
}
}
if constexpr (not std::is_pointer_v<T>)
{
for (const auto p : particles)
{
cumsum[0] += p.getX();
cumsum[1] += p.getY();
cumsum[2] += p.getZ();
}
}
double f = 1.0 / particles.size();
cumsum[0] *= f;
cumsum[1] *= f;
cumsum[2] *= f;
return cumsum;
}
Much cleaner and more efficient solution using std::invoke:
std::array<double, 3> centroid(const std::vector<T> & particles)
{
if (particles.empty())
return {0, 0, 0};
std::array<double, 3> cumsum{0.0, 0.0, 0.0};
for (auto && p : particles)
{
cumsum[0] += std::invoke(&topology::Particle::getX, p);
cumsum[1] += std::invoke(&topology::Particle::getY, p);
cumsum[2] += std::invoke(&topology::Particle::getZ, p);
}
double f = 1.0 / particles.size();
cumsum[0] *= f;
cumsum[1] *= f;
cumsum[2] *= f;
return cumsum;
}

C++11 for each loop with more than one variable

I would like to translate the following traditional for loop into a C++11 for-each loop without extra looping constructs:
int a[] = { 5, 6, 7, 8, 9, 10 };
int b[] = { 50, 60, 70, 80, 90, 100 };
// Swap a and b array elements
for (int i = 0; i < sizeof(a)/sizeof(a[0]); i++)
{
a[i] ^= b[i]; b[i] ^= a[i]; a[i] ^= b[i];
}
Does there exist any way by which it is possible to provide more than one variable in the C++11 for-each loop like:
for (int i, int j : ...)
There is no built-in way to do this. If you can use Boost, boost::combine will work for iterating two (or more) ranges simultaneously (Does boost offer make_zip_range?, How can I iterate over two vectors simultaneously using BOOST_FOREACH?):
for (boost::tuple<int&, int&> ij : boost::combine(a, b)) {
int& i = boost::get<0>(ij);
int& j = boost::get<1>(ij);
// ...
}
Unfortunately accessing the elements within the tuple elements of the zipped range is highly verbose. C++17 will make this much more readable using structured binding:
for (auto [&i, &j] : boost::combine(a, b)) {
// ...
}
Since you don't need to break out of the loop or return from the enclosing function, you could use boost::range::for_each with the body of your loop as a lambda:
boost::range::for_each(a, b, [](int& i, int& j)
{
// ...
});
zip or combine ranges are common in many range libraries.
Writing one strong enough for a for(:) loop isn't hard however.
First we write a basic range type:
template<class It>
struct range_t {
It b,e;
It begin() const{ return b; }
It end() const{ return e; }
range_t without_front( std::size_t count = 1 ) const {
return {std::next(begin()), end()};
}
bool empty() const { return begin()==end(); }
};
template<class It>
range_t<It> range( It b, It e ) { return {b,e}; }
template<class C>
auto range( C& c ) {
using std::begin; using std::end;
return range( begin(c), end(c) );
};
Then we write an iterator that works with ranges (easier than with iterators):
template<class R1, class R2>
struct double_foreach_iterator {
R1 r1;
R2 r2;
void operator++() { r1 = r1.without_front(); r2 = r2.without_front(); }
bool is_end() const { return r1.empty() || r2.empty(); }
auto operator*()const {
return std::tie( *r1.begin(), *r2.begin() );
}
using self=double_foreach_iterator;
auto cur() const {
return std::make_tuple( r1.begin(), r2.begin() );
}
friend bool operator==( self const& lhs, self const& rhs ) {
if (lhs.is_end() || rhs.is_end())
return lhs.is_end() == rhs.is_end();
return lhs.cur() == rhs.cur();
}
friend bool operator!=( self const& lhs, self const& rhs ) {
return !(lhs==rhs);
}
};
now we double iterate:
template<class A, class B>
auto zip_iterate(
A& a, B& b
) {
auto r1 = range(a);
auto r2 = range(b);
auto r1end = range(r1.end(), r1.end());
auto r2end = range(r2.end(), r2.end());
using it = double_foreach_iterator<decltype(r1), decltype(r2)>;
return range( it{r1, r2}, it{r1end, r2end} );
}
which gives us:
for (auto tup : zip_iterate(a, b)) {
int& i = std::get<0>(tup);
int& j = std::get<1>(tup);
// ...
}
or in C++17:
for (auto&& [i, j] : zip_iterate(a, b)) {
// ...
}
My zip iterate does not assume the two containers are of the same length, and will iterate to the length of the shorter one.
live example.
Just for fun.
The following isn't intended to be a serious answer to the question but just an exercise to try to understand the potentiality of C++11 (so, please, be patient).
The following is an example of a class (a draft of a class) that receive a couple of container (with size() method), with the same size (exception otherwise), and of a custom iterator that return a std::pair of std::reference_wrapper to n-position elements.
With a simple use example that show that it's possible to change the value in the starting containers.
Doesn't work with old C-style arrays but works with std::array. We're talking about C++11 so I suppose we could impose the use of std::array.
#include <array>
#include <vector>
#include <iostream>
#include <functional>
template <typename T1, typename T2>
class pairWrapper
{
public:
using V1 = typename std::remove_reference<decltype((T1().at(0)))>::type;
using V2 = typename std::remove_reference<decltype((T2().at(0)))>::type;
using RW1 = std::reference_wrapper<V1>;
using RW2 = std::reference_wrapper<V2>;
class it
{
public:
it (pairWrapper & pw0, std::size_t p0): pos{p0}, pw{pw0}
{ }
it & operator++ ()
{ ++pos; return *this; }
bool operator!= (const it & it0)
{ return pos != it0.pos; }
std::pair<RW1, RW2> & operator* ()
{
static std::pair<RW1, RW2>
p{std::ref(pw.t1[0]), std::ref(pw.t2[0])};
p.first = std::ref(pw.t1[pos]);
p.second = std::ref(pw.t2[pos]);
return p;
}
private:
std::size_t pos;
pairWrapper & pw;
};
it begin()
{ return it(*this, 0U); }
it end()
{ return it(*this, len); }
pairWrapper (T1 & t10, T2 & t20) : len{t10.size()}, t1{t10}, t2{t20}
{ if ( t20.size() != len ) throw std::logic_error("no same len"); }
private:
const std::size_t len;
T1 & t1;
T2 & t2;
};
template <typename T1, typename T2>
pairWrapper<T1, T2> makePairWrapper (T1 & t1, T2 & t2)
{ return pairWrapper<T1, T2>(t1, t2); }
int main()
{
std::vector<int> v1 { 1, 2, 3, 4 };
std::array<long, 4> v2 { { 11L, 22L, 33L, 44L } };
for ( auto & p : makePairWrapper(v1, v2) )
{
std::cout << '{' << p.first << ", " << p.second << '}' << std::endl;
p.first += 3;
p.second += 55;
}
for ( const auto & i : v1 )
std::cout << '[' << i << ']' << std::endl;
for ( const auto & l : v2 )
std::cout << '[' << l << ']' << std::endl;
return 0;
}
p.s.: sorry for my bad English

Sorting vector based on tuple value

I have below data structure,
typedef vector< tuple<int,int,int> > vector_tuple;
In vector i am storing tuple<value,count,position>
I want to sort my vector based on count, If count is same then based on position sort the vector.
structure ordering
{
bool ordering()(....)
{
return /// ?
}
};
int main()
{
std::vector<int> v1{1,1,1,6,6,5,4,4,5,5,5};
std::vector<int> v2(v1);
vector_tuple vt;
std::tuple<int,int,int> t1;
std::vector<int>::iterator iter;
int sizev=v1.size();
for(int i=0; i < sizev ; i++)
{
auto countnu = count(begin(v2),end(v2),v1[i]);
if(countnu > 0)
{
v2.erase(std::remove(begin(v2),end(v2),v1[i]),end(v2));
auto t = std::make_tuple(v1[i], countnu, i);
vt.push_back(t);
}
}
sort(begin(vt),end(vt),ordering(get<1>vt); // I need to sort based on count and if count is same, sort based on position.
for (int i=0; i < vt.size(); i++)
{
cout << get<0>(vt[i]) << " " ;
cout << get<1>(vt[i]) << " " ;
cout << get<2>(vt[i]) << " \n" ;
}
}
Your compare method should look like:
auto ordering = [](const std::tuple<int, int, int>& lhs,
const std::tuple<int, int, int>& rhs) {
return std::tie(std::get<1>(lhs), std::get<2>(lhs))
< std::tie(std::get<1>(rhs), std::get<2>(rhs));
};
std::sort(std::begin(vt), std::end(vt), ordering);
All credit to Jarod42 for the std::tie answer.
Now let's make it generic by creating a variadic template predicate:
template<std::size_t...Is>
struct tuple_parts_ascending
{
template<class...Ts>
static auto sort_order(const std::tuple<Ts...>& t)
{
return std::tie(std::get<Is>(t)...);
}
template<class...Ts>
bool operator()(const std::tuple<Ts...>& l,
const std::tuple<Ts...>& r) const
{
return sort_order(l) < sort_order(r);
}
};
which we can invoke thus:
sort(begin(vt),end(vt),tuple_parts_ascending<1,2>());
Full Code:
#include <vector>
#include <algorithm>
#include <tuple>
#include <iostream>
using namespace std;
typedef std::vector< std::tuple<int,int,int> > vector_tuple;
template<std::size_t...Is>
struct tuple_parts_ascending
{
template<class...Ts>
static auto sort_order(const std::tuple<Ts...>& t)
{
return std::tie(std::get<Is>(t)...);
}
template<class...Ts>
bool operator()(const std::tuple<Ts...>& l,
const std::tuple<Ts...>& r) const
{
return sort_order(l) < sort_order(r);
}
};
int main()
{
std::vector<int> v1{1,1,1,6,6,5,4,4,5,5,5};
std::vector<int> v2(v1);
vector_tuple vt;
std::tuple<int,int,int> t1;
std::vector<int>::iterator iter;
int sizev=v1.size();
for(int i=0; i < sizev ; i++)
{
auto countnu = count(begin(v2),end(v2),v1[i]);
if(countnu > 0)
{
v2.erase(std::remove(begin(v2),end(v2),v1[i]),end(v2));
auto t = std::make_tuple(v1[i], countnu, i);
vt.push_back(t);
}
}
sort(begin(vt),end(vt),tuple_parts_ascending<1,2>());
for (int i=0; i < vt.size(); i++)
{
cout << get<0>(vt[i]) << " " ;
cout << get<1>(vt[i]) << " " ;
cout << get<2>(vt[i]) << " \n" ;
}
}
Expected results:
6 2 3
4 2 6
1 3 0
5 4 5
Going further, we could make this operation a little more generic and 'library-worthy' by allowing the ordering predicate and the indices to be passed as parameters (this solution required c++14):
namespace detail {
template<class Pred, std::size_t...Is>
struct order_by_parts
{
constexpr
order_by_parts(Pred&& pred)
: _pred(std::move(pred))
{}
template<class...Ts>
constexpr
static auto sort_order(const std::tuple<Ts...>& t)
{
return std::tie(std::get<Is>(t)...);
}
template<class...Ts>
constexpr
bool operator()(const std::tuple<Ts...>& l,
const std::tuple<Ts...>& r) const
{
return _pred(sort_order(l), sort_order(r));
}
private:
Pred _pred;
};
}
template<class Pred, size_t...Is>
constexpr
auto order_by_parts(Pred&& pred, std::index_sequence<Is...>)
{
using pred_type = std::decay_t<Pred>;
using functor_type = detail::order_by_parts<pred_type, Is...>;
return functor_type(std::forward<Pred>(pred));
}
Now we can sort like so:
sort(begin(vt),end(vt),
order_by_parts(std::less<>(), // use predicate less<void>
std::index_sequence<1, 2>())); // pack indices into a sequence
Using lambdas:
std::sort(std::begin(vt),std::end(vt),[](const auto& l,const auto& r){
if(std::get<1>(l)== std::get<1>(r)){
return std::get<2>(l) < std::get<2>(r);
}
return std::get<1>(l) < std::get<1>(r);
});

std::iota is very limited

Coming from a Python world, I find the function std::iota very limited. Why is the interface restricted to not take any UnaryFunction ?
For instance I can convert
>>> x = range(0, 10)
into
std::vector<int> x(10);
std::iota(std::begin(x), std::end(x), 0);
But how would one do:
>>> x = range(0,20,2)
or even
>>> x = range(10,0,-1)
I know this is trivial to write one such function or use Boost, but I figured that C++ committee must have picked this design with care. So clearly I am missing something from C++11.
how about std::generate?
int n = -2;
std::generate(x.begin(), x.end(), [&n]{ return n+=2; });
int n = 10;
std::generate(x.begin(), x.end(), [&n]{ return n--;});
But how would one do:
x = range(0,20,2)
Alternatively to std::generate() (see other answer), you can provide your own unary function to std::iota(), it just have to be called operator++():
#include <iostream>
#include <functional>
#include <numeric>
#include <vector>
template<class T>
struct IotaWrapper
{
typedef T type;
typedef std::function<type(const type&)> IncrFunction;
type value;
IncrFunction incrFunction;
IotaWrapper() = delete;
IotaWrapper(const type& n, const IncrFunction& incrFunction) : value(n), incrFunction(incrFunction) {};
operator type() { return value; }
IotaWrapper& operator++() { value = incrFunction(value); return *this; }
};
int main()
{
IotaWrapper<int> n(0, [](const int& n){ return n+2; });
std::vector<int> v(10);
std::iota(v.begin(), v.end(), n);
for (auto i : v)
std::cout << i << ' ';
std::cout << std::endl;
}
Output: 0 2 4 6 8 10 12 14 16 18
Demo
Here is an idea of how one could implement Range():
struct Range
{
template<class Value, class Incr>
std::vector<Value> operator()(const Value& first, const Value& last, const Incr& increment)
{
IotaWrapper<Value> iota(first, [=](const int& n){ return n+increment; });
std::vector<Value> result((last - first) / increment);
std::iota(result.begin(), result.end(), iota);
return result;
}
};
Demo
With C++20 ranges, you can write it like this:
static auto stepped_iota(int start, int step) {
return std::ranges::views::iota(0) |
std::ranges::views::transform([=](int x) { return x * step + start; });
}
void f() {
for (int x : stepped_iota(0, 2)) { ... }
}
https://godbolt.org/z/3G49rs
Or, if you want the range to be finite:
static auto stepped_iota(int start, int end, int step) {
return std::ranges::views::iota(0, (end - start + step - 1) / step) |
std::ranges::views::transform([=](int x) { return x * step + start; });
}

C++ algorithm like python's 'groupby'

Are there any C++ transformations which are similar to itertools.groupby()?
Of course I could easily write my own, but I'd prefer to leverage the idiomatic behavior or compose one from the features provided by the STL or boost.
#include <cstdlib>
#include <map>
#include <algorithm>
#include <string>
#include <vector>
struct foo
{
int x;
std::string y;
float z;
};
bool lt_by_x(const foo &a, const foo &b)
{
return a.x < b.x;
}
void list_by_x(const std::vector<foo> &foos, std::map<int, std::vector<foo> > &foos_by_x)
{
/* ideas..? */
}
int main(int argc, const char *argv[])
{
std::vector<foo> foos;
std::map<int, std::vector<foo> > foos_by_x;
std::vector<foo> sorted_foos;
std::sort(foos.begin(), foos.end(), lt_by_x);
list_by_x(sorted_foos, foos_by_x);
return EXIT_SUCCESS;
}
This doesn't really answer your question, but for the fun of it, I implemented a group_by iterator. Maybe someone will find it useful:
#include <assert.h>
#include <iostream>
#include <set>
#include <sstream>
#include <string>
#include <vector>
using std::cout;
using std::cerr;
using std::multiset;
using std::ostringstream;
using std::pair;
using std::vector;
struct Foo
{
int x;
std::string y;
float z;
};
struct FooX {
typedef int value_type;
value_type operator()(const Foo &f) const { return f.x; }
};
template <typename Iterator,typename KeyFunc>
struct GroupBy {
typedef typename KeyFunc::value_type KeyValue;
struct Range {
Range(Iterator begin,Iterator end)
: iter_pair(begin,end)
{
}
Iterator begin() const { return iter_pair.first; }
Iterator end() const { return iter_pair.second; }
private:
pair<Iterator,Iterator> iter_pair;
};
struct Group {
KeyValue value;
Range range;
Group(KeyValue value,Range range)
: value(value), range(range)
{
}
};
struct GroupIterator {
typedef Group value_type;
GroupIterator(Iterator iter,Iterator end,KeyFunc key_func)
: range_begin(iter), range_end(iter), end(end), key_func(key_func)
{
advance_range_end();
}
bool operator==(const GroupIterator &that) const
{
return range_begin==that.range_begin;
}
bool operator!=(const GroupIterator &that) const
{
return !(*this==that);
}
GroupIterator operator++()
{
range_begin = range_end;
advance_range_end();
return *this;
}
value_type operator*() const
{
return value_type(key_func(*range_begin),Range(range_begin,range_end));
}
private:
void advance_range_end()
{
if (range_end!=end) {
typename KeyFunc::value_type value = key_func(*range_end++);
while (range_end!=end && key_func(*range_end)==value) {
++range_end;
}
}
}
Iterator range_begin;
Iterator range_end;
Iterator end;
KeyFunc key_func;
};
GroupBy(Iterator begin_iter,Iterator end_iter,KeyFunc key_func)
: begin_iter(begin_iter),
end_iter(end_iter),
key_func(key_func)
{
}
GroupIterator begin() { return GroupIterator(begin_iter,end_iter,key_func); }
GroupIterator end() { return GroupIterator(end_iter,end_iter,key_func); }
private:
Iterator begin_iter;
Iterator end_iter;
KeyFunc key_func;
};
template <typename Iterator,typename KeyFunc>
inline GroupBy<Iterator,KeyFunc>
group_by(
Iterator begin,
Iterator end,
const KeyFunc &key_func = KeyFunc()
)
{
return GroupBy<Iterator,KeyFunc>(begin,end,key_func);
}
static void test()
{
vector<Foo> foos;
foos.push_back({5,"bill",2.1});
foos.push_back({5,"rick",3.7});
foos.push_back({3,"tom",2.5});
foos.push_back({7,"joe",3.4});
foos.push_back({5,"bob",7.2});
ostringstream out;
for (auto group : group_by(foos.begin(),foos.end(),FooX())) {
out << group.value << ":";
for (auto elem : group.range) {
out << " " << elem.y;
}
out << "\n";
}
assert(out.str()==
"5: bill rick\n"
"3: tom\n"
"7: joe\n"
"5: bob\n"
);
}
int main(int argc,char **argv)
{
test();
return 0;
}
Eric Niebler's ranges library provides a group_by view.
according to the docs it is a header only library and can be included easily.
It's supposed to go into the standard C++ space, but can be used with a recent C++11 compiler.
minimal working example:
#include <map>
#include <vector>
#include <range/v3/all.hpp>
using namespace std;
using namespace ranges;
int main(int argc, char **argv) {
vector<int> l { 0,1,2,3,6,5,4,7,8,9 };
ranges::v3::sort(l);
auto x = l | view::group_by([](int x, int y) { return x / 5 == y / 5; });
map<int, vector<int>> res;
auto i = x.begin();
auto e = x.end();
for (;i != e; ++i) {
auto first = *((*i).begin());
res[first / 5] = to_vector(*i);
}
// res = { 0 : [0,1,2,3,4], 1: [5,6,7,8,9] }
}
(I compiled this with clang 3.9.0. and --std=c++11)
I recently discovered cppitertools.
It fulfills this need exactly as described.
https://github.com/ryanhaining/cppitertools#groupby
What is the point of bloating standard C++ library with an algorithm that is one line of code?
for (const auto & foo : foos) foos_by_x[foo.x].push_back(foo);
Also, take a look at std::multimap, it might be just what you need.
UPDATE:
The one-liner I have provided is not well-optimized for the case when your vector is already sorted. A number of map lookups can be reduced if we remember the iterator of previously inserted object, so it the "key" of the next object and do a lookup only when the key is changing. For example:
#include <map>
#include <vector>
#include <string>
#include <algorithm>
#include <iostream>
struct foo {
int x;
std::string y;
float z;
};
class optimized_inserter {
public:
typedef std::map<int, std::vector<foo> > map_type;
optimized_inserter(map_type & map) : map(&map), it(map.end()) {}
void operator()(const foo & obj) {
typedef map_type::value_type value_type;
if (it != map->end() && last_x == obj.x) {
it->second.push_back(obj);
return;
}
last_x = obj.x;
it = map->insert(value_type(obj.x, std::vector<foo>({ obj }))).first;
}
private:
map_type *map;
map_type::iterator it;
int last_x;
};
int main()
{
std::vector<foo> foos;
std::map<int, std::vector<foo>> foos_by_x;
foos.push_back({ 1, "one", 1.0 });
foos.push_back({ 3, "third", 2.5 });
foos.push_back({ 1, "one.. but third", 1.5 });
foos.push_back({ 2, "second", 1.8 });
foos.push_back({ 1, "one.. but second", 1.5 });
std::sort(foos.begin(), foos.end(), [](const foo & lhs, const foo & rhs) {
return lhs.x < rhs.x;
});
std::for_each(foos.begin(), foos.end(), optimized_inserter(foos_by_x));
for (const auto & p : foos_by_x) {
std::cout << "--- " << p.first << "---\n";
for (auto & f : p.second) {
std::cout << '\t' << f.x << " '" << f.y << "' / " << f.z << '\n';
}
}
}
How about this?
template <typename StructType, typename FieldSelectorUnaryFn>
auto GroupBy(const std::vector<StructType>& instances, const FieldSelectorUnaryFn& fieldChooser)
{
StructType _;
using FieldType = decltype(fieldChooser(_));
std::map<FieldType, std::vector<StructType>> instancesByField;
for (auto& instance : instances)
{
instancesByField[fieldChooser(instance)].push_back(instance);
}
return instancesByField;
}
and use it like this:
auto itemsByX = GroupBy(items, [](const auto& item){ return item.x; });
I wrote a C++ library to address this problem in an elegant way. Given your struct
struct foo
{
int x;
std::string y;
float z;
};
To group by y you simply do:
std::vector<foo> dataframe;
...
auto groups = group_by(dataframe, &foo::y);
You can also group by more than one variable:
auto groups = group_by(dataframe, &foo::y, &foo::x);
And then iterate through the groups normally:
for(auto& [key, group]: groups)
{
// do something
}
It also has other operations such as: subset, concat, and others.
I would simply use boolinq.h, which includes all of LINQ. No documentation, but very simple to use.