subset a vector and sort it - c++

I'm looking into using some C++ for simple parts of my R package using the Rcpp package. I'm a C++ novice (but keen to learn!). I've implemented a few simple cpp programs using the excellent Rcpp - in fact that package has motivated me to learn C++...
Anyway, I've got stuck with a simple problem, which if I can fix would help lots. I have a NumericVector I want to subset and then sort. The code below sorts the whole vector (and would also deal with NAs, which is what I need).
My question is, say I want to extract a part of this vector, sort and have it available for other processing - how can I do that? For example, for a vector of length 10, how do I extract and sort the elements 5:10?
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
RcppExport SEXP rollP(SEXP x) {
NumericVector A(x); // the data
A = sort_unique(A);
return A;
}
which I call from R:
sourceCpp( "rollP.cpp")
rollP(10:1)
# [1] 1 2 3 4 5 6 7 8 9 10

Here are 3 variants:
include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector rollP(NumericVector A, int start, int end) {
NumericVector B(end-start+1) ;
std::copy( A.begin() + start-1, A.begin() + end, B.begin() ) ;
return B.sort() ;
}
// [[Rcpp::export]]
NumericVector rollP2(NumericVector A, int start, int end) {
NumericVector B( A.begin() + start-1, A.begin() + end ) ;
return B.sort() ;
}
// [[Rcpp::export]]
NumericVector rollP3(NumericVector A, int start, int end) {
NumericVector B = A[seq(start-1, end-1)] ;
return B.sort() ;
}
start and end are meant as 1-based indices, as if you were passing A[start:end] from R.

You need to look into C++ indexing, iterators and the whole bit. At a minimum, you need to change your interface (vector, fromInd, toInd) and figure out what you want to return.
One interpretation of your question would be to copy the subset from [fromInd, toInd) into a new vector, sort it and return it. All that is standard C++ fare, and a good text like the excellent (and free!!) C++ Annotations will be of help. It has a pretty strong STL section too.

You can use std::slice on a std::valarray. But if you want to use std::vector specifically then you can use std::copy to extract a portion of the vector and then use std::sort to sort the extracted slice of the vector.

You can do this quite easily by using the std::sort implementation that receives two iterators:
#include <vector>
#include <cinttypes>
#include <algorithm>
template <typename SeqContainer>
SeqContainer slicesort(SeqContainer const& sq, size_t begin, size_t end) {
auto const b = std::begin(sq)+begin;
auto const e = std::begin(sq)+end;
if (b <= std::end(sq) && e <= std::end(sq)) {
SeqContainer copy(b,e);
std::sort(copy.begin(),copy.end());
return copy;
}
return SeqContainer();
}
Which can be invoked like
std::vector<int> v = {3,1,7,3,6,-2,-8,-7,-1,-4,2,3,9};
std::vector<int> v2 = slicesort(v,5,10);

Related

use std::accumulate to add an array to only a vector slice [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 months ago.
Improve this question
I have following code
std::vector<float> d;
d.resize(800);
std::array<float, 8> adder;
int ind_slice = 5; // we want to add the array adder to v[40],v[41] ... v[47]
const auto it_begin = d.begin() + ind_slice *8;
const auto it_end = d.begin() + ind_slice *8 + ind_slice;
int index = 0;
std::accumulate(it_begin, it_end) [&] ( float* ind) { return ind = ind + (adder[index++])};
I am wondering if this is a safe way to do the accumulation, since I am captuing a reference from the outside and mutating it. So the function does have side effects. Is there a better way to use the accumulate to achieve my objective
At least if I understand your intent correctly, the algorithm to use here would almost certainly be std::transform, not std::accumulate.
accumulate is intended for taking some collection, and simply adding them up, roughly equivalent to sum() in a spreadsheet (for one example).
transform allows you (among other things) to combine two collections, about the way you seem to want to.
#include <vector>
#include <array>
#include <algorithm>
#include <iterator>
#include <iostream>
int main() {
std::vector<float> d;
d.resize(800);
std::array<float, 8> adder { 1, 2, 3, 4, 5, 6, 7, 8};
size_t start = 40;
const auto it_begin = d.begin() + start;
const auto it_end = d.begin() + start + adder.size();
// do the addition:
std:transform(it_begin, it_end, adder.begin(), it_begin, [](float a, float b) { return a + b; });
// show the modified part of the array.
std::copy(it_begin, it_end, std::ostream_iterator<float>(std::cout, "\n"));
}
I've taken the liberty of simplifying a bit of the other code as well, but not in ways that are likely to matter much here.
Since you only need an iterator to the beginning of the second collection, you can simplify the code a bit further if you want, by using adder as the first collection, and the slice of d as the second:
#include <vector>
#include <array>
#include <algorithm>
#include <iterator>
#include <iostream>
int main() {
std::vector<float> d;
d.resize(800);
std::array<float, 8> adder { 1, 2, 3, 4, 5, 6, 7, 8};
size_t start = 40;
const auto it_begin = d.begin() + start;
std:transform(adder.begin(), adder.end(), it_begin, it_begin, [](float a, float b) { return a + b; });
std::copy(it_begin, it_begin+adder.size(), std::ostream_iterator<float>(std::cout, "\n"));
}
As it stands right now, we still use an iterator to the end of the affected portion of d when we print things out, but that was added just to make it clear that we'd actually done something, not to fulfill any real requirement.
Since C++11 the binary operator passed to std::accumulate is allowed to have side effects. The restrictions since C++11 are (from cppreference):
op must not invalidate any iterators, including the end iterators, nor modify any elements of the range involved, nor *last.
In general, standard algorithms are permitted to copy functors passed to them, but this isnt an issue with your operator either.
The problem is that the functor passed to std::accumulate must be a binary operator that specifies how each element is accumulated to the resulting value. Yours is not a binary operator and it will simply not compile. It is not clear why you want to use std::accumulate when you do not want to accumulate something. A simple loop will do:
for (size_t i=0;i<8;++i) d[i+40] += adder[i];

Order a vector according to weights stored in another vector using STL algorithms

I have a vector containing instances of a class, let's say std::vector<A> a. I want to order this vector according to weights stored in a std::vector<float> weights, with weights[i] being the weight associated to a[i]; after sorting, a elements must be ordered by increasing weight.
I know how to do this explicitly, but I'd like to use C++14 STL algorithms in order to benefit from an eventual optimal implementation. Up to now, I haven't been able to figure how to use weights in a lambda comparison expression for std::sort, nor how to keep a and weights aligned every time two elements of a are swapped by std::sort, so I'm beginning to think that it might be not possible.
Thanks in advance for any help.
Sort an index vector, then rearrange according to the result:
void my_sort(std::vector<A>& a, std::vector<float>& weights)
{
std::vector<int> idx(a.size());
std::iota(idx.begin(), idx.end(), 0);
sort(idx.begin(), idx.end(),
[&](int a, int b) { return weights[a] < weights[b]; });
auto reorder = [&](const auto& o) {
decltype(o) n(o.size());
std::transform(idx.begin(), idx.end(), n.begin(),
[&](int i) { return o[i]; });
return n;
};
a = reorder(a);
weights = reorder(weights);
}
Transform the two vectors in a std::pair<A,float> vector and then sort based on the weight ( second member of the pair ) . Recreate the two vectors afterwards
Add a new member to the A class so that it contains the weight and sort based on that weight
make a custom comparison function based on a global array containing the weights like described here: std::sort and custom swap function
I would go for 3 as it is the most efficient. That is valid if you don't have multi-threading which would require some synchronization.
With my comment I was alluding exactly to what #AndreaRossini summarised with their comment. Something like this:
#include <boost/hana/functional/on.hpp>
#include <functional>
#include <iostream>
#include <range/v3/algorithm/sort.hpp>
#include <range/v3/view/transform.hpp>
#include <range/v3/view/zip.hpp>
#include <string>
#include <vector>
using boost::hana::on;
using namespace ranges;
using namespace ranges::views;
// helpers to get first and second of a pair
auto /*C++17->*/constexpr/*<-C++17*/ fst = [](auto const& p){ return p.first; };
auto /*C++17->*/constexpr/*<-C++17*/ snd = [](auto const& p){ return p.second; };
int main(){
std::vector<std::string> v{"one", "two", "three"}; // values
std::vector<float> w{3,1,2}; // weights
// zipping the two sequences; each element of vw is a pair
auto vw = zip(v, w);
// sorting: using `std::less` on the `snd` element of the pairs
sort(vw, std::less<>{} ^on^ snd);
// extracting only the `fst` of each pair
auto res = vw | transform(fst);
// show result
for (auto i : res) { std::cout << i << std::endl; }
}
A few things about the libraries that I've used:
res is not a std::vector but just a view; if you want a vector, you can do
#include <range/v3/range/conversion.hpp>
auto res = vw | transform(fst) | to_vector;
std::less<>{} ^on^ snd is equivalent to the following f
auto f = [](auto const& x, auto const& y){
return std::less<>{}(snd(x), snd(y));
};
so you can think of it as a function that takes x and y and gives back snd(x) < snd(y).

c++ computing a new vector which has deltas from an existing vector

I'm working on learning C++ STL algorithms. I need help trying to find a function to create a vector of deltas from values in an existing vector. In other words:
delta0 = abs(original1 - original0)
delta1 = abs(original2 - original1)
and so on.
I'm looking for something concise, like R's "diff" function mentioned here:
computing a new vector which has deltas from an existing vector
I found the transform function but it seemed to operate on a single element at a time. It didn't seem to allow parameters of iterator in the function supplied to transform so I was limited to the current element only. I'm trying to learn STL algorithms so I don't really need any libraries that may have "diff" implemented. I would just like to see a way to use STL functions to solve this if there is a concise way I'm not aware of.
Here is an example with the section in question commented:
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> v = { 1, 2, 3, 4, 5 };
vector<int> delta;
//---------------------------------------
// way to do this with STL algorithms?
for (auto i = v.begin()+1; i != v.end(); i++) {
delta.push_back(abs(*i - *(i - 1)));
}
//---------------------------------------
for (int i : delta) {
cout << i << " ";
}
return 0;
}
std::transform(std::next(v.begin()), v.end(),
v.begin(), delta.begin(),
[](int a, int b){ return std::abs(a - b); });
You simply missed the second version of transform. Take a look at it here.
See that code in action online!

boost::numeric::ublas::vector<double> and double[]

I'm using boost for matrix and vector operations in a code and one of the libraries I am using (CGNS) has an array as an argument. How do I copy the vector into double[] in a boost 'way', or better yet, can I pass the data without creating a copy?
I'm a bit new to c++ and am just getting going with boost. Is there a guide I should read with this info?
Contents between any two input iterators can be copied to an output iterator using the copy algorithm. Since both ublas::vector and arrays have iterator interfaces, we could use:
#include <boost/numeric/ublas/vector.hpp>
#include <algorithm>
#include <cstdio>
int main () {
boost::numeric::ublas::vector<double> v (3);
v(0) = 2;
v(1) = 4.5;
v(2) = 3.15;
double p[3];
std::copy(v.begin(), v.end(), p); // <--
printf("%g %g %g\n", p[0], p[1], p[2]);
return 0;
}
Depends on the types involved. For std::vector you just make sure that it's non-empty and then you can pass &v[0]. Most likely the same holds for the Boost types you're using.

sum of square of each elements in the vector using for_each

As the function accepted by for_each take only one parameter (the element of the vector), I have to define a static int sum = 0 somewhere so that It can be accessed
after calling the for_each . I think this is awkward. Any better way to do this (still use for_each) ?
#include <algorithm>
#include <vector>
#include <iostream>
using namespace std;
static int sum = 0;
void add_f(int i )
{
sum += i * i;
}
void test_using_for_each()
{
int arr[] = {1,2,3,4};
vector<int> a (arr ,arr + sizeof(arr)/sizeof(arr[0]));
for_each( a.begin(),a.end(), add_f);
cout << "sum of the square of the element is " << sum << endl;
}
In Ruby, We can do it this way:
sum = 0
[1,2,3,4].each { |i| sum += i*i} #local variable can be used in the callback function
puts sum #=> 30
Would you please show more examples how for_each is typically used in practical programming (not just print out each element)? Is it possible use for_each simulate 'programming pattern' like map and inject in Ruby (or map /fold in Haskell).
#map in ruby
>> [1,2,3,4].map {|i| i*i}
=> [1, 4, 9, 16]
#inject in ruby
[1, 4, 9, 16].inject(0) {|aac ,i| aac +=i} #=> 30
EDIT: Thank you all. I have learned so much from your replies. We have so many ways to do the same single thing in C++ , which makes it a little bit difficult to learn. But it's interesting :)
No, don't use std::accumulate() use std::inner_product(). No functor required.
#include <vector>
#include <numeric>
void main()
{
std::vector <int> v1;
v1.push_back(1);
v1.push_back(2);
v1.push_back(3);
v1.push_back(4);
int x = std::inner_product( v1.begin(), v1.end(), v1.begin(), 0 );
}
Use std::accumulate
#include <vector>
#include <numeric>
// functor for getting sum of previous result and square of current element
template<typename T>
struct square
{
T operator()(const T& Left, const T& Right) const
{
return (Left + Right*Right);
}
};
void main()
{
std::vector <int> v1;
v1.push_back(1);
v1.push_back(2);
v1.push_back(3);
v1.push_back(4);
int x = std::accumulate( v1.begin(), v1.end(), 0, square<int>() );
// 0 stands here for initial value to which each element is in turn combined with
// for our case must be 0.
}
You could emulate std::accumulate as in nice GMan's answer, but I believe that using std::accumulate will make your code more readable, because it was designed for such purposes. You could find more standard algorithms here.
for_each returns (a copy of) the functor that it was using. So, something like this:
#include <algorithm>
#include <vector>
#include <iostream>
template <typename T>
class square_accumulate
{
public:
square_accumulate(void) :
_sum(0)
{
}
const T& result(void) const
{
return _sum;
}
void operator()(const T& val)
{
_sum += val * val;
}
private:
T _sum;
};
int main(void)
{
int arr[] = {1,2,3,4};
std::vector<int> a (arr ,arr + sizeof(arr)/sizeof(arr[0]));
int sum = std::for_each(a.begin(), a.end(), square_accumulate<int>()).result();
std::cout << "sum of the square of the element is " << sum << std::endl;
}
As demonstrated by other answers, though, std::accumulate is the best way to go.
Don't use for_each() for this, use accumulate() from the <numeric> header:
#include <numeric>
#include <iostream>
using namespace std;
struct accum_sum_of_squares {
// x contains the sum-of-squares so far, y is the next value.
int operator()(int x, int y) const {
return x + y * y;
}
};
int main(int argc, char **argv) {
int a[] = { 4, 5, 6, 7 };
int ssq = accumulate(a, a + sizeof a / sizeof a[0], 0, accum_sum_of_squares());
cout << ssq << endl;
return 0;
}
The default behaviour of accumulate() is to sum elements, but you can provide your own function or functor as we do here, and the operation it performs need not be associative -- the 2nd argument is always the next element to be operated on. This operation is sometimes called reduce in other languages.
You could use a plain function instead of the accum_sum_of_squares functor, or for even more genericity, you could make accum_sum_of_squares a class template that accepts any numeric type.
As a general solution to such issue with STL: instead of passing a function, you can pass a functor -- for example, an instance of any class implementing operator(). This is much better than relying on global variables, since said instance can keep and update its own state! You could think of it as a kind of "compile time duck typing": generic programming does not constrain you to pass a "function" in that place, anything that "behaves like a function" (i.e., has a proper operator()) will do as well!-)
std::for_each is for doing something with each element. If you want get a result from a calculation on all the elements, there's std::accumulate. If you are wanting Haskell's map behaviour, use std::transform.
You can abuse either of these three to do the same thing as any of the others, since ultimately they are just iterating over an iterator (except for transform's form that takes two iterators as input.) The point is that for_each is not a replacement for map/fold - that should be done by transform/accumulate - although C++ doesn't natively have something that expresses the map/fold concept as well as Haskell does - but both gcc and VC++ support OpenMP which has a much better analogue in #pragma omp parallel for.
Inject in Ruby is a much closer match to calling for_each with a full-fledged functor, like GMan explained above. Lambda functions with variable capture in C++0X will make the behaviour between the two languages even more similar:
int main(void)
{
int arr[] = {1,2,3,4};
std::vector<int> a (arr ,arr + sizeof(arr)/sizeof(arr[0]));
int sum = 0;
std::for_each(a.begin(), a.end(), [&](int i) { sum += i*i;} );
std::cout << "sum of the square of the element is " << sum << std::endl;
}