C++ Using std::transform on a vector of structures

C++ Using std::transform on a vector of structures - c++

I have a C++ application with a struct which looks like:
struct test_struc {
std::string name;
int x;
int y;
std::vector<int> data;
};
And I would like to transform two std::vector<test_struc> using the lines according to their data element in the structure. So far, I have the code:
std::vector<test_struc> first = <from somewhere else>;
std::vector<test_struc> second = <from somewhere else>;
std::vector<int> result;
std::transform (second.begin(), second.end(), first.begin(), result.begin(), op_xor);
where op_xor is:
int op_xor (int &i,
int &j) {
return i^j;
}
This works well if first and second are vectors of int, but since they are not, I don't know how to tell the code to use the data element of test_struc as arguments to std::transform.
Am I barking up the wrong tree? Or is there a way to do this?

Note that at least with a modern compiler, you probably want to implement the operation as a lambda expression, so it would look something like:
std::transform(second.begin(), second.end(),
first.begin(),
std::back_inserter(result),
[](test_struct const &a, test_struct const &b) {
return a.y ^ b.y;
});
Minor aside: as it was, you had UB, trying to write through result.begin() when the size of result was 0. You could either use back_inserter as above, or you could define result initializing its size to second.size().

Your binary functor must take two test_structs:
int op_xor (const test_struct& i,
const test_struct& j)
{
return 42; // replace wit your desired logic.
}
It isn't clear what exactly you want the functor to do, but it should operate on the test_structs and return an int.

Or use lambdas.
And remember size of other vectors must be the same or more than "second"
result.resize(second.size());
std::transform(second.begin(), second.end(), first.begin(), result.begin(),
[](test_struct & one, test_struct & two){ return one.x ^ two.x; });
But Jerry's example with back_inserter is better

Also you can do it using boost::transform and boost phoenix lambdas:
#include <boost/range/algorithm.hpp>
#include <boost/phoenix.hpp>
using boost::phoenix::arg_names::arg1;
using boost::phoenix::arg_names::arg2;
boost::transform(second, first, std::back_inserter(result), (&arg1)->*&test_struc::x ^ (&arg2)->*&test_struc::x);
First and second arguments of boost::transform are ranges, so you do not have to write second.begin(), second.end(), first.begin() here.

Related

Filling vector with emplace_back vs. std::transform

It's oversimplified code with a simple vector and class.
class OutputClass
{
public:
OutputClass(int x, int y);
};
std::vector<OutputClass> Convert(std::vector<int> const &input)
{
std::vector<OutputClass> res;
res.reserve(input.size());
//either (1)
for (auto const &in : input)
res.emplace_back(in, in*in);
return res;
//or something like (2)
std::transform(input.begin(),
input.end(),
std::back_inserter(res),
[](InputClass const &in){return OutputClass(in, in*in);});
return res;
}
Is there a difference in performance between those two options? Static analyzers often have a rule for replacing all raw loops with algorithms, but in this case, it seems to me that looping with emplace_back would be more efficient, as we don't need either copy or move. Or I'm wrong and they are equal in terms of performance and (2) is preferable in terms of good style and readability?

To find out whether one is significantly faster than the other in a particular use case, you can measure.
I see no benefit in enforcing the creation of vectors. Avoiding that dynamic allocation when it isn't needed can be quite good for performance. Here is an example where vectors are used, but that's not necessary:
OutputClass
convert(int in)
{
return {in, in*in};
}
auto
convert_range(const auto& input)
{
return std::ranges::transform_view(input, convert);
}
#include <vector>
int main()
{
std::vector<int> input {1, 2, 3};
auto r = convert_range(input);
std::vector<OutputClass> output(r.begin(), r.end());
}

how to sum up a vector of vector int in C++ without loops

I try to implement that summing up all elements of a vector<vector<int>> in a non-loop ways.
I have checked some relevant questions before, How to sum up elements of a C++ vector?.
So I try to use std::accumulate to implement it but I find it is hard for me to overload a Binary Operator in std::accumulate and implement it.
So I am confused about how to implement it with std::accumulate or is there a better way?
If not mind could anyone help me?
Thanks in advance.

You need to use std::accumulate twice, once for the outer vector with a binary operator that knows how to sum the inner vector using an additional call to std::accumulate:
int sum = std::accumulate(
vec.begin(), vec.end(), // iterators for the outer vector
0, // initial value for summation - 0
[](int init, const std::vector<int>& intvec){ // binaryOp that sums a single vector<int>
return std::accumulate(
intvec.begin(), intvec.end(), // iterators for the inner vector
init); // current sum
// use the default binaryOp here
}
);

In this case, I do not suggest using std::accumulate as it would greatly impair readability. Moreover, this function use loops internally, so you would not save anything. Just compare the following loop-based solution with the other answers that use std::accumulate:
int result = 0 ;
for (auto const & subvector : your_vector)
for (int element : subvector)
result += element;
Does using a combination of iterators, STL functions, and lambda functions makes your code easier to understand and faster? For me, the answer is clear. Loops are not evil, especially for such simple application.

According to https://en.cppreference.com/w/cpp/algorithm/accumulate , looks like BinaryOp has the current sum on the left hand, and the next range element on the right. So you should run std::accumulate on the right hand side argument, and then just sum it with left hand side argument and return the result. If you use C++14 or later,
auto binary_op = [&](auto cur_sum, const auto& el){
auto rhs_sum = std::accumulate(el.begin(), el.end(), 0);
return cur_sum + rhs_sum;
};
I didn't try to compile the code though :). If i messed up the order of arguments, just replace them.
Edit: wrong terminology - you don't overload BinaryOp, you just pass it.

Signature of std::accumulate is:
T accumulate( InputIt first, InputIt last, T init,
BinaryOperation op );
Note that the return value is deduced from the init parameter (it is not necessarily the value_type of InputIt).
The binary operation is:
Ret binary_op(const Type1 &a, const Type2 &b);
where... (from cppreference)...
The type Type1 must be such that an object of type T can be implicitly converted to Type1. The type Type2 must be such that an object of type InputIt can be dereferenced and then implicitly converted to Type2. The type Ret must be such that an object of type T can be assigned a value of type Ret.
However, when T is the value_type of InputIt, the above is simpler and you have:
using value_type = std::iterator_traits<InputIt>::value_type;
T binary_op(T,value_type&).
Your final result is supposed to be an int, hence T is int. You need two calls two std::accumulate, one for the outer vector (where value_type == std::vector<int>) and one for the inner vectors (where value_type == int):
#include <iostream>
#include <numeric>
#include <iterator>
#include <vector>
template <typename IT, typename T>
T accumulate2d(IT outer_begin, IT outer_end,const T& init){
using value_type = typename std::iterator_traits<IT>::value_type;
return std::accumulate( outer_begin,outer_end,init,
[](T accu,const value_type& inner){
return std::accumulate( inner.begin(),inner.end(),accu);
});
}
int main() {
std::vector<std::vector<int>> x{ {1,2} , {1,2,3} };
std::cout << accumulate2d(x.begin(),x.end(),0);
}

Solutions based on nesting std::accumulate may be difficult to understand.
By using a 1D array of intermediate sums, the solution can be more straightforward (but possibly less efficient).
int main()
{
// create a unary operator for 'std::transform'
auto accumulate = []( vector<int> const & v ) -> int
{
return std::accumulate(v.begin(),v.end(),int{});
};
vector<vector<int>> data = {{1,2,3},{4,5},{6,7,8,9}}; // 2D array
vector<int> temp; // 1D array of intermediate sums
transform( data.begin(), data.end(), back_inserter(temp), accumulate );
int result = accumulate(temp);
cerr<<"result="<<result<<"\n";
}
The call to transform accumulates each of the inner arrays to initialize the 1D temp array.

To avoid loops, you'll have to specifically add each element:
std::vector<int> database = {1, 2, 3, 4};
int sum = 0;
int index = 0;
// Start the accumulation
sum = database[index++];
sum = database[index++];
sum = database[index++];
sum = database[index++];
There is no guarantee that std::accumulate will be non-loop (no loops). If you need to avoid loops, then don't use it.
IMHO, there is nothing wrong with using loops: for, while or do-while. Processors that have specialized instructions for summing arrays use loops. Loops are a convenient method for conserving code space. However, there may be times when loops want to be unrolled (for performance reasons). You can have a loop with expanded or unrolled content in it.

With range-v3 (and soon with C++20), you might do
const std::vector<std::vector<int>> v{{1, 2}, {3, 4, 5, 6}};
auto flat = v | ranges::view::join;
std::cout << std::accumulate(begin(flat), end(flat), 0);
Demo

Trying to provide operator that divides integer-type std::vector by constant

I want to multiply and divide all the elements of std::vector by constant in the same way as it is performed in C++ for ordinary types: at least the result should be integer when input vector has integer type and floating-point type otherwise.
I have found the code for multiplication based on std::multiplies and modified it with the replacement std::divides. As the result, the code works but not in the order I want it:
#include <iostream>
#include <vector>
#include <algorithm>
// std::vector multiplication by constant
// http://codereview.stackexchange.com/questions/77546
template <class T, class Q>
std::vector <T> operator*(const Q c, const std::vector<T> &A) {
std::vector <T> R(A.size());
std::transform(A.begin(), A.end(), R.begin(),
std::bind1st(std::multiplies<T>(),c));
return R;
}
// My modification for division. There should be integer division
template <class T, class Q>
std::vector <T> operator/(const std::vector<T> &A, const Q c) {
std::vector <T> R(A.size());
std::transform(A.begin(), A.end(), R.begin(),
std::bind1st(std::divides<T>(),c));
return R;
}
int main() {
std::vector<size_t> vec;
vec.push_back(100);
int d = 50;
std::vector<size_t> vec2 = d*vec;
std::vector<size_t> vec3 = vec/d;
std::cout<<vec[0]<<" "<<vec2[0]<<" "<<vec3[0]<<std::endl;
// The result is:
// 100 5000 0
size_t check = vec[0]/50;
std::cout<<check<<std::endl;
// Here the result is 2
// But
std::vector<double> vec_d;
vec_d.push_back(100.0);
vec_d = vec_d/50;
std::cout<<vec_d[0]<<std::endl;
// And here the result is 0.5
return 0;
}
How can I write my operator correctly ? I thought that std::bind1st would call division by c for each element, but it does the opposite somehow.
EDIT: I understand that I can write a loop, but I want to do a lot of divisions for big numbers, so I wanted it to be faster...

Using std::transform with C++11, I'd suggest making a lambda (see this tutorial) instead of using bind:
std::transform(A.begin(), A.end(), R.begin(), [c](T val) {
return val / c;
});
In my opinion, lambdas are almost always more readable than binding, especially when (like in your case) you're not binding all of the function's parameters.
Although if you're worried about performance, a raw for loop might be slightly faster, as there's no overhead of the function call and creating the lambda object.
According to Dietmar Kühl:
std::transform() may do a bit of "magic" and actually perform better than a loop. For example, the implementation may choose to vectorize the loop when it notices that it is used on a contiguous sequence of integers. It is, however, rather unlikely to be slower than the loop.

auto c_inverse= 1/c;
std::transform(A.begin(), A.end(), R.begin(), [c_inverse](T val) {
return val * c_inverse;
});
Similar to the other post, but it should be mentioned that rather than division, you will most likely see performance gains by multiplying by the inverse.

Why make it only for vectors? Here's a way to make more generic, to work with many types of containers:
template <class container, class Q>
container operator/(const container& A, const Q c) {
container R;
std::transform(std::cbegin(A), std::cend(A), std::back_inserter(R),
[c](const auto& val) {return val / c; });
return R;
}
Sure, it is expected to be a bit slower than with pre-allocation for a vector, since the back_inserter will allocate dynamically as it grows, but well, sometimes it might be appropriate to trade speed for genericity.

Fast way to do lexicographical comparing 2 numbers

I'm trying to sort a vector of unsigned int in lexicographical order.
The std::lexicographical_compare function only supports iterators so I'm not sure how to compare two numbers.
This is the code I'm trying to use:
std::sort(myVector->begin(),myVector->end(), [](const unsigned int& x, const unsigned int& y){
std::vector<unsigned int> tmp1(x);
std::vector<unsigned int> tmp2(y);
return lexicographical_compare(tmp1.begin(),tmp1.end(),tmp2.begin(),tmp2.end());
} );

C++11 introduces std::to_string
You can use from to_string as below:
std::sort(myVector->begin(),myVector->end(), [](const unsigned int& x, const unsigned int& y){
std::string tmp1 = std::to_string(x);
std::string tmp2 = std::to_string(y);
return lexicographical_compare(tmp1.begin(),tmp1.end(),tmp2.begin(),tmp2.end());
} );

I assume you have some good reasons, but allow me to ask: Why are you sorting two int's by using the std::lexicographical order? In which scenario is 0 not less than 1, for example?
I suggest for comparing the scalars you want to use std::less . Same as std lib itself does.
Your code (from the question) might contain a lambda that will use std::less and that will work perfectly. But let us go one step further and deliver some reusable code ready for pasting into your code. Here is one example:
/// sort a range in place
template< typename T>
inline void dbj_sort( T & range_ )
{
// the type of elements range contains
using ET = typename T::value_type;
// use of the std::less type
using LT = std::less<ET>;
// make its instance whose 'operator ()'
// we will use
LT less{};
std::sort(
range_.begin(),
range_.end(),
[&]( const ET & a, const ET & b) {
return less(a, b);
});
}
The above is using std::less<> internally. And it will sort anything that has begin() and end() and public type of the elements it contains. In other words implementation of the range concept.
Example usage:
std::vector<int> iv_ = { 13, 42, 2 };
dbj_sort(iv_);
std::array<int,3> ia_ = { 13, 42, 2 };
dbj_sort(ia_);
std:: generics in action ...
Why is std::less working here? Among other obvious things, because it compares two scalars. std::lexicographical_compare compares two ordinals.
std::lexicographical_compare might be used two compare two vectors, not two elements from one vector containing scalars.
HTH

Create a vector of int from a vector of points with C++11

I have a simple point structure
struct mypoint
{
int x;
int y;
};
and a vector of mypoints
vector<mypoint> myvector;
If I want to create a vector of int containing all the coordinates of my points (i.e. x1, y1, x2, y2, x3, y3, ...), I could easily do it in the following way
vector<mypoint>::iterator pt, ptend(myvector.end());
vector<int> newvector;
for(pt=myvector.begin(); pt!=ptend; ++pt)
{
newvector.push_back(pt->x);
newvector.push_back(pt->y);
}
Is there a way to obtain the same result in one (or two) line(s) of code using the C++11?

std::vector<int> extractIntsFromPoints(const std::vector<mypoint>& pointVector)
{
std::vector<int> retVector;
for (const auto& element : pointVector)
{
retVector.push_back(element.x);
retVector.push_back(element.y);
}
return retVector;
}
Call this function where you need the int vector.
I threw in the range-based for loop to make it extra C++11.

Since you're using C++11, you can use the new for syntax.
vector<int> newvector;
for( const auto &pt : myvector)
{
newvector.push_back(pt.x);
newvector.push_back(pt.y);
}

steal from the post: C++ std::transform vector of pairs->first to new vector
vector<int> items;
std::transform(pairs.begin(),
pairs.end(),
std::back_inserter(items),
[](const std::pair<int, int>& p) { return p.first; });

Here's about 4 lines, using a lambda:
vector<mypoint> points;
vector<int> iv;
points.push_back(mypoint(1,2));
points.push_back(mypoint(3,4));
points.push_back(mypoint(5,6));
for_each(points.cbegin(), points.cend(),
[&iv](const mypoint &pt) {
iv.push_back(pt.x);
iv.push_back(pt.y);
});

You could use a std::pair<> in which you push the coordinates using std::make_pair and then push the std::pair<> into the vector such as:
mypoint a_point;
std::pair<int, int> point = std::make_pair(a_point.x, a_point.y);
vector<std::pair<int, int>> vec.push_back(point).
Perhaps bulky but in two lines it works well and encapsulates a point rather than separating the magnitudes of each point axis and placing them inside a std::vector.

As reima already noted, if you only want to reference the existing sequence, it is sufficient to cast myvector.data() to int* (assuming sizeof(mypoint) == 2 * sizeof(int) holds).
However, if you explicitly want a copy of the flattened sequence, you are probably better off creating a small utility function like this:
template <typename T, typename U>
std::vector<T> flatten(std::vector<U> const& other) {
static_assert(std::is_trivially_copyable<U>::value,
"source type must be trivially copyable!");
static_assert(std::is_trivially_copy_constructible<T>::value,
"destination type must be trivially copy constructible!");
static_assert((sizeof(U) / sizeof(T)) * sizeof(T) == sizeof(U),
"sizeof(U) must be a multiple of sizeof(T)!");
return std::vector<T>(reinterpret_cast<T const*>(other.data()),
reinterpret_cast<T const*>(std::next(other.data(), other.size())));
}
template <typename U>
std::vector<typename U::value_type> flatten(std::vector<U> const& other) {
return flatten<typename U::value_type>(other);
}
reducing your code to
auto newvector = flatten<int>(myvector);
or - if you equip your mypoint struct with a (STL-conforming) value_type member type - even to
auto newvector = flatten(myvector);
Note, that this utility function is nothing more than a tweaked constructor using the inherently unsafe reinterpret_cast to convert mypoint pointers into int pointers.
To get rid of the safety caveats that go along with the use of reinterpret_cast, the flatten function uses some static_assert parachutes. So, it's better to hide all this in a seprate function.
Still, it uses a lot of C++11 features like auto, move construction, static_assert, type traits, std::next and vector::data() which pretty much strips down your call site code to a bare minimum.
Also, this is as efficient as it gets because the range constructor of vector will only perform the memory allocation and call uninitialized_copy, which will probably boil down to a call of memcpy for trivially copyable types.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ Using std::transform on a vector of structures - c++

Your binary functor must take two test_structs: int op_xor (const test_struct& i, const test_struct& j) { return 42; // replace wit your desired logic. } It isn't clear what exactly you want the functor to do, but it should operate on the test_structs and return an int.

Related

Filling vector with emplace_back vs. std::transform

how to sum up a vector of vector int in C++ without loops

Trying to provide operator that divides integer-type std::vector by constant

Fast way to do lexicographical comparing 2 numbers

Create a vector of int from a vector of points with C++11

Categories

Resources