How to concatenate many std::vectors? - c++

There is already a question on how to concatenate two vectors: Concatenating two std::vectors. However, I found it appropriate to start a new one, as my question is a bit more specific....
I have two classes that look like this:
class AClass {
public:
std::vector<double> getCoeffs() {return coeffs;}
private:
std::vector<double> coeffs;
};
class BClass {
public:
std::vector<double> getCoeffs() {return ...;}
private:
std::vector<AClass> aVector;
};
What is the best way (i.e. avoiding unnecessary copying etc.) to concatenate the coefficients from each element in aVector?
My very first attempt was
std::vector<double> BClass::getCoeffs(){
std::vector<double> coeffs;
std::vector<double> fcoefs;
for (int i=0;i<aVector.size();i++){
fcoefs = aVector[i].getCoeffs();
for (int j=0;j<fcoefs.size();j++{
coeffs.push_back(fcoefs[j]);
}
}
return coeffs;
}
I already know how to avoid the inner for loop (thanks to the above mentioned post), but I am pretty sure, that with the help of some std algorithm this could be done in a single line.
I cannot use C++11 at the moment. Nevertheless, I would also be interested how to do it in C++11 (if there is any advantage over "no C++11").
EDIT: I will try to rephrase the question a bit, to make it more clear.
Concatenating two vectors can be done via insert. For my example I would use this:
std::vector<double> BClass::getCoeffs(){
std::vector<double> coeffs;
std::vector<double> fcoefs;
for (int i=0;i<aVector.size();i++){
fcoefs = aVector[i].getCoeffs();
coeffs.insert(coeffs.end(),fcoefs.begin(),fcoefs.end());
}
return coeffs;
}
Is it possible to avoid the for loop?
I could imagine that it is possible to write something like
for_each(aVector.begin(),aVector.end(),coeffs.insert(coeffs.end(),....);

You can do this in C++11:
std::for_each(aVector.begin(), aVector.end(), [&](AClass i){const auto& temp = i.getCoeffs(); coeffs.insert(coeffs.end(), temp.begin(), temp.end());});
C++03 is more difficult because it lacks lambdas and bind.
About as good as you can do is to use copy in your internal loop:
for(std::vector<AClass>::iterator it = aVector.begin(); it != aVector.end(); ++it){
const std::vector<double>& temp = it->getCoeffs();
coeffs.insert(coeffs.end(), temp.begin(), temp.end());
}
These are both essentially the same thing, though you could improve your runtime on both by returning a const std::vector<double>& from getCoeffs.
EDIT:
Arg, just saw you added insert to your question. I thought I was really going to help you out there. As a consolation tip, what you are really asking about here is flattening a std::vector of std::vectors. That has an answer here. But should you have access to boost you should look at: http://www.boost.org/doc/libs/1_57_0/libs/multi_array/doc/reference.html#synopsis

The first step is to avoid extra allocations. If you know that you won't be growing the return value, you can reserve to exactly the right size.
std::vector<double> BClass::getCoeffs(){
typedef std::vector<double> dvec;
dvec coeffs;
typedef std::vector<AClass> avec;
typedef std::vector<dvec> ddvec;
ddvec swap_space;
swap_space.reserve(aVector.size());
size_t capacity = 0;
for (avec::const_iterator it = aVector.begin(); it != aVector.end(); ++it) {
dvec v = it->getCoeffs(); // RVO elision!
capacity += v.size();
swap_space.push_back();
v.swap(swap_space.back());
}
dvec retval;
retval.reserve(capacity);
for (ddvec::iterator it = swap_space.begin(); it != swap_space.end(); ++it) {
retval.insert( retval.end(), it->begin(), it->end() );
}
return retval; // NRVO
}
this should avoid more than one allocation per AClass (as forced by their API! You should have a vector<?> const& accessor), plus one allocation for the return value.
Fixing AClass is advised.

Related

Filling vector with emplace_back vs. std::transform

It's oversimplified code with a simple vector and class.
class OutputClass
{
public:
OutputClass(int x, int y);
};
std::vector<OutputClass> Convert(std::vector<int> const &input)
{
std::vector<OutputClass> res;
res.reserve(input.size());
//either (1)
for (auto const &in : input)
res.emplace_back(in, in*in);
return res;
//or something like (2)
std::transform(input.begin(),
input.end(),
std::back_inserter(res),
[](InputClass const &in){return OutputClass(in, in*in);});
return res;
}
Is there a difference in performance between those two options? Static analyzers often have a rule for replacing all raw loops with algorithms, but in this case, it seems to me that looping with emplace_back would be more efficient, as we don't need either copy or move. Or I'm wrong and they are equal in terms of performance and (2) is preferable in terms of good style and readability?
To find out whether one is significantly faster than the other in a particular use case, you can measure.
I see no benefit in enforcing the creation of vectors. Avoiding that dynamic allocation when it isn't needed can be quite good for performance. Here is an example where vectors are used, but that's not necessary:
OutputClass
convert(int in)
{
return {in, in*in};
}
auto
convert_range(const auto& input)
{
return std::ranges::transform_view(input, convert);
}
#include <vector>
int main()
{
std::vector<int> input {1, 2, 3};
auto r = convert_range(input);
std::vector<OutputClass> output(r.begin(), r.end());
}

Is there an efficient way to slice a C++ vector given a vector containing the indexes to be sliced

I am working to implement a code which was written in MATLAB into C++.
In MATLAB you can slice an Array with another array, like A(B), which results in a new array of the elements of A at the indexes specified by the values of the element in B.
I would like to do a similar thing in C++ using vectors. These vectors are of size 10000-40000 elements of type double.
I want to be able to slice these vectors using another vector of type int containing the indexes to be sliced.
For example, I have a vector v = <1.0, 3.0, 5.0, 2.0, 8.0> and a vector w = <0, 3, 2>. I want to slice v using w such that the outcome of the slice is a new vector (since the old vector must remain unchanged) x = <1.0, 2.0, 5.0>.
I came up with a function to do this:
template<typename T>
std::vector<T> slice(std::vector<T>& v, std::vector<int>& id) {
std::vector<T> tmp;
tmp.reserve(id.size());
for (auto& i : id) {
tmp.emplace_back(v[i]);
}
return tmp;
}
I was wondering if there was potentially a more efficient way to do such a task. Speed is the key here since this slice function will be in a for-loop which has approximately 300000 iterations. I heard the boost library might contain some valid solutions, but I have not had experience yet with it.
I used the chrono library to measure the time it takes to call this slice function, where the vector to be sliced was length 37520 and the vector containing the indexes was size 1550. For a single call of this function, the time elapsed = 0.0004284s. However, over ~300000 for-loop iterations, the total elapsed time was 134s.
Any advice would be much appreicated!
emplace_back has some overhead as it involves some internal accounting inside std::vector. Try this instead:
template<typename T>
std::vector<T> slice(const std::vector<T>& v, const std::vector<int>& id) {
std::vector<T> tmp;
tmp.resize (id.size ());
size_t n = 0;
for (auto i : id) {
tmp [n++] = v [i];
}
return tmp;
}
Also, I removed an unnecessary dereference in your inner loop.
Edit: I thought about this some more, and inspired by #jack's answer, I think that the inner loop (which is the one that counts) can be optimised further. The idea is to put everything used by the loop in local variables, which gives the compiler the best chance to optimise the code. So try this, see what timings you get. Make sure that you test a Release / optimised build:
template<typename T>
std::vector<T> slice(const std::vector<T>& v, const std::vector<int>& id) {
size_t id_size = id.size ();
std::vector<T> tmp (id_size);
T *tmp_data = tmp.data ();
const int *id_data = id.data ();
const T* v_data = v.data ();
for (size_t i = 0; i < id_size; ++i) {
tmp_data [i] = v_data [id_data [i]];
}
return tmp;
}
The performance seems a bit slow; are you building with compiler optimizations (eg. g++ main.cpp -O3 or if using an IDE, switching to release mode). This alone sped up computation time around 10x.
If you are using optimizations already, by using basic for loop iteration (for int i = 0; i < id.size(); i++) computation time was sped up around 2-3x on my machine, the idea being, the compiler doesn't have to resolve what type auto refers to, and since basic for loops have been in C++ forever, the compiler is likely to have lots of tricks to speed it up.
template<typename T>
std::vector<T> slice(const std::vector<T>& v, const std::vector<int>& id){
// #Jan Schultke's suggestion
std::vector<T> tmp(id.size ());
size_t n = 0;
for (int i = 0; i < id.size(); i++) {
tmp [n++] = v [i];
}
return tmp;
}

C++ vectors - Is there any way I can copy a vector inside a for loop and use it outside of the loop?

I have a do-while loop which copies a vector before it clears it, to make room for the for loop to run again. I have tried the following (with declaring the vector as a global one):
std::vector<int> z;
// bunch of code
do {
// bunch of code
double v_size = v.size();
for (int i = 0; i < v_size; i++) {
z.push_back(v[i]);
}
} while (true);
This does work, but my understanding is that its bad programming practise and may get some bizarre results on different compilers. I also get an error message (on the line where the push_back is):
Implicit conversion changes signedness: 'int' to
'std::__1::vector<int, std::__1::allocator<int> >::size_type' (aka
'unsigned long')
I am really new to C++ and coding in general. So, if there are any veterans here, willing to help or figure this out, it would be highly appreciated.
Here is some additional code in more context to clarify:
#include <iostream>
#include <vector>
int main(){
std::vector<int> v;
std::vector<int> z;
int option;
std::cout << "Enter option";
std::cin >> option;
do {
double v_size = v.size();
for (int i = 0; i < v_size; i++) {
z.push_back(v[i]);
}
// Clearing vector v to let the loop run again, and the vector
// need to be empty for the real code to work.
v.clear();
} while (option != 1);
// Use the vector Z for all the total values that
// circled through vector v.
}
The error you are getting is because you are trying to compare int and double;
if you are trying to copy the contents of one vector into another. What you are doing is not the best way.
There are a few methods
Source: Ways to copy a vector into another in C++
Using the = operator
std::vector<int> vec1{1,2,3,4,5};
std::vector<int> vec2 = vec1;
Passing vector by a constructor
std::vector<int> vec1{1,2,3,4,5};
std::vector<int> vec2(vec1);
Using std::copy
std::vector<int> vec1{1,2,3,4,5};
std::vector<int> vec2;
std::copy(vec1.begin(), vec1.end(), back_inserter(vec2));
assign
std::vector<int> vec1{1,2,3,4,5};
std::vector<int> vec2;
vec2.assign(vec1.begin(), vec1.end());
Iterating through a vector
Just some extra knowledge I believe can help you
Maybe you already know this, maybe you don't. But iterating through a vector the way you have mentioned is also not the best and there is a better way.
C++11 Introduced Range-based loops. A nicer and cleaner way to iterate over a range of values.
Syntax
It goes like this
for ( range_declaration : range_expression )
in context to our example
std::vector<int> vec{4,62,36,54};
for (auto i:vec){
std::cout << i;
}
If you don't want to modify the values, it is a good practice to use
const auto i:vec rather than auto i:vec. Check this out for more information

Smart way of assigning single member from vector A to vector B

This is a piece of code I'm currently using and I was wondering if there was a more elegant way of doing this in C++11 -
Essentially vector_a is copied to vector_b, then slightly modified, then vector_b is returned.
Vector elements are of class Point which is basically (leaving out constructors and a bunch of methods):
class Point {
double x,
y,
z;
};
Ideally I'd love to boil down the assignment of member z from vector_a to vector_b to something like a line or two but couldn't come up with an elegant way of doing it.
Any suggestions welcome!
auto localIter = vector_a.begin();
auto outIter = vector_b.begin();
while (localIter != vector_a.end() && outIter != vector_b.end())
{
outIter->z = localIter->z;
localIter++;
outIter++;
}
You may use transform().
std::transform (vector_a.begin(), vector_a.end(), vector_b.begin(), vector_a.begin(), [](Elem a, Elem b) { a->z = b->z; return a; });
Where Elem is a type of vector element.
As the vector has a random access iterator (using of std::next is effective) then I would write the code the following way
auto it = vector_a.begin();
std::for_each( vector_b.begin(),
std::next( vector_b.begin(),
std::min( vector_a.size(), vector_b.size() ) ),
[&it] ( Point &p ) { p.z = it++->z; } );
A partial copy is, actually, just a transformation of the elements (one of many), and therefore std::transform is a natural fit here.
Like many algorithms acting on multiple sequences, you have to be careful about the bounds of your containers; in this particular case, since vector_b just receives stuff, the easiest is to start empty and adjust its size as you go.
Thus, in the end, we get:
vector_b.clear();
std::transform(vector_a.begin(),
vector_a.end(),
std::back_inserter(vector_b),
[](Elem const& a) { Elem b; b.z = a.z; return b; });
transform is perhaps the most generic algorithm in the Standard Library (it could imitate copy for example), so you should carefully consider whereas a more specialized algorithm exists before reaching for it. In this case, however, it just fits.
I would be tempted to do something a bit like this:
#include <vector>
struct info
{
int z;
};
int main()
{
std::vector<info> a = {{1}, {2}, {3}};
std::vector<info> b = {{4}, {5}};
for(size_t i(0); i < a.size() && i < b.size(); ++i)
b[i].z = a[i].z;
}

C++ cast vector type in place

Is it possible to do this without creating new data structure?
Suppose we have
struct Span{
int from;
int to;
}
vector<Span> s;
We want to get an integer vector from s directly, by casting
vector<Span> s;
to
vector<int> s;
so we could remove/change some "from", "to" elements, then cast it back to
vector<Span> s;
This is not really a good idea, but I'll show you how.
You can get a raw pointer to the integer this way:
int * myPointer2 = (int*)&(s[0]);
but this is really bad practice because you can't guarantee that the span structure doesn't have any padding, so while it might work fine for me and you today we can't say much for other systems.
#include <iostream>
#include <vector>
struct Span{
int from;
int to;
};
int main()
{
std::vector<Span> s;
Span a = { 1, 2};
Span b = {2, 9};
Span c = {10, 14};
s.push_back(a);
s.push_back(b);
s.push_back(c);
int * myPointer = (int*)&(s[0]);
for(int k = 0; k < 6; k++)
{
std::cout << myPointer[k] << std::endl;
}
return 0;
}
As I said, that hard reinterpret cast will often work but is very dangerous and lacks the cross-platform guarantees you normally expect from C/C++.
The next worse thing is this, that will actually do what you asked but you should never do. This is the sort of code you could get fired for:
// Baaaad mojo here: turn a vector<span> into a vector<int>:
std::vector<int> * pis = (std::vector<int>*)&s;
for ( std::vector<int>::iterator It = pis->begin(); It != pis->end(); It++ )
std::cout << *It << std::endl;
Notice how I'm using a pointer to vector and pointing to the address of the vector object s. My hope is that the internals of both vectors are the same and I can use them just like that. For me, this works and while the standard templates may luckily require this to be the case, it is not generally so for templated classes (see such things as padding and template specialization).
Consider instead copying out an array (see ref 2 below) or just using s1.from and s[2].to.
Related Reading:
Are std::vector elements guaranteed to be contiguous?
How to convert vector to array in C++
If sizeof(Span) == sizeof(int) * 2 (that is, Span has no padding), then you can safely use reinterpret_cast<int*>(&v[0]) to get a pointer to array of int that you can iterate over. You can guarantee no-padding structures on a per-compiler basis, with __attribute__((__packed__)) in GCC and #pragma pack in Visual Studio.
However, there is a way that is guaranteed by the standard. Define Span like so:
struct Span {
int endpoints[2];
};
endpoints[0] and endpoints[1] are required to be contiguous. Add some from() and to() accessors for your convenience, if you like, but now you can use reinterpret_cast<int*>(&v[0]) to your heart’s content.
But if you’re going to be doing a lot of this pointer-munging, you might want to make your own vector-like data structure that is more amenable to this treatment—one that offers more safety guarantees so you can avoid shot feet.
Disclaimer: I have absolutely no idea about what you are trying to do. I am simply making educated guesses and showing possible solutions based on that. Hopefully I'll guess one right and you won't have to do crazy shenanigans with stupid casts.
If you want to remove a certain element from the vector, all you need to do is find it and remove it, using the erase function. You need an iterator to your element, and obtaining that iterator depends on what you know about the element in question. Given std::vector<Span> v;:
If you know its index:
v.erase(v.begin() + idx);
If you have an object that is equal to the one you're looking for:
Span doppelganger;
v.erase(std::find(v.begin(), v.end(), doppelganger));
If you have an object that is equal to what you're looking for but want to remove all equal elements, you need the erase-remove idiom:
Span doppelganger;
v.erase(std::remove(v.begin(), v.end(), doppelganger)),
v.end());
If you have some criterion to select the element:
v.erase(std::find(v.begin(), v.end(),
[](Span const& s) { return s.from == 0; }));
// in C++03 you need a separate function for the criterion
bool starts_from_zero(Span const& s) { return s.from == 0; }
v.erase(std::find(v.begin(), v.end(), starts_from_zero));
If you have some criterion and want to remove all elements that fit that criterion, you need the erase-remove idiom again:
v.erase(std::remove_if(v.begin(), v.end(), starts_from_zero)),
v.end());