C++ array operator overhead - c++

I remember reading a while back some code that allowed the compiler to do some work and simplify an expression like this one:
// edit: yes the parameters where meant to be passed by reference
// and maintain constness sorry !
template< typename T >
std::vector<T> operator+( const std::vector<T>& a, const std::vector<T>& b )
{
assert( a.size() == b.size() );
std::vector<T> o; o.reserve( a.size() );
for( std::vector<T>::size_type i = 0; i < a.size(); ++i )
o[i] = a[i] + b[i];
return o;
}
// same for operator* but a[i] * b[i] instead
std::vector<double> a, b, c, d, e;
// do some initialization of the vectors
e = a * b + c * d
Where normally a new vector would be created and allocated for each operator instead the compiler would instead only create one copy and do all the operations onto it.
What is this technique?

As #Agnew mentioned very early, the technique you're describing is expression templates.
This is typically done with the mathematical concept of a vector1, not a std::vector.
The broad strokes are:
Don't have math operations on your vectors return the result. Instead, have them return a proxy object that represents the operation that eventually needs to be done. a * b could return a "multiplication proxy" object that just holds const references to the two vectors that should be multiplied.
Write math operations for these proxies too, allowing them to be chained together, so a * b + c * d becomes (TempMulProxy) + (TempMulProxy) becomes (TempAddProxy), all without doing any of the math or copying any vectors.
Write an assignment operator that takes your proxy object for the right-side object. This operator can see the entire expression a * b + c * d and do that operation efficiently on your vector, while knowing the destination. All without creating multiple temporary vector objects.
1 or matrix or quaternion, etc...*

I don't see a question here. However, my crystal ball tells me that you want to know the better method of two methods you came up with in order to perform component-wise arithmetic operations on vectors like a * b + c * d where a, b, c, d are vectors (std::vector<T>) having the same size:
For each operation to be done, loop over the elements, perform the calculation and return a resulting vector. Put these operations together in a formula on vectors.
For each element in the input vectors, calculate the whole expression and write it into one single final resulting vector.
There are two things to consider:
Performance: Here, the second option is ahead, since the processor will not allocate unnecessary temporary vectors.
Re-usability: Clearly, it's nice to implement algorithmic operations for vectors and re-use them by simply expressing your target formula on vectors.
However, there is a nice option to implement the second option which looks very pretty:
std::vector<int> a, b, c, d, e;
// fill a, b, c, d with data
auto expression = [](int a, int b, int c, int d){ return a * b + c * d; };
assert (a.size() == b.size() && b.size() == c.size() && c.size() == d.size());
e.reserve(a.size());
for(auto _a = a.begin(), _b = b.begin(), _c = c.begin(), _d = d.begin(), _e = e.begin();
_a != a.end();
++_a, ++_b, ++_c, ++_d, ++_e)
{
*_e = expression(*_a, *_b, *_c, *_d);
}
This way, you can separate the expression from the logic to evaluate it:
void componentWise4(std::function<int(int,int,int,int)> f,
const std::vector<int> & a,
const std::vector<int> & b,
const std::vector<int> & c,
const std::vector<int> & d,
std::vector<int> & result)
{
assert (a.size() == b.size() && b.size() == c.size() && c.size() == d.size());
result.reserve(a.size());
for(auto _a = a.begin(), _b = b.begin(), _c = c.begin(), _d = d.begin(), _result = result.begin();
_a != a.end();
++_a, ++_b, ++_c, ++_d, ++_result)
{
*_result = expression(*_a, *_b, *_c, *_d);
}
}
Which is then called like that:
std::vector<int> a, b, c, d, e;
// fill a, b, c, d with data
componentWise4([](int a, int b, int c, int d){ return a * b + c * d; },
a, b, c, d, e);
I'm sure this "expression evaluator" can be extended using C++11 new feature "variadic templates" to support arbitrary numbers of arguments within the expression as well as even different types. I couldn't manage to get it working (the variadic template thing), you can try to finish my attempt here: http://ideone.com/w88kuG (I'm new to variadic templates, so I don't know the syntax).

What do you want is in „The C++ Programming Language”. Third Edition by Bjarne Stroustrup in 22.4.7 Temporaries, Copying, and Loops [num.matrix]. It is always a good idea to read the book.
If you dont have it, basically we have two option:
First: we write a set of function for direct calculation of some of the most expected combination ( For example mul_add_and_assign(&U,&M,&V,&W)to calcule U =M*V+W) and led the user to select self what function is he most convenient.
Second: we can introduce some auxiliary classes (for example VxV, VplusV, etc.) that only keep a reference to the arguments of each operation and define an operator conversion to vector. Now we create overloads of the operators + and * that take two vectors by reference and just return an object of the corresponding type. We can create classes of the type VxVplusVxV to calculate more complex operations. Now we can overload operator= to assing VxVplusVxV to a vector. And in this last overload we made all calculation, using the references to the arguments keeped in the auxiliary classes objects, with no or minimal temporary vectors created.

Related

Why does this happen at the end of the for loop with vectors c++

I want to erase repeated elements in a vector; I used a for loop to check if the next element in the vector is the same as the current element in the iteration and then delete it if true but, for some reason, it deletes the last element without being equal.
Here's my code:
#include <string>
#include <vector>
#include <iostream>
using namespace std;
template <typename T> vector<T> uniqueInOrder(const vector<T>& iterable){
vector<T> coolestVector = iterable;
for (int i = 0; i < coolestVector.size(); i++)
{
if (coolestVector[i] == coolestVector[i+1]){
coolestVector.erase(coolestVector.begin()+i);
i--;
}
/*for (int i = 0; i < coolestVector.size(); i++)
{
cout<<coolestVector[i]<<", ";
}
cout<<i<<", ";
cout<<coolestVector.size();
cout<<endl;*/
}
for (int i = 0; i < coolestVector.size(); i++)
{
cout<<coolestVector[i]<<endl;
}
return coolestVector;
}
vector<char> uniqueInOrder(const string& iterable){
vector<char> coolVector = {};
for (int i = 0; i < iterable.size(); i++)
{
coolVector.push_back(iterable[i]);
}
const vector<char> realVector = coolVector;
uniqueInOrder(realVector);
}
int main(){
const string test = "AAAABBBCCDAABBB";
uniqueInOrder(test);
}
output:
vector 0: A, A, A, B, B, B, C, C, D, A, A, B, B, B, iterator value -1, vector size 14
vector 0: A, A, B, B, B, C, C, D, A, A, B, B, B, iterator value -1, vector size 13
vector 0: A, B, B, B, C, C, D, A, A, B, B, B, iterator value -1, vector size 12
vector 1: A, B, B, B, C, C, D, A, A, B, B, B, iterator value 0, vector size 12
vector 1: A, B, B, C, C, D, A, A, B, B, B, iterator value 0, vector size 11
vector 1: A, B, C, C, D, A, A, B, B, B, iterator value 0, vector size 10
vector 2: A, B, C, C, D, A, A, B, B, B, iterator value 1, vector size 10
vector 2: A, B, C, D, A, A, B, B, B, iterator value 1, vector size 9
vector 3: A, B, C, D, A, A, B, B, B, iterator value 2, vector size 9
vector 4: A, B, C, D, A, A, B, B, B, iterator value 3, vector size 9
vector 4: A, B, C, D, A, B, B, B, iterator value 3, vector size 8
vector 5: A, B, C, D, A, B, B, B, iterator value 4, vector size 8
vector 5: A, B, C, D, A, B, B, iterator value 4, vector size 7
vector 5: A, B, C, D, A, B, iterator value 4, vector size 6
vector 5: A, B, C, D, A, iterator value 4, vector size 5
A
B
C
D
A
Expected:
A
B
C
D
A
B
Why is the code incorrect?
Many people learn to iterate over an array or vector by memorizing the setup
for (int i = 0; i < X.size(); i++)
This is good for basic loops, but there are times when it is inadequate. Do you know why the conditional is i < X.size()? A basic understanding would say that this conditional ensures that the loop body is executed a number of times equal to the size of X. This is not wrong, but that rationale is more applicable when i is not used inside the loop body. (As an example, that rationale would apply equally well if i started at 1 and the loop continued as long as i <= X.size(), yet that is not a good way to iterate over an array/vector.)
A deeper understanding looks at how i is used in the loop body. A common example is printing the elements of X. (This is preliminary; we'll return to the question's situation later.) A loop that prints the elements of X might look like the following:
for (int i = 0; i < X.size(); i++)
std::cout << X[i] << ' ';
Note the index given to X – this is key to the loop's condition. The condition's deeper purpose is to ensure that the indices stay within the valid range. The indices given to X must not drop below 0 and they must remain less than X.size(). That is, index < X.size() where index gets replaced by whatever you have in the brackets. In this case, the thing in the brackets is i, so the condition becomes the familiar i < X.sixe().
Now let's look at the question's code.
for (int i = 0; i < coolestVector.size(); i++)
{
if (coolestVector[i] == coolestVector[i+1]){
// Code not using operator[]
}
// Diagnostics
}
There are two places where operator[] is used inside the loop. Apply the above "deeper understanding" to each of them, then combine the resulting conditionals with a logical "and".
The first index is i, so the goal index < X.size() becomes i < coolestVector.size() for this case.
The second index is i+1, so the goal index < X.size() becomes i+1 < coolestVector.size() for this case.
Combining these gives i < coolestVector.size() && i+1 < coolestVector.size(). This is what the loop's conditional should be to ensure that the indices stay within the valid range. Something logically equivalent would also work. Assuming that i+1 does not overflow (which would entail another class of problems), if i+1 is less than some value then so is i. It is enough to check that i+1 is in range, so we can simplify this conditional to i+1 < coolestVector.size().
for (int i = 0; i+1 < coolestVector.size(); i++) // <-- Fixed!
{
if (coolestVector[i] == coolestVector[i+1]){
// Code not using operator[]
}
// Diagnostics
}
(I know, that was a lot of writing to say "add one". The point is to give you – and future readers – the tools to get the next loop correct.)
Note that the same principle applies to the start of the loop. We start i at 0 so that i >= 0. This happens to imply i+1 >= 0 as well, so in this case there is nothing extra to be done. However, if one of the used indices was i-1, then you would need to ensure i-1 >= 0, which would be done by starting i at 1.
Look at your indices to determine where your loop control variable should start and stop.
I have separated this from my earlier answer because the earlier answer can stand on its own and I do not want it embroiled in the potential controversy that comes from explaining undefined behavior.
Why did the program consistently remove the last element?
Officially, we are in the realm of undefined behavior, so anything is possible. However, it is very likely that this behavior will be seen in all release builds, with two caveats.
An earlier element was removed. If no elements should be removed (a case worth adding to your test suite), then the behavior is unpredictable, possibly a crash but most likely the expected behavior.
Move construction leaves behind a copy. This is true for simple types like char. You likely will not see this behavior for a vector of std::string.
When an element in the middle of a std::vector is erased, all of the elements after that element are shifted down an index; they are copied (or moved) to the preceding element.
A B B C D
^
|-- erase this
A B B C D
^ ^ ^ <--- shift and copy (or move)
B C D
A B C D D
^
|-- Last element in the vector
Note that space is not released upon erasing. The vector still owns the memory where D used to be; it's just that accessing the element from outside the vector implementation is undefined behavior. Also, that memory is unlikely to have its bits changed by the vector in a release build. So it is very likely that past the end of the vector is a copy of the last element of the vector, unless the move constructor changed it.
Now comes your condition. When i is coolestVector.size()-1, you check to see if the last element of the vector (coolestVector[i]) equals the element past the end of the vector (coolestVector[i+1]). A release build will not verify that the index is valid, and the operating system does not care if that location in memory is accessed, so this comparison is likely to go through as one might naively expect. Does the last element of the vector equal the thing from which it was copied? Yes! OK, delete the last element.
Very likely in a release build, but don't rely on it.
You can use std::set for unique elements that too in linear time O(1).
void Unique_Vector(vector<string>&v,int size)
{
std::set<string>s;
for(auto i : v)
{
s.insert(i);
}
std::cout<<"Vector after removing duplicate :";
for(auto i : s)
{
std::cout<<i<<" ";
}
}

Smart way of assigning single member from vector A to vector B

This is a piece of code I'm currently using and I was wondering if there was a more elegant way of doing this in C++11 -
Essentially vector_a is copied to vector_b, then slightly modified, then vector_b is returned.
Vector elements are of class Point which is basically (leaving out constructors and a bunch of methods):
class Point {
double x,
y,
z;
};
Ideally I'd love to boil down the assignment of member z from vector_a to vector_b to something like a line or two but couldn't come up with an elegant way of doing it.
Any suggestions welcome!
auto localIter = vector_a.begin();
auto outIter = vector_b.begin();
while (localIter != vector_a.end() && outIter != vector_b.end())
{
outIter->z = localIter->z;
localIter++;
outIter++;
}
You may use transform().
std::transform (vector_a.begin(), vector_a.end(), vector_b.begin(), vector_a.begin(), [](Elem a, Elem b) { a->z = b->z; return a; });
Where Elem is a type of vector element.
As the vector has a random access iterator (using of std::next is effective) then I would write the code the following way
auto it = vector_a.begin();
std::for_each( vector_b.begin(),
std::next( vector_b.begin(),
std::min( vector_a.size(), vector_b.size() ) ),
[&it] ( Point &p ) { p.z = it++->z; } );
A partial copy is, actually, just a transformation of the elements (one of many), and therefore std::transform is a natural fit here.
Like many algorithms acting on multiple sequences, you have to be careful about the bounds of your containers; in this particular case, since vector_b just receives stuff, the easiest is to start empty and adjust its size as you go.
Thus, in the end, we get:
vector_b.clear();
std::transform(vector_a.begin(),
vector_a.end(),
std::back_inserter(vector_b),
[](Elem const& a) { Elem b; b.z = a.z; return b; });
transform is perhaps the most generic algorithm in the Standard Library (it could imitate copy for example), so you should carefully consider whereas a more specialized algorithm exists before reaching for it. In this case, however, it just fits.
I would be tempted to do something a bit like this:
#include <vector>
struct info
{
int z;
};
int main()
{
std::vector<info> a = {{1}, {2}, {3}};
std::vector<info> b = {{4}, {5}};
for(size_t i(0); i < a.size() && i < b.size(); ++i)
b[i].z = a[i].z;
}

dot product of vector < vector < int > > over the first dimension

I have
vector < vector < int > > data_mat ( 3, vector < int > (4) );
vector < int > data_vec ( 3 );
where data_mat can be thought of as a matrix and data_vec as a column vector, and I'm looking for a way to compute the inner product of every column of data_mat with data_vec, and store it in another vector < int > data_out (4).
The example http://liveworkspace.org/code/2bW3X5%241 using for_each and transform, can be used to compute column sums of a matrix:
sum=vector<int> (data_mat[0].size());
for_each(data_mat.begin(), data_mat.end(),
[&](const std::vector<int>& c) {
std::transform(c.begin(), c.end(), sum.begin(), sum.begin(),
[](int d1, double d2)
{ return d1 + d2; }
);
}
);
Is it possible, in a similar way (or in a slightly different way that uses STL functions), to compute column dot products of matrix columns with a vector?
The problem is that the 'd2 = d1 + d2' trick does not work here in the column inner product case -- if there is a way to include a d3 as well that would solve it ( d3 = d3 + d1 * d2 ) but ternary functions do not seem to exist in transform.
In fact you can use your existing column sum approach nearly one to one. You don't need a ternary std::transform as inner loop because the factor you scale the matrix rows with before summing them up is constant for each row, since it is the row value from the column vector and that iterates together with the matrix rows and thus the outer std::for_each.
So what we need to do is iterate over the rows of the matrix and multiply each complete row by the corresponding value in the column vector and add that scaled row to the sum vector. But unfortunately for this we would need a std::for_each function that simultaneously iterates over two ranges, the rows of the matrix and the rows of the column vector. To achieve this, we could use the usual unary std::for_each and just do the iteration over the column vector manually, using an additional iterator:
std::vector<int> sum(data_mat[0].size());
auto vec_iter = data_vec.begin();
std::for_each(data_mat.begin(), data_mat.end(),
[&](const std::vector<int>& row) {
int vec_value = *vec_iter++; //manually advance vector row
std::transform(row.begin(), row.end(), sum.begin(), sum.begin(),
[=](int a, int b) { return a*vec_value + b; });
});
The additional manual iteration inside the std::for_each isn't really that idiomatic use of the standard library algorithms, but unfortunately there is no binary std::for_each we could use.
Another option would be to use std::transform as outer loop (which can iterate over two ranges), but we don't really compute a single value in each outer iteration to return, so we would have to just return some dummy value from the outer lambda and throw it away by using some kind of dummy output iterator. That wouldn't be the cleanest solution either:
//output iterator that just discards any output
struct discard_iterator : std::iterator<std::output_iterator_tag,
void, void, void, void>
{
discard_iterator& operator*() { return *this; }
discard_iterator& operator++() { return *this; }
discard_iterator& operator++(int) { return *this; }
template<typename T> discard_iterator& operator=(T&&) { return *this; }
};
//iterate over rows of matrix and vector, misusing transform as binary for_each
std::vector<int> sum(data_mat[0].size());
std::transform(data_mat.begin(), data_mat.end(),
data_vec.begin(), discard_iterator(),
[&](const std::vector<int>& row, int vec_value) {
return std::transform(row.begin(), row.end(),
sum.begin(), sum.begin(),
[=](int a, int b) {
return a*vec_value + b;
});
});
EDIT: Although this has already been discussed in comments and I understand (and appreciate) the theoretic nature of the question, I will still include the suggestion that in practice a dynamic array of dynamic arrays is an awfull way to represent such a structurally well-defined 2D array like a matrix. A proper matrix data structure (which stores its contents contigously) with the appropriate operators is nearly always a better choice. But nevertheless due to their genericity you can still use the standard library algorithms for working with such a custom datastructure (maybe even by letting the matrix type provide its own iterators).

Template trick to optimize out allocations

I have:
struct DoubleVec {
std::vector<double> data;
};
DoubleVec operator+(const DoubleVec& lhs, const DoubleVec& rhs) {
DoubleVec ans(lhs.size());
for(int i = 0; i < lhs.size(); ++i) {
ans[i] = lhs[i]] + rhs[i]; // assume lhs.size() == rhs.size()
}
return ans;
}
DoubleVec someFunc(DoubleVec a, DoubleVec b, DoubleVec c, DoubleVec d) {
DoubleVec ans = a + b + c + d;
}
Now, in the above, the "a + b + c + d" will cause the creation of 3 temporary DoubleVec's -- is there a way to optimize this away with some type of template magic ... i.e. to optimize it down to something equivalent to:
DoubleVec ans(a.size());
for(int i = 0; i < ans.size(); i++) ans[i] = a[i] + b[i] + c[i] + d[i];
You can assume all DoubleVec's have the same # of elements.
The high level idea is to have do some type of templateied magic on "+", which "delays the computation" until the =, at which point it looks into itself, goes hmm ... I'm just adding thes numbers, and syntheizes a[i] + b[i] + c[i] + d[i] ... instead of all the temporaries.
Thanks!
Yep, that's exactly what expression templates (see http://www.drdobbs.com/184401627 or http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Expression-template for example) are for.
The idea is to make operator+ return some kind of proxy object which represents the expression tree to be evaluated. Then operator= is written to take such an expression tree and evaluate it all at once, avoiding the creation of temporaries, and applying any other optimizations that may be applicable.
Have a look at Boost.Proto, which is a library for writing EDSL (embedded domain specific languages) directly in C++. There is even an example showing exactly what you need.
http://codeidol.com/cpp/cpp-template-metaprogramming/Domain-Specific-Embedded-Languages/-10.5.-Blitz-and-Expression-Templates/
If we had to boil the problem solved by Blitz++ down to a single sentence, we'd say, "A naive implementation of array math is horribly inefficient for any interesting computation." To see what we mean, take the boring statement
x = a + b + c;
The problem here is that the operator+ signature above is just too greedy: It tries to evaluate a + b just as soon as it can, rather than waiting until the whole expression, including the addition of c, is available.
In the expression's parse tree, evaluation starts at the leaves and proceeds upwards to the root. What's needed here is some way of delaying evaluation until the library has all of the expression's parts: that is, until the assignment operator is executed. The stratagem taken by Blitz++ is to build a replica of the compiler's parse tree for the whole expression, allowing it to manage evaluation from the top down
This can't be any ordinary parse tree, though: Since array expressions may involve other operations like multiplication, which require their own evaluation strategies, and since expressions can be arbitrarily large and nested, a parse tree built with nodes and pointers would have to be traversed at runtime by the Blitz++ evaluation engine to discover its structure, thereby limiting performance. Furthermore, Blitz++ would have to use some kind of runtime dispatching to handle the different combinations of operation types, again limiting performance.
Instead, Blitz++ builds a compile-time parse tree out of expression templates. Here's how it works in a nutshell: Instead of returning a newly computed Array, operators just package up references to their arguments in an Expression instance, labeled with the operation:
// operation tags
struct plus; struct minus;
// expression tree node
template <class L, class OpTag, class R>
struct Expression
{
Expression(L const& l, R const& r)
: l(l), r(r) {}
float operator[](unsigned index) const;
L const& l;
R const& r;
};
// addition operator
template <class L, class R>
Expression<L,plus,R> operator+(L const& l, R const& r)
{
return Expression<L,plus,R>(l, r);
}
Notice that when we write a + b, we still have all the information needed to do the computationit's encoded in the type Expressionand the data is accessible through the expression's stored references. When we write a + b + c, we get a result of type:
Expression<Expression<Array,plus,Array>,plus,Array>

c++ how to make two vectors one with data the other points and reads only

In C++ I have 2 STL vectors A and V. A has data and is able to change it, V only points to data but reads only and can't modify it. So if these two vectors are inside a class what will be the syntax of
Variables definition
assigning A reference into V
get_A() and get_V() will it return a reference or a pointer?
Also If I have other normal vectors like A, B, C, and D am I able to "insert" their references into V so that V can see them all one by one? For clearance V.size() will be equal to A.size() + B.size() + C.size().
Sorry for confusion,I think I've asked the question in wrong way
The vectors will be declared as
vector<Data> A;
vector<const Data *> V;
(Note that V cannot be a vector of const Data & because references are not Assignable, and vector requires an Assignable template type.)
Assigning a reference from A into V would look like this:
V[i] = &A[i];
I'm not sure what you mean by get_A and get_V. My best guess is that you are referring to what the results of operator[] on A and V are. A[i] returns a reference to a Data, that is a Data&. V[i] technically returns a reference to a const Data pointer, i.e. const Data * &, but effectively you would use it as a pointer, i.e. a const Data *.
Regarding the question about A, B, and C: if they are the all vectors of same type, and they do not change size, you could set up V to contain pointers to the elements in each one of them. But if they do change size, then appending an element to, say, A, after having set V up would mean you would have to insert the pointer to the new element of A into the correct offset of V, which is possible but seems like a hassle.
A quick example of setting up such a V would look like this:
vector<Data const *> V;
for (size_t i = 0; i < A.size(); ++i) { V.push_back(&A[i]); }
for (size_t i = 0; i < B.size(); ++i) { V.push_back(&B[i]); }
for (size_t i = 0; i < C.size(); ++i) { V.push_back(&C[i]); }
I believe what you are describing is something like a const alias for the data vector. What I am saying is that you need to work with a const reference to vector A.
An example (completely out of the blue, but describes my suggestion and understanding of the situation quite well):
class someClass
{
public:
const std::vector & V() const
{
return A;
}
private:
std::vector<int> A;
};
From what I get from constness, this "protects" A through showing only the const version when someone "accesses" the vector through someClass::V().