So Im trying to implement an operator overload which will allows me to multiply a matrix and a vector. The matrix is itself a vector of vectors. I have produced code which achieves this through use of for-loops and simple vector indexing, however, I'd like to try implement it with iterators.
My first step was to produce an operator overload which allows me to 'multiply' vectors. In this sense multiplying is a row by row operations. So for two vectors a = {a1,a2,a3} and b = {b1,b2,b3}, the results of a*b is {a1b1, a2b2, a3b3}. This was implemented as follows:
template<typename T>
vector<double> operator*(const vector<T> &v1, const vector<T> &v2) //vector * vector [broken down matrix] Not dot product
{
vector<double> output(v2.size());
int pos = 0;
for(auto &row : v1){
for(auto &col : v2){
output[pos] = col*row;
}
pos++;
}
return output;
}
My idea was to then implement pretty much the following code for the multiplication of a matrix * vector:
template<typename T>
vector<double> operator*(const vector<vector<T> > &v1, const vector<T> &v2) //matrix * vector
{
vector<double> output(v2.size());
int pos = 0;
for(auto &row : v1){
output[pos] = row*v2;
}
pos++;
return output;
}
I think I know my problem. row is not a vector, so when I try to do row*v2, it does not make sense, nor does it used the vector * vector operator overload. So my question is, how can I make it so that the iterator operates over a vector of vectors, allowing the desired multiplication properties. Perhaps this approach is fundamentally floored, in which case I welcome any additional help. Thanks
follow up question: As seen, these are template functions. Is there a way I can declare the vector 'output' in terms of the typename T. I.e so that when integer vectors/matrices are passed into the function, integer vectors are returned, and when double vectors/matrices are passed into the function, double vectors are returned.
Related
I have written a function for matrix multiplication where matrices are defined using vectors of vectors containing double values.
vector<vector<double> > mat_mul(vector<vector<double> >A, vector<vector<double> >B){
vector<vector<double> >result(A.size(),vector<double>(B[0].size(),0));
if(A[0].size()==B.size()){
const int N=A[0].size();
for(int i=0;i<A.size();i++){
for(int j=0;j<B[0].size();j++){
for(int k=0;k<A[0].size();k++)
result[i][j]+=A[i][k]*B[k][j];
}
}
}
return result;
}
The code seems to work fine but is very slow for matrices large as 400X400.
How can we speed this up? I have seen other answers but they discuss matrix multiplication but not about any speed up for vector of vectors.
Any help is highly appreciated.
struct matrix:
vector<double>
{
using base = vector<double>
using size_type=std::pair<std::size_t,std::size_t>;
void resize(size_type const& dims, double const v=0){
rows = dims.second;
base::resize(dims.first * rows);
};
size_type size() const { return { cols(), rows }; };
double& operator[](size_type const& idx) {
base& vec=*this;
return vec[idx.first + idx.second * cols()];
};
double operator[](size_type const& idx) const {
base const& vec=*this;
return vec[idx.first + idx.second * cols()];
};
private:
std::size_t cols() const { return base::size() / rows; };
std::size_t rows = 0;
};
///...
auto const size = std::tuple_cat(A.size(), std::make_tuple(B.size().second));
matrix result;
result.resize({get<0>(size), get<2>(size)});
for(auto i = 0; i < get<0>(size); ++i)
for(auto j = 0; j < get<1>(size); ++j)
for(auto k = 0; k < get<2>(size); ++k)
result[{i,k}] += A[{i,j}] * B[{j,k}];
I just skipped lots of details, such as none-dedault constructors which is needed if you want a pretty initialization syntax. Moreover as a matrix, this type will need lots of arithmetics operators.
Another approach would be type_erased 2D raw array, but that would require defining assignment operator, as well as copy and move constructors. So this std::vector based solution seems to be the simplest implementation.
Also, if the dimensions are fixed at compile-time, a template alias can do:
template<typename T, std::size_t cols, std::size_t rows>
using array_2d = std::array<std::array<double, rows>, cols>;
array_2d<double, icount, kcount> result{};
You are using the naive algorithm for matrix multiplication. It’s extremely cache unfriendly, and you are hit by the full latency.
First, split up the operations so your inner loop repeatedly accessed data that fit into your cache.
Second, calculate four sums simultaneously to avoid penalties for latency.
Third, use fma instructions (fused multiply-add) which calculate a product and a sum in the same time as a product.
Fourth, use vector registers.
Five, use multiple threads.
Or just use a package like linpack optimising things for you.
I want to multiply and divide all the elements of std::vector by constant in the same way as it is performed in C++ for ordinary types: at least the result should be integer when input vector has integer type and floating-point type otherwise.
I have found the code for multiplication based on std::multiplies and modified it with the replacement std::divides. As the result, the code works but not in the order I want it:
#include <iostream>
#include <vector>
#include <algorithm>
// std::vector multiplication by constant
// http://codereview.stackexchange.com/questions/77546
template <class T, class Q>
std::vector <T> operator*(const Q c, const std::vector<T> &A) {
std::vector <T> R(A.size());
std::transform(A.begin(), A.end(), R.begin(),
std::bind1st(std::multiplies<T>(),c));
return R;
}
// My modification for division. There should be integer division
template <class T, class Q>
std::vector <T> operator/(const std::vector<T> &A, const Q c) {
std::vector <T> R(A.size());
std::transform(A.begin(), A.end(), R.begin(),
std::bind1st(std::divides<T>(),c));
return R;
}
int main() {
std::vector<size_t> vec;
vec.push_back(100);
int d = 50;
std::vector<size_t> vec2 = d*vec;
std::vector<size_t> vec3 = vec/d;
std::cout<<vec[0]<<" "<<vec2[0]<<" "<<vec3[0]<<std::endl;
// The result is:
// 100 5000 0
size_t check = vec[0]/50;
std::cout<<check<<std::endl;
// Here the result is 2
// But
std::vector<double> vec_d;
vec_d.push_back(100.0);
vec_d = vec_d/50;
std::cout<<vec_d[0]<<std::endl;
// And here the result is 0.5
return 0;
}
How can I write my operator correctly ? I thought that std::bind1st would call division by c for each element, but it does the opposite somehow.
EDIT: I understand that I can write a loop, but I want to do a lot of divisions for big numbers, so I wanted it to be faster...
Using std::transform with C++11, I'd suggest making a lambda (see this tutorial) instead of using bind:
std::transform(A.begin(), A.end(), R.begin(), [c](T val) {
return val / c;
});
In my opinion, lambdas are almost always more readable than binding, especially when (like in your case) you're not binding all of the function's parameters.
Although if you're worried about performance, a raw for loop might be slightly faster, as there's no overhead of the function call and creating the lambda object.
According to Dietmar Kühl:
std::transform() may do a bit of "magic" and actually perform better than a loop. For example, the implementation may choose to vectorize the loop when it notices that it is used on a contiguous sequence of integers. It is, however, rather unlikely to be slower than the loop.
auto c_inverse= 1/c;
std::transform(A.begin(), A.end(), R.begin(), [c_inverse](T val) {
return val * c_inverse;
});
Similar to the other post, but it should be mentioned that rather than division, you will most likely see performance gains by multiplying by the inverse.
Why make it only for vectors? Here's a way to make more generic, to work with many types of containers:
template <class container, class Q>
container operator/(const container& A, const Q c) {
container R;
std::transform(std::cbegin(A), std::cend(A), std::back_inserter(R),
[c](const auto& val) {return val / c; });
return R;
}
Sure, it is expected to be a bit slower than with pre-allocation for a vector, since the back_inserter will allocate dynamically as it grows, but well, sometimes it might be appropriate to trade speed for genericity.
I have
vector < vector < int > > data_mat ( 3, vector < int > (4) );
vector < int > data_vec ( 3 );
where data_mat can be thought of as a matrix and data_vec as a column vector, and I'm looking for a way to compute the inner product of every column of data_mat with data_vec, and store it in another vector < int > data_out (4).
The example http://liveworkspace.org/code/2bW3X5%241 using for_each and transform, can be used to compute column sums of a matrix:
sum=vector<int> (data_mat[0].size());
for_each(data_mat.begin(), data_mat.end(),
[&](const std::vector<int>& c) {
std::transform(c.begin(), c.end(), sum.begin(), sum.begin(),
[](int d1, double d2)
{ return d1 + d2; }
);
}
);
Is it possible, in a similar way (or in a slightly different way that uses STL functions), to compute column dot products of matrix columns with a vector?
The problem is that the 'd2 = d1 + d2' trick does not work here in the column inner product case -- if there is a way to include a d3 as well that would solve it ( d3 = d3 + d1 * d2 ) but ternary functions do not seem to exist in transform.
In fact you can use your existing column sum approach nearly one to one. You don't need a ternary std::transform as inner loop because the factor you scale the matrix rows with before summing them up is constant for each row, since it is the row value from the column vector and that iterates together with the matrix rows and thus the outer std::for_each.
So what we need to do is iterate over the rows of the matrix and multiply each complete row by the corresponding value in the column vector and add that scaled row to the sum vector. But unfortunately for this we would need a std::for_each function that simultaneously iterates over two ranges, the rows of the matrix and the rows of the column vector. To achieve this, we could use the usual unary std::for_each and just do the iteration over the column vector manually, using an additional iterator:
std::vector<int> sum(data_mat[0].size());
auto vec_iter = data_vec.begin();
std::for_each(data_mat.begin(), data_mat.end(),
[&](const std::vector<int>& row) {
int vec_value = *vec_iter++; //manually advance vector row
std::transform(row.begin(), row.end(), sum.begin(), sum.begin(),
[=](int a, int b) { return a*vec_value + b; });
});
The additional manual iteration inside the std::for_each isn't really that idiomatic use of the standard library algorithms, but unfortunately there is no binary std::for_each we could use.
Another option would be to use std::transform as outer loop (which can iterate over two ranges), but we don't really compute a single value in each outer iteration to return, so we would have to just return some dummy value from the outer lambda and throw it away by using some kind of dummy output iterator. That wouldn't be the cleanest solution either:
//output iterator that just discards any output
struct discard_iterator : std::iterator<std::output_iterator_tag,
void, void, void, void>
{
discard_iterator& operator*() { return *this; }
discard_iterator& operator++() { return *this; }
discard_iterator& operator++(int) { return *this; }
template<typename T> discard_iterator& operator=(T&&) { return *this; }
};
//iterate over rows of matrix and vector, misusing transform as binary for_each
std::vector<int> sum(data_mat[0].size());
std::transform(data_mat.begin(), data_mat.end(),
data_vec.begin(), discard_iterator(),
[&](const std::vector<int>& row, int vec_value) {
return std::transform(row.begin(), row.end(),
sum.begin(), sum.begin(),
[=](int a, int b) {
return a*vec_value + b;
});
});
EDIT: Although this has already been discussed in comments and I understand (and appreciate) the theoretic nature of the question, I will still include the suggestion that in practice a dynamic array of dynamic arrays is an awfull way to represent such a structurally well-defined 2D array like a matrix. A proper matrix data structure (which stores its contents contigously) with the appropriate operators is nearly always a better choice. But nevertheless due to their genericity you can still use the standard library algorithms for working with such a custom datastructure (maybe even by letting the matrix type provide its own iterators).
I am relatively new to C++ and still confused how to pass and return arrays as arguments. I would like to write a simple matrix-vector-product c = A * b function, with a signature like
times(A, b, c, m, n)
where A is a two-dimensional array, b is the input array, c is the result array, and m and n are the dimensions of A. I want to specify array dimensions through m and n, not through A.
The body of the (parallel) function is
int i, j;
double sum;
#pragma omp parallel for default(none) private(i, j, sum) shared(m, n, A, b, c)
for (i = 0; i < m; ++i) {
sum = 0.0;
for (j = 0; j < n; j++) {
sum += A[i][j] * b[j];
}
c[i] = sum;
}
What is the correct signature for a function like this?
Now suppose I want to create the result array c in the function and return it. How can I do this?
So instead of "you should rather" answer (which I will leave up, because you really should rather!), here is "what you asked for" answer.
I would use std::vector to hold your array data (because they have O(1) move capabilities) rather than a std::array (which saves you an indirection, but costs more to move around). std::vector is the C++ "improvement" of a malloc'd (and realloc'd) buffer, while std::array is the C++ "improvement" of a char foo[27]; style buffer.
std::vector<double> times(std::vector<double> const& A, std::vector<double> const& b, size_t m, size_t n)
{
std::vector<double> c;
Assert(A.size() = m*n);
c.resize(n);
// .. your code goes in here.
// Instead of A[x][y], do A[x*n+y] or A[y*m+x] depending on if you want column or
// row-major order in memory.
return std::move(c); // O(1) copy of the std::vector out of this function
}
You'll note I changed the signature slightly, so that it returns the std::vector instead of taking it as a parameter. I did this because I can, and it looks prettier!
If you really must pass c in to the function, pass it in as a std::vector<double>& -- a reference to a std::vector.
This is the answer you should use... So a good way to solve this one involves creating a struct or class to wrap your array (well, buffer of data -- I'd use a std::vector). And instead of a signature like times(A, b, c, m, n), go with this kind of syntax:
Matrix<4,4> M;
ColumnMatrix<4> V;
ColumnMatrix<4> C = M*V;
where the width/height of M are in the <4,4> numbers.
A quick sketch of the Matrix class might be (somewhat incomplete -- no const access, for example)
template<size_t rows, size_t columns>
class Matrix
{
private:
std::vector<double> values;
public:
struct ColumnSlice
{
Matrix<rows,columns>* matrix;
size_t row_number;
double& operator[](size_t column) const
{
size_t index = row_number * columns + column;
Assert(matrix && index < matrix->values.size());
return matrix->values[index];
}
ColumnSlice( Matrix<rows,columns>* matrix_, size_t row_number_ ):
matrix(matrix_), row_number(row_number_)
{}
};
ColumnSlice operator[](size_t row)
{
Assert(row < rows); // note: zero based indexes
return ColumnSlice(this, row);
}
Matrix() {values.resize(rows*columns);}
template<size_t other_columns>
Matrix<rows, other_columns> operator*( Matrix<columns, other_columns> const& other ) const
{
Matrix<rows, other_columns> retval;
// TODO: matrix multiplication code goes here
return std::move(retval);
}
};
template<size_t rows>
using ColumnMatrix = Matrix< rows, 1 >;
template<size_t columns>
using RowMatrix = Matrix< 1, columns >;
The above uses C++0x features your compiler might not have, and can be done without these features.
The point of all of this? You can have math that both looks like math and does the right thing in C++, while being really darn efficient, and that is the "proper" C++ way to do it.
You can also program in a C-like way using some features of C++ (like std::vector to handle array memory management) if you are more used to it. But that is a different answer to this question. :)
(Note: code above has not been compiled, nor is it a complete Matrix implementation. There are template based Matrix implementations in the wild you can find, however.)
Normal vector-matrix multiplication is as follows:
friend Vector operator*(const Vector &v, const Matrix &m);
But if you want to pass the dimensions separately, it's as follows:
friend Vector mul(const Vector &v, const Matrix &m, int size_x, int size_y);
Since the Vector and Matrix would be 1d and 2d arrays, they would look like this:
struct Vector { float *array; };
struct Matrix { float *matrix; };
I have four std::vector containers that all might (or might not) contain elements. I want to determine which of them has the most elements and use it subsequently.
I tried to create a std::map with their respective sizes as keys and references to those containers as values. Then I applied std::max on the size() of each vector to figure out the maximum and accessed it through the std::map.
Obviously, this gets me into trouble once there is the same number of elements in at least two vectors.
Can anyone think of a elegant solution ?
You're severely overthinking this. You've only got four vectors. You can determine the largest vector using 3 comparisons. Just do that:
std::vector<blah>& max = vector1;
if (max.size() < vector2.size()) max = vector2;
if (max.size() < vector3.size()) max = vector3;
if (max.size() < vector4.size()) max = vector4;
EDIT:
Now with pointers!
EDIT (280Z28):
Now with references! :)
EDIT:
The version with references won't work. Pavel Minaev explains it nicely in the comments:
That's correct, the code use
references. The first line, which
declares max, doesn't cause a copy.
However, all following lines do cause
a copy, because when you write max =
vectorN, if max is a reference, it
doesn't cause the reference to refer
to a different vector (a reference
cannot be changed to refer to a
different object once initialized).
Instead, it is the same as
max.operator=(vectorN), which simply
causes vector1 to be cleared and
replaced by elements contained in
vectorN, copying them.
The pointer version is likely your best bet: it's quick, low-cost, and simple.
std::vector<blah> * max = &vector1;
if (max->size() < vector2.size()) max = &vector2;
if (max->size() < vector3.size()) max = &vector3;
if (max->size() < vector4.size()) max = &vector4;
Here's one solution (aside from Pesto's far-too-straightforward approach) - I've avoided bind and C++0x lambdas for explanatory purposes, but you could use them to remove the need for a separate function. I'm also assuming that with two vectors with an equal number of elements, which one is picked is irrelevant.
template <typename T> bool size_less (const T* lhs, const T* rhs) {
return lhs->size() < rhs ->size();
}
void foo () {
vector<T>* vecs[] = {&vec1, &vec2, &vec3, &vec4};
vector<T>& vec = std::min_element(vecs, vecs + 4, size_less<vector<T> >);
}
Here is my very simple method. Only interest is that you just need basic c++ to understand it.
vector<T>* v[] = {&v1, &v2, &v3, &v4}, *max=&v1;
for(int i=1; i < 4; ++i)
if (v[i]->size() > max->size()) max = v[i];
This is a modified version of coppro's answer using a std::vector to reference any number of vectors for comparison.
template <typename T> bool size_less (const T* lhs, const T* rhs) {
return lhs->size() < rhs ->size();
}
void foo () {
// Define vector holding pointers to the original vectors
typedef vector< vector<T>* > VectorPointers;
// Fill the list
VectorPointers vecs;
vecs.push_back(&vec1);
vecs.push_back(&vec2);
vecs.push_back(&vec3);
vecs.push_back(&vec4);
vector<T>& vec = std::min_element(
vecs.begin(),
vecs.end(),
size_less<vector<T> >
);
}
I'm all for over-thinking stuff :)
For the general problem of finding the highest/lowest element in a group, I would use a priority_queue with a comparator:
(copying shamelessly from coppro, and modifying...)
template <typename T> bool size_less (const T* lhs, const T* rhs)
{
return lhs->size() < rhs ->size();
}
vector* highest()
{
priority_queue<vector<T>, size_less<T> > myQueue;
...
...
return myQueue.top();
}
You could use a std::multimap. That allows multiple entries with the same key.