subtracting elements of a vector from each other in CUDA - c++

I have a vector (x_dev) in CUDA which has B elements and is of double type.
I am looking for the best way, to subtract each element from the next element, and overwrite the vector.
(I do not care about the last element).
Essentially, the corresponding of this c++ code:
for(int i = 0; i < B-1; i++)
x_dev[i] = x_dev[i] - x_dev[i+1];

You could use thrust::transform
thrust::transform(x.begin(), x.end()-1, x.begin()+1, x.begin(), _1 - _2);

Related

Replace entire row of a 2D array with another 1d array

I am trying to replace an entire row of a 2d array with another vector.
My code is currently as follow:
#include <stdio.h>
int main(){
int imax = 5;
int jmax = 5;
double x[imax][jmax] = {0.0};
double a[imax] = {1,2,3,4,5};
}
In other words, now my x is a matrix with 5x5. How do I add/append/rewrite the 1st row of X with my a vector?
Thanks
One way to copy the row "without a loop" is the std::copy standard library algorithm.
std::copy(a, a + imax, x[0]); // x[0] is the first row
The algorithm contains the loop. Depending on the implementation this might emit a single call to memcpy or memmove instead.
imax and jmax should be const to make that code legal. Anyways, one obvious possibility is to copy elements one by one like this:
for ( int j = 0; j < jmax; j++ ) {
x[row][j] = a[j];
}
Another way is to use memcpy. That should be faster in normal circumstances, however, you rely on the assumption that the square bracket [] operator was not overloaded. Also you only can overwrite one row this way, not a column, so be careful when and where you use that.
memcpy( x[row], a, sizeof(a) );
('row' is your variable where you put the index of the row you want to replace)

How to access elements in vectors in Lists more efficiently?

I have a problem where I want to combine a list of vectors, all of the same type, in a particular fashion. I want the first element of my resultant vector to be the first element of the first vector in my list, the second element should be the first element of the second vector, the third, the first of the third and so on until n where n is length of my list and then element n+1 should be the second element of the first vector. This repeats until finished.
Currently, I am doing it like this:
CharacterVector measure(nrows * expansion);
CharacterVector temp(nrows);
for(int i=0; i < measure.size(); i++){
temp = values[i % expansion];
measure[i] = temp[i / expansion];
}
return(measure);
Where values is the List of CharacterVectors. This seems incredibly inefficient, overwriting temp every single time but I don't know of a better way to access the elements in values. I don't know a lot of C++ but I assume there must be a better way.
Any and all help is greatly appreciate!
EDIT:
All vectors in 'values are of the same length nrows and values has expansion elements in it.
What you need is the ListOf<CharacterVector> class. As the name implies, it represents an R list which only contains CharacterVector.
The code below uses it to extract the second element of each character vector from the list. Should not be hard to adapt it to your expansion algorithm, but your example was not reproducible without a bit more context.
#include <Rcpp.h>
using namespace Rcpp ;
// [[Rcpp::export]]
CharacterVector second( ListOf<CharacterVector> values ){
int n = values.size() ;
CharacterVector res(n);
for(int i=0; i<n; i++){
res[i] = values[i][1] ;
}
return res ;
}
Then, you sourceCpp this and try it on some sample data:
> data <- list(letters, letters, LETTERS)
> second(data)
[1] "b" "b" "B"
Now about your assumption:
This seems incredibly inefficient, overwriting temp every single time
Creating a CharacterVector is pretty fast, there is no deep copy of data, so this should not have been an issue in the first place.
You can preconstruct the vector and can easily know at which positions the elements of the first vector should go.. i.e. measure[0], measure[n], measure[n*2] etc.. where n = mylist.size(). Here i assume of course that each vector in the list has equal size. Untested code:
CharacterVector measure(nrows * expansion);
for(int i=0; i < values.size(); ++i)
{
CharacterVector& temp = values[i];
int newPosition = i;
for( int j=0; j < temp.size(); ++j)
{
measure[newPosition ] = temp[j];
newPosition += expansion;
}
}
return(measure);

element-wise multiplication of two vectors in c++

I am trying to do the following mathematical operation with two vectors:
v1 = [a1][a2][a3][a4][a5]
v2 = [b1][b2][b3][b4]b5]
Want to compute:
v = [a2*b2][a3*b3][a4*b4][a5*b5]
Note that I did not want the first element in the new vector.
I was wondering if there is a more efficient (one-liner) way to multiply (element-wise) two vectors in c++ than a for-loop (using push back). My current approach is as follows,
for(long i=1;i < v1.size();++i){
v.push_back(v1[i]*v2[i]);
}
I also tried the following,
for (long i = 1; i < v1.size(); ++i){
v[i-1] = v1[i]*v2[i];
}
Any suggestions?
std::transform( v1.begin()+1, v1.end(),
v2.begin()+1, v.begin(), // assumes v1,v2 of same size > 1,
// v one element smaller
std::multiplies<int>() ); // assumes values are 'int'
You can replace v.begin() with std::back_inserter(v) if v is empty, you should reserve() memory upfront to avoid multiple allocations.
You could look into std::valarray. It's designed to allow mathematical operations on every element in the array.

ArgMin for vector<double> in C++?

I'd like to find the index of the minimum value in a C++ std::vector<double>. Here's a somewhat verbose implementation of this:
//find index of smallest value in the vector
int argMin(std::vector<double> vec)
{
std::vector<double>::iterator mins = std::min_element(vec.begin(), vec.end()); //returns all mins
double min = mins[0]; //select the zeroth min if multiple mins exist
for(int i=0; i < vec.size(); i++)
{
//Note: could use fabs( (min - vec[i]) < 0.01) if worried about floating-point precision
if(vec[i] == min)
return i;
}
return -1;
}
(Let me know if you notice any mistakes in the above implementation. I tested it, but my testing is not at all exhaustive.)
I think the above implementation is probably a wheel-reinvention; I'd like to use built-in code if possible. Is there a one-line call to an STL function for this? Or, can someone suggest a more concise implementation?
You could use the standard min_element function:
std::min_element( vec.begin(), vec.end() );
It returns an iterator to the minimum element in the iterator range. Since you want an index and you are working with vectors, you can then substract the resulting iterator from vec.begin() to get such index.
There is an additional overload for a function or function-object if you need a custom comparison.

C++, fastest STL container for recursively doing { delete[begin], insert[end], and summing entire array contents}

I have some code and an array where each iteration I delete the first element, add an element at the end, and then sum the contents of the array. Naturally, the array stays the same size. I have tried using both vector and list, but both seem pretty slow.
int length = 400;
vector <int> v_int(length, z);
list <int> l_int(length, z);
for(int q=0; q < x; q++)
{
int sum =0;
if(y) //if using vector
{
v_int.erase(v_int.begin()); //takes `length` amount of time to shift memory
v_int.push_back(z);
for(int w=0; w < v_int.size(); w++)
sum += v_int[w];
}
else //if using list
{
l_int.pop_front(); //constant time
l_int.push_back(z);
list<int>::iterator it;
for ( it=l_int.begin() ; it != l_int.end(); it++ ) //seems to take much
sum += *it; //longer than vector does
}
}
The problem is that erasing the first element of the vector requires that each other element be shifted down, multiplying, by the size of the vector, the amount of time taken each iteration. Using a linked list avoids this (constant time removal of elements), and should not sacrifice any time summing the array (linear time traversal of the array), except that in my program it seems to be taking way longer to sum the contents than the vector does (at least 1 order of magnitude longer).
Is there a better container to use here? or a different way to approach the problem?
Why not keep a running sum with sum -= l_int.front(); sum += z?
Also the data structure you're looking for with that delete/insert performance is a queue
Efficient additions and deletions of end elements in a container is what the deque was made for.
If you are just inserting at one end and deleting at the other then you can use a queue