I am trying to replace an entire row of a 2d array with another vector.
My code is currently as follow:
#include <stdio.h>
int main(){
int imax = 5;
int jmax = 5;
double x[imax][jmax] = {0.0};
double a[imax] = {1,2,3,4,5};
}
In other words, now my x is a matrix with 5x5. How do I add/append/rewrite the 1st row of X with my a vector?
Thanks
One way to copy the row "without a loop" is the std::copy standard library algorithm.
std::copy(a, a + imax, x[0]); // x[0] is the first row
The algorithm contains the loop. Depending on the implementation this might emit a single call to memcpy or memmove instead.
imax and jmax should be const to make that code legal. Anyways, one obvious possibility is to copy elements one by one like this:
for ( int j = 0; j < jmax; j++ ) {
x[row][j] = a[j];
}
Another way is to use memcpy. That should be faster in normal circumstances, however, you rely on the assumption that the square bracket [] operator was not overloaded. Also you only can overwrite one row this way, not a column, so be careful when and where you use that.
memcpy( x[row], a, sizeof(a) );
('row' is your variable where you put the index of the row you want to replace)
Related
Say I have a vector containing only positive, real elements defined like this:
Eigen::VectorXd v(1.3876, 8.6983, 5.438, 3.9865, 4.5673);
I want to generate a new vector v2 that has repeated the elements in v some k times. Then I want to apply k different functions to each of the repeated elements in the vector.
For example, if v2 was v repeated 2 times and I applied floor() and ceil() as my two functions, the result based on the above vector would be a column vector with values: [1; 2; 8; 9; 5; 6; 3; 4; 4; 5]. Preserving the order of the original values is important here as well. These values are also a simplified example, in practice, I'm generating vectors v with ~100,000 or more elements and would like to make my code as vectorizable as possible.
Since I'm coming to Eigen and C++ from Matlab, the simplest approach I first took was to just convert this Nx1 vector into an Nx2 matrix, apply floor to the first column and ceil to the second column, take the transpose to get a 2xN matrix and then exploit the column-major nature of the matrix and reshape the 2xN matrix into a 2Nx1 vector, yielding the result I want. However, for large vectors, this would be very slow and inefficient.
This response by ggael effectively addresses how I could repeat the elements in the input vector by generating a sequence of indices and indexing the input vector. I could just then generate more sequences of indices to apply my functions to the relevant elements v2 and copy the result back to their respective places. However, is this really the most efficient approach? I dont fully grasp copy-on-write and move semantics, but I think the second indexing expressions would be in a sense redundant?
If that is true, then my guess is that a solution here would be some sort of nullary or unary expression where I could define an expression that accepts the vector, some index k and k expressions/functions to apply to each element and spits out the vector I'm looking for. I've read the Eigen documentation on the subject, but I'm struggling to build a functional example. Any help would be appreciated!
So, if I understand you correctly, you don't want to replicate (in terms of Eigen methods) the vector, you want to apply different methods to the same elements and store the result for each, correct?
In this case, computing it sequentially once per function is the easiest route. Most CPUs can only do one (vector) memory store per clock cycle, anyway. So for simple unary or binary operations, your gains have an upper bound.
Still, you are correct that one load is technically always better than two and it is a limitation of Eigen that there is no good way of achieving this.
Know that even if you manually write a loop that would generate multiple outputs, you should limit yourself in the number of outputs. CPUs have a limited number of line-fill buffers. IIRC Intel recommended using less than 10 "output streams" in tight loops, otherwise you could stall the CPU on those.
Another aspect is that C++'s weak aliasing restrictions make it hard for compilers to vectorize code with multiple outputs. So it might even be detrimental.
How I would structure this code
Remember that Eigen is column-major, just like Matlab. Therefore use one column per output function. Or just use separate vectors to begin with.
Eigen::VectorXd v = ...;
Eigen::MatrixX2d out(v.size(), 2);
out.col(0) = v.array().floor();
out.col(1) = v.array().ceil();
Following the KISS principle, this is good enough. You will not gain much if anything by doing something more complicated. A bit of multithreading might gain you something (less than factor 2 I would guess) because a single CPU thread is not enough to max out memory bandwidth but that's about it.
Some benchmarking
This is my baseline:
int main()
{
int rows = 100013, repetitions = 100000;
Eigen::VectorXd v = Eigen::VectorXd::Random(rows);
Eigen::MatrixX2d out(rows, 2);
for(int i = 0; i < repetitions; ++i) {
out.col(0) = v.array().floor();
out.col(1) = v.array().ceil();
}
}
Compiled with gcc-11, -O3 -mavx2 -fno-math-errno I get ca. 5.7 seconds.
Inspecting the assembler code finds good vectorization.
Plain old C++ version:
double* outfloor = out.data();
double* outceil = outfloor + out.outerStride();
const double* inarr = v.data();
for(std::ptrdiff_t j = 0; j < rows; ++j) {
const double vj = inarr[j];
outfloor[j] = std::floor(vj);
outceil[j] = std::ceil(vj);
}
40 seconds instead of 5! This version actually does not vectorize because the compiler cannot prove that the arrays don't alias each other.
Next, let's use fixed size Eigen vectors to get the compiler to generate vectorized code:
double* outfloor = out.data();
double* outceil = outfloor + out.outerStride();
const double* inarr = v.data();
std::ptrdiff_t j;
for(j = 0; j + 4 <= rows; j += 4) {
const Eigen::Vector4d vj = Eigen::Vector4d::Map(inarr + j);
const auto floorval = vj.array().floor();
const auto ceilval = vj.array().ceil();
Eigen::Vector4d::Map(outfloor + j) = floorval;
Eigen::Vector4d::Map(outceil + j) = ceilval;;
}
if(j + 2 <= rows) {
const Eigen::Vector2d vj = Eigen::Vector2d::MapAligned(inarr + j);
const auto floorval = vj.array().floor();
const auto ceilval = vj.array().ceil();
Eigen::Vector2d::Map(outfloor + j) = floorval;
Eigen::Vector2d::Map(outceil + j) = ceilval;;
j += 2;
}
if(j < rows) {
const double vj = inarr[j];
outfloor[j] = std::floor(vj);
outceil[j] = std::ceil(vj);
}
7.5 seconds. The assembler looks fine, fully vectorized. I'm not sure why performance is lower. Maybe cache line aliasing?
Last attempt: We don't try to avoid re-reading the vector but we re-read it blockwise so that it will be in cache by the time we read it a second time.
const int blocksize = 64 * 1024 / sizeof(double);
std::ptrdiff_t j;
for(j = 0; j + blocksize <= rows; j += blocksize) {
const auto& vj = v.segment(j, blocksize);
auto outj = out.middleRows(j, blocksize);
outj.col(0) = vj.array().floor();
outj.col(1) = vj.array().ceil();
}
const auto& vj = v.tail(rows - j);
auto outj = out.bottomRows(rows - j);
outj.col(0) = vj.array().floor();
outj.col(1) = vj.array().ceil();
5.4 seconds. So there is some gain here but not nearly enough to justify the added complexity.
vector<vector<int>> matrixReshape(vector<vector<int>>& nums, int r, int c) {
int row = nums.size();
int col = nums[0].size();
vector<vector<int>> newNums;
if((row*col) < (r*c)){
return nums;
}
else{
deque<int> storage;
for(int i = 0; i < row; i++){
for(int k = 0; k < col; k++){
storage.push_back(nums[i][k]);
}
}
for(int j = 0; j < r; j++){
for(int l = 0; l < c; l++){
newNums[j][l] = storage.pop_front();
}
}
}
return newNums;
}
Hey guys, I am having a problem where I am getting the said error of the title above 'Void value not ignored as it ought to be'. When I looked up the error message, the tips stated "This is a GCC error message that means the return-value of a function is 'void', but that you are trying to assign it to a non-void variable. You aren't allowed to assign void to integers, or any other type." After reading this, I assumed my deque was not being populated; however, I can not find out why my deque is not being populated. If you guys would like to know the problem I am trying to solve, I will be posting it below. Also, I cannot run this through a debugger since it will not compile :(. Thanks in advance.
In MATLAB, there is a very useful function called 'reshape', which can reshape a matrix into a new one with different size but keep its original data.
You're given a matrix represented by a two-dimensional array, and two positive integers r and c representing the row number and column number of the wanted reshaped matrix, respectively.
The reshaped matrix need to be filled with all the elements of the original matrix in the same row-traversing order as they were.
If the 'reshape' operation with given parameters is possible and legal, output the new reshaped matrix; Otherwise, output the original matrix.
Example 1:
Input:
nums =
[[1,2],
[3,4]]
r = 1, c = 4
Output:
[[1,2,3,4]]
Explanation:
The row-traversing of nums is [1,2,3,4]. The new reshaped matrix is a 1 * 4 matrix, fill it row by row by using the previous list.
This line has two problems:
newNums[j][l] = storage.pop_front();
First, pop_front() doesn't return the element that was popped. To get the first element of the deque, use storage[0]. Then call pop_front() to remove it.
You also can't assign to newNums[j][i], because you haven't allocated those elements of the vectors. You can pre-allocate all the memory by declaring it like this.
vector<vector<int>> newNums(r, vector<int>(c));
So the above line should be replaced with:
newNums[j][l] = storage[0];
storage.pop_front();
How do you fill with 0 a dynamic matrix, in C++? I mean, without:
for(int i=0;i<n;i++)for(int j=0;j<n;j++)a[i][j]=0;
I need it in O(n), not O(n*m) or O(n^2).
Thanks.
For the specific case where your array is going to to be large and sparse and you want to zero it at allocation time then you can get some benefit from using calloc - on most platforms this will result in lazy allocation with zero pages, e.g.
int **a = malloc(n * sizeof(a[0]); // allocate row pointers
int *b = calloc(n * n, sizeof(b[0]); // allocate n x n array (zeroed)
a[0] = b; // initialise row pointers
for (int i = 1; i < n; ++i)
{
a[i] = a[i - 1] + n;
}
Note that this is, of course, premature optimisation. It is also C-style coding rather than C++. You should only use this optimisation if you have established that performance is a bottleneck in your application and there is no better solution.
From your code:
for(int i=0;i<n;i++)for(int j=0;j<n;j++)a[i][j]=0;
I assume, that your matrix is two dimensional array declared as either
int matrix[a][b];
or
int** matrix;
In first case, change this for loop to a single call to memset():
memset(matrix, 0, sizeof(int) * a * b);
In second case, you will to do it this way:
for(int n = 0; n < a; ++n)
memset(matrix[n], 0, sizeof(int) * b);
On most platforms, a call to memset() will be replaced with proper compiler intrinsic.
every nested loop is not considered as O(n2)
the following code is a O(n),
No 1
for(int i=0;i<n;i++)for(int j=0;j<n;j++)a[i][j]=0;
imagine that you had all of the cells in matrix a copied into a one dimentional flat array and set zero for all of its elements by just one loop, what would be the order then? ofcouse you will say thats a O(n)
No 2 for(int i=0;i<n*m;i++) b[i]=0;
Now lets compare them, No 2 with No 1, ask the following questions from yourselves :
Does this code traverse matrix a cells more than once?
If I can measure the time will there be a difference?
Both answers are NO.
Both codes are O(n), A multi-tier nested loop on a multi-dimentional array produces a O(n) order.
I have a problem where I want to combine a list of vectors, all of the same type, in a particular fashion. I want the first element of my resultant vector to be the first element of the first vector in my list, the second element should be the first element of the second vector, the third, the first of the third and so on until n where n is length of my list and then element n+1 should be the second element of the first vector. This repeats until finished.
Currently, I am doing it like this:
CharacterVector measure(nrows * expansion);
CharacterVector temp(nrows);
for(int i=0; i < measure.size(); i++){
temp = values[i % expansion];
measure[i] = temp[i / expansion];
}
return(measure);
Where values is the List of CharacterVectors. This seems incredibly inefficient, overwriting temp every single time but I don't know of a better way to access the elements in values. I don't know a lot of C++ but I assume there must be a better way.
Any and all help is greatly appreciate!
EDIT:
All vectors in 'values are of the same length nrows and values has expansion elements in it.
What you need is the ListOf<CharacterVector> class. As the name implies, it represents an R list which only contains CharacterVector.
The code below uses it to extract the second element of each character vector from the list. Should not be hard to adapt it to your expansion algorithm, but your example was not reproducible without a bit more context.
#include <Rcpp.h>
using namespace Rcpp ;
// [[Rcpp::export]]
CharacterVector second( ListOf<CharacterVector> values ){
int n = values.size() ;
CharacterVector res(n);
for(int i=0; i<n; i++){
res[i] = values[i][1] ;
}
return res ;
}
Then, you sourceCpp this and try it on some sample data:
> data <- list(letters, letters, LETTERS)
> second(data)
[1] "b" "b" "B"
Now about your assumption:
This seems incredibly inefficient, overwriting temp every single time
Creating a CharacterVector is pretty fast, there is no deep copy of data, so this should not have been an issue in the first place.
You can preconstruct the vector and can easily know at which positions the elements of the first vector should go.. i.e. measure[0], measure[n], measure[n*2] etc.. where n = mylist.size(). Here i assume of course that each vector in the list has equal size. Untested code:
CharacterVector measure(nrows * expansion);
for(int i=0; i < values.size(); ++i)
{
CharacterVector& temp = values[i];
int newPosition = i;
for( int j=0; j < temp.size(); ++j)
{
measure[newPosition ] = temp[j];
newPosition += expansion;
}
}
return(measure);
i just got this task of finding out how this code works.
int array[rows][coloums];
int *pointerArray[rows];
for (int i = 0; i < rows; i++) {
pointerArray[i] = array[i];
for (int j = 0; j < coloums; j++) {
*(pointerArray[i] + j) = 0;
}
}
The thing I'm courious about is the *(pointerArray[i] + j), I think it's the same thing as pointerArray[i][j], since you can access the element both ways, But can anyone tell me what is actually happening with the *()? Like how does the compiler know that im asking for the same as pointerArray[i][j]?
Thanks for the answers!
When you do pointerArray[i] + j, you request the element pointerArray[i], which is a int*, and increment that pointer by j (also returning an int*). The *(...) simply dereferences the pointer and returns the int at that position. * is called the dereference operator (in this case). So yes, it's equivalent to pointerArray[i][j].
In this context, the * operator is the dereference operator. The value it prepends will be the location in memory at which it will return a value.
The parenthesis are grouping an addition operation so that the compiler knows that the result of this addition will be used for the dereference. It's simply a case of order-of-operations.
Keep in mind that the [] operator does the same thing as the dereference operator, because arrays are essentially a kind of pointer variable. If you imagine a two-dimensional array as a 2D grid of values with rows and columns, in memory the data is laid out such that each row is strung one after the next in sequential order. The first index in the array (i) along with the type of the array (int) tells the compiler at what offset to look for the first location in the row. The second index in the array (j) tells it at what offset within that row to look.
*(pointerArray[i] + j) basically means: "Find the beginning of the ith row of data in pointerArray, and then pick the jth element of that row, and give me that value.