Sparse matrix without knowing size - c++

Does Eigen support element insertion into a sparse matrix if the size is not known?
I have a stream of data coming in, and I am attempting to store that sparsely, but I don't know the maximum value of the indices (row/column) of the data ahead of time (I can guess, but not guarantee). Looking at Eigen's insert code, it has an assertion (1130, SparseMatrix.h) that the index that you wish to insert into is <=rows(), <=cols().
Do I really need to wait until I have all the data before I can start using Eigen's sparse matrix code? The design I would have to go for then would require me to wait for all the data, then scanto find the maximum index, which is not ideal for my application. I curently don't need the full matrix to start working - an limited one with the currently available data would be fine.
Please don't close this question unless you have an answer, the linked answer was for dense matrices, not sparse ones, which have different internal storage...
I'm also looking for information on the case where matrix size is not immediately available at run-time, rather than at compile time, and olny for sparse.

The recommendation is still to store the values into a intermediate triplets container and build the sparse matrix at the end. If you don't want to read all the stream... then just read first nnn triplets until your desired condition and then use setFromTriplets() with the partial list of triplets.
But, if you still don't want to read the full matrix to start working, you can guess a size for your matrix and make it grow in case you read a value that can not be stored in your current size using conservativeResize().
#include <Eigen/Sparse>
#include <iostream>
Eigen::SparseMatrix<double> mat;
mat.resize(100,100); //Initial size guess. Could be 1, 10, 1000, etc...
fstream inputStream ("filename.txt", "r"):
while(inputStream)
{
//Read position and value from stream
unsigned i, j;
double v;
inputStream >> i >> j >> v;
//check current size of the matrix and make it grow if necessary
if ( (i >= mat.rows()) || (j >= mat.cols()) )
mat.conservativeResize(std::max(i+1, mat.rows()), std::max(j+1, mat.cols()) );
//store the value in matrix
mat.coeffRef(i,j) = v;
//Insert here your condition to break before of read all the stream
if (mat.nonZeros() > 150 )
break;
}
//Do some clean-up in case you think is necessary
mat.makeCompressed();

Related

How to create dictionary of set size and append to vector value in loop in c++

I have a video and for each frame, I am dividing into equally sized squares. Each video will have fixed frame dimensions, so the number of squares per video will not change (but different videos with different frame size will change, so the code must be dynamic).
I am trying to loop through each frame, and then loop through each square, and insert the square index and the square matrix into a dictionary. After the first frame, I want to append the square matrix to the vector value at its corresponding key. My code so far:
// let's assume list_of_squares is a vector that holds the Mat values for each square per frame
// also assuming unordered_map<int, vector<Mat>> fixedSquaredict; is declared in a .hpp file and already exists
for (int i=0; i<list_of_squares.size(); i++) {
if (i=0) {
fixedSquaredict.insert({i, list_of_squares[i]});
}
else {
fixedSquaredict[i].push_back(list_of_squares[i]);
}
What I am confused about is this line:
fixedSquaredict.insert({i, list_of_squares[i]});
This line initializes the proper number of keys, but how do I insert the Mat value into the vector<Mat> structure for the first time, so I can then push_back to it in subsequent iterations?
I want the result to be something like this:
// assuming list_of_squares.size() == 2 in this example and it loops through 2 frames
list_of_squares = ((0, [mat1, mat2]),
(1, [mat1, mat2]))
You don't need to do anything, and you don't need the insert call at all. The following will do everything you need:
for (size_t i = 0; i < list_of_squares.size(); ++i) {
fixedSquaredict[i].push_back(list_of_squares[i]);
}
std::unordered_map::operator[] will default-construct a new value if none with that key exists, so the first time a new value of i is encountered it will default-construct a new std::vector<Mat>, which you can then append the first value to.
Side note, using std::unordered_map<int, SomeType> for a contiguous sequence of keys is a bit odd. You've essentially created a less efficient version of std::vector at that point.

How to read some elements of a matrix from disk C++

I have a code where I'm reading 1024x1024 float matrix from disk then I'm getting some elements of it and doing some process on the new matrix as follows.
// mask is the 1Kx1K matrix that 1/64 element of it are 1 other elements are 0;
// it is a mask for **Mat data**
string filename = "filepath";
Mat data(1024,1024,CV_32F);
readMatrix(filename, data);
Mat smallMat(128,128,CV_32F);
getSmallerMat(data, mask, smallMat);
I read from float Mat from disk and fill smallMat using getSmallerMat(...) which is simply two for loops checking if mask(i,j) == 1, write to next position in smallMat
readMatrix(string fpath,Mat& data){
FILE* fp = fopen(fpath.c_str(),"rb");
if (!fp)perror("fopen");
int size = 1024;
data.create(size,size,CV_32F);
float* buffer= new float[size];
for(int i=0;i<size;++i) {
fread(buffer,sizeof(float),size,fp);
for(int j=0;j<size;++j){
data.at<float>(i,j)=buffer[j];
}
}
fclose(fp);
free(buffer);
}
What I want to do is just reading matrix elements whose corresponding value in mask is equal to 1. My problem is how will I pick (i,j)-th element from the disk.
Reading whole matrix and squeezing it takes 15 ms, I want to make it faster but I couldn't achieve to do it.
Consider this pic is my mask matrix. I want to read only white pixels only.
Thanks,
I am not sure that i understand the question correctly, but are you looking for a method to access data on the hard disk more quickly than via a stream? For finding some specific matrix element (i,j) in your stream you need to read the whole file (in the worst case), i.e. the complexity is linear, this can't be helped.
However, if you actually know the position in the fiel exactly (i.e. if you use a fixed length format for representing your doubles, etc.) seekg
http://www.cplusplus.com/reference/istream/istream/seekg/
should be faster than actually reading all characters until the desired position.
EDIT:
Given the discussion in comments to other answers I want to stress that using some seek in a file stream is O(N), hence multiple seeks for specific element will be way slower than just reading the whole file. I am not aware of a method to access data stored on hard disk in O(1). However, if all you ever need is matrices which are zero outside your mask, you should familiarize yourself with the concept of sparse matrices.
See e.g. https://en.wikipedia.org/wiki/Sparse_matrix and the documentation for your library, e.g. http://www.boost.org/doc/libs/1_39_0/libs/numeric/ublas/doc/matrix_sparse.htm
I am not sure if I have understood your problem or not; but if you want to read i,j th element from a file which contains the only float elements you should be able to get it like below -
float get(int i, int j, int rowsize, FILE * fp) {
float retVal = -1.0f; //-infinity may be?
// if you need restoring the stream pos
long lastPos = ftell(fp);
// ff to i*row + j
fseek(fp , ((i * rowsize) + j) * sizeof(float), SEEK_SET);
fread((unsigned char *)&retVal, sizeof(float), 1, fp);
// restore prevpos
// bla bla bla
return retVal;
}
You should be able to read any file which contains fixed size element very fast using the fseek and some arithmatic from start end or current file pointer. check the fseek documentation for more details.
From your code it appears your matrix is stored in binary as a memory image of the floats. What you want is go directly to the index on the disk where the (i,j) float is. You can compute this position using the following formula: index = i*colWidth+j where colWidth is 1024 in your case. You can use fseek and ftell to move your position and get your position in the file opened by fopen.

How to store an Image as a 1*n floating point vector?

Im trying to obtain a single floating vector called testdata from images obtained via a webcam.Once the images are converted to a single floating vector ,it is passed to a trained Neural Network.To test the network I use the function float CvANN_MLP::predict(const Mat& inputs, Mat& outputs).This function requires an testing sample in the format as follows :-
Floating-point matrix of input vectors, one vector per row.
testdata vector is defined as follows:-
// define testing data storage matrices
//NumberOfTestingSamples is 1 and AttributesPerSample is number of rows *number of columns
Mat testing_data = Mat(NumberOfTestingSamples, AttributesPerSample, CV_32FC1);
To store each row of the image in a CSV Format ,I do the following :-
Formatted row0= format(Image.row(0),"CSV" ); //Get all rows to store in a single vector
Formatted row1= format(Image.row(1),"CSV" ); //Get all rows to store in a single vector
Formatted row2= format(Image.row(2),"CSV" ); //Get all rows to store in a single vector
Formatted row3= format(Image.row(3),"CSV" ); //Get all rows to store in a single vector
I then output all formated rows which were stored in row0 to row3 into a textfile as such:-
store_in_file<<row0<<", "<<row1<<", "<<row2<<", "<<row3<<endl;
This will store the entire Mat on a single line.
The textfile is closed .I reopen the same textfile to extract the data to store into the vector testdata
// if we can't read the input file then return 0
FILE* Loadpixel = fopen( "txtFileValue.txt", "r" );
if(!Loadpixel) // file didn't open
{
cout<<"ERROR: cannot read file \n";
return 0; // all not OK;
}
for(int attribute = 0; attribute < AttributesPerSample; attribute++)
{
fscanf(Loadpixel, "%f,",&colour_value);//Reads a single attribute and stores it in colour_value
testdata.at<float>(0, attribute) = colour_value;
}
This works ,however after a period of time the file doesn't open and displays the Error Message:"ERROR: cannot read file ".There is alot of limitation to this method,unnecessary time take to store in a textfile and then reopen and extract.What is the best way to store an image(Mat) into a single floating point vector similar to testdata.at<float>(0, attribute) ? Or is there a simple way to ensure that the file always opens,basically the correct the problem?
The sane solution is of course to convert the values directly in memory. The whole file intermediate is an incredible kludge, as you suspected.
If you would use standard C++ types such as std::vector, we could provide actual code. The simple algorithm equivalent to your code is to just iterate through your 2D image one pixel at a time, and append each pixel's value to the back of the 1D vector.
However, that's a bad idea for neural network processing of webcam images anyway. If your input shifts down by a single pixel - entirely possible - the whole 1D vector changes. It's therefore advisable to normalize your input first. That may require translating, scaling and rotating the image first.
[edit]
Standard C++ example:
std::vector<std::vector<int>> Image2D;
std::vector<float> Vector1D;
for (auto const& row : Image2D) {
for (auto pixel : row) {
Vector1D.push_back(pixel);
}
}

Shift vector in thrust

I'm looking at a project involving online (streaming) data. I want to work with a sliding window of that data. For example, say that I want to hold 10 values in my vector. When value 11 comes in, I want to drop value 1, shift everything over, and then place value 11 where value 10 was.
The long way would be something like the following:
int n = 9;
thrust::device_vector<float> val;
val.resize(n+1,0);
// Shift left
for(int i=0; i != n-1; i++){
val[i] = val[i+1];
}
// add the new value to the last position
val[n] = newValue;
Is there a "fast" way to do this with thrust? The project I'm looking at will have around 500 vectors that will need this operation done simultaneously.
Thanks!
As I have said, Ring buffer is what you need. No need to shift there, only one counter and a fixed size array.
Let's think how we may deal with 500 of ring buffers.
If you want to have 500 (let it be 512) sliding windows and process them all on the GPU, then you might pack them into one big 2D texture, where each column is an array of samples for the same moment.
If you're getting new samples for each of the vector at once (I mean one new sample for each 512 buffers at one processing step), then this "ring texture" (like a cylinder) only needs to be updated once (upload the array of new samples at each step) and you need just one counter.
I highly recommend using a different, yet still free, library for this problem. In 4 lines of ArrayFire code, you can do all 500 vectors, as follows:
array val = array(window_width, num_vectors);
val = shift(val, 0, 1);
array newValue = array(1,num_vectors);
val(span,end) = newValue;
I benchmarked against Thrust code for the same and ArrayFire is getting about a 10X speedup over Thrust.
Downside is that ArrayFire is not open source, but it is still free for this sort of problem.
Want you want is simply thrust::copy. You can't do a shift in place in parallel, because you can't guarantee a value is read before it is written.
int n = 9;
thrust::device_vector<float> val_in(n);
thrust::device_vector<float> val_out(n+1);
thrust::copy(val_in.begin() + 1, val_in.end(), val_out.begin());
// add the new value to the last position
val_out[n] = newValue;

improving performance for graph connectedness computation

I am writing a program to generate a graph and check whether it is connected or not. Below is the code. Here is some explanation: I generate a number of points on the plane at random locations. I then connect the nodes, NOT based on proximity only. By that I mean to say that a node is more likely to be connected to nodes that are closer, and this is determined by a random variable that I use in the code (h_sq) and the distance. Hence, I generate all links (symmetric, i.e., if i can talk to j the viceversa is also true) and then check with a BFS to see if the graph is connected.
My problem is that the code seems to be working properly. However, when the number of nodes becomes greater than ~2000 it is terribly slow, and I need to run this function many times for simulation purposes. I even tried to use other libraries for graphs but the performance is the same.
Does anybody know how could I possibly speed everything up?
Thanks,
int Graph::gen_links() {
if( save == true ) { // in case I want to store the structure of the graph
links.clear();
links.resize(xy.size());
}
double h_sq, d;
vector< vector<luint> > neighbors(xy.size());
// generate links
double tmp = snr_lin / gamma_0_lin;
// xy is a std vector of pairs containing the nodes' locations
for(luint i = 0; i < xy.size(); i++) {
for(luint j = i+1; j < xy.size(); j++) {
// generate |h|^2
d = distance(i, j);
if( d < d_crit ) // for sim purposes
d = 1.0;
h_sq = pow(mrand.randNorm(0, 1), 2.0) + pow(mrand.randNorm(0, 1), 2.0);
if( h_sq * tmp >= pow(d, alpha) ) {
// there exists a link between i and j
neighbors[i].push_back(j);
neighbors[j].push_back(i);
// options
if( save == true )
links.push_back( make_pair(i, j) );
}
}
if( neighbors[i].empty() && save == false ) {
// graph not connected. since save=false i dont need to store the structure,
// hence I exit
connected = 0;
return 1;
}
}
// here I do BFS to check whether the graph is connected or not, using neighbors
// BFS code...
return 1;
}
UPDATE:
the main problem seems to be the push_back calls within the inner for loops. It's the part that takes most of the time in this case. Shall I use reserve() to increase efficiency?
Are you sure the slowness is caused by the generation but not by your search algorithm?
The graph generation is O(n^2) and you can't do too much to it. However, you can apparently use memory in exchange of some of the time if the point locations are fixed for at least some of the experiments.
First, distances of all node pairs, and pow(d, alpha) can be precomputed and saved into memory so that you don't need to compute them again and again. The extra memory cost for 10000 nodes will be about 800mb for double and 400mb for float..
In addition, sum of square of normal variable is chi-square distribution if I remember correctly.. Probably you can have some precomputed table lookup if the accuracy allowed?
At last, if the probability that two nodes will be connected are so small if the distance exceeds some value, then you don't need O(n^2) and probably you can only calculate those node pairs that have distance smaller than some limits?
As a first step you should try to use reserve for both inner and outer vectors.
If this does not bring performance up to your expectations I believe this is because memory allocations that are still happening.
There is a handy class I've used in similar situations, llvm::SmallVector (find it in Google). It provides a vector with few pre-allocated items, so you can have decrease number of allocations by one per vector.
It still can grow when it is running out of items in pre-allocated space.
So:
1) Examine the number of items you have in your vectors on average during runs (I'm talking about both inner and outer vectors)
2) Put in llvm::SmallVector with a pre-allocation of such size (as vector is allocated on the stack you might need to increase stack size, or reduce pre-allocation if you are restricted on available stack memory).
Another good thing about SmallVector is that it has almost the same interface as std::vector (could be easily put instead of it)