match elements between two std::vectors - c++

I am writing a module that estimates optical flow. At each time step it consumes an std::vector where each element of the vector is a current pixel location and a previous pixel location. The vector is not ordered. New pixels that were previously not seen will be present and flow locations that were not found will be gone. Is there a correct way to match elements in the new vector to the set of optical flow locations being estimated?
The vectors are on the order of 2000 elements.
These are the approaches I am considering:
naively iterate through the new vector for each estimated optical flow location
naively iterating through the new vector but removing each matched location so the search gets faster as it goes on
run std::sort on my list and the new list at every time step. Then iterate through the new vector starting at the last matched index +1
I'm suspecting that there is an accepted way to go about this but I don't have any comp sci training.
I'm in c++ 11 if that is relevant.
// each element in the new vector is an int. I need to check if
// there are matches between the new vec and old vec
void Matcher::matchOpticalFlowNaive(std::vector<int> new_vec)
{
for(int i = 0; i < this->old_vec.size(); i++)
for(int j =0; j < new_vec.size(); j++)
if(this->old_vec[i] == new_vec[j]){
do_stuff(this->old_vec[i], new_vec[j])
j = new_vec.size();
}
}

Not sure to understand what do you need but, supposing that your Matcher is constructed with a vector of integer, that there ins't important the order and that you need check this vector with other vectors (method matchOpticalFlowNaive()) to do something when there is a match, I suppose you can write something as follows
struct Matcher
{
std::set<int> oldSet;
Matcher (std::vector<int> const & oldVect)
: oldSet{oldVect.cbegin(), oldVect.cend()}
{ }
void matchOpticalFlowNaive (std::vector<int> const & newVec)
{
for ( auto const & vi : newVec )
{
if ( oldSet.cend() != oldSet.find(vi) )
/* do something */ ;
}
}
};
where the Matcher object is constructed with a vector that is used to initialize a std::set (or a std::multi_set, or a unordered set/multiset?) to make simple the work in matchOpticalFlowNaive()

Related

C++ Making sure 2D vector is compact in memory

I'm writing a C++ program to perform calculations on a huge graph and therefore has to be as fast as possible. I have a 100MB textfile of unweighted edges and am reading them into a 2D vector of integers (first index = nodeID, then a sorted list of nodeIDs of nodes which have edges to that node). Also, during the program, the edges are looked up exactly in the order in which they're stored in the list. So my expectation was that, apart from a few bigger gaps, it'd always be nicely preloaded to the cache. However, according to my profiler, iterating through the edges of a player is an issue. Therefore I suspect, that the 2D vector isn't placed in memory compactly.
How can I ensure that my 2D vector is as compact as possible and the subvectors in the order in which they should be?
(I thought for example about making a "2D array" from the 2D vector, first an array of pointers, then the lists.)
BTW: In case it wasn't clear: The nodes can have different numbers of edges, so a normal 2D array is no option. There are a couple ones with lots of edges, but most have very few.
EDIT:
I've solved the problem and my program is now more than twice as fast:
There was a first solution and then a slight improvement:
I put the lists of neighbour ids into a 1D integer array and had another array to know where a certain id's neighbour lists start
I got a noticeable speedup by replacing the pointer array (a pointer needs 64 bit) with a 32 bit integer array containing indices instead
What data structure are you using for the 2d vector? If you use std::vector then the memory will be contiguous.
Next, if pointers are stored then only the address will take advantage of the vectors spacial locality. Are you accessing the object pointed to when iterating the edges and if so this could be a bottleneck. To get around this perhaps you can setup your objects so they are also in contiguous memory and take advantage of spacial locality.
Finally the way in which you access the members of a vector affects the caching. Make sure you are accessing in an order advantageous to the container used (eg change column index first when iterating).
Here's some helpful links:
Cache Blocking Techniques
SO on cache friendly code
I have written a few of these type structures by having a 2D view onto a 1D vector and there are lots of different ways to do it. I have never made one that allows the internal arrays to vary in length before so this may contain bugs but should illustrate the general approach:
#include <cassert>
#include <iostream>
#include <vector>
template<typename T>
class array_of_arrays
{
public:
array_of_arrays() {}
template<typename Iter>
void push_back(Iter beg, Iter end)
{
m_idx.push_back(m_vec.size());
m_vec.insert(std::end(m_vec), beg, end);
}
T* operator[](std::size_t row) { assert(row < rows()); return &m_vec[m_idx[row]]; }
T const* operator[](std::size_t row) const { assert(row < rows()); return &m_vec[m_idx[row]]; }
std::size_t rows() const { return m_idx.size(); }
std::size_t cols(std::size_t row) const
{
assert(row <= m_idx.size());
auto b = m_idx[row];
auto e = row + 1 >= m_idx.size() ? m_vec.size() : m_idx[row + 1];
return std::size_t(e - b);
}
private:
std::vector<T> m_vec;
std::vector<std::size_t> m_idx;
};
int main()
{
array_of_arrays<int> aoa;
auto data = {2, 4, 3, 5, 7, 2, 8, 1, 3, 6, 1};
aoa.push_back(std::begin(data), std::begin(data) + 3);
aoa.push_back(std::begin(data) + 3, std::begin(data) + 8);
for(auto row = 0UL; row < aoa.rows(); ++row)
{
for(auto col = 0UL; col < aoa.cols(row); ++col)
{
std::cout << aoa[row][col] << ' ';
}
std::cout << '\n';
}
}
Output:
2 4 3
5 7 2 8 1

Sorting large std::vector of custom objects

I am creating a sparse matrix in CSR format, for which I start with a vector of matrix element structures. It needs to be std::vector at the beginning because I don't know ahead of time how many non-zeros my matrix is going to have. Then, to fill up the appropriate arrays for the CSR matrix, I need to first sort this array of non-zeros, in the order they appear in the matrix if one goes through it line-by-line. But above a certain matrix size (roughly 1 500 000 non-zeros), the sorted vector does not start from the beginning of the matrix. It is still sorted, but starts around row 44000.
// Matrix element struct:
struct mel
{
int Ncols;
int row,col;
MKL_Complex16 val;
void print();
};
// Custom function for sorting:
struct less_than_MElem
{
inline bool operator() (const mel& ME1, const mel& ME2)
{
return ( ( ME1.row*ME1.Ncols+ME1.col ) < ( ME2.row*ME2.Ncols+ME2.col ) );
}
};
int main()
{
std::vector<mel> mevec;
/* long piece of code that fills up mevec */
std::sort( mevec.begin(), mevec.end(), less_than_MElem() );
return 0;
}
I thought maybe as the vector was grown dynamically it wound up in separate blocks in the memory and the iterator wasn't pointing at the genuine beginning/end anymore. So I have tried creating a new vector and started with resizing it to the size that is known by that time. Then copied the elements one-by-one into this new vector and sorted it, but the result was the same.
Nelements = mevec.size();
std::vector<mel> nzeros;
nzeros.resize(Nelements);
for( int i = 0; i < Nelements; i++ )
{
nzeros[i].Ncols = mevec[i].Ncols;
nzeros[i].row = mevec[i].row;
nzeros[i].col = mevec[i].col;
nzeros[i].val = mevec[i].val;
}
std::sort( nzeros.begin(), nzeros.end(), less_than_MElem() );
Can anyone think of a solution?

How to initialize an empty global vector in C++

I have a general question. Hopefully, one of you has a good approach to solve my problem. How can I initialize an empty vector?
As far as I read, one has to know the size of an array at compiling time, though for vectors it is different. Vectors are stored in the heap (e.g. here: std::vector versus std::array in C++)
In my program I want to let the client decide how accurate interpolation is going to be done. That's why I want to use vectors.
The problem is: For reasons of clear arrangement I want to write two methods:
one method for calculating the coefficients of an vector and
one method which is providing the coefficients to other functions.
Thus, I want to declare my vector as global and empty like
vector<vector<double>> vector1;
vector<vector<double>> vector2;
However, in the method where I determine the coefficients I cannot use
//vector containing coefficients for interpolation
/*vector<vector<double>>*/ vector1 (4, vector<double>(nn - 1));
for (int ii = 0; ii < nn - 1; ii++) {vector1[ii][0] = ...;
}
"nn" will be given by the client when running the program. So my question is how can I initialize an empty vector? Any ideas are appreciated!
Note please, if I call another function which by its definition gives back a vector as a return value I can write
vector2= OneClass.OneMethod(SomeInputVector);
where OneClass is an object of a class and OneMethod is a method in the class OneClass.
Note also, when I remove the comment /**/ in front of the vector, it is not global any more and throws me an error when trying to get access to the coefficients.
Use resize:
vector1.resize(4, vector<double>(nn - 1));
Use resize() function as follows:
vector<vector<double>> v;
int f(int nn){
v.resize(4);
for(int i = 0; i < 4; i++){
v[i].resize(nn - 1);
}
}
It look to me that you're actually asking how to add items to your global vector. If so this might help:
//vector containing coefficients for interpolation
for (int i = 0; i < 4; ++i)
vector1.push_back(vector<double>(nn - 1));
for (int ii = 0; ii < nn - 1; ii++)
{
vector1[ii][0] = ...;
}
Unsure if it is what you want, but assign could be interesting :
vector<vector<double>> vector1; // initialises an empty vector
// later in the code :
vector<double> v(nn -1, 0.); // creates a local vector of size 100 initialized with 0.
vector1.assign(4, v); // vector1 is now a vector of 4 vectors of 100 double (currently all 0.)

How can I check if a vector is pointing to null vector in a 2d-vector in c++?

So, I have the following case:
I declared a vector of vector of integers as vector < vector<int> > edges. Basically, I am trying to implement a graph using the above where graph is defined as follows:
class graph
{
public:
int vertices;
vector < vector<int> > edges;
};
Now, during the insertion of an edge, I take input as the starting vertex and ending vertex. Now, I want to do something like this:
void insert(int a, int b, graph *mygraph) // a is starting vertex and b is ending vertex
{
auto it = mygraph->edges.begin();
//int v = 1;
vector<int> foo;
foo.push_back(b);
if (mygraph->edges[a].size() != 0) // Question here?
mygraph->edges[a].push_back(b);
else
mygraph->edges.push_back(foo);
return;
}
Now, in the line marked with Question here, basically, I want to check if the vector for that particular entry exists or not? size is actually wrong because I am trying to call size operation on a vector which doesn't exists. In other words, I want to check, if there is a vector which exists at a particular location in vector of vectors. How can I do it? Something like, mygraph->edges[a] != NULL?
Simply check that a does not exceed size of the vector. If it does, then resize the outer vector.
void insert(int a, int b, graph &mygraph) { // a is starting vertex and b is ending vertex
if (a >= mygraph.edges.size())
mygraph.edges.resize(a+1);
mygraph.edges[a].push_back(b);
}
You can approach your problem in two different ways:
Initialize edges to the number of vertices, and don't allow other vertices to be inserted after that. Why is that?
std::vector< std::vector<int> > v = { {1}, {2} };
// now you need to add an edge between vertex 4 and vertex 5
std::vector<int> edges3;
v.push_back(edges3); // v = { {1}, {2}, {} }
std::vector<int> edges4 = {5};
v.push_back(edges4); // v = { {1}, {2}, {}, {5} }
If you don't want to do it like that, you'd have to do something like this first:
std::vector< std::vector<int> > v;
for (int i = 0; i < maxVertices; i++)
{
std::vector<int> w;
v.push_back(w);
}
// now you need to add an edge between vertex 4 and vertex 5
v[4].push_back(5);
Change the structure used for edges, probably to something better suited for sparse matrices (which looks like your case here, since probably not every vertex is connected to every other vertex). Try:
std::map< int, std::vector<int> > edges;
That way you can match a single vertex with a list of other vertices without the need to initialize edges to the maximum possible number of vertices.
std::vector<int> vertices = {5};
edges[4] = vertices;
Edges is a vector of vectors. Vectors are stored contiguously in memory. You insert elements into a vector from the end. If the size of vector is 10, all 10 members are contiguous and their indexes are going to range from 0-9. If you delete a middle vector, say 5th, all vectors from index 6-9 get shifted up by 1.
The point of saying all this is that you can't have a situation where edges would have an index that doesn't hold a vector. To answer your question, a vector for index a would exist if
a < mygraph->edges.size ();

Sorting an array of structs in C++

I'm using a particle physics library written in c++ for a game.
In order to draw the particles I must get an array of all their positions like so..
b2Vec2* particlePositionBuffer = world->GetParticlePositionBuffer();
This returns an array of b2Vec2 objects (which represent 2 dimensional vectors in the physics engine).
Also I can get and set their colour using
b2ParticleColor* particleColourBuffer = world->GetParticleColorBuffer();
I would like to get the 10% of the particles with the highest Y values (and then change their colour)
My idea is..
1. Make an array of structs the same size as the particlePositionBuffer array, the struct just contains an int (the particles index in the particlePositionBuffer array) and a float (the particles y position)
2.Then I sort the array by the y position.
3.Then I use the int in the struct from the top 10% of structs in my struct array to do stuff to their colour in the particleColourBuffer array.
Could someone show me how to sort and array of structs like that in c++ ?
Also do you think this is a decent way of going about this? I only need to do it once (not every frame)
Following may help:
// Functor to compare indices according to Y value.
struct comp
{
explicit comp(b2Vec2* particlePositionBuffer) :
particlePositionBuffer(particlePositionBuffer)
{}
operator (int lhs, int rhs) const
{
// How do you get Y coord ?
// note that I do rhs < lhs to have higher value first.
return particlePositionBuffer[rhs].getY() < particlePositionBuffer[lhs].getY();
}
b2Vec2* particlePositionBuffer;
};
void foo()
{
const std::size_t size = world->GetParticleCount(); // How do you get Count ?
const std::size_t subsize = size / 10; // check for not zero ?
std::vector<std::size_t> indices(size);
for (std::size_t i = 0; i != size; ++i) {
indices[i] = i;
}
std::nth_element(indices.begin(), indices.begin() + subsize, indices.end(),
comp(world->GetParticlePositionBuffer()));
b2ParticleColor* particleColourBuffer = world->GetParticleColorBuffer();
for (std::size_t i = 0; i != subsize; ++i) {
changeColor(particleColourBuffer[i])
}
}
If your particle count is low, it won't matter much either way, and sorting them all first with a simple stl sort routine would be fine.
If the number were large though, I'd create a binary search tree whose maximum size was 10% of the number of your particles. Then I'd maintain the minY actually stored in the tree for quick rejection purposes. Then this algorithm should do it:
Walk through your original array and add items to the tree until it is full (10%)
Update your minY
For remaining items in original array
If item.y is less than minY, go to next item (quick rejection)
Otherwise
Remove the currently smallest Y value from the tree
Add the larger Y item to the tree
Update MinY
A binary search tree has a nice advantage of quick insert, quick search, and maintained ordering. If you want to be FAST, this is better than a complete sort on the entire array.