Data analysis - memory bug c++ - c++

I am a data scientist, currently working on some C++ code to extract triplet particles from a rather large text file containing 2D coordinate data of particles in ~10⁵ consecutive frames. I am struggling with a strange memory error that I don't seem to understand.
I have a vector of structs, which can be divided into snippets defined by their frame. For each frame, I build an array with unique ID's for each individual coordinate pair, and if at any point the coordinate pair is repeated, the coordinate pair is given the old coordinate pair. This I then use later to define whether the particle triplet is indeed a trimer.
I loop over all particles and search forward for any corresponding coordinate pair. After I'm done, and no particles were found, I define this triplet to be unique and push the coordinates into a vector that corresponds to particle IDs.
The problem is: after the 18th iteration, at line trimerIDs[i][0] = particleCounter; , the variable trimerCands (my big vector array) suddenly becomes unreadable. Can this be that the vector pointer object is being overwritten? I put this vector fully on the heap, but even if I put it on stack, the error persists.
Do any of you have an idea of what I might be overlooking? Please note that I am rather new at C++, coming from other, less close to the metal, languages. While I think I understand how stack/heap allocations work, especially with respect to vectors/vector structs, I might be very wrong!
The error that Eclipse gives me in the variables tab is:
Failed to execute MI command:
-data-evaluate-expression trimerCands
Error message from debugger back end:
Cannot access memory at address 0x7fff0000000a
The function is as follows.
struct trimerCoords{
float x1,y1,x2,y2,x3,y3;
int frame;
int tLength1, tLength2, tLength3;
};
void removeNonTrimers(std::vector<trimerCoords> trimerCands, int *trCandLUT){
// trimerCands is a vector containing possible trimers, tLengthx is an attribute of the particle;
// trCandLUT is a look up table array with indices;
for (int currentFrame = 1; currentFrame <=framesTBA; currentFrame++){ // for each individual frame
int nTrimers = trCandLUT[currentFrame] - trCandLUT[currentFrame-1]; // get the number of trimers for this specific frame
int trimerIDs[nTrimers][3] = {0}; // preallocate an array for each of the inidivual particle in each triplet;
int firstTrim = trCandLUT[currentFrame-1]; // first index for this particular frame
int lastTrim = trCandLUT[currentFrame] - 1; // last index for this particular frame
bool found;
std::vector<int> traceLengths;
traceLengths.reserve(nTrimers*3);
// Block of code to create a unique ID array for this particular frame
std::vector<Particle> currentFound;
Particle tempEntry;
int particleCounter = 0;
for (int i = firstTrim; i <= lastTrim; i++){
// first triplet particle. In the real code, this is repeated three times, for x2/y2 and x3/y3, corresponding to the
tempEntry.x = trimerCands[i].x1;
tempEntry.y = trimerCands[i].y1;
found = false;
for (long unsigned int j = 0; j < currentFound.size(); j++){
if (fabs(tempEntry.x - currentFound[j].x) + fabs(tempEntry.y - currentFound[j].y) < 0.001){
trimerIDs[i][0] = j; found = true; break;
}
}
if (found == false) {
currentFound.push_back(tempEntry);
traceLengths.push_back(trimerCands[i].tLength1);
trimerIDs[i][0] = particleCounter;
particleCounter++;
}
}
// end block of create unique ID code block
compareTrips(nTrimers, trimerIDs, traceLengths, trimerfile_out);
}
}
If anything's unclear, let me know!

Related

C++: std::vector - std::find & std::distance query

To give context to my question I will describe what it is I am ultimately trying to achieve - I am developing a game, I have a .obj model that I am using as my terrain and I must update the players height as they traverse the terrain (because the terrain is far from flat). I am achieving this currently by doing the following - when I load the mesh (terrain.obj) I store all its vertices (each vertex is a Vector3f object that has an x y and z value) in a std::vector<Vector3f> meshVertices, then every second in the "main game loop" I loop through every one of the Vector3f objects in meshVertices and check its x and z value against the players x and z value, if they are colliding I set the players height as the y value of the matched Vector3f object.
This system actually works and updates my players height as they traverse through the terrain, the only issue is this - my approach of checking every single mesh vertex against player position every second kills my frame rate, I need a much better system.
I will now describe my new optimized approach in the attempt of saving my frame rate - Firstly, when creating the mesh I don't just store each Vector3f vertex in a single std::vector<Vector3f>, I store the x y and z values of each Vector3f vertex of the mesh in three seperate std::vector<Vector3f>'s, named meshVerticesX, mechVerticesY and meshVerticesZ. These std::vector's can be seen in the following code:
for (Vector3f currentVertex : meshVertices) {
meshVerticesX.push_back((int)currentVertex.GetX());
}
for (Vector3f currentVertex : meshVertices) {
meshVerticesZ.push_back((int)currentVertex.GetZ());
}
for (Vector3f currentVertex : meshVertices) {
meshVerticesY.push_back((int)currentVertex.GetY());
}
Now every second I get the x and z value of the players position (casted to an int because I feel like making this system work with int values and not float values will be much easier for comparisons later) and then send them to functions that check to see if they exist in the before mentioned meshVerticesX and mechVerticesZ by returning a bool, the code responsible for this is as follows:
int playerPosX = (int) freeMoveObjects[0]->GetParent()->GetTransform()->GetPos()->GetX();
int playerPosZ = (int) freeMoveObjects[0]->GetParent()->GetTransform()->GetPos()->GetZ();
bool x = meshObjects[0]->checkMeshVerticesX(playerPosX);
bool z = meshObjects[0]->checkMeshVerticesZ(playerPosZ);
The functions checkMeshVerticesX and checkMeshVerticesZ are as follows:
bool Mesh::checkMeshVerticesX(int playerPosX)
{
return std::find(meshVerticesX.begin(), meshVerticesX.end(), playerPosX) != meshVerticesX.end();
}
bool Mesh::checkMeshVerticesZ(int playerPosZ)
{
return std::find(meshVerticesZ.begin(), meshVerticesZ.end(), playerPosZ) != meshVerticesZ.end();
}
Using the returned boolean values (true if the players position was in the respective std::vector or false if it was not) I then call another function (getMeshYHeight) that also gets passed the players x and z position that then checks the index of the respective std::vector's (meshVerticesX & meshVerticesZ) were the match was found, then checks if these indexes are equal and if so returns an int of that index from the meshVerticesY std::vector mentioned earlier, this code can be seen in the following:
if (x == true & z == true) {// boolean values returned by checkMeshVerticesX & checkMeshVerticesZ
int terrainVertexYHeight = meshObjects[0]->getMeshYHeight(playerPosX, playerPosZ);
freeMoveObjects[0]->GetParent()->GetTransform()->GetPos()->SetY(terrainVertexYHeight);
}
The function getMeshYHeight is as follows:
int Mesh::getMeshYHeight(int playerXPos, int playerZPos) {//15/2/20
auto iterX = std::find(meshVerticesX.begin(), meshVerticesX.end(), playerXPos) != meshVerticesX.end();
auto iterZ = std::find(meshVerticesZ.begin(), meshVerticesZ.end(), playerZPos) != meshVerticesZ.end();
int indexX = std::distance(meshVerticesX.begin(), iterX);
int indexZ = std::distance(meshVerticesZ.begin(), iterZ);
if (indexX == indexZ)
{
return meshVerticesY[indexX];
}
}
The idea here is that if the index from the meshVerticesX and meshVerticesZ std::vectors's for the original check match, then they must be the x and z values from an original Vector3f object when I first made the mesh as described earlier, and so that same index in meshVerticesY must contain that same Vector3f's objects y value, therefore return it and use it to set the players height.
The issue is that I cant even test if this works because the line of code int indexX = std::distance(meshVerticesX.begin(), iterX); gives an error saying the arguments supplied to std::distance are wrong (it says iterX is a bool instead of an int which is what I thought it would be).
So my question is - Firstly, if I diden't have this error would my approach even work? and if so, how can I fix the error?
I kind of lost track of your logic somewhere in the middle there, but to address the issue at hand: iterX is a bool!
auto iterX = std::find(...) != meshVerticesX.end();
In this statement, find returns an iterator, which you compare to another iterator, meshVerticesX.end(). The result of that expression (the comparison operator) is a bool, which is then assigned to iterX, so auto deduces that iterX needs to be of type bool.
could you convert your terrain's x,y coordinates to ints, you may need to scale it, then when you load the mesh you could just store the z value for every x,y point (you may have to take some sort of average over the 1x1 square). Now you don't have to look for collisions, instead for each object in the game you can just look up it's z value by it's (scaled) x,y coordinates.

Trouble writing to 4D vector in C++ (no viable overloaded '=')

The issue I am facing is that via the openCV library I am reading in a series of images as their own "Mat" format: an image matrix.
Basically I need to write any pixel value that's > 0 as "true" to a 4D vector and any that == 0 as "false".
Why 4 dimensions?
vector<vector<vector<bool>>>pointVector;
The 3 vector levels refer to X,Y,Z axes. Bool is just the true/false. The images are Y by Z and are stacked in 3D along axis X.
Basically we have a series of images representing points that are stacked in 3D.
(Poor explanation? Probably)
Anyway, the issue comes in my function to read the points in a single photo then write them out to the 4D vector.
Note: xVal is a global storing the ID number of the photo addressed. It's used for the X dimension (layers of images).
Int lineTo3DVector (Mat image)
{
// Takes in matrix and converts to 4D vector.
// This will be exported and all vectors added together into a point cloud
vector<vector<vector<bool>>>pointVector;
for (int x=0; x<image.rows; x++)
{
for (int y = 0; y<image.cols; y++)
{
if((image.at<int>(x,y)) > 0)
{
pointVector[xVal*image.cols*image.rows + x*image.cols + y] = true;
}
}
}
}
I haven't finished writing all of the function because the if statement intended to write pointVector at address xVal, x, y with bool 'true' throws up an error saying:
No viable overloaded '='
Any idea what's going wrong? I've scoured the web and given myself a headache trying to dig up info (yeah, noob in the deep-end again) so any suggestions would be appreciated.
You're accessing only the first vector (the outer one), without actually accessing any of the vectors inside.
The syntax would be pointVector[x][y][z] = true, where x, y and z are the values you want to use to access the three nested vectors.
What you want is:
pointVector[xVal][x][y] = true
What you were using is a way to access a 3D array that is laid out in memory as a 1D array, but it's not what you want in your case.
Make sure you don't go out of bounds
Make sure the elements you are accessing actually exist! If the code you have shown is the actual code, pointVector will have no elements when you try to use it.
To fix that, you have to resize all the vectors (outer and inner). This can get cumbersome, and you may want to go with the 1D approach, i.e. allocate a large, one-dimensional bool array and access it using the method you were using (largeArray[xVal*image.cols*image.rows + x*image.cols + y]).
1D approach
In the following code numberOfValues is the maximum number of elements you will access.
int lineTo3DVector (Mat image)
{
// Takes in matrix and converts to 4D vector.
// This will be exported and all vectors added together into a point cloud
std::vector<bool> pointVector;
pointVector.resize(numberOfValues);
for (int x=0; x < image.rows; x++)
{
for (int y = 0; y < image.cols; y++)
{
if(image.at<int>(x,y) > 0)
{
pointVector[xVal*image.cols*image.rows + x*image.cols + y] = true;
}
}
}
// Return whatever.
}

How to use multi-threading within a loop that iterates through a point cloud in C++?

I have made a function that estimates the normal vectors of a 3D Point Cloud and it takes a lot of time to run on a cloud of size 2 million. I want to multi-thread by calling the same function on two different points at the same time but it didn't work (it was creating hundreds of threads). Here is what I tried:
// kd-tree used for finding neighbours
pcl::KdTreeFLANN<pcl::PointXYZRGB> kdt;
// cloud iterators
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it = pt_cl->points.begin();
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it1;
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it2;
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it3;
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it4;
// initializing tree
kdt.setInputCloud(pt_cl);
// loop exit condition
bool it_completed = false;
while (!it_completed)
{
// initializing cloud iterators
cloud_it1 = cloud_it;
cloud_it2 = cloud_it++;
cloud_it3 = cloud_it++;
if (cloud_it3 != pt_cl->points.end())
{
// attaching threads
boost::thread thread_1 = boost::thread(geom::vectors::find_normal, pt_cl, cloud_it1, kdt, radius, max_neighbs);
boost::thread thread_2 = boost::thread(geom::vectors::find_normal, pt_cl, cloud_it2, kdt, radius, max_neighbs);
boost::thread thread_3 = boost::thread(geom::vectors::find_normal, pt_cl, cloud_it3, kdt, radius, max_neighbs);
// joining threads
thread_1.join();
thread_2.join();
thread_3.join();
cloud_it++;
}
else
{
it_completed = true;
}
}
As you can see I am trying to call the same function on 3 different points at the same time. Any suggestions for how to make this work? Sorry for the poor code, I'm tired and thank you in advance.
EDIT: here is the find_normal function
Here are the parameters:
#param pt_cl is a pointer to the point cloud to be treated (pcl::PointCloud<PointXYZRGB>::Ptr)
#param cloud_it is an iterator of this cloud (pcl::PointCloud<PointXYZRGB>::iterator)
#param kdt is the kd_tree used to find the closest neighbours of a point
#param radius defines the range in which to search for the neighbours of a point
#param max_neighbs is the maximum number of neighbours to be returned by the radius search
// auxilliary vectors for the k-tree nearest search
std::vector<int> pointIdxRadiusSearch; // neighbours ids
std::vector<float> pointRadiusSquaredDistance; // distances from the source to the neighbours
// the vectors of which the cross product calculates the normal
geom::vectors::vector3 *vect1;
geom::vectors::vector3 *vect2;
geom::vectors::vector3 *cross_prod;
geom::vectors::vector3 *abs_cross_prod;
geom::vectors::vector3 *normal;
geom::vectors::vector3 *normalized_normal;
// vectors to average
std::vector<geom::vectors::vector3> vct_toavg;
// if there are neighbours left
if (kdt.radiusSearch(*cloud_it, radius, pointIdxRadiusSearch, pointRadiusSquaredDistance, max_neighbs) > 0)
{
for (int pt_index = 0; pt_index < (pointIdxRadiusSearch.size() - 1); pt_index++)
{
// defining the first vector
vect1 = geom::vectors::create_vect2p((*cloud_it), pt_cl->points[pointIdxRadiusSearch[pt_index + 1]]);
// defining the second vector; making sure there is no 'out of bounds' error
if (pt_index == pointIdxRadiusSearch.size() - 2)
vect2 = geom::vectors::create_vect2p((*cloud_it), pt_cl->points[pointIdxRadiusSearch[1]]);
else
vect2 = geom::vectors::create_vect2p((*cloud_it), pt_cl->points[pointIdxRadiusSearch[pt_index + 2]]);
// adding the cross product of the two previous vectors to our list
cross_prod = geom::vectors::cross_product(*vect1, *vect2);
abs_cross_prod = geom::aux::abs_vector(*cross_prod);
vct_toavg.push_back(*abs_cross_prod);
// freeing memory
delete vect1;
delete vect2;
delete cross_prod;
delete abs_cross_prod;
}
// calculating the normal
normal = geom::vectors::vect_avg(vct_toavg);
// calculating the normalized normal
normalized_normal = geom::vectors::normalize_normal(*normal);
// coloring the point
geom::aux::norm_toPtRGB(&(*cloud_it), *normalized_normal);
// freeing memory
delete normal;
delete normalized_normal;
// clearing vectors
vct_toavg.clear();
pointIdxRadiusSearch.clear();
pointRadiusSquaredDistance.clear();
// shrinking vectors
vct_toavg.shrink_to_fit();
pointIdxRadiusSearch.shrink_to_fit();
pointRadiusSquaredDistance.shrink_to_fit();
}
Since I don't quite get it how the result data is being stored, I'm going to suggest a solution based on OpenMP that matches the code you've posted.
// kd-tree used for finding neighbours
pcl::KdTreeFLANN<pcl::PointXYZRGB> kdt;
#pragma openmp parallel for schedule(static)
for (pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it = pt_cl->points.begin();
cloud_it < pt_cl.end();
++cloud_it) {
geom::vectors::find_normal, pt_cl, cloud_it, kdt, radius, max_neighbs);
}
Note that you should be using the < comparison, and not the != one, -that's how OpenMP works (it wants random access iterators). I'm using the static schedule since every element should take more or less identical time to process. If that's not the case, try using schedule(dynamic) instead.
This solution uses OpenMP, and you may investigate e.g. TBB as well, though it has a higher entrance barrier than OpenMP and uses an OOP-style API.
Also, repeating what I've said in the comments already: OpenMP as well as TBB are going to handle thread management and load distribution for you. You only pass them hints (such as schedule(static)) on how to do it to so as to better suit your needs.
Other than that, please, do get in the habit of repeating as little code as you can; ideally, no code should be duplicated. E.g. when you declare many variables of the same type, or call a certain function a few times in a row with a similar pattern, etc. I also see excessive commenting in the code, with an unclear reason behind it.

Loop to check apple position against snake body positions

I'm trying to figure out how to write a loop to check the position of a circle against a variable number of rectangles so that the apple is not placed on top of the snake, but I'm having a bit of trouble thinking it through. I tried:
do
apple.setPosition(randX()*20+10, randY()*20+10); // apple is a CircleShape
while (apple.getPosition() == snakeBody[i].getPosition());
Although, in this case, if it detects a collision with one rectangle of the snake's body, it could end up just placing the apple at a previous position of the body. How do I make it check all positions at the same time, so it can't correct itself only to have a chance of repeating the same problem again?
There are three ways (I could think of) of generating a random number meeting a requirement:
The first way, and the simpler, is what you're trying to do: retry if it doesn't.
However, you should change the condition so that it checks all the forbidden cells at once:
bool collides_with_snake(const sf::Vector2f& pos, //not sure if it's 2i or 2f
const /*type of snakeBody*/& snakeBody,
std::size_t partsNumber) {
bool noCollision = true;
for( std::size_t i = 0 ; i < partsNumber && noCollision ; ++i )
noCollision = pos != snakeBody[i].getPosition()
return !noCollision;
}
//...
do
apple.setPosition(randX()*20+10, randY()*20+10);
while (collides_with_snake(apple.getCollision(), snakeBody,
/* snakeBody.size() ? */));
The second way is to try to generate less numbers and find a function which will map these numbers to the set you want. For instance, if your grid has N cells, you could generate a number between 0 and N - [number of parts of your Snake] then map this number X to the smallest number Y such that this integer doesn't refer to a cell occupied by a snake part and X = Y + S where S is the number of cells occupied by a snake part referred by a number smaller than Y.
It's more complicated though.
The third way is to "cheat" and choose a stronger requirement which is easier to enforce. For instance, if you know that the cell body is N cells long, then only spawn the apple on a cell which is N + 1 cells away of the snakes head (you can do that by generating the angle).
The question is very broad, but assuming that snakeBody is a vector of Rectangles (or derived from Rectanges), and that you have a checkoverlap() function:
do {
// assuming that randX() and randY() allways return different random variables
apple.setPosition(randX()*20+10, randY()*20+10); // set the apple
} while (any_of(snakeBody.begin(), snakeBody.end(), [&](Rectangle &r)->bool { return checkoverlap(r,apple); } );
This relies on standard algorithm any_of() to check in one simple expression if any of the snake body elements overlaps the apple. If there's an overlap, we just iterate once more and get a new random position until it's fine.
If snakebody is an array and not a standard container, just use snakeBody, snakeBody+snakesize instead of snakeBody.begin(), snakeBody.end() in the code above.
If the overlap check is as simple as to compare the postition you can replace return checkoverlap(r,apple); in the code above with return r.getPosition()==apple.getPosition();
The "naive" approach would be generating apples and testing their positions against the whole snake until we find a free spot:
bool applePlaced = false;
while(!applePlaced) { //As long as we haven't found a valid place for the apple
apple.setPosition(randX()*20+10, randY()*20+10);
applePlaced = true; //We assume, that we can place the apple
for(int i=0; i<snakeBody.length; i++) { //Check the apple position with all snake body parts
if(apple.getPosition() == snakeBody[i].getPosition()) {
applePlaced=false; //Our prediction was wrong, we could not place the apple
break; //No further testing necessary
}
}
}
The better way would be storing all free positions in an array and then pick a Position out of this array(and delete it from the array), so that no random testing is necessary. It requires also updating the array if the snakes moves.

Why Is My Spatial Hash So Slow?

Why is my spatial hash so slow? I am working on a code that uses smooth particle hydrodynamics to model the movement of landslides. In smooth particle hydrodynamics each particle influences the particles that are within a distance of 3 "smoothing lengths". I am trying to implement a spatial hash function in order to have a fast look up of the neighboring particles.
For my implementation I made use of the "set" datatype from the stl. At each time step the particles are hashed into their bucket using the function below. "bucket" is a vector of sets, with one set for each grid cell (the spatial domain is limited). Each particle is identified by an integer.
To look for collisions the function below entitled "getSurroundingParticles" is used which takes an integer (corresponding to a particle) and returns a set that contains all the grid cells that are within 3 support lengths of the particle.
The problem is that this implementation is really slow, slower even than just checking each particle against every other particles, when the number of particles is 2000. I was hoping that someone could spot a glaring problem in my implementation that I'm not seeing.
//put each particle into its bucket(s)
void hashParticles()
{
int grid_cell0;
cellsWithParticles.clear();
for(int i = 0; i<N; i++)
{
//determine the four grid cells that surround the particle, as well as the grid cell that the particle occupies
//using the hash function int grid_cell = ( floor(x/cell size) ) + ( floor(y/cell size) )*width
grid_cell0 = ( floor( (Xnew[i])/cell_spacing) ) + ( floor(Ynew[i]/cell_spacing) )*cell_width;
//keep track of cells with particles, cellsWithParticles is an unordered set so duplicates will automatically be deleted
cellsWithParticles.insert(grid_cell0);
//since each of the hash buckets is an unordered set any duplicates will be deleted
buckets[grid_cell0].insert(i);
}
}
set<int> getSurroundingParticles(int particleOfInterest)
{
set<int> surroundingParticles;
int numSurrounding;
float divisor = (support_length/cell_spacing);
numSurrounding = ceil( divisor );
int grid_cell;
for(int i = -numSurrounding; i <= numSurrounding; i++)
{
for(int j = -numSurrounding; j <= numSurrounding; j++)
{
grid_cell = (int)( floor( ((Xnew[particleOfInterest])+j*cell_spacing)/cell_spacing) ) + ( floor((Ynew[particleOfInterest]+i*cell_spacing)/cell_spacing) )*cell_width;
surroundingParticles.insert(buckets[grid_cell].begin(),buckets[grid_cell].end());
}
}
return surroundingParticles;
}
The code that looks calls getSurroundingParticles:
set<int> nearbyParticles;
//for each bucket with particles in it
for ( int i = 0; i < N; i++ )
{
nearbyParticles = getSurroundingParticles(i);
//for each particle in the bucket
for ( std::set<int>::iterator ipoint = nearbyParticles.begin(); ipoint != nearbyParticles.end(); ++ipoint )
{
//do stuff to check if the smaller subset of particles collide
}
}
Thanks a lot!
Your performance is getting eaten alive by the sheer amount of stl heap allocations caused by repeatedly creating and populating all those Sets. If you profiled the code (say with a quick and easy non-instrumenting tool like Sleepy), I'm certain you'd find that to be the case. You're using Sets to avoid having a given particle added to a bucket more than once - I get that. If Duck's suggestion doesn't give you what you need, I think you could dramatically improve performance by using preallocated arrays or vectors, and getting uniqueness in those containers by adding an "added" flag to the particle that gets set when the item is added. Then just check that flag before adding, and be sure to clear the flags before the next cycle. (If the number of particles is constant, you can do this extremely efficiently with a preallocated array dedicated to storing the flags, then memsetting to 0 at the end of the frame.)
That's the basic idea. If you decide to go this route and get stuck somewhere, I'll help you work out the details.