I have a stream of (x,y) data that I want to determine velocity and acceleration from. The data is pretty typical and can be thought to represent say a car driving around.
A new data point comes every 2ms and I would prefer to not accumulate/store unnecessary values so I thought to use a boost::accumulator.
Is there a simpler way to handle this type of task? or perhaps other libraries that already exist which already does this? or am I on the right track with my thinking. Not yet sure what tags I'm going to use but I like the idea that the container keeps an updated value for a given property and doesn't store the old positional data.
Another idea is to use a circular buffer (e.g size 200) and calculate the acceleration based off the last 50 values and velocity based off all the values in the buffer. However if the buffer stores raw positional data this would require looping over all elements every time to calculate the acceleration and velocity. This could be improved by instead keeping some sort of rolling acceleration and velocity value which recalculates by removing the value from the end element and adding the value from the new element to insert (with weight 1/elements in buffer). However this to me seems like some sort of boost rolling weighted accumulator.
You probably want to apply some sort of a Kalman filter to the data. Old data needs to be there to help reduce the impact of noise, new data needs to be there, and weighted higher so that the answer is sensitive to the latest information.
A fairly simple approach for position, lets call it X, where each new sample is x is:
X = (1-w) * X + w * x
as each new value comes in. Where the weight w adjusts how sensitive you are to new information vs old information. w = 1 means you don't care about history, w = 0 means you don't care at all about new information (and obviously means that you'll never store anything).
The instantaneous velocity can be calculated by computing the difference between successive points and dividing this difference by the time interval. This can in turn be filtered with a Kalman filter.
Acceleration is the difference in sequential velocities, again divided by the time interval. You can filter these as well.
The divided differences will be more sensitive to noise than the position. For example, if object whose position you're monitoring stops, you will continue to get position measurements. The velocity vectors for successive measurements will point in random directions.
Boost accumulator doesn't appear to do what you want.
Related
Fairly new to processing data like this; I have two curves that I'm not sure how to process, but I know what I'd like to have as an outcome. The original plots of two datasets are shown below (left); the rough fit that I think I would like to have for them is shown to below (right) with the overlayed fit in red.
First example:
The sudden drops in amplitude are an artifact on how the data was taken. This means it's inherently unpredictable, and I would ideally like to find a method that is robust to this behavior.
In the first case, I could try to eliminate the sharp drops in amplitude by using a threshold, but that would not help me in the second case:
,
where I still get strong oscillation, but the minima are no longer at 0.
Edit: After writing a short script to use #JamesPhillips suggestion, fitting results are shown below; can confirm this is what I was looking for, and works better/faster than other fitting algorithms.
and
A possible algothm: filter the data something like this...
Start with the smallest X-valued point shown on the graph, iterating from smallest X value to largest X value. For each point:
1) If the next point's Y value is greater than or equal to this point's Y value, include it.
2) If the next point's value is less that [cutoff] percent of this point's Y value, exclude it.
3) Go to next point.
Run the filter and test different values for [cutoff], each time graphing the result to see if the value of [cutoff] meets your requirements. You may need an additional filter condition or two, but that should be a good start to filtering the data as you describe.
I am currently working on a MD simulation. It stores the molecule positions in a vector. For each time step, that vector is stored for display in a second vector, resulting in
std::vector<std::vector<molecule> > data;
The size of data is time steps*<number of molecules>*sizeof(molecule), where sizeof(molecule) is (already reduced) 3*sizeof(double), as the position vector. Still I get memory problems for larger amounts of time steps and number of molecules.
Thus, is there an additional possibility to decrease the amount of data? My current workflow is that I calculate all molecules first, store them, and then render them by using the data of each molecule for each step, the rendering is done with Irrlicht. (Maybe later with blender).
If the trajectories are smooth, you can consider to compress data by storing only for every Nth step and restoring the intermediate positions by interpolation.
If the time step is small, linear interpolation can do. Top quality can be provided by cubic splines. Anyway, the computation of the spline coefficients is a global operation that you can only perform in the end and that required extra storage (!), and you might prefer Cardinal splines, which can be built locally, from four consecutive positions.
You could gain a factor of 2 improvement by storing the positions in single precision rather than double - it will be sufficient for rendering, if not for the simulation.
But ultimately you will need to store the results in a file and render offline.
I am working on a project to simulate a hard sphere model of a gas. (Similar to the ideal gas model.)
I have written my entire project, and it is working. To give you an idea of what I have done, there is a loop which does the following: (Pseudo code)
Get_Next_Collision(); // Figure out when the next collision will occur
Step_Time_Forwards(); // Step to time of collision
Process_Collision(); // Process collision between 2 particles
(Repeat)
For a large number of particles (say N particles), O(N*N) checks must be made to figure out when the next collision occurs. It is clearly inefficient to follow the above procedure, because in the vast majority of cases, collisions between pairs of particles are unaffected by the processing of a collision elsewhere. Therefore it is desirable to have some form of priority queue which stores the next event for each particle. (Actually, since a collision involves 2 particles, only half that number of events will be stored, because if A collides with B then B also collides with A, and at exactly the same time.)
I am finding it difficult to write such an event/collision priority queue.
I would like to know if there are any Molecular Dynamics simulators which have been written and which I can go and look at the source code in order to understand how I might implement such a priority queue.
Having done a google search, it is clear to me that there are many MD programs which have been written, however many of them are either vastly too complex or not suitable.
This may be because they have huge functionality, including the ability to produce visualizations or ability to compute the simulation for particles which have interacting forces acting between them, etc.
Some simulators are not suitable because they do calculations for a different model, ie: something other than the energy conserving, hard sphere model with elastic collisions. For example, particles interacting with potentials or non-spherical particles.
I have tried looking at the source code for LAMMPS, but it's vast and I struggle to make any sense of it.
I hope that is enough information about what I am trying to do. If not I can probably add some more info.
A basic version of a locality-aware system could look like this:
Divide the universe into a cubic grid (where each cube has side A, and volume A^3), where each cube is sufficiently large, but sufficiently smaller than the total volume of the system. Each grid cube is further divided into 4 sub-cubes whose particles it can theoretically give to its neighboring cubes (and lend for calculations).
Each grid cube registers particles that are contained within it and is aware of its neighboring grid cubes' contained particles.
Define a particle's observable universe to have a radius of (grid dimension/2). Define timestep=(griddim/2) / max_speed. This postulates that particles from a maximum of four, adjacent grid cubes can theoretically interact in that time period.
For every particle in every grid cube, run your traditional collision detection algorithm (with mini_timestep < timestep, where each particle is checked for possible collisions with other particles in its observable universe. Store the collisions into any structure sorted by time, even just an array, sorted by the time of collision.
The first collision that happens within a mini_timestep resets your universe(and universe clock) to (last_time + time_to_collide), where time_to_collide < mini_timestep. I suppose that does not differ from your current algorithm. Important note: particles' absolute coordinates are updated, but which grid cube and sub-cube they belong to are not updated.
Repeat step 5 until the large timestep has passed. Update the ownership of particles by each grid square.
The advantage of this system is that for each time window, we have (assuming uniform distribution of particles) O(universe_particles * grid_size) instead of O(universe_particles * universe_size) checks for collision. In good conditions (depending on universe size, speed and density of particles), you could improve the computation efficiency by orders of magnitude.
I didn't understand how the 'priority queue' approach would work, but I have an alternative approach that may help you. It is what I think #Boyko Perfanov meant with 'make use of locality'.
You can sort the particles into 'buckets', such that you don't have to check each particle against each other ( O(n²) ). This uses the fact that particles can only collide if they are already quite close to each other. Create buckets that represent a small area/volume, and fill in all particles that are currently in the area/volume of the bucket ( O(n) worst case ). Then check all particles inside a bucket against the other particles in the bucket ( O(m*(n/m)²) average case, m = number of buckets ). The buckets need to be overlapping for this to work, or else you could also check the particles from neighboring buckets.
Update: If the particles can travel for a longer distance than the bucket size, an obvious 'solution' is to decrease the time-step. However this will increase the running time of the algorithm again, and it works only if there is a maximum speed.
Another solution applicable even when there is no maximum speed, would be to create an additional 'high velocity' bucket. Since the velocity distribution is usually a gaussian curve, not many particles would have to be placed into that bucket, so the 'bucket approach' would still be more efficient than O(n²).
I have a collection of object with a position (x, y)
These objects randomly move
Could have thousands of it
At any moment I would have the list of object in a (constant) radius RAD from a position POS.
Edit - Context : It's for a gameserver, which would (utopically) have thousands of players. When a player moves/[makes an action], I want to send the update to others players in the radius.
The easy way, every time I need the list :
near_objects;
foreach( objects o ) {
if( o.distance( POS ) < RAD )
near_objects.add( o )
}
I guess there are better/faster methods, but I don't know what to search.
Here are two suggestions.
Usually you compute distance using sqrt( (a.x-b.x)^2 + (a.y-b.y)^2 ) and the expensive part is computing sqrt(), if you compute RAD^2 once outside the loop and compare it to the inside of the sqrt() you can avoid computing sqrt() in the loop.
If most of the objects are far away, you can eliminate them by using
if( abs(a.x-b.x) > RAD ) continue;
if( abs(a.y-b.y) > RAD ) continue;
I assume this is for some kind of MMO - can't imagine 'thousands' of players in any other scenario. So your problem is actually more complex - you need to determine which players should receive the update about each player, so it turns into O(n^2) problem and we're dealing with millions. First thing to consider is do you really want to send updates based only on distance? You could divide your world into zones and keep separate lists of players for each zone and check it only for these lists, so for m zones we have O(m * (n/m)^2) = O(n^2/m). Obviously you also want to send updates to players in the same party and allow players near zone transition spots to know about each other(but make sure to keep that area small and unattractive for players so they don't just stand there). Also considering huge world and relatively slow player speed you don't have to update that info all that often.
Also keep in mind that memory/cache usage is extremely important for performance and I was referring to list as an abstract term - you should keep data accessed in tight loops in arrays, but make sure elements aren't too big. In this case consider making a simple class containing basic player data for those intensive loops and keep a pointer to a bigger class containing other data.
And on a total side note - your question seems to be quite basic, yet you are trying to build an MMO, which is not only technically complicated, but also requires a ton of work. I believe, that pursuing a smaller, less ambitious project, that you will be actually able to complete would be more beneficial.
You could put your objects into an ordered data structure, indexed by their distance from POS. This is similar to a priority queue, but you don't want to push/pop items all the time.
You'd have to update an object's key whenever it moves to the new position. To iterate over the items within a given radius RAD, you'd simply iterate over the items of this ordered data structure as long as the distance (the key) is less than RAD.
I'm writing a mobile robotics application in C/C++ in Ubuntu and, at the moment, I'm using a laser sensor to scan the environment and detect collisions with objects when the robot moves.
This laser has a scan area of 270° and a maximum radius of 4000mm.
It is able to detect an object within this range and to report their distance from the sensor.
Each distance is in planar coordinates, so to get more readeable data, I convert them from planar to cartesian coordinates and then I print them in a text file and then I plot them in MatLab to see what the laser had detected.
This picture shows a typical detection on cartesian coordinates.
Values are in meters, so 0.75 are 75 centimeters and 2 are two meters. Contiguous blue points are all the detected objects, while the points near (0,0) refer to the laser position and must be discarded. Blue points under y < 0 are produced since laser scan area is 270°; I added the red line square (1.5 x 2 meters) to determine the region within I want to implement the collisions check.
So, I would like to detect in realtime if there are points (objects) inside that area and, if yes, call some functions. This is a little bit tricky, because, this check should be able to detect also if there are contiguous points to determine if the object is real or not (i.e. if it detects a point, then it should search the nearest point to determine if they compose an object or if it's only a point which may be a detection error).
This is the function I use to perform a single scan:
struct point pt[limit*URG_POINTS];
//..
for(i = 0; i < limit; i++){
for(j = 0; j < URG_POINTS; j++){
ang2 = kDeg2Rad*((j*240/(double)URG_POINTS)-120);
offset = 0.03; //it depends on sensor module [m]
dis = (double) dist[cnt] / 1000.0;
//THRESHOLD of RANGE
// if(dis > MAX_RANGE) dis = 0; //MAX RANGE = 4[m]
// if(dis < MIN_RANGE) dis = 0;
pt[cnt].x = dis * cos(ang2) * cos(ang1) + (offset*sin(ang1)); // <-- X POINTS
pt[cnt].y = dis * sin(ang2); // <-- Y POINTS
// pt[cnt].z = dis * cos(ang2) * sin(ang1) - (offset*cos(ang1)); <- I disabled 3D mapping at the moment
cnt++;
}
ang1 += diff;
}
After each single scan, pt contains all the detected points in x-y coordinates.
I'd like to do something like this:
perform a single scan, then at the end,
apply collisions check on each pt.x and pt.y
if you find a point in the inner region, then check for other near points, if yes, stop the robot;
if not or if no other near points are found, start another scan
I'd like to know how to easy check objects (composed by more than one single point) inner the previous defined region.
Can you help me, please?
It seems very difficult for me :(
I don't think I can give a complete answer, but a few thoughts on where it might be possible to go.
What do you mean with realtime? How long may it take for any given algorithm to run? And what processor does your program run at?
Filtering the points that are within your area of detection should be quite easy just by checking if abs(x) < 0.75 and y< 2 && y > 0. Furthermore, you should only consider points that are far enough away from 0, so x^2 + y^2 > d.
But that should be the trivial part.
More interesting it will get to detect groups of points. DBSCAN has proven to be a fairly good clustering algorithm for detecting 2-dimensional groups of points. The critical question here is if DBSCAN is fast enough for real-time applications.
If not, you might have to think about optimizing the algorithm (You can press it's complexity to n*log(n) using some clever indexing structures).
Furthermore, it might be worth thinking about how you can incorporate the knowledge you have from your last iteration (assuming a high frequency, the data points should not change to much).
It might be worth looking at other robotics projects - I could imagine the problem of interpreting sensor data to construct information of the surroundings is a rather common one.
UPDATE
It is fairly difficult to give you good advice without knowing where you stumble on applying DBSCAN on your problem. But let me try to give you a step-by-step-guide how an algorithm may work:
For each datapoint you receive you check whether it is in the region you want to have observed. (The conditions I have given above should work).
If the datapoint is within the region you save it to some sort of list
After reading all data points you check if the list is empty. If so, everything is good. Otherwise we have to check if there are bigger groups of data points that you have to navigate around.
Now comes the more difficult part. You throw DBSCAN on that points and try to find groups of points. Which parameters will work for the algorithm I do not know - that has to be tried. After that you should have some clusters of points. I'm not totally sure what you will do with the groups - an idea would be to detect the points of each group that have the minimum and maximum degree in polar coordinates. That way you could decide how far you have to turn your vehicle. Special care would have to be taken if two groups are so close that it will not be possible to navigate through the gap between.
For the implementation of DBSCAN you could here or just ask google for help. It is a fairly common algorithm that has been coded thousands of times. For further optimizations concerning speed it might be helpful to create an own implementation. However, if one of the implementations you find seems to be usable, I would try that first before going all the way and implementing it on my own.
If you stumble on specific problems while implementing the algorithm I would suggest creating a new question, as it is far away from this one and you might get more people that are willing to help you.
I hope things are a bit clearer now. If not please give the exact point that you have doubts about.