Algorithm for smoothing - c++

I wrote this code for smoothing of a curve .
It takes 5 points next to a point and adds them and averages it .
/* Smoothing */
void smoothing(vector<Point2D> &a)
{
//How many neighbours to smooth
int NO_OF_NEIGHBOURS=10;
vector<Point2D> tmp=a;
for(int i=0;i<a.size();i++)
{
if(i+NO_OF_NEIGHBOURS+1<a.size())
{
for(int j=1;j<NO_OF_NEIGHBOURS;j++)
{
a.at(i).x+=a.at(i+j).x;
a.at(i).y+=a.at(i+j).y;
}
a.at(i).x/=NO_OF_NEIGHBOURS;
a.at(i).y/=NO_OF_NEIGHBOURS;
}
else
{
for(int j=1;j<NO_OF_NEIGHBOURS;j++)
{
a.at(i).x+=tmp.at(i-j).x;
a.at(i).y+=tmp.at(i-j).y;
}
a.at(i).x/=NO_OF_NEIGHBOURS;
a.at(i).y/=NO_OF_NEIGHBOURS;
}
}
}
But i get very high values for each point, instead of the similar values to the previous point . The shape is maximized a lot , what is going wrong in this algorithm ?

What it looks like you have here is a bass-ackwards implementation of a finite impulse response (FIR) filter that implements a boxcar window function. Thinking about the problem in terms of DSP, you need to filter your incoming vector with NO_OF_NEIGHBOURS equal FIR coefficients that each have a value of 1/NO_OF_NEIGHBOURS. It is normally best to use an established algorithm rather than reinvent the wheel.
Here is a pretty scruffy implementation that I hammered out quickly that filters doubles. You can easily modify this to filter your data type. The demo shows filtering of a few cycles of a rising saw function (0,.25,.5,1) just for demonstration purposes. It compiles, so you can play with it.
#include <iostream>
#include <vector>
using namespace std;
class boxFIR
{
int numCoeffs; //MUST be > 0
vector<double> b; //Filter coefficients
vector<double> m; //Filter memories
public:
boxFIR(int _numCoeffs) :
numCoeffs(_numCoeffs)
{
if (numCoeffs<1)
numCoeffs = 1; //Must be > 0 or bad stuff happens
double val = 1./numCoeffs;
for (int ii=0; ii<numCoeffs; ++ii) {
b.push_back(val);
m.push_back(0.);
}
}
void filter(vector<double> &a)
{
double output;
for (int nn=0; nn<a.size(); ++nn)
{
//Apply smoothing filter to signal
output = 0;
m[0] = a[nn];
for (int ii=0; ii<numCoeffs; ++ii) {
output+=b[ii]*m[ii];
}
//Reshuffle memories
for (int ii = numCoeffs-1; ii!=0; --ii) {
m[ii] = m[ii-1];
}
a[nn] = output;
}
}
};
int main(int argc, const char * argv[])
{
boxFIR box(1); //If this is 1, then no filtering happens, use bigger ints for more smoothing
//Make a rising saw function for demo
vector<double> a;
a.push_back(0.); a.push_back(0.25); a.push_back(0.5); a.push_back(0.75); a.push_back(1.);
a.push_back(0.); a.push_back(0.25); a.push_back(0.5); a.push_back(0.75); a.push_back(1.);
a.push_back(0.); a.push_back(0.25); a.push_back(0.5); a.push_back(0.75); a.push_back(1.);
a.push_back(0.); a.push_back(0.25); a.push_back(0.5); a.push_back(0.75); a.push_back(1.);
box.filter(a);
for (int nn=0; nn<a.size(); ++nn)
{
cout << a[nn] << endl;
}
}
Up the number of filter coefficients using this line to see a progressively more smoothed output. With just 1 filter coefficient, there is no smoothing.
boxFIR box(1);
The code is flexible enough that you can even change the window shape if you like. Do this by modifying the coefficients defined in the constructor.
Note: This will give a slightly different output to your implementation as this is a causal filter (only depends on current sample and previous samples). Your implementation is not causal as it looks ahead in time at future samples to make the average, and that is why you need the conditional statements for the situation where you are near the end of your vector. If you want output like what you are attempting to do with your filter using this algorithm, run the your vector through this algorithm in reverse (This works fine so long as the window function is symmetrical). That way you can get similar output without the nasty conditional part of algorithm.

in following block:
for(int j=0;j<NO_OF_NEIGHBOURS;j++)
{
a.at(i).x=a.at(i).x+a.at(i+j).x;
a.at(i).y=a.at(i).y+a.at(i+j).y;
}
for each neighbour you add a.at(i)'s x and y respectively to neighbour values.
i understand correctly, it should be something like this.
for(int j=0;j<NO_OF_NEIGHBOURS;j++)
{
a.at(i).x += a.at(i+j+1).x
a.at(i).y += a.at(i+j+1).y
}

Filtering is good for 'memory' smoothing. This is the reverse pass for the learnvst's answer, to prevent phase distortion:
for (int i = a.size(); i > 0; --i)
{
// Apply smoothing filter to signal
output = 0;
m[m.size() - 1] = a[i - 1];
for (int j = numCoeffs; j > 0; --j)
output += b[j - 1] * m[j - 1];
// Reshuffle memories
for (int j = 0; j != numCoeffs; ++j)
m[j] = m[j + 1];
a[i - 1] = output;
}
More about zero-phase distortion FIR filter in MATLAB: http://www.mathworks.com/help/signal/ref/filtfilt.html

The current-value of the point is used twice: once because you use += and once if y==0. So you are building the sum of eg 6 points but only dividing by 5. This problem is in both the IF and ELSE case. Also: you should check that the vector is long enough otherwise your ELSE-case will read at negative indices.
Following is not a problem in itself but just a thought: Have you considered to use an algorithm that only touches every point twice?: You can store a temporary x-y-value (initialized to be identical to the first point), then as you visit each point you just add the new point in and subtract the very-oldest point if it is further than your NEIGHBOURS back. You keep this "running sum" updated for every point and store this value divided by the NEIGHBOURS-number into the new point.

You make addition with point itself when you need to take neighbor points - just offset index by 1:
for(int j=0;j<NO_OF_NEIGHBOURS;j++)
{
a.at(i).x += a.at(i+j+1).x
a.at(i).y += a.at(i+j+1).y
}

This works fine for me:
for (i = 0; i < lenInput; i++)
{
float x = 0;
for (int j = -neighbours; j <= neighbours; j++)
{
x += input[(i + j <= 0) || (i + j >= lenInput) ? i : i + j];
}
output[i] = x / (neighbours * 2 + 1);
}

Related

How to access a vector inside a vector?

So I have a vector of vectors type double. I basically need to be able to set 360 numbers to cosY, and then put those 360 numbers into cosineY[0], then get another 360 numbers that are calculated with a different a now, and put them into cosineY[1].Technically my vector is going to be cosineYa I then need to be able to take out just cosY for a that I specify...
My code is saying this:
for (int a = 0; a < 8; a++)
{
for int n=0; n <= 360; n++
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
which I hope is the correct way of actually setting it.
But then I need to take cosY for a that I specify, and calculate another another 360 vector, which will be stored in another vector again as a vector of vectors.
Right now I've got:
for (int a = 0; a < 8; a++
{
for (int n = 0; n <= 360; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
The VectorOfY is besically the amplitude of an input wave. What I am doing is trying to create a cosine wave with different frequencies (a). I am then calculation the product of the input and cosine wave at each frequency. I need to be able to access these 360 points for each frequency later on in the program, and right now also I need to calculate the addition of all elements in cosProductPt, for every frequency (stored in cosProductY), and store it in a vector dotProductCos[a].
I've been trying to work it out but I don't know how to access all the elements in a vector of vectors to add them. I've been trying to do this for the whole day without any results. Right now I know so little that I don't even know how I would display or access a vector inside a vector, but I need to use that access point for the addition.
Thank you for your help.
for (int a = 0; a < 8; a++)
{
for int n=0; n < 360; n++) // note traded in <= for <. I think you had an off by one
// error here.
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
Is sound so long as cosY has been pre-allocated to contain at least 360 elements. You could
std::vector<std::vector<double>> cosineY;
std::vector<double> cosY(360); // strongly consider replacing the 360 with a well-named
// constant
for (int a = 0; a < 8; a++) // same with that 8
{
for int n=0; n < 360; n++)
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
for example, but this hangs on to cosY longer than you need to and could cause problems later, so I'd probably scope cosY by throwing the above code into a function.
std::vector<std::vector<double>> buildStageOne(std::vector<double> &vectorOfY)
{
std::vector<std::vector<double>> cosineY;
std::vector<double> cosY(NumDegrees);
for (int a = 0; a < NumVectors; a++)
{
for int n=0; n < NumDegrees; n++)
{
cosY[n] = cos(a*vectorOfY[n]); // take radians into account if needed.
}
cosineY.push_back(cosY);
}
return cosineY;
}
This looks horrible, returning the vector by value, but the vast majority of compilers will take advantage of Copy Elision or some other sneaky optimization to eliminate the copying.
Then I'd do almost the exact same thing for the second step.
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
for (int a = 0; a < numVectors; a++)
{
for (int n = 0; n < NumDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosineY[a][n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
But we can make a couple optimizations
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
for (int a = 0; a < numVectors; a++)
{
// why risk constantly looking up cosineY[a]? grab it once and cache it
std::vector<double> & cosY = cosineY[a]; // note the reference
for (int n = 0; n < numDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
And the next is kind of an extension of the first:
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
std::vector<double> cosProductPt(360);
for (std::vector<double> & cosY: cosineY) // range based for. Gets rid of
{
for (int n = 0; n < NumDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
We could do the same range-based for trick for the for (int n = 0; n < NumDegrees; n++), but since we are iterating multiple arrays here it's not all that helpful.

Evaluating variable if statement in C++

I am trying to evaluate a statistics problem via a Monte Carlo method. In this problem I am generating a random number and comparing it to a fixed probability number stored in a vector array titled comms_reliability. Assuming there is only one variable in the vector array, I am comparing the random number and the probability and tallying the results if the random number is greater than the reliability number. However, the vector array could also have two values, in which case I am producing two random numbers and comparing them to the two reliability numbers and. If both random numbers are bigger than the reliability numbers, I am tallying the scenarios. Theoretically this could continue on and on for as many values in the vector array as I want. However, through a failure of imagination I only know how to code this where the for statement is contained in multiple if statements
for each possible scenario. In this implementation I have to copy the same lines of code multiple times, and it also limits the commms_reliability array sizes that can be evaluated based on how many times I have copied these lines of code to handle the next array point. How can I do this where I only need one if statement. An example of how I have it coded currently is shown below.
int main(int argc, const char * argv[]) {
int sample_size = 1000000;
std::vector<float> comms_reliability = {0.6,0.6};
float tally = 0.0;
// rang() = random number generator
// if statement for comms_reliability array of size 1
if (comms_reliability.size() == 1) {
for (int i = 0; i < sample_size; i++){
if (rang() > comms_reliability[0]) tally = tally + 1.0;
}
}
// if statement 2 for comms_reliability array of size 2
if (comms_reliability.size() == 2) {
for (int i = 0; i < sample_size; i++){
if (rang() > comms_reliability[0] && rang() > comms_reliability[1]) tally = tally + 1.0;
}
}
// if statement 3 for comms_reliability array of size 3
if (comms_reliability.size() == 3) {
for (int i = 0; i < sample_size; i++){
if (rang() > comms_reliability[0] && rang() > comms_reliability[1] &&
rang() > comms_reliability[2]) tally = tally + 1.0;
}
}
If I understand you correctly you want to make sure that all elements of comms_reliability satisfy some criterion (namely being less than rang()) for each sample.
So make a loop over all elements and test each, or just use std::all_of:
// Lambda function used to test a single comm_reliability
auto is_reliable = [] (float r) { return rang() > r; };
// Iterate over your samples
for (int i = 0; i < sample_size; ++i) {
// If all elements satisfy your criterion ...
if (std::all_of(std::begin(comms_reliability),
std::end(comms_reliability),
is_reliable)) {
// .. perform your action
tally += 1.0;
}
}
Instead of the lambda function you could also use a normal function defined somewhere before:
bool is_reliable(float r) {
return rang() > r;
}
Note: Try to improve your variable/function naming.
use a flag to keep the value
int main(int argc, const char * argv[]) {
int sample_size = 1000000;
std::vector<float> comms_reliability = {0.6,0.6};
float tally = 0.0;
// rang() = random number generator
for (int i = 0; i < sample_size; i++){
boolean flag = true;
for(int j = 0; j < comms_reliability.size(); j++)
{
if (rang() <= comms_reliability[j])
{
flag = false;
break;
}
}
tally = flag ? tally + 1.0 : tally;
}

Find similar distances between all values in vector and subset them

Given is a vector with double values. I want to know which distances between any elements of this vector have a similar distance to each other. In the best case, the result is a vector of subsets of the original values where subsets should have at least n members.
//given
vector<double> values = {1,2,3,4,8,10,12}; //with simple values as example
//some algorithm
//desired result as:
vector<vector<double> > subset;
//in case of above example I would expect some result like:
//subset[0] = {1,2,3,4}; //distance 1
//subset[1] = {8,10,12}; //distance 2
//subset[2] = {4,8,12}; // distance 4
//subset[3] = {2,4}; //also distance 2 but not connected with subset[1]
//subset[4] = {1,3}; //also distance 2 but not connected with subset[1] or subset[3]
//many others if n is just 2. If n is 3 (normally the minimum) these small subsets should be excluded.
This example is simplified as the distances of integer numbers could be iterated and tested for the vector which is not the case for double or float.
My idea so far
I thought of something like calculating the distances and storing them in a vector. Creating a difference distance matrix and thresholding this matrix for some tolerance for similar distances.
//Calculate distances: result is a vector
vector<double> distances;
for (int i = 0; i < values.size(); i++)
for (int j = 0; j < values.size(); j++)
{
if (i >= j)
continue;
distances.push_back(abs(values[i] - values[j]));
}
//Calculate difference of these distances: result is a matrix
Mat DiffDistances = Mat::zero(Size(distances.size(), distances.size()), CV_32FC1);
for (int i = 0; i < distances.size(); i++)
for (int j = 0; j < distances.size(); j++)
{
if (i >= j)
continue;
DiffDistances.at<float>(i,j) = abs(distances[i], distances[j]);
}
//threshold this matrix with some tolerance in difference distances
threshold(DiffDistances, DiffDistances, maxDistTol, 255, CV_THRESH_BINARY_INV);
//get points with similar distances
vector<Points> DiffDistancePoints;
findNonZero(DiffDistances, DiffDistancePoints);
At this point I get stuck with finding the original values corresponding to my similar distances. It should be possible to find them, but it seems very complicated to trace back the indices and I wonder if there isn't an easier way to solve the problem.
Here is a solution that works, as long as there are no branches meaning, that there are no values closer together than 2*threshold. That is the valid neighbor region because neighboring bonds should differ by less than the threshold, if I understood #Phann correctly.
The solution is definitively neither the fastest nor the nicest possible solution. But you might use it as a starting point:
#include <iostream>
#include <vector>
#include <algorithm>
int main(){
std::vector< double > values = {1,2,3,4,8,10,12};
const unsigned int nValues = values.size();
std::vector< std::vector< double > > distanceMatrix(nValues - 1);
// The distanceMatrix has a triangular shape
// First vector contains all distances to value zero
// Second row all distances to value one for larger values
// nth row all distances to value n-1 except those already covered
std::vector< std::vector< double > > similarDistanceSubsets;
double threshold = 0.05;
std::sort(values.begin(), values.end());
for (unsigned int i = 0; i < nValues-1; ++i) {
distanceMatrix.at(i).resize(nValues-i-1);
for (unsigned j = i+1; j < nValues; ++j){
distanceMatrix.at(i).at(j-i-1) = values.at(j) - values.at(i);
}
}
for (unsigned int i = 0; i < nValues-1; ++i) {
for (unsigned int j = i+1; j < nValues; ++j) {
std::vector< double > thisSubset;
double thisDist = distanceMatrix.at(i).at(j-i-1);
// This distance already belongs to another cluster
if (thisDist < 0) continue;
double minDist = thisDist - threshold;
double maxDist = thisDist + threshold;
thisSubset.push_back(values.at(i));
thisSubset.push_back(values.at(j));
//Indicate that this is already clustered
distanceMatrix.at(i).at(j-i-1) = -1;
unsigned int lastIndex = j;
for (unsigned int k = j+1; k < nValues; ++k) {
thisDist = distanceMatrix.at(lastIndex).at(k-lastIndex-1);
// This distance already belongs to another cluster
if (thisDist < 0) continue;
// Check if you found a new valid pair
if ((thisDist > minDist) && (thisDist < maxDist)){
// Update the valid distance interval
minDist = thisDist - threshold;
minDist = thisDist - threshold;
// Add the newly found point
thisSubset.push_back(values.at(k));
// Indicate that this is already clustered
distanceMatrix.at(lastIndex).at(k-lastIndex-1) = -1;
// Continue the search from here
lastIndex = k;
}
}
if (thisSubset.size() > 2) {
similarDistanceSubsets.push_back(thisSubset);
}
}
}
for (unsigned int i = 0; i < similarDistanceSubsets.size(); ++i) {
for (unsigned int j = 0; j < similarDistanceSubsets.at(i).size(); ++j) {
std::cout << similarDistanceSubsets.at(i).at(j);
if (j != similarDistanceSubsets.at(i).size()-1) {
std::cout << " ";
}
else {
std::cout << std::endl;
}
}
}
}
The idea is to precompute the distances and then look for every pair of particles, starting from the smallest and its larger neighbors, if there is another valid pair above it. If so these are all collected in a subset and this is added to the subset vector. For every new value the valid neighbor region has to be updated to ensure that neighboring distances differ by less than the threshold. Afterwards, the program continues with the next smallest value and its larger neighbors and so on.
Here is an algorithm which is slightly different from yours, which is O(n^3) in the length n of the vector - not very efficient.
It is based on the premise that you want to have subsets of at least size 2. So what you can do is consider all the two-element subsets of the vector, then find all other elements that also match.
So given a function
std::vector<int> findSubset(std::vector<int> v, int baseValue, int distance) {
// Find the subset of all elements in v that differ by a multiple of
// distance from the base value
}
you can do
std::vector<std::vector<int>> findSubsets(std::vector<int> v) {
for(int i = 0; i < v.size(); i++) {
for(int j = i + 1; j < v.size(); j++) {
subsets.push_back(findSubset(v, v[i], abs(v[i] - v[j])));
}
}
return subsets;
}
Only remaining problem is keeping track of the duplicates, maybe you can keep a hashed list of (baseValue % distance, distance) pairs for all the subsets you have already found.

creating matrix using 2-d vector c++

im trying to explain the problem i have.
I need a 2-d matrix which contains 233x233 row and columns.
for(int i = 0; i < dimension;i++)
for(int j = 0 ; j < dimension;j++)
distance3 = sqrt(pow((apointCollection2[j].x - apointCollection[i].x1), 2) + pow((apointCollection2[j].y - apointCollection[i].y1), 2));
if (distance3 < Min)
{
Min = distance3;
station = busStation;
}
distance2 = sqrt(pow((apointCollection2[j].x - apointCollection[i].x2), 2) + pow((apointCollection2[j].y - apointCollection[i].y2), 2));
if (distance2 < Min2)
{
Min2 = distance2;
station1 = busStation;
}
So i find the minimum distance and two stations with minimum distance. The first station(station) corresponds to row and the second one (station1) corresponds to column. Then i need to increment the number of people these(can be called route) has.
Then i need to find the station and station1 after the second iteration and if they are the same i need just to increment people and not add the same stations to the vector.
Or another variant i thought
I creat a 2-d vector with 233x233 and 0 values in each cell.
vector< vector<int> > m;
cout << "Filling matrix with test numbers.";
m.resize(233);
for (int i = 0; i < 233; i++)
{
m[i].resize(233);
for (int j = 0; j < 233; j++)
{
}
}
After the loop above i decided to create the following where i find the min distance :
Here i want to increment somehow:
m[station][station1] = person;
if (find(m.begin(), m.end(), station, station1))
{
person++;
}
else
{
m[station][station1] = person;
}
I have an error in "find" because there is no instance of function template.Another problem i don't add values to vector but here also a mistake when i want to add.
This should be done very easy just need to find out the logic i should follow.
Thanks in advance

Fast access to Rcpp::List elements

I have a data set that I really want to work with as a 3D array. Rather than deal with an attempt to get an R array into a RcppArmadillo Cube, which I'm not sure would work (?), I'm sending in a list of matrices. My problem, however, is that the list is of large matrices and I want to be able to loop over the 3rd dimension in the middle of loops over rows or columns. With medium size matrices (list of 20 matrices of size 50,000x5), flattening the list into one long array gets me my result in less than a second.
I'd prefer to avoid copying the data in order to accommodate larger matrices. But using as< NumericMatrix >(list_obj[t]) inside a loop over the rows makes the function take several minutes at least. An example of my code use as<> that is incredibly slow is below. dat is the list sent into the function. steps is an int sent into the function.
T = dat.size()
N = as<NumericMatrix>(dat[0]).nrow()
M = as<NumericMatrix>(dat[0]).ncol()
// Temp vals
double top, bot;
// Output vector
NumericVector out(M);
// Loop through each signal
for (int j=0; j<M; j++) {
// Reset numerator and denominator
top = 0;
bot = 0;
// Loop through each time dimension
for (int tm = 0; tm < (T - steps); tm++) {
// Loop through each row
for (int i = 0; i < N; i++) {
// Check if entry is positive
if (as<NumericMatrix>(dat[tm])(i, j) > 0) {
// Increment denominator
bot += 1.0;
// Compute future product
top = 1.0;
for (int k = 1; k <= steps; k++) {
if (as<NumericMatrix>(dat[tm + k])(i, j) == 0) {
top = 0.0;
break;
}
}
}
}
out(j) = top / bot;
}
}
Is there a fast way to do this without flattening the matrix and requiring a full copy of the potentially large data?