How to calculate local outlier detection (LOF) - data-mining

I want to have the correct calculation formula for the local outlier factor (LOF) according to the publication of Breunig & Sander.
I have found this formula:
LOF = (Average of the lrd of the objects located in the MinPts area) divided through lrd of the suspected outlier, that is the centroid of the MinPts.
I am in doubt, if this is correct. Somebody said that the LOF is calculated as follows:
LOF = (Average of the lrd of the objects located in the MinPts area including the lrd of the centroid i.e. the suspected ) divided through the lrd of the suspected outlier, that is the centroid of the MinPts.
What is the correct answer?

When in doubt, read the original source!
Because you want to use the definition of LOF by Breunig & Sander themselves, not by "somebody". And if I'd now say, e.g., "LOF is e^(I*pi)", would you next post the question "somebody on stack exchange said this, and someone else said something else"?
Go, and read the original source, if you need to verify a definition.

Thank You for Your answer. I have read the original publication of Breunig et al. But I must confess that I did not understand the correct formula to be used. Meanwhile I have got the answer. For claculating LOF of an object You must take the average of the lrd of all objects contained in MinPts (without the suspected outlier object, the centroid) divided through the lrd of lrd of the object (suspected to be an outlier) / centroid. The original publication has - by the way - some confounding definition of MinPts resp. k.

Related

Where to alter reference code to extract motion vectors from HEVC encoded video

So this question has been asked a few times, but I think my C++ skills are too deficient to really appreciate the answers. What I need is a way to start with an HEVC encoded video and end with CSV that has all the motion vectors. So far, I've compiled and run the reference decoder, everything seems to be working fine. I'm not sure if this matters, but I'm interested in the motion vectors as a convenient way to analyze motion in a video. My plan at first is to average the MVs in each frame to just get a value expressing something about the average amount of movement in that frame.
The discussion here tells me about the TComDataCU class methods I need to interact with to get the MVs and talks about how to iterate over CTUs. But I still don't really understand the following:
1) what information is returned by these MV methods and in what format? With my limited knowledge, I assume that there are going to be something like 7 values associated with the MV: the frame number, an index identifying a macroblock in that frame, the size of the macroblock, the x coordinate of the macroblock (probably the top left corner?), the y coordinate of the macroblock, the x coordinate of the vector, and the y coordinate of the vector.
2) where in the code do I need to put new statements that save the data? I thought there must be some spot in TComDataCU.cpp where I can put lines in that print the data I want to a file, but I'm confused when the values are actually determined and what they are. The variable declarations look like this:
// create motion vector fields
m_pCtuAboveLeft = NULL;
m_pCtuAboveRight = NULL;
m_pCtuAbove = NULL;
m_pCtuLeft = NULL;
But I can't make much sense of those names. AboveLeft, AboveRight, Above, and Left seem like an asymmetric mix of directions?
Any help would be great! I think I would most benefit from seeing some example code. An explanation of the variables I need to pay attention to would also be very helpful.
At TEncSlice.cpp, you can access every CTU in loop
for( UInt ctuTsAddr = startCtuTsAddr; ctuTsAddr < boundingCtuTsAddr; ++ctuTsAddr )
then you can choose exact CTU by using address of CTU.
pCtu(TComDataCU class)->getCtuRsAddr().
After that,
pCtu->getCUMvField()
will return CTU's motion vector field. You can extract MV of CTU in that object.
For example,
TComMvField->getMv(g_auiRasterToZscan[y * 16 + x])->getHor()
returns specific 4x4 block MV's Horizontal element.
You can save these data after m_pcCuEncoder->compressCtu( pCtu ) because compressCtu determines all data of CTU such as CU partition and motion estimation, etc.
I hope this information helps you and other people!

About generalized hough transform code

I was looking for an implementation of Generalized Hough Transform,and then I found this website,which showed me a complete implementation of GHT .
I can totally understand how the algorithm processes except this:
Vec2i referenceP = Vec2i(id_max[0]*rangeXY+(rangeXY+1)/2, id_max[1]*rangeXY+(rangeXY+1)/2);
which calculates the reference point of the object based on the maximum value of the hough space,then mutiplied by rangXY to get back to the corresponding position of origin image.(rangeXY is the dimensions in pixels of the squares in which the image is divided. )
I edited the code to
Vec2i referenceP = Vec2i(id_max[0]*rangeXY, id_max[1]*rangeXY);
and I got another reference point then show all edgePoints in the image,which apparently not fit the shape.
I just cannot figure out what the factor(rangeXY+1)/2means.
Is there anyone who has implemented this code or familiared with the rationale of GHT can tell me what the factor rangeXYmeans? Thanks~
I am familiar with the classic Hough Transform, though not with the generalised one. However, I believe you give enough information in your question for me to answer it without being familiar with the algorithm in question.
(rangeXY+1)/2 is simply integer division by 2 with rounding. For instance (4+1)/2 gives 2 while (5+1)/2 gives 3 (2.5 rounds up). Now, since rangeXY is the side of a square block of pixels and id_max is the position (index) of such a block, then id_max[dim]*rangeXY+(rangeXY+1)/2 gives the position of the central pixel in that block.
On the other hand, when you simplified the expression to id_max[dim]*rangeXY, you were getting the position of the top-left rather than the central pixel.

Image Stitching details with OpenCV

I am trying to get deep into stitching. I am using cv::detail.
I am trying to follow this example:
I roughly understand the stitching pipeline.
there is a function matchesGraphAsString() which return a graph. I am wondering how does it even compute this graph. Further, what is the dfination of confidence interval in this case.
The output is in DOT format and a sample graph looks like
graph matches_graph{
"15.jpg" -- "13.jpg"[label="Nm=75, Ni=50, C=1.63934"];
"15.jpg" -- "12.jpg"[label="Nm=47, Ni=28, C=1.26697"];
"15.jpg" -- "14.jpg"[label="Nm=149, Ni=117, C=2.22011"];
"11.jpg" -- "13.jpg"[label="Nm=71, Ni=52, C=1.77474"];
"11.jpg" -- "9.jpg"[label="Nm=46, Ni=37, C=1.69725"];
"11.jpg" -- "10.jpg"[label="Nm=87, Ni=73, C=2.14076"];
"9.jpg" -- "8.jpg"[label="Nm=122, Ni=99, C=2.21973"];
}
What does label, Nm, and Ni mean here? The official document seems to be lacking these details.
This is a very interesting question indeed. As #hatboyzero pointed out, the meaning of the variables is reasonably straightforward:
Nm is the number of matches (in the overlapping region, so obvious outliers have been removed already).
Ni is the number of inliers after finding a homography with Ransac.
C is the confidence that the two images are a match.
Background to matching
Building a panorama is done by finding interest points in all images and computing descriptors for them. These descriptors, like SIFT, SURF and ORB, were developed so that the same parts of an image could be detected. They are just a medium-dimensional vector (64 or 128 dimensions are typical). By computing the L2 or some other distance between two descriptors, matches can be found. How many matches in a pair of images are found is described by the term Nm.
Notice that so far, the matching has only been done through appearance of image regions around interest points. Very typically, many of these matches are plain wrong. This can be because the descriptor looks the same (think: repetitive object like window sills on a multi-window building, or leaves on a tree) or because the descriptor is just a bit too uninformative.
The common solution is to add geometric constraints: The image pair was taken from the same position with the same camera, therefore points that are close in one image must be close in the other image, too. More specifically, all the points must have undergone the same transformation. In the panorama case where the camera was rotated around the nodal point of the camera-lens system this transformation must have been a 2D homography.
Ransac is the gold standard algorithm to find the best transformation and all the matches that are consistent with this tranformation. The number of these consistent matches is called Ni. Ransac works by randomly selecting in this case 4 matches (see paper sect 3.1) and fitting a homography to these four matches. Then, count how many matches from all possible matches would agree with this homography. Repeat 500 times (see paper) and at the end take the model that had the most inliers. Then re-compute the model with all inliers. The name of the algorithm comes from RANdom SAmple Consensus: RanSaC.
Confidence-Term
The question for me was, about this mysterious confidence. I quickly found where it was calculated.
From stitching/sources/matches.cpp:
// These coeffs are from paper M. Brown and D. Lowe. "Automatic Panoramic Image Stitching
// using Invariant Features"
matches_info.confidence = matches_info.num_inliers / (8 + 0.3 * matches_info.matches.size());
// Set zero confidence to remove matches between too close images, as they don't provide
// additional information anyway. The threshold was set experimentally.
matches_info.confidence = matches_info.confidence > 3. ? 0. : matches_info.confidence;
The mentioned paper
has in section 3.2 ("Probabilistic Model for Image Match Verification") some more details to what this means.
Reading this section a few things stood out.
There are a lot of variables (mostly probabilities) in their model. These values are defined in the paper without any justification. Below is the key sentence:
Though in practice we have chosen values for p0, p1, p(m = 0), p(m = 1) and pmin, they could in principle be learnt from the data.
So, this is just a theoretical exercise as the the parameters have been plucked out of thin air. Notice the could in principle be learnt.
The paper has in equation 13 the confidence calculation. If read correctly, it means that matches_info.confidence indicates a proper match between two images iff its value is above 1.
I don't see any justification in the removal of a match (setting confidence to 0) when the confidence is above 3. It just means that there are very little outliers. I think the programmers thought that a high number of matches that turn out to be outlier means that the images overlap a great deal, but this isn't provided by algorithms behind this. (Simply, the matchings are based on appearance of features.)
Glancing at the OpenCV source code available online, I gather that they mean the following:
Nm - Number of pairwise matches
Ni - Number of geometrically consistent matches
C - Confidence two images are from the same panorama
I'm basing my assumptions on a snippet from the body of matchesGraphAsString in modules/stitching/src/motion_estimators.cpp from version 2.4.2 of the OpenCV source code. I.e.
str << "\"" << name_src << "\" -- \"" << name_dst << "\""
<< "[label=\"Nm=" << pairwise_matches[pos].matches.size()
<< ", Ni=" << pairwise_matches[pos].num_inliers
<< ", C=" << pairwise_matches[pos].confidence << "\"];\n";
Additionally, I'm also looking at the documentation for detail::MatchesInfo for information about the Ni and C terms.

Parameter of BackgroundSubtractorMOG2

I have Problem understanding all Parameter of backgroundsubtractormog2.
I looked in the code (located in bfgf_gaussmix2.cpp), but don't see the connection to the mentioned paper. For exmaple is Tb = varThreshold, but what is the name of Tb in the paper?
I am especially interested in the fat marked parameter.
Let's start with the easy parameter [my remarks]:
int nmixtures
Maximum allowed number of mixture components. Actual number is determined dynamically per pixel.
[set 0 for GMG]
uchar nShadowDetection
The value for marking shadow pixels in the output foreground mask. Default value is 127.
float fTau
Shadow threshold. The shadow is detected if the pixel is a darker version of the background. Tau is a threshold defining how much darker the shadow can be. Tau= 0.5 means that if a pixel is more than twice darker then it is not shadow.
Now to the ones i don't understand:
float backgroundRatio
Threshold defining whether the component is significant enough to be included into the background model ( corresponds to TB=1-cf from the paper??which paper??). cf=0.1 => TB=0.9 is default. For alpha=0.001, it means that the mode should exist for approximately 105 frames before it is considered foreground.
float varThresholdGen
Threshold for the squared Mahalanobis distance that helps decide when a sample is close to the existing components (corresponds to Tg). If it is not close to any component, a new component is generated. 3 sigma => Tg=3*3=9 is default. A smaller Tg value generates more components. A higher Tg value may result in a small number of components but they can grow too large. [i don't understand a word of this]
In the Constructor the variable varThreshold is used. Is it the same as varThresholdGen?
Threshold on the squared Mahalanobis distance to decide whether it is well described by the background model (see Cthr??). This parameter does not affect the background update. A typical value could be 4 sigma, that is, varThreshold=4*4=16; (see Tb??).
float fVarInit
Initial variance for the newly generated components. It affects the speed of adaptation. The parameter value is based on your estimate of the typical standard deviation from the images. OpenCV uses 15 as a reasonable value.
float fVarMin
Parameter used to further control the variance.
float fVarMax
Parameter used to further control the variance.
float fCT
Complexity reduction parameter. This parameter defines the number of samples needed to accept to prove the component exists. CT=0.05 is a default value for all the samples. By setting CT=0 you get an algorithm very similar to the standard Stauffer&Grimson algorithm.
Someone asked pretty much the same question on the OpenCV website, but without an answer.
Well, I don't think anyone could tell you which parameter is what if you don't know the details of the algorithm that you are using. Besides, you should not need anyone to tell you which parameter is what if you know the details of the algorithm. I'm telling this for detailed parameters (fCT, fVarMax, etc.) not for straightforward ones (nmixtures, nShadowDetection, etc.).
So, I think you should read the papers referenced in the documentation. Here are the links for the papers 1, 2, 3.
And also you should read this paper as well, which is the beginning of background estimation.
After reading these papers and checking out the code with, I'm sure you will understand what those parameters are.
Good luck!

Measure variation of data points from a line; To Catch a Dip

How can I measure this area in C++?
(update: I posted the solution and code as an answer rather than edit the question again)
The ideal line (dashed red) is the plot from starting point with the average rise added with each angle of measurement; this I obtain via average. I measured the test data in black. How can I quantify the area of the dip in blue? X-axis is unitized, so slopes and math are simplified.
I could determine a cutoff for the size of areas like this and then flag this part for retesting or failure. Rarely, there is another dip that appears closer to the right, but setting a cutoff value for standard deviation usually fails those parts.
Update
Diego's answer helped me visualize this. Now that I can see what I'm trying to do, I'll work on the algorithm to implement the "homemade dip detector". :)
Why?
I created a test bench to test throttle position sensors I'm selling. I'm trying to programatically quantify how straight the plot is by analyzing the data collected. This one particular model is vexing me.
Sample plot of a part I prefer not to sell:
The X axis are evenly spaced angles of throttle opening. The stepper motor turns the input shaft, stopping every 0.75° to measure the output on a 10 bit ADC, which gets translated to the Y axis. The plot is the translation of data[idx] to idx,value mapped to (x,y) bitmap coordinates. Then I draw lines between the points within the bitmap using Bresenham's algorithm.
My other TPS products produce amazingly linear output.
The lower (left) portion of the plot is crucial to normal usage of any motor vehicle; it's when you're driving around town, entering parking lots, etc. This particular part has a tendency to develop a dip around 15° opening and I wish to use the program to quantify this "dip" in the curve and rely less upon the tester's intuition. In the above example, the plot dips but doesn't return to what an ideal line might be.
Even though this is an embedded application, printing the report takes 10 seconds, thus I do not consider stepping through an array of 120 points of data multiple times a waste of cycles. Also, since I'm using a uC32 PIC32 microcontroller, there's plenty of memory, so I have the luxury of being able to ponder this problem within the controller.
What I'm trying already
Array of rise between test points: I dismiss the X-axis entirely, considering it unitized, and then make an array of change from one reading to the next. This array is what contributes to the report's "Min rise between points: 0 Max: 14". I call this array deltas.
I've tried using standard deviation on deltas, however, during testing I have found that a low Std Dev is not a reliable measure for this part. If the dip quickly returns to the original line implied by early data points, the Std Dev can be deceptively low (observed to be as low as 2.3) but the part is still something I wouldn't want to use. I tried setting a cutoff at 2.6, but it failed too many parts with great plots. The other, more linear part linked to above can reliably count on Std Dev for quality.
Kurtosis seems not to apply for this situation at all. I learned of Kurtosis today and found a Statistics Library which includes Kurtosis and Skewness. During continued testing, I found that of these two measures, there was not a trend of positive, negative, or amplitude which would correspond to either passing or failing. That same gentleman has shared a linear regression library, but I believe Lin Reg is unrelated to my situation, as I am comfortable with the assumption of the AVG of deltas being my ideal line. Linear Regression and R^2 are more for finding a line from less ideal data or much larger sets.
Comparing each delta to AVG and Std Dev I set up a monitor to check each delta against final average of the deltas's data. Here, too, I couldn't find a reliable metric. Too many good parts would not pass a test restricting any delta to within 2x Std Dev away from the Average. Ultimately, the only variation from AVG I could settle on is to be within AVG+Std Dev difference from the AVG itself. Anything more restrictive would fail otherwise good parts. And the elusive dip around 15° opening can sneak through this test.
Homemade dip detector When feeding deltas to the serial monitor of the computer, I observed consecutive negative deltas during the dip, so I programmed in a dip detector, but it feels very crude to me. If there are 5 or more negative deltas in a row, I sum them. I have seen that if I take that sum the dip's differences from AVG then divide by the number of negative deltas, a value over 2.9 or 3 could mean a fail. I have observed dips lasting from 6 to 15 deltas. Readily observable dips would have their differences from AVG sum up to -35.
Trending accumulated variation from the AVG The above made me think watching the summation of deltas as it wanders away from AVG could be the answer. Meaning, I step through the array and sum the differences of each delta from AVG. I thought I was on to something until a good part blew this theory. I was seeing a trend of the fewer times the running sum varied from AVG by less than 2x AVG, the more straight the line appeared. Many ideal parts would only show 8 or less delta points where the sumOfDiffs would stray from the AVG very far.
float sumOfDiffs=0.0;
for( int idx=0; idx<stop; idx++ ){
float spread = deltas[idx] - line->AdcAvgRise;
sumOfDiffs = sumOfDiffs + spread;
...
testVal = 2*line->AdcAvgRise;
if( sumOfDiffs > testVal || sumOfDiffs < -testVal ){
flag = 'S';
}
...
}
And then a part with a fantastic linear plot came through with 58 data points where sumOfDiffs was more than twice the AVG! I find this amazing, as at the end of the ~120 data points, sumOfDiffs value is -0.000057.
During testing, the final sumOfDiffs result would often register as 0.000000 and only on exceptionally bad parts would it be greater than .000100. I found this quite surprising, actually: how a "bad part" can have accumulated great accuracy.
Sample output from monitoring sumOfDiffs This below output shows a dip happening. The test watches as the running sumOfDiffs is more than 2x the AVG away from the AVG for the whole test. This dip lasts from deltas idx of 23 through 49; starts at 17.25° and lasts for 19.5°.
Avg rise: 6.75 Std dev: 2.577
idx: delta diff from avg sumOfDiffs Flag
23: 5 -1.75 -14.05 S
24: 6 -0.75 -14.80 S
25: 7 0.25 -14.55 S
26: 5 -1.75 -16.30 S
27: 3 -3.75 -20.06 S
28: 3 -3.75 -23.81 S
29: 7 0.25 -23.56 S
30: 4 -2.75 -26.31 S
31: 2 -4.75 -31.06 S
32: 8 1.25 -29.82 S
33: 6 -0.75 -30.57 S
34: 9 2.25 -28.32 S
35: 8 1.25 -27.07 S
36: 5 -1.75 -28.82 S
37: 15 8.25 -20.58 S
38: 7 0.25 -20.33 S
39: 5 -1.75 -22.08 S
40: 9 2.25 -19.83 S
41: 10 3.25 -16.58 S
42: 9 2.25 -14.34 S
43: 3 -3.75 -18.09 S
44: 6 -0.75 -18.84 S
45: 11 4.25 -14.59 S
47: 3 -3.75 -16.10 S
48: 8 1.25 -14.85 S
49: 8 1.25 -13.60 S
Final Sum of diffs: 0.000030
RunningStats analysis:
NumDataValues= 125
Mean= 6.752
StandardDeviation= 2.577
Skewness= 0.251
Kurtosis= -0.277
Sobering note about quality: what started me on this journey was learning how major automotive OEM suppliers consider a 4 point test to be the standard measure for these parts. My first test bench used an Arduino with 8k of RAM, didn't have a TFT display nor a printer, and a mechanical resolution of only 3°! Back then I simply tested deltas being within arbitrary total bounds and choosing a limit of how big any single delta could be. My 120+ point test feels high class compared to that 30 point test from before, but that test had no idea about these dips.
Premises
the mean of a set of data has the mathematical property that the sum of the deviations from the mean is 0.
this explains why both bad and good datasets alwais give almost 0.
basically the result when differs from zero is essentially an accumulations of rounding errors in the diffs and that's why unfortunately cannot hold useful informations
the thing that most clearly define what you're looking for is your image: you're looking for an AREA and this is why you're not finding the solution in this ways:
looking to a metric in the single points is too local to extract that information
looking to global accumulations or parameters (global standard deviation) is too global and you lose the data among too much information and source of variations
kurtosis (you've already told I know but is for completeness) is out of its field of applications since this is not a probability distribution
in the end the more suitable approach of your already tryied ones is the "Homemade dip detector" because thinks in a way that is local but not too much.
Last but not least:
Any Algorithm you're going to choose has its tacit points on which it stands.
So maybe one is looking for a super clever algorithm that with no parametrization and tuning automatically adapts to the problem and self define thereshods and other.
On the other side there is an algorithm that will stand on the knowledge by the writer of the tipical data behavior (good and bad) and that is itself stupid in the way that if there is another different and unespected behavior the results are unpredictable
Ok, the right way is one of this two or is in-between them depending on the application. So if it works also the "Homemade dip detectors" can be a solution. There is not reason to define it crude but it could be that is not sufficient based on applicaton needs and that's an other thing.
How to find the area
Once you have the data the first thing is to clearly define the "theoretical straight line". I give some options:
use RANSAC algorithm (formally the best option IMHO)
this give you the best fit to the aligned points disregarding the not aligned ones
it is quite difficult and maybe oversized for this work (IMHO)
consider the line defined by the first and last point
you told that the dip is almost always in the same position that is not near boundaries so first and last points can be thought as affordable
very easy to implement
this is an example of using the knowledge about expected behaviors as I told before so you need to think if and how much confidence you give to this assumption
consider a linear fit to the first 10 points and last 10 points
is only a more affordable version of previous since using more points you can be less worried that maybe just the first point or the last were affected by any measure problem and so all fails because of this
also quite easy to implement
if I were you I will use this or something inspired to this
calculate the Y value given by the straight line for each X
calculate the area between the two curves (or the areas under the function Y_dev = Y_data - Y_straight that is mathematically the same) with this procedure:
PositiveMax = 0; NegativeMax = 0;
start from first point (value can be positive or negative) and put in a temporary area accumulator tmp_Area
for each next point
if the sign is the same then accumulate the value
if it is different
stop accumulating
check if the accumulated value is the greater than PositiveMax or below NegativeMax and if it is than store as new PositiveMax or NegativeMax
in any case reset the accumulator with tmp_Area = Y_dev; to the current value starting this way a new accumulation
in the end you will have the values of the maximum overvalued contiguous area and maximum undervalued contiguous area that I think are the scores you're looking for.
if you want you can only manage the NegativeMax based on observed and expected data behaviors
you may find useful to put a thereshold so that if a value Y_dev is lower than the thereshold you do not accumulate it.
this in order to not obtain large accumulations from many points close to the straight line that can be similar to the accumulations of few points far from the line
the need of this and and the proper thereshold needs to be evaluated on some sample data
you need to find an appropriate thereshold for this contiguous area and you can have it only from observation of sample data.
again: it can be you observing and deciding the thereshold or you can build a repository of good and bad samples and write a program that automatically learn which thereshold to use. But his is not the algorithm, this is how to find its operative parameters and there is nothing wrong to do by human brain.. ..it only depends if we're looking for a method to separate bad and good things or if we're looking for and autoadaptive algorithm that does this.. ..you decide the target.
It turns out the result of my gut feeling and Diego's method is an average of the integral. I still don't like that name, so I have described the algorithm and have asked on Math.SE what to call this, which got migrated to "Cross Validated", Stats.SE .
I Updated graphs after a massive edit of my Math.SE question. It turns out I'm taking the average of a closed integral of the derivative of the data. :P First, we gather the data:
Next is the "derivative": step through the original data array to form the deltas array which is the rise of ADC values from one 0.75° step to the next. "Rise" or "slope" is what the derivative is: dy/dx.
With the "slope" or average leveled out, I can find multiple negative deltas in a row, sum them, then divide by the count at the end of the dip. The sum is an integral of the area between average and the deltas and when the dip goes back positive, I can divide the sum by the count of the dips.
During testing, I came up with a cutoff value for this average of the integral at 2.6. That was a great measure of my "gut instinct" looking at the plot thinking a part was good or bad.
In case someone else finds themselves trying to quantify this, here's the code I implemented. Note that it is only looking for negative dips. Also, dipCountLimit is defined elsewhere as 5. In addition to the dip detector/accumulator (ie Numerical Integrator) I also have a spike detector that arbitrarily flags the test as bad if any data points stray from the average by the amount of average + standard deviation. AVG+STD DEV as a spike limit was chosen arbitrarily based on the observed plots of the parts it would fail.
int dipdx=0;
// inDipFlag also counts the length of this dip
int inDipFlag=0;
float dips[140] = { 0.0 };
for( int idx=0; idx<stop; idx++ ){
const float diffFromAvg = deltas[idx] - line->AdcAvgRise;
// state machine to monitor dips
const int _stop = stop-1;
if( diffFromAvg < 0 && idx < _stop ) {
// check NEXT data point for negative diff & set dipFlag to put state in dip
const float nextDiff = deltas[idx+1] - line->AdcAvgRise;
if( nextDiff < 0 && inDipFlag == 0 )
inDipFlag = 1;
// already IN a dip, and next diff is negative
if( nextDiff < 0 && inDipFlag > 0 ) {
inDipFlag++;
}
// accumulate this dip
dips[dipdx]+= diffFromAvg;
// next data point ends this dip and we advance dipdx to next dip
if( inDipFlag > 0 && nextDiff > 0 ) {
if( inDipFlag < dipCountLimit ){
// reset the accumulator, do not advance dipdx to next entry
dips[dipdx]=0.0;
} else {
// change this entry's value from dip sum to its ratio
dips[dipdx] = -dips[dipdx]/inDipFlag;
// advance dipdx to next entry
dipdx++;
}
// Next diff isn't negative, so the dip is done
inDipFlag = 0;
}
}
}