Ive got a graph that displays the temperature from my wood pellets stove, what I would like is to get the time the temperature is rising vs cooling down.
Anyone know how to get something like the slope of the curve in RRDTool or something similar?
You can do this in two different ways.
First of all, you could use a "DERIVE" data type. This will log the derivative -- IE, the slope -- of the data instead of the actual data. However, this will not store the actual temperatures, which is probably not what you want.
The next way to do it is to calculate the slope on the fly from the actual data, as we build the graph. You've already stored your temperature using a GAUGE data type. Now, you can use a calculated value to work out the slope.
DEF:temp=myrrdfile.rrd:ds0:AVERAGE
CDEF:slope=temp,PREV(temp),-,STEPWIDTH,/
This calculates slope to be the difference between the current and previous value, divided by the time interval.
However, since all you seem to be interested in is if the temperature is going up or down, you could instead use something like:
CDEF:cooling=temp,PREV(temp),LT,INF,0,IF
CDEF:warming=temp,PREV(temp),GT,INF,0,IF
AREA:cooling#0000cc::skipscale
AREA:warming#cc0000::skipscale
LINE:temp#00cc00:Temperature
This will graph the temperature as a green line, with a background of red if warming, and blue if cooling.
Related
Fairly new to processing data like this; I have two curves that I'm not sure how to process, but I know what I'd like to have as an outcome. The original plots of two datasets are shown below (left); the rough fit that I think I would like to have for them is shown to below (right) with the overlayed fit in red.
First example:
The sudden drops in amplitude are an artifact on how the data was taken. This means it's inherently unpredictable, and I would ideally like to find a method that is robust to this behavior.
In the first case, I could try to eliminate the sharp drops in amplitude by using a threshold, but that would not help me in the second case:
,
where I still get strong oscillation, but the minima are no longer at 0.
Edit: After writing a short script to use #JamesPhillips suggestion, fitting results are shown below; can confirm this is what I was looking for, and works better/faster than other fitting algorithms.
and
A possible algothm: filter the data something like this...
Start with the smallest X-valued point shown on the graph, iterating from smallest X value to largest X value. For each point:
1) If the next point's Y value is greater than or equal to this point's Y value, include it.
2) If the next point's value is less that [cutoff] percent of this point's Y value, exclude it.
3) Go to next point.
Run the filter and test different values for [cutoff], each time graphing the result to see if the value of [cutoff] meets your requirements. You may need an additional filter condition or two, but that should be a good start to filtering the data as you describe.
I have attached images of an input signal (shown in blue), which is actually a continuous input stream and whose trend I do not know, and the signal smoothed using a moving average filter of span 5 (shown in red).
Raw input and smoothed input signal
First derivative of the raw input and smoothed input signal
My aim is to calculate the ratio of this signal to its first derivative. However, clearly the first derivative is noisy and does not give good results. I realize I must change the filter from moving average to a more robust one. I have looked up Savitzky-Golay filter, but I read on another site that it is more efficient in retaining the shape of the signal than reducing the noise. http://terpconnect.umd.edu/~toh/spectrum/Smoothing.html
Kalman filter would be my next guess but it needs an initial state estimate which I cannot know for this type of signal.
Any other suggestions on how to smooth the first derivative of a noisy input?
First of all, don't expect any miracles from any of these filters. Numerical differentiation of noisy data is generally critical because the differentialtion operation itself acts as a high pass filter and thus amplifies noise.
Yes there are differences between Moving Average, Savitzky-Golay and Kalman but these are subtle. The main advantage of Savitzky-Golay is using an adaptive window size.
Looking at your data, it seems like you should use a much larger window size resulting in a lower cut-off frequency. But then, I don't know if your data sets always look like this.
Another hint: As long as your filters are effectively linear, it does not matter if you first apply the filter and then calculate the derivative or if you calculate the derivative from the original signal and then apply the filtering.
i have a sinusoidal-like shaped signal,and i would like to compute the frequency.
I tried to implement something but looks very difficult, any idea?
So far i have a vector with timestep and value, how can i get the frequency from this?
thank you
If the input signal is a perfect sinusoid, you can calculate the frequency using the time between positive 0 crossings. Find 2 consecutive instances where the signal goes from negative to positive and measure the time between, then invert this number to convert from period to frequency. Note this is only as accurate as your sample interval and it does not account for any potential aliasing.
You could try auto correlating the signal. An auto correlation can be rapidly calculated by following these steps:
Perform FFT of the audio.
Multiply each complex value with its complex conjugate.
Perform the inverse FFT of the audio.
The left most peak will always be the highest (as the signal always correlates best with itself). The second highest peak, however, can be used to calculate the sinusoid's frequency.
For example if the second peak occurs at an offset (lag) of 50 points and the sample rate is 16kHz and the window is 1 second then the end frequency is 16000 / 50 or 320Hz. You can even use interpolation to get a more accurate estimation of the peak position and thus a more accurate sinusoid frequency. This method is quite intense but is very good for estimating the frequency after significant amounts of noise have been added!
I have been researching and trying to figure this one out to no avail. I have found many ways not to solve this...
The gist of the problem: I am looking for a method to calculate the deviance from an original path traveled by way of GPS coordinates. I have multiple csv files that contain latitude, longitude, and UTC time. I have created KML files from this information for a visual viewing of the deviance and now would like to put a value on this deviation. I ahve chosen a route as a reference and would like to measure the other routes against the reference route. There are multiple routes each having it's own reference route, each of which has many runs. No two runs are the same, and some of the routes deviate more than the next. I cannot use time, only lat and lon since the runs were completed over many weeks of data collection.
What I have tried thus far:
Haversine and Equirectangular formulas (looping through and measuring point to point).
Outcome: The coordinates only line up for a short period of time and the difference in the number of points varies greatly.
Area under each curve: was going to find the difference of the two routes by this method.
Outcome: Really unsure how to proceed, nor find equations suitable for this calculation.
There were a couple more feeble attempts, but have been working on this for a few weeks now, with not much to show for and still unsure on how to proceed.
Any help or ideas would be greatly appreciated.
Possible solution 1: Instead of calculating the "sideways" deviation between the two routes, just compare the respective arc lengths (Matlab: arclength).
Possible solution 2: To compare two routes, each going from the same start A to the same end point B: Draw a straight line between A and B, place a number of equidistant points along AB, and then average the perpendicular distance from these points on AB to the paths you want to compare. The absolute difference between the cumulative deviations from the straight-line reference is your deviation.
Possible solution 3: Calculate the arc length of each route. Place a number of equidistant points along each route. Average the distance between these points.
Both solution 2 and 3 will depend on the number of points you place, but with a higher number of points, the average deviation will converge. Note that these solutions are both related to calculating the area under each curve.
100 periods have been collected from a 3 dimensional periodic signal. The wavelength slightly varies. The noise of the wavelength follows Gaussian distribution with zero mean. A good estimate of the wavelength is known, that is not an issue here. The noise of the amplitude may not be Gaussian and may be contaminated with outliers.
How can I compute a single period that approximates 'best' all of the collected 100 periods?
Time-series, ARMA, ARIMA, Kalman Filter, autoregression and autocorrelation seem to be keywords here.
UPDATE 1: I have no idea how time-series models work. Are they prepared for varying wavelengths? Can they handle non-smooth true signals? If a time-series model is fitted, can I compute a 'best estimate' for a single period? How?
UPDATE 2: A related question is this. Speed is not an issue in my case. Processing is done off-line, after all periods have been collected.
Origin of the problem: I am measuring acceleration during human steps at 200 Hz. After that I am trying to double integrate the data to get the vertical displacement of the center of gravity. Of course the noise introduces a HUGE error when you integrate twice. I would like to exploit periodicity to reduce this noise. Here is a crude graph of the actual data (y: acceleration in g, x: time in second) of 6 steps corresponding to 3 periods (1 left and 1 right step is a period):
My interest is now purely theoretical, as http://jap.physiology.org/content/39/1/174.abstract gives a pretty good recipe what to do.
We have used wavelets for noise suppression with similar signal measured from cows during walking.
I'm don't think the noise is so much of a problem here and the biggest peaks represent actual changes in the acceleration during walking.
I suppose that the angle of the leg and thus accelerometer changes during your experiment and you need to account for that in order to calculate the distance i.e you need to know what is the orientation of the accelerometer in each time step. See e.g this technical note for one to account for angle.
If you need get accurate measures of the position the best solution would be to get an accelerometer with a magnetometer, which also measures orientation. Something like this should work: http://www.sparkfun.com/products/10321.
EDIT: I have looked into this a bit more in the last few days because a similar project is in my to do list as well... We have not used gyros in the past, but we are doing so in the next project.
The inaccuracy in the positioning doesn't come from the white noise, but from the inaccuracy and drift of the gyro. And the error then accumulates very quickly due to the double integration. Intersense has a product called Navshoe, that addresses this problem by zeroing the error after each step (see this paper). And this is a good introduction to inertial navigation.
Periodic signal without noise has the following property:
f(a) = f(a+k), where k is the wavelength.
Next bit of information that is needed is that your signal is composed of separate samples. Every bit of information you've collected are based on samples, which are values of f() function. From 100 samples, you can get the mean value:
1/n * sum(s_i), where i is in range [0..n-1] and n = 100.
This needs to be done for every dimension of your data. If you use 3d data, it will be applied 3 times. Result would be (x,y,z) points. You can find value of s_i from the periodic signal equation simply by doing
s_i(a).x = f(a+k*i).x
s_i(a).y = f(a+k*i).y
s_i(a).z = f(a+k*i).z
If the wavelength is not accurate, this will give you additional source of error or you'll need to adjust it to match the real wavelength of each period. Since
k*i = k+k+...+k
if the wavelength varies, you'll need to use
k_1+k_2+k_3+...+k_i
instead of k*i.
Unfortunately with errors in wavelength, there will be big problems keeping this k_1..k_i chain in sync with the actual data. You'd actually need to know how to regognize the starting position of each period from your actual data. Possibly need to mark them by hand.
Now, all the mean values you calculated would be functions like this:
m(a) :: R->(x,y,z)
Now this is a curve in 3d space. More complex error models will be left as an excersize for the reader.
If you have a copy of Curve Fitting Toolbox, localized regression might be a good choice.
Curve Fitting Toolbox supports both lowess and loess localized regression models for curve and curve fitting.
There is an option for robust localized regression
The following blog post shows how to use cross validation to estimate an optimzal spaning parameter for a localized regression model, as well as techniques to estimate confidence intervals using a bootstrap.
http://blogs.mathworks.com/loren/2011/01/13/data-driven-fitting/