Is the rrdtool RRA XFF value useless if RRA step is 1? - rrdtool

According to rrdcreate manual, the RRA definition has the following syntax:
RRA:{AVERAGE | MIN | MAX | LAST}:xff:steps:rows
XFF(xfiles factor) defines what part of a consolidation interval may be made up from UNKNOWN data while the consolidated value is still regarded as known. Am I correct, that if the step is 1, then XFF value doesn't matter because single PDP is used to build the RRA entry, i.e this single PDP is either present or not? For example, both RRA:AVERAGE:0:1:12 and RRA:AVERAGE:0.9999:1:12 are equal?

Short answer is, yes, sort of. When you have 1cdp=1pdp then the amount of data present in a CDP can logically only be 100% or 0%, and so all xff values in the range 100>xff>=0 are identical. Note that we exclude 100 from this range! An xff of 100 would not make sense unless your computer has a usb-attached crystal ball...
However this parameter becomes much more important once lower-resolution RRAs start to consolidate data over large windows, which is the primary function of the RRA.

Related

RRDTool data values (e.g. max value) are different in different time resolutions

currently I'm experimenting a bit with RRDTool. I'm aware that the accuracy gets lower the longer the time periods are selected. But I thought I could bypass this with my datasource settings.
For example temperature and humidity from my house, resoultion 1h:
And now with the resolution of 1d:
As you could see, there is a great difference for the max. value of the blue line.
I created my datasources and archives with this values:
"rrdtool create temp.rrd --step 30",
"DS:temp:GAUGE:60:U:U",
"DS:humidity:GAUGE:60:U:U",
"RRA:AVERAGE:0.5:1:1051200",
"RRA:MAX:0.5:1:1051200",
"RRA:MIN:0.5:1:1051200",
I thought that 1051200 (1 year = 31536000 / 30 s (resoulution) = 1051200) is correct for saving every value for a year and that there should be no need for interpolating.
Is it possible to get the exact values displayed even if the resolution changes (for example the max humidity (Luftfeuchtigkeit) at 99.9%)?
Here are my values for image creation:
"--start" => "-1h", (-1d etc-)
"--title" => "Haustemperatur",
"--vertical-label" => "°C / % RLF",
"--width" => 800,
"--height" => 600,
"--lower-limit" => "-5",
"DEF:temperatur=$rrdFile:temperatur:LAST",
"DEF:humidity=$rrdFile:humidity:LAST",
"LINE1:temperatur#33CC33:Temperatur",
"GPRINT:temperatur:LAST:\t\tAktuell\: %4.2lf °C",
"GPRINT:temperatur:AVERAGE:Schnitt\: %4.2lf °C",
"GPRINT:temperatur:MAX:Maximum\: %4.2lf °C\j",
"LINE1:humidity#0000FF:Relative Luftfeuchtigkeit",
"GPRINT:humidity:LAST:Aktuell\: %4.2lf %%",
"GPRINT:humidity:AVERAGE:Schnitt\: %4.2lf %%",
"GPRINT:humidity:MAX:Maximum\: %4.2lf %%\j",
Thanks for your help and any suggestions.
P.S. I'm using a library to generate the graphs and the database, please do not be surprised about possible syntax errors.
Your problem is that you are causing the values to be rolled-up on the fly at graph time, but have not correctly specified which rollup function to use. Your second graph is showing the MAXIMUM of the LAST in the interval, not the true Maximum.
There are a few issues to explain with this configuration:
Firstly, your RRD is defined using 3 RRAs with 1cdp=1pdp and different consolidation functions (AVG, MIN, MAX). This means they are functionally identical, but they do not save you any time at graphing as they have not done any pre-rollup for you! You should definitely consider having just one of these (probably AVG) and adding others at lower resolution to help speed up graphing when you have a bigger time window.
Secondly, you need to specify the on-the-fly rollup function. When graphing, RRDTool will work out the best RRA to use based on your DEF lines, and will perform any additional consolidation required on the fly. This can take a long time if the only available RRA is too high-granularity.
Your graph request uses DEF:temperatur=$rrdFile:temperatur:LAST but you do not actually have a LAST type RRA, so RRDTool will grab the last average. Your RRA data points are at 30s interval, but your second graph has (approx) 5min per pixel, meaning that RRDTool needs to grab the 10 entries from the RRA, and print the last. Looking at the data in the top graph, it seems that the last in that interval was the 66 value, though previous ones were 100.
So you have a choice. Do you want the graph to show the average for the time period, the maximum, or both? Do you want the figures at the bottom to show the maximum of the average, or the maximum of everything?
For example
"DEF:temperatur=$rrdFile:temperatur:AVERAGE",
"DEF:humidity=$rrdFile:humidity:AVERAGE",
"DEF:temperaturmax=$rrdFile:temperatur:MAX;reduce=MAX",
"DEF:humiditymax=$rrdFile:humidity:MAX;reduce=MAX",
"LINE1:temperatur#33CC33:Temperatur",
"LINE1:temperaturmax#66EE66:Maximum Temperatur",
"GPRINT:temperatur:LAST:\t\tAktuell\: %4.2lf °C",
"GPRINT:temperatur:AVERAGE:Schnitt\: %4.2lf °C",
"GPRINT:temperaturmax:MAX:Maximum\: %4.2lf °C\j",
"LINE1:humidity#0000FF:Relative Luftfeuchtigkeit",
"LINE1:humiditymax#3333FF:Maximum Luftfeuchtigkeit",
"GPRINT:humidity:LAST:Aktuell\: %4.2lf %%",
"GPRINT:humidity:AVERAGE:Schnitt\: %4.2lf %%",
"GPRINT:humiditymax:MAX:Maximum\: %4.2lf %%\j",
In this case, we define a separate DEF for the maximum data set, so that we can always obtain the highest value even after consolidation. This is also used in the GPRINT so that we get the MAX of the MAX rather than the MAX of the AVERAGE. The Maximum line is now drawn separately to the average line, so that we can see the effect of any rollup of data - the lines will be together at high-resolution but get further apart as the time window widens and resolution decreases.
TheDEF is set to force any rollup function used for the maxima to be MAX rather than AVG, so we can be sure to get the maximum rather than average of maxima.
We are also using AVERAGE rather than LAST in order to get more meaningful data after rollup. Note that we could also use a separate DEF for the LAST as well if we wanted to though it is of less usefulness.
Note that, if you ever expect to be generating graphs over more than a few days, you should definitely consider adding some lower-resolution RRAs for AVERAGE and MAX or else the graphs will generate very slowly. RRDTool is designed with the intention that data will be rolled up over time, rather than (as in a traditional database) every sample kept as-is. So, unless you really need to have 30s resolution data kept for an entire year, you may prefer to keep this high resolution data for only a week, and then have separate RRAs that roll up to 1 hour resolution and keep for longer. Many people keep the 30s for 2 days, then 30min-summary for 2 weeks, 2h-summary for 2 months, and then 1day-summary for 2 years.
For more information, see the RRDTool manual pages.

Differences of using the rrdtool RRD PDP or RRA consolidation function to calculate the average reading?

I'm updating a rrdtool round-robin database with 1 minute intervals. I want to store the average value of five updates as one RRA entry in rrdtool RRD. One way to do this is like this:
$ rrdtool create foo.rrd --start 1000000500 --step 60 \
> DS:ping:GAUGE:120:0:1000 RRA:AVERAGE:0.5:5:12; \
> rrdtool update foo.rrd 1000000560:10 1000000620:20 \
> 1000000680:30 1000000740:40 1000000800:50
It accumulates five readings and stores the average of those as an entry in RRA. However, one could achieve the same with this:
$ rrdtool create bar.rrd --start 1000000500 --step 300 \
> DS:ping:GAUGE:600:0:1000 RRA:AVERAGE:0.5:1:12; \
> rrdtool update bar.rrd 1000000560:10 1000000620:20 \
> 1000000680:30 1000000740:40 1000000800:50
As seen above, step is 300 seconds, but as RRD PDP accepts values between the intervals and calculates the average, then both examples store 30((10+20+30+40+50)/5) in RRA. One difference, which I can tell, is that first example requires at least three updates to store an entry to RRA while in case of the second example, a single update within 300 second step is enough. Are there any other differences?
These two examples are not really the same thing under the covers, though they can appear the same in some circumstances.
In the first, you have a 60s step, and your RRA stores the average of 5 PDPs in each CDP.
In the second, you have a 300s step, and your RRA stores each PDP as a CDP.
Here are some differences:
In the first, you will need at least one sample (PDP) every 2 minutes; so three to cover each CDP in the RRA. In the second, you need a single sample every CDP.
In the first, data normalisation of each sample happens over a 60s window. In the second, it happens over a 300s window. This will make things look different when the samples arrive irregularly.
In the first, you can have up to 120s with no data before you get an Unknown; in the second, up to 600s.
While the RRA outcome is pretty much the same in both, which you choose would depend on the nature of your incoming data (how often you get samples, how irregular these are), and your storage and display requirements (if you need higher granularity stored or displayed). The first option is more accurate if you have frequent samples; the second is less storage and less work but may sacrifice some data if you have updates more frequent than the step.
Note that, if you have other RRA types than just AVG, having a smaller step will make calculations more accurate.
In general, I would recommend that you set the step to be close to the expected average sample frequency,with a latency setting according to how regular the data are. Set your RRA consolodation depending on how you need to view and display the data, and how long you are required to hold history for. Create RRAs corresponding to normal display rollups, in order to minimise the amount of on-the-fly calculations.

rrdtool info ds unknown_sec meaning

The answer to this question is probably obvious, but I couldn't figure out what it means nor find the documentation for it. When I run rrdtool info, I get ds[source].unknown_sec = 0, and I am not sure what it exactly means...Any help or pointers would be appreciated!
The unknown_sec is the number of seconds for which the value of the DS is Unknown. This could be because the supplied value was Unknown, or was outside the specified range, or because the time since the last sample exceeds the heartbeat time (which marks everything since then as Unknown).
The amount of Unknown time in an interval is then used in combination with the xff fraction in the RRA definitions to determine if a consolodated data point stored in the RRA is Unknown.
Actually, I think I figured out what it means. If two data samples exceed the heartbeat value, then the entire interval between the two data samples are marked as unknown. So unknown_sec = 0 means there hasn't been two data samples that exceed the heartbeat value.

RRDTool Counter increment lower than time

I create a standard RRDTool database with a default step of 5mn (300s).
I have different types of values in it, some gauges which are easily processed, but I have other values I would have in COUNTER but here is my problem :
I read the data in a program, and get the difference between values over two steps is good but the counter increment less than time (It can increment by less than 300 during a step), so my out value is wrong.
Is it possible to change the COUNTER for not be a number by second but by step or something like that, if it's not I suppose I have to calculate the difference in my program ?
Thank you for helping.
RRDTool is capable of handling fractional values, so there is no problem if the counter increments by less than the seconds interval since the last update.
RRDTool stores everything as a Rate. If your DS is of type GAUGE, then RRDTool assumes that the incoming value is alreayd a rate, and only applies Data Normalisation (more on this later). If the type is COUNTER or DERIVE, then the value/timepoint you are updating with is compared to the previous value/timepoint to obtain a rate thus: r=(x2 - x1)/(t2 - t1). The rate obtained is then Normalised. The other DS type is ABSOLUTE, which assumes the counter was reset on the last read, giving r=x2/(t2 - t1).
The Normalisation step adjusts the data point based on assuming a linear progression from the last data point so that it lies exactly on an interval boundary. For example, if your step is 5min, and you update at 12:06, the data point is adjusted back to what it would have been at 12:05, and stored against 12:05. However the last unadjusted DP is still preserved for use at the next update, so that overall rates are correct.
So, if you have a 300s (5min) interval, and the value increased by 150, the rate stored will be 0.5.
If the value you are graphing is something small, e.g. 'number of pages printed', this might seem counterintuitive, but it works well for large rates such as network traffic counters (which is what RRDTool was designed for).
If you really do not want to display fractional values in the generated graphs or output, then you can use a format string such as %.0f to enforce no decimal places and the displayed number will be rounded to the nearest integer.

rrdtool Holt-Winters feature

I mainly write because I'm using the rrdtool holt-winters feature, but sadly it does not work as I would, starting I'll write for you the rrd file command line creation:
`/usr/bin/rrdtool create /home/spread/httphw/rrd/httpr.rrd --start $_[7]-60 --step 60 DS:200:GAUGE:120:U:U RRA:AVERAGE:0.5:1:1440 RRA:HWPREDICT:1440:0.1:0.0035:288 RRA:AVERAGE:0.5:6:700 RRA:AVERAGE:0.5:24:775 RRA:AVERAGE:0.5:288:797`;
After that I basically insert data and then I draw the graph like that:
`/usr/bin/rrdtool graph file.png --start $start --end $time --width 600 --height 200 --imgformat PNG DEF:doscents=$rrd:200:AVERAGE DEF:pred=$rrd:200:HWPREDICT DEF:dev=$rrd:200:DEVPREDICT DEF:fail=$rrd:200:FAILURES TICK:fail#ffffa0:1.0:"Failures Average" CDEF:scale200=doscents,8,* CDEF:upper=pred,dev,2,*,+ CDEF:lower=pred,dev,2,*,- CDEF:scaledupper=upper,8,* CDEF:scaledlower=lower,8,* LINE1:scale200#0000ff:"Average" LINE1:scaledupper#ff0000:"Upper Bound Average" LINE1:scaledlower#ff0000:"Lower Bound Average"`;
Here's the image RRDTOOL IMAGE
The I get a graph like that, but as you can see there's yellow lines that indicates that there has been an error when that's not true, I mean, the activity line at that point is slightly out from the red area but it does not an error, I basically need to understand the values I gotta set up and based on what, I tried it out but I don't really understand the system really well.
Any sugestion from an rrdtool expert?
Many thanks in advance
Being outside the expected range is an error, as far as Holt-Winters is concerned.
The Holt-Winters FAILURES RRA is a slightly more complex than just 'outside the range HWPREDICT+-2*DEVPREDICT'. In fact, there are the additional threshold and window parameters, which (if not specified, as in your case) default to 7 and 9 respectively.
These cause a smoothing of the samples over window samples before comparison, and only trigger a FAILURE flag when there is a sequence of threshold consecutive errors.
As a result, you see a FAILURE trigger where you do, and not in the larger area to the left (which averages down within the range). This results in a better indicator of consistently our of range behaviour, rather than a slope slightly too early or a temporary spike.
If you want to avoid this, and have a FAILURE flag every time the data goes outside of the predicted bounds, then set the FAILURE parameters to 1 and 1. To do this, you would need to explicitly define the additional HW RRAs rather than having them defined implicitly as you do now.
On a separate note, is is bad practice to have a DS with a purely numerical name. It can cause confusion in the RPN calculations. Always have a DS name start with a lowercase letter.