Can anybody explain to me why I get different values when I fetch from my rrd-db than what I filled it with.
Here are the commands:
a. Create database
rrdtool create temperature.rrd --step 300 -b 1374150100 \
DS:temp:GAUGE:300:N:N \
RRA:AVERAGE:0:1:5
b. Fill with data
rrdtool update temperature.rrd \
1374150400:6 \
1374150700:8 \
1374151000:4 \
1374151300:4
c. Fetch data
rrdtool fetch temperature.rrd AVERAGE --start 1374150099 --end 1374151301
Output:
temp
1374150300: 6.0000000000e+00
1374150600: 7.3333333333e+00
1374150900: 5.3333333333e+00
1374151200: 4.0000000000e+00
1374151500: -nan
I fill the database in the exact period with data. I have no idea why it displays 7.3 and 5.3?!
Did I missed something?
OK, I got it. The problem is, that the start time does not fit to the step interval. As you can see in the output, the steps begin with ...300, then ...600 and so on. I filled the data with ...400, ...700 etc.
So the solution is to set the values in the correct step interval and then it works
Related
RRD does not update the second datasource correctly, see:
First, I create the RRD file with two datasources (c1 and c2):
rdtool create test.rrd --start N --step 60 DS:c1:GAUGE:120:0:100 DS:c2:GAUGE:120:0:100 RRA:AVERAGE:0.5:1:1440
Then I do update the two datasources in two commands:
rrdtool update test.rrd -t c1 N:10 && rrdtool update test.rrd -t c2 N:10
Wait for 60 seconds....
Do again an update :
rdtool update test.rrd -t c1 N:20 && rrdtool update test.rrd -t c2 N:20
Then lets see what we have:
rrdtool fetch test.rrd AVERAGE | tail -5
1468409580: -nan -nan
1468409640: -nan -nan
1468409700: -nan -nan
1468409760: 1,5988575517e+01 1,9266620475e-01
1468409820: -nan -nan
The first datacource c1 works as expected, but the second c2 shows a value lower than 1 and I expect also a value close to 15.
Yes, I know I can also update both datasources in ONE update command, but in my case a have a lot of datasources in one rrd file and its better to read and follow the mass of values.
Used rrd version : 1.6.0
This is, of course, Data Normalisation. It is also caused by your updating the two datasources in two separate calls.
If you instead use:
rrdtool update test.rrd -t c1:c2 N:10:10
rrdtool update test.rrd -t c1:c2 N:20:20
then you will be updating both DSs at the same time. You see, when you do it in separate updates, what you're actually doing is implicitly updating the other DS with 'unknown' and then relying on the automatic interpolation to fill things in. RRDTool is not a relational database, and you cannot update values in a timewindow independently without affecting the other values.
The other issue is Data Normalisation, where values are adjusted temporally to fit into the exact time boundaries and in doing so, the values are adjusted to be linearly equivalent... the practical upshot when using network traffic (big numbers) is pretty much the same, and the overall totals and averages are consistent, but smaller point-in-time values end up as decimals like this.
So, two things:
Update your DS all together, not in separate calls
Try to update exactly on the time boundary.(Instead of using 'N' use an exact time, rounded to the nearest minute)
Am I correct that value of --step option is used solely for pre-calculating the data slots in RRD? Or does RRD somehow expect updates with interval specified with --step?
RRDtool will 're-sample' the data you provide to be in --step interval before continuing to process it. You can deliver as many updates as you wish. RRDtool will take them all into account when building the --step interval.
First. I created a rrd database
$ rrdtool create test.rrd --start 1200000000 --step 300 DS:test_1:GAUGE:600:0:100 RRA:AVERAGE:0.5:1:12
Second. do some updates
$ rrdtool update test.rrd 1200000100:1
$ rrdtool update test.rrd 1200000400:3
$ rrdtool update test.rrd 1200000700:4
$ rrdtool update test.rrd 1200001000:5
Third. fetch data from test.rrd
$ rrdtool fetch test.rrd -r 300 -s 1200000000 -e 1200001000 AVERAGE
Why 1200000300 is 2.333?
This is caused by Data Normalisation. RRDTool will automatically adjust data to fit exactly on the time boundary of the defined Interval.
Although your data are spaced exactly at 300s intervals, the same as your defined Interval (step), unfortunately they are not on the actual boundaries.
The boundary is when time modulo step is equal to zero. In your case, that would be at time 1200000000 and not at 1200000100. Thus, the sample needs to be adjusted (one third of it allocated to the earlier interval, and two thirds to the later). This is further complicated because you are operating in Gauge mode whereas RRDTool works interpolates assuming a linear rate of change of the rate.
If you started your samples at time 1200000300 or at 1200000000 then you would see them stored exactly as given, because the normalisation step would become a null operation. Since you provide a Gauge sample at 1200000100 and 1200000400 , the stored value for 1200000300 will be two-thirds along a line joining the two sample: 1 + ( 3 - 1 ) x 0.666 = 2.333 which is what you are getting.
The tutorial by Alex van den Bogeardt here will explain it all to you.
Here is what I am trying to achieve:
I read my data once a day (the exact time of the day is not very important).
I want to archive the values for this DS for two years back.
I need to be able to look back for 2 years and I need the value for every day
and I also need to see the weekly average
If I miss a reading for two consecutive days the data should be declared unknown
Here is what I am using for this:
rrdtool create Carsforsale.rrd --start 20130217 --step 86400 ^
DS:MidsizeCars:GAUGE:172800:U:U ^
DS:FullSizeCars:GAUGE:172800:U:U ^
RRA:AVERAGE:0:7:104^
RRA:LAST:0:7:1:720
I updated the above database with
rrdtool update Carsforsale.rrd 1361203200:554:791
rrdtool update Carsforsale.rrd 1361289600:556:795
The updated correspond to yesterday and the day before yesterday (18, 19 Feb)
I tried to plot the graphs for the above using this
rrdtool graph "Inventory.png" \
--start "20130217" \
--imgformat PNG --width 850 --height 400 \
DEF:MidsizeCars=Carsforsale.rrd:MidsizeCars:AVERAGE \
DEF:FullSizeCars=Carsforsale.rrd:FullSizeCars:AVERAGE \
AREA:MidsizeCars#0000FF:"MidsizeCars" \
AREA:FullSizeCars#FF004D:"FullSizeCars:STACK"'
And now here are the my questions:
are the step and the heart beat defined correctly for what I wantto do ?
Why are my graphs empty ?
Looking into the database with the freeware utility called RRD Editor I could see that the last values are stored in the MidSizeCars and FullSizecars but the only DS that contains a history of what has been loaded into the database is the archiving function LAST Am I supposed to plot LAST or Average to see the current values ?
Thanks
C
since you want to keep the data for two years at 1 day resolution, you have to setup an appropriate RRA for this purpose ... since this will only be about 730 values, I would not bother with setting up an extra consolidated RRA for the week. this will get calculated on the fly ...
I'm just getting started using RRDtool to collect climate data. I don't use the graph functionality, but rather use "fetch" to retrieve data. I then use another graphing solution (flot) to display the data, and that seems to work somewhat. But I had some small problems and decided to check the details of the update and fetching and was suddenly not so sure that things worked as I expected.
So I've created a tiny shell script that creates a database, put a single value in it and then print the contents:
#!/bin/sh
RRD=test.rrd
STEP=300
HB=600
# Remove previous databse to be sure that
# old data does not affect the test
rm -f $RRD
# Create database
rrdtool create $RRD \
--start 2999999999 --step $STEP \
DS:a:GAUGE:$HB:U:U \
RRA:AVERAGE:0.5:1:1000
# Do a single update
rrdtool update $RRD \
3000000400:123
# Fetch data and print to stdout
rrdtool fetch $RRD \
--start 3000000000 --end 3000000900 AVERAGE
I would expect this to print three (or perhaps four, not sure about the last one) values like this:
3000000000: -nan
3000000300: 123
3000000600: -nan
3000000900: -nan
But this is what I get:
3000000300: -nan
3000000600: -nan
3000000900: -nan
3000001200: -nan
So I've three questions:
Why does the fetch command start at 300, instead of 0?
Why does the fetch command include not only the last step (900) but also one more (1200)?
Why was not the updated value accepted?
The timeslot b contains information valid for b-step up to b EXCLUDING b itself. Hence when asking for data from 3000000000 to 3000000900 the first entry you get is 3000000300.
Since you are asking for data to end at 3000000900 you get the entry for 3000001200 as well as 3000000900 itself is the start of this entry.
At the moment even in gauge mode you would have to have a known value to start off ... so your first known update will simply bring you back into known state, it does not yet establish anything else. One might argue that in GAUGE mode this could be done differently.