Im using an RRD to monitor a data source. We are seeing many occasions where the RRD stores a NaN result despite the fact that we know data was received as we are also appending the received data to a file for testing. When we examine the difference we see the following:
I tried to paste the data as two columns but it hasnt structured properly but in essence what we see below is two columns of a spreadsheet. The left column is the rrd dump and the right column is the actual data that arrived at that time.
" <!-- 2017-09-28 06:00:00 UTC / 1506578400 --> <row><v>1.1999200000e+06</v></row>" 1506578412:1202000
" <!-- 2017-09-28 06:05:00 UTC / 1506578700 --> <row><v>1.2538400000e+06</v></row>" 1506578712:1256000
" <!-- 2017-09-28 06:10:00 UTC / 1506579000 --> <row><v>1.2310400000e+06</v></row>" 1506579012:1230000
" <!-- 2017-09-28 06:15:00 UTC / 1506579300 --> <row><v>1.2415200000e+06</v></row>" 1506579312:1242000
" <!-- 2017-09-28 06:20:00 UTC / 1506579600 --> <row><v>1.2304800000e+06</v></row>" 1506579612:1230000
" <!-- 2017-09-28 06:25:00 UTC / 1506579900 --> <row><v>1.2357600000e+06</v></row>" 1506579912:1236000
" <!-- 2017-09-28 06:30:00 UTC / 1506580200 --> <row><v>1.1284800000e+06</v></row>" 1506580212:1124000
" <!-- 2017-09-28 06:35:00 UTC / 1506580500 --> <row><v>1.2238400000e+06</v></row>" 1506580512:1228000
" <!-- 2017-09-28 06:40:00 UTC / 1506580800 --> <row><v>NaN</v></row>" 1506580813:1222000
" <!-- 2017-09-28 06:45:00 UTC / 1506581100 --> <row><v>1.2400000000e+06</v></row>" 1506581112:1240000
" <!-- 2017-09-28 06:50:00 UTC / 1506581400 --> <row><v>1.2284800000e+06</v></row>" 1506581412:1228000
" <!-- 2017-09-28 06:55:00 UTC / 1506581700 --> <row><v>8.9392000000e+05</v></row>" 1506581712:880000
" <!-- 2017-09-28 07:00:00 UTC / 1506582000 --> <row><v>NaN</v></row>" 1506582014:1000000
" <!-- 2017-09-28 07:05:00 UTC / 1506582300 --> <row><v>NaN</v></row>" 1506582315:738000
" <!-- 2017-09-28 07:10:00 UTC / 1506582600 --> <row><v>1.1760000000e+06</v></row>" 1506582613:1176000
" <!-- 2017-09-28 07:15:00 UTC / 1506582900 --> <row><v>1.1874800000e+06</v></row>" 1506582912:1188000
" <!-- 2017-09-28 07:20:00 UTC / 1506583200 --> <row><v>1.2033600000e+06</v></row>" 1506583212:1204000
" <!-- 2017-09-28 07:25:00 UTC / 1506583500 --> <row><v>1.2097600000e+06</v></row>" 1506583512:1210000
" <!-- 2017-09-28 07:30:00 UTC / 1506583800 --> <row><v>1.0717600000e+06</v></row>" 1506583811:1066000
" <!-- 2017-09-28 07:35:00 UTC / 1506584100 --> <row><v>NaN</v></row>" 1506584112:1222000
" <!-- 2017-09-28 07:40:00 UTC / 1506584400 --> <row><v>1.1760000000e+06</v></row>" 1506584412:1176000
" <!-- 2017-09-28 07:45:00 UTC / 1506584700 --> <row><v>1.2048000000e+06</v></row>" 1506584712:1206000
" <!-- 2017-09-28 07:50:00 UTC / 1506585000 --> <row><v>1.0255200000e+06</v></row>" 1506585012:1018000
" <!-- 2017-09-28 07:55:00 UTC / 1506585300 --> <row><v>1.2004000000e+06</v></row>" 1506585312:1208000
" <!-- 2017-09-28 08:00:00 UTC / 1506585600 --> <row><v>1.1676800000e+06</v></row>" 1506585612:1166000
" <!-- 2017-09-28 08:05:00 UTC / 1506585900 --> <row><v>1.2024800000e+06</v></row>" 1506585912:1204000
" <!-- 2017-09-28 08:10:00 UTC / 1506586200 --> <row><v>1.2116800000e+06</v></row>" 1506586212:1212000
" <!-- 2017-09-28 08:15:00 UTC / 1506586500 --> <row><v>NaN</v></row>" 1506586513:886000
" <!-- 2017-09-28 08:20:00 UTC / 1506586800 --> <row><v>1.1940000000e+06</v></row>" 1506586812:1194000
" <!-- 2017-09-28 08:25:00 UTC / 1506587100 --> <row><v>1.1959200000e+06</v></row>" 1506587112:1196000
" <!-- 2017-09-28 08:30:00 UTC / 1506587400 --> <row><v>NaN</v></row>" 1506587413:1206000
" <!-- 2017-09-28 08:35:00 UTC / 1506587700 --> <row><v>1.1440000000e+06</v></row>" 1506587712:1144000
" <!-- 2017-09-28 08:40:00 UTC / 1506588000 --> <row><v>NaN</v></row>" 1506588013:668000
" <!-- 2017-09-28 08:45:00 UTC / 1506588300 --> <row><v>1.2080000000e+06</v></row>" 1506588312:1208000
" <!-- 2017-09-28 08:50:00 UTC / 1506588600 --> <row><v>NaN</v></row>" 1506588613:1156000
" <!-- 2017-09-28 08:55:00 UTC / 1506588900 --> <row><v>1.2080000000e+06</v></row>" 1506588912:1208000
" <!-- 2017-09-28 09:00:00 UTC / 1506589200 --> <row><v>1.1945600000e+06</v></row>" 1506589212:1194000
" <!-- 2017-09-28 09:05:00 UTC / 1506589500 --> <row><v>1.1786400000e+06</v></row>" 1506589512:1178000
" <!-- 2017-09-28 09:10:00 UTC / 1506589800 --> <row><v>1.1396000000e+06</v></row>" 1506589811:1138000
" <!-- 2017-09-28 09:15:00 UTC / 1506590100 --> <row><v>NaN</v></row>" 1506590113:1006000
" <!-- 2017-09-28 09:20:00 UTC / 1506590400 --> <row><v>1.1780000000e+06</v></row>" 1506590412:1178000
" <!-- 2017-09-28 09:25:00 UTC / 1506590700 --> <row><v>1.1799200000e+06</v></row>" 1506590712:1180000
" <!-- 2017-09-28 09:30:00 UTC / 1506591000 --> <row><v>1.1953600000e+06</v></row>" 1506591012:1196000
" <!-- 2017-09-28 09:35:00 UTC / 1506591300 --> <row><v>1.1806400000e+06</v></row>" 1506591312:1180000
" <!-- 2017-09-28 09:40:00 UTC / 1506591600 --> <row><v>1.1588800000e+06</v></row>" 1506591612:1158000
" <!-- 2017-09-28 09:45:00 UTC / 1506591900 --> <row><v>1.2002400000e+06</v></row>" 1506591912:1202000
" <!-- 2017-09-28 09:50:00 UTC / 1506592200 --> <row><v>1.0656800000e+06</v></row>" 1506592212:1060000
" <!-- 2017-09-28 09:55:00 UTC / 1506592500 --> <row><v>1.2078400000e+06</v></row>" 1506592512:1214000
" <!-- 2017-09-28 10:00:00 UTC / 1506592800 --> <row><v>1.1640800000e+06</v></row>" 1506592812:1162000
" <!-- 2017-09-28 10:05:00 UTC / 1506593100 --> <row><v>1.1754400000e+06</v></row>" 1506593112:1176000
We can see the occasions where the data seems not to be accepted are almost always when the time it arrives is somewhat outside the trend.
How can we go about widening the acceptance criteria so that all of these datapoints are accepted?
RRD info for the RRD in question is shown below:
root#ra:/var/www/genie/public_html# rrdtool info /an/data/SI1.rrd
filename = "/an/data/SI1.rrd"
rrd_version = "0003"
step = 300
last_update = 1506594312
header_size = 1000
ds[probe1-temp].index = 0
ds[probe1-temp].type = "GAUGE"
ds[probe1-temp].minimal_heartbeat = 300
ds[probe1-temp].min = 0.0000000000e+00
ds[probe1-temp].max = 5.0000000000e+06
ds[probe1-temp].last_ds = "1226000"
ds[probe1-temp].value = NaN
ds[probe1-temp].unknown_sec = 12
rra[0].cf = "MIN"
rra[0].rows = 1440
rra[0].cur_row = 238
rra[0].pdp_per_row = 12
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = 1.1754400000e+06
rra[0].cdp_prep[0].unknown_datapoints = 2
rra[1].cf = "MAX"
rra[1].rows = 1440
rra[1].cur_row = 1220
rra[1].pdp_per_row = 12
rra[1].xff = 5.0000000000e-01
rra[1].cdp_prep[0].value = 1.2140000000e+06
rra[1].cdp_prep[0].unknown_datapoints = 2
rra[2].cf = "AVERAGE"
rra[2].rows = 1440
rra[2].cur_row = 1205
rra[2].pdp_per_row = 1
rra[2].xff = 5.0000000000e-01
rra[2].cdp_prep[0].value = NaN
rra[2].cdp_prep[0].unknown_datapoints = 0
root#ra:#
You have set the DS heartbeat to 300, but also the step to 300.
This means that, if your data arrive 300s or more apart, then they will be stored as NaN, which is what you are seeing. From the stats you give, you can see that on the NaN rows, the actual time interval is 301 or 302 sec, which is >300 and so results in a NaN as it exceeds the heartbeat time.
You should normally set the heartbeat to twice the expected data interval, IE twice the step, so as to handle this case.
Try setting the heartbeat to 600; this should solve the problem.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last month.
Improve this question
I Have a .xml file that has lines which look like this:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rrd SYSTEM "http://oss.oetiker.ch/rrdtool/rrdtool.dtd">
<!-- Round Robin Database Dump -->
<rrd>
<version>0003</version>
<step>60</step> <!-- Seconds -->
<lastupdate>1674125860</lastupdate> <!-- 2023-01-19 10:57:40 UTC -->
<ds>
<name> 1 </name>
<type> GAUGE </type>
<minimal_heartbeat>8460</minimal_heartbeat>
<min>NaN</min>
<max>NaN</max>
<!-- PDP Status -->
<last_ds>954298368</last_ds>
<value>3.8171934720e+10</value>
<unknown_sec> 0 </unknown_sec>
</ds>
<!-- Round Robin Archives -->
<rra>
<cf>AVERAGE</cf>
<pdp_per_row>1</pdp_per_row> <!-- 60 seconds -->
<params>
<xff>5.0000000000e-01</xff>
</params>
<cdp_prep>
<ds>
<primary_value>8.5981579947e+08</primary_value>
<secondary_value>0.0000000000e+00</secondary_value>
<value>NaN</value>
<unknown_datapoints>0</unknown_datapoints>
</ds>
</cdp_prep>
<database>
<!-- 2023-01-17 10:58:00 UTC / 1673953080 --> <row><v>NaN</v></row>
<!-- 2023-01-17 10:59:00 UTC / 1673953140 --> <row><v>NaN</v></row>
<!-- 2023-01-17 11:00:00 UTC / 1673953200 --> <row><v>NaN</v></row>
<!-- 2023-01-17 11:01:00 UTC / 1673953260 --> <row><v>NaN</v></row>
<!-- 2023-01-17 11:02:00 UTC / 1673953320 --> <row><v>NaN</v></row>
<!-- 2023-01-17 11:03:00 UTC / 1673953380 --> <row><v>NaN</v></row>
<!-- 2023-01-18 12:00:00 UTC / 1674043200 --> <row><v>NaN</v></row>
<!-- 2023-01-18 18:00:00 UTC / 1674064800 --> <row><v>7.9644330667e+08</v></row>
<!-- 2023-01-19 00:00:00 UTC / 1674086400 --> <row><v>7.9696554667e+08</v></row>
<!-- 2023-01-19 06:00:00 UTC / 1674108000 --> <row><v>5.8408509440e+08</v></row>
</database>
</rra>
Trying to convert the scientific notation (which is a value in bytes) and convert it to a value in megabytes and back to scientific notation in Linux bash shell or script.
So far I have this lines, but i am stuck and don't know how to put them back into the file with the calculation to divide 2x by 1024:
cat Memory_mem_used.xml | grep -Eo '[0-9]+\.[0-9]+e\+[0-9]+' | perl -ne 'printf "%d\n", $_;'
The output should look like this:
output=796443306 | output2=$(($output / 1024 / 1024)) | perl -e 'printf "%.11e\n", '$output2''
7.59000000000e+02
Try:
#!/bin/bash
IFS=''
while read line ; do
left=${line%%<v>*}
rest=${line#*<v>}
value=${rest%%</v>*}
right=${rest#*</v>}
if [ "$value" != "$line" ] && [ "$value" != "NaN" ] ; then # match
num_value=$(LC_ALL=C printf '%.0f' "$value")
new_value=$(LC_ALL=C printf '%.11e' $((num_value / 1048576)) )
line="$left<v>$new_value</v>$right"
fi
echo "$line"
done < input.xml
How to get next minute from current_timestamp in presto/athena
Eg.
2021-07-27 12:29:52.951 UTC -> 2021-07-27 12:30:00.000 UTC
2021-07-27 12:29:25.951 UTC -> 2021-07-27 12:30:00.000 UTC
There's no built-in function for that, but you can do it by adding 1 minute to the timestamp and then using date_trunc to round down to the nearest minute:
WITH data(ts) AS (
VALUES
TIMESTAMP '2021-07-27 12:29:52.951 UTC',
TIMESTAMP '2021-07-27 12:29:25.951 UTC'
)
SELECT ts, date_trunc('minute', ts + INTERVAL '1' MINUTE)
FROM data
=>
ts | _col1
-----------------------------+-----------------------------
2021-07-27 12:29:52.951 UTC | 2021-07-27 12:30:00.000 UTC
2021-07-27 12:29:25.951 UTC | 2021-07-27 12:30:00.000 UTC
(2 rows)
I had created a rrd file with a specific time. But when i convert it into xml, i find the start time is inconsitent with the specified time.
The version of rrdtool is 1.5.5.
And the code is
> rrdtool create abc.rrd \
> step 15 --start 1554122342 \ DS:sum:GAUGE:120:U:U \ RRA:AVERAGE:0.5:1:5856 \ RRA:AVERAGE:0.5:4:20160 \
> RRA:AVERAGE:0.5:40:52704
The first few lines is like
> <!-- 2019-03-31 20:15:15 CST / 1554034515 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:15:30 CST / 1554034530 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:15:45 CST / 1554034545 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:16:00 CST / 1554034560 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:16:15 CST / 1554034575 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:16:30 CST / 1554034590 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:16:45 CST / 1554034605 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:17:00 CST / 1554034620 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:17:15 CST / 1554034635 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:17:30 CST / 1554034650 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:17:45 CST / 1554034665 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:18:00 CST / 1554034680 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:18:15 CST / 1554034695 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:18:30 CST / 1554034710 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:18:45 CST / 1554034725 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:19:00 CST / 1554034740 --> <row><v>NaN</v></row>
> <!-- 2019-03-31 20:19:15 CST / 1554034755 --> <row><v>NaN</v></row>
I tried other parameters such as the default(now-10s), but the interval is about one day.
(My example below tested with RRDTool 1.5.5)
Your RRA is approximately 1 year long, in 10min intervals; with a 15s set up the RRD.
When you create an RRD, the start time is the time of the most recent data point or last update; in other words, you cannot add any data for a time earlier than this. The RRA will be initialised with unknown throughout.
So, when you create your RRD with:
rrdtool create abc.rrd --step 15 --start 1554122342 \
DS:sum:GAUGE:120:U:U RRA:AVERAGE:0.5:40:52704`
you can see this using rrdtool info (output trimmed for clarity):
$ rrdtool info abc.rrd
filename = "abc.rrd"
...
last_update = 1554122342
When you then use rrdtool dump to immediately view the content of the RRA, you can see that it starts about a year earlier:
$ rrdtool dump abc.rrd
...
<lastupdate>1554122342</lastupdate> <!-- 2019-04-02 01:39:02 NZDT -->
...
<database>
<!-- 2018-04-01 01:40:00 NZDT / 1522500000 --> <row><v>NaN</v></row>
<!-- 2018-04-01 01:50:00 NZDT / 1522500600 --> <row><v>NaN</v></row>
...
<!-- 2019-04-02 01:20:00 NZDT / 1554121200 --> <row><v>NaN</v></row>
<!-- 2019-04-02 01:30:00 NZDT / 1554121800 --> <row><v>NaN</v></row>
</database>
But wait a minute! This ends on 1554121800, but our last update (start time) was 1554122342! This is a difference of 542. Why would this be?
The reason is that although your step is 15s, the RRA interval is 40 steps, IE 600s. The next entry cannot be added until there is 600s of data, and we only have 542. Therefore, the last entry in the RRA is as shown. Note that all intervals are normalised relative to UCT, and so your RRA cdp (consolodated data points) will always be a multiple of the interval size - in this case, 600 - regardless of when you set 'start' to be. RRDTool will simply pick the closest. This behaviour becomes much more obvious when you are rolling up to a large time period - e.g. 1 day - and you live in a more extreme timezone - e.g. Auckland with UCT+13.
Of course, once you write anything to the RRD, then lastupdate will change, and the RRA will add however many new points are required (and drop the old ones of course).
How do I convert epoch time to date & time based on time zone in xslt 2.0 ?
For example, epoch time 1212497304 converts to
GMT: Tue, 03 Jun 2008 12:48:24 GMT
time zone: martes, 03 de junio de 2008 14:48:24 GMT+2
This is because of the Daylight Saving Time (DST): in many countries there is one hour more in summer, during some dates that varies each year.
For example, it is supposed that this instruction:
<xsl:value-of select=" format-dateTime(xs:dateTime('2013-07-07T23:08:00+00:00'), '[D] [MNn] [Y] [h]:[m01][PN,*-2] [Z] ([C])', 'en', 'AD', 'IST') "/>
would calculate the given GMT date event into Indian Standard Time (IST) with Gregorian Calendar (AD) but it just prints:
7 July 2013 11:08PM +00:00 (Gregorian)
So it does not shift the time zone.
To shift the time zone we must use:
adjust-dateTime-to-timezone
But this function accepts only a duration in number of hours/minutes, not a TimeZone so that the processor determines if there is DST or not.
Any advise, please
Edit: This is not really the answer to the actual question asked, which was how to get the correct local time, respecting its timezone, but I'll still leave it here since it might be usefull to someone.
Because epoch is just seconds past since unix time you can just add it to unix time like this:
Unix time is the number of seconds elapsed since the epoch of midnight 1970-01-01, you can do:
<xsl:value-of select="xs:dateTime('1970-01-01T00:00:00') + xs:dayTimeDuration('PT1212497304S')"
/>
This will give you the correct xs:dateTime of 2008-06-03T12:48:24
Put into a function:
<xsl:function name="fn:epochToDate">
<xsl:param name="epoch"/>
<xsl:variable name="dayTimeDuration" select="concat('PT',$epoch,'S')"/>
<xsl:value-of select="xs:dateTime('1970-01-01T00:00:00') + xs:dayTimeDuration($dayTimeDuration)"/>
</xsl:function>
Hello is there a way to prune a rrd file by date? It seems posible as rrdtool dump file dumps
<!-- 2012-05-07 19:15:00 UTC / 1336418100 --> <row><v> 0.0000000000e+00 </v></row>
<!-- 2012-05-07 19:20:00 UTC / 1336418400 --> <row><v> 9.6589767000e-01 </v></row>
<!-- 2012-05-07 19:25:00 UTC / 1336418700 --> <row><v> 3.4568563333e-02 </v></row>
<!-- 2012-05-07 19:30:00 UTC / 1336419000 --> <row><v> 9.6402870667e-01 </v></row>
Thanks
you can edit the dump file prior to restore ... not sure what you mean by pruning, since rrdfiles always stay the same size.