I want to generate fake values in RRD DB for a period of 1 month and with 5 seconds as a frequency for data collection. Is there any tool which would fill RRD DB with fake data for given time duration.
I Googled a lot but did not find any such tool.
Please help.
I would recommend the following one liner:
perl -e 'my $start = time - 30 * 24 * 3600; print join " ","update","my.rrd",(map { ($start+$_*5).":".rand} 0..(30*24*3600/5))' | rrdtool -
this assumes you have an rrd file called my.rrd and that is contains just one data source expecting GAUGE type data.
Related
I am using rrdtool to record data off of a Morningstar Solar Charge controller. The data is obtained through SNMP. One of the datapoints being recorded is the "Charge Current" produced by the Solar array. I'm using PHP's "rrd_graph" to generate the graph. I have no problem generating a graph to show the power generated by the solar array over time, but I also want a summary of the AmpHours generated for the past "x" time displayed by the graph. The data is recorded into the rrd database every 60 seconds. This is the PHP code to display the desired graph:
// Solar Array
$opts= array("--start", $start, "--end", $timestamp,
"-h", "250",
"-w", "800",
"-E",
"-v", "Watts",
"--title", "Array Power for $node",
"DEF:arraymaxpower=/home/anr/data/solar/$node.rrd:arraymaxpower:AVERAGE",
"DEF:arraypower=/home/anr/data/solar/$node.rrd:arraypower:AVERAGE",
"DEF:chargecurrent=/home/anr/data/solar/$node.rrd:chargecurrent:AVERAGE",
"CDEF:amphours=chargecurrent,60,/",
"VDEF:amphourstot=amphours,TOTAL",
"AREA:arraypower#ffaf8f:Array Power",
"LINE1:arraypower#852600",
"LINE2:arraymaxpower#336600:Array Max Power",
"GPRINT:amphourstot:Amp Hours\: %5.2lf "
);
$ret = rrd_graph("/var/www/html/admin/solar/graphs/$node-arraypower.png", $opts);
if (! $ret) {
echo "<b>Graph error: </b>" . rrd_error() . "\n";
}
echo "<img src='/admin/solar/graphs/$node-arraypower.png' alt='Generated RRD image'><br />";
While I'm not displaying the charge. current in the graph, the datapoint generated by the DEF statement is needed to calculate the AmpHours value that I need. Since the data is stored every 60 seconds, I assumed that if I simply divided the data by 60 in a code, it would change the stored value from "AmpMinutes" to "AmpHours" and then I could use a VDEF to Total that value over the displayed range in the graph. However, I am ending up with numbers that are way too high. Any idea on what I am doing wrong?
I have a water flowmeter connected to a RPi which is writing data to a simple RRD
RRDs::create ($rrdfile, "--start", 1572829200,
"--step", 60,
"DS:FLOW1:GAUGE:90:U:U",
"RRA:MAX:0.5:1:10512000",);
From this I generate a graph for the last 24 hours and some statistics for the last few days. A simplified version follows
RRDs::graph "temps.png",
"--start=now-1d",
"--end=now",
"--width=1000",
"--base=1000",
"--height=240",
"--title=Flow Data - ",
"--slope-mode",
"--vertical-label=Volume of Water",
"DEF:flow-now=flow.rrd:FLOW1:AVERAGE", #Used to generate the graph
"DEF:flow-1d=flow.rrd:FLOW1:AVERAGE:end=midnight:start=end-1d", #Data for yesterday
"CDEF:flow-1d-1=flow-1d,25440,/", #Convert raw data to litres
"VDEF:flow-1dtotal=flow-1d-1,TOTAL", #Get total litres
"GPRINT:flow-1dtotal:Total Volume last 1 day = %.2lf L", #Print total for yesterday
I would like to add an arbitrary value to flow-1dtotal but can't work out how. Something along the lines of the psuedo code below is what I need
flow-1dtotal = flow-1dtotal + 1000
Thanks for reading and for any suggestions
My first post here so I hope I have not been too verbose.
I found I was losing datapoints due to only having 10 rows in my rrdtool config and wanted to update from a backup source file with older data.
After fixing the rows count the config was created with:
rrdtool create dailySolax.rrd \
--start 1451606400 \
--step 21600 \
DS:toGrid:GAUGE:172800:0:100000 \
DS:fromGrid:GAUGE:172800:0:100000 \
DS:totalEnerg:GAUGE:172800:0:100000 \
DS:BattNow:GAUGE:1200:0:300 \
RRA:LAST:0.5:1d:1010 \
RRA:MAX:0.5:1d:1010 \
RRA:MAX:0.5:1M:1010
and the update line in python is
newline = ToGrid + ':' + FromGrid + ':' + TotalEnergy + ':' + battNow
UpdateE = 'N:'+ (newline)
print UpdateE
try:
rrdtool.update(
"%s/dailySolax.rrd" % (os.path.dirname(os.path.abspath(__file__))),
UpdateE)
This all worked fine for inputting the original data (from a crontabbed website scrape) but as I said I lost data and wanted to add back the earlier datapoints.
From my backup source I had a plain text file with lines looking like
1509386401:10876.9:3446.22:18489.2:19.0
1509408001:10879.76:3446.99:18495.7:100.0
where the first field is the timestamp. And then used this code to read in the lines for the updates:
with open("rrdRecovery.txt","r") as fp:
for line in fp:
print line
## newline = ToGrid + ':' + FromGrid + ':' + TotalEnergy + ':' + battNow
UpdateE = line
try:
rrdtool.updatev(
"%s/dailySolax.rrd" % (os.path.dirname(os.path.abspath(__file__))),
UpdateE)
When it did not work correctly with a copy of the current version of the database I tried again on an empty database created using the same config.
In each case the update results only in the timestamp data in the database and no data from the other fields.
Python is not complaining and I expected
1509386401:10876.9:3446.22:18489.2:19.0
would update the same as does
N:10876.9:3446.22:18489.2:19.0
The dump shows the lastupdate data for all fields but then this for the rra database
<!-- 2017-10-31 11:00:00 AEDT / 1509408000 --> <row><v>NaN</v><v>NaN</v><v>NaN</v><v>NaN</v></row>
Not sure if I have a python issue - more likely a rrdtool understanding problem. Thanks for any pointers.
The problem you have is that RRDTool timestamps must be increasing. This means that, if you increase the length of your RRAs (back into the past), you cannot put data directly into these points - only add new data onto the end as time increases. Also, when you create a new RRD, the 'last update' time defaults to NOW.
If you have a log of your previous timestamp, then you should be able to add this history, as long as you don't do any 'now' updates before you finish doing so.
First, create the RRD, with a 'start' time earlier than the first historical update.
Then, process all of the historical updates in chronological order, with the appropriate timestamps.
Finally, you can start doing your regular 'now' updates.
I suspect what has happened is that you had your regular cronjob adding in new data before you have run all of your historical data input - or else you created the RRD with a start time after your historical timestamps.
I am trying to create a MonetDB database that shall hold 100k columns and approximately 2M rows of smallint type.
To generate 100k columns I am using a C code, i.e., a loop that performs the following sql request:
ALTER TABLE test ADD COLUMN s%d SMALLINT;
where %d is a number from 1 till 100000.
I observed that after 80000 sql requests each transaction takes about 15s, meaning that I need a lot of time to complete the table creation.
Could you tell me if there is a simple way of creating 100k columns?
Also, do you know what exactly what is going on with MonetDB?
You should use only one create table
in script shell (bash) :
#!/bin/bash
fic="/tmp/100k.sql"
col=1
echo "CREATE TABLE bigcol (" > $fic
while [[ $col -lt 100000 ]]
do
echo "field$col SMALLINT," >> $fic
col=$(($col + 1))
done
echo "field$col SMALLINT);" >> $fic
And in command line :
sh 100k.sh
mclient yourbdd < /tmp/100k.sql
wait about 2 minutes :D
mclient yourbdd
> \d bigcol
[ ... ... ...]
"field99997" SMALLINT,
"field99998" SMALLINT,
"field99999" SMALLINT,
"field100000" SMALLINT
);
DROP TABLE bigcol is against very very long. I do not know why.
I also think it is not a good idea, but it answer your question.
Pierre
I have written a python 2.7 script to retrieve all my historical data from Xively.
Originally I wrote it in C#, and it works perfectly.
I am limiting the request to 6 hour blocks, to retrieve all stored data.
My version in Python is as follows:
requestString = 'http://api.xively.com/v2/feeds/41189/datastreams/0001.csv?key=YcfzZVxtXxxxxxxxxxxORnVu_dMQ&start=' + requestDate + '&duration=6hours&interval=0&per_page=1000' response = urllib2.urlopen(requestString).read()
The request date is in the correct format, I compared the full c# requestString version and the python one.
Using the above request, I only get 101 lines of data, which equates to a few minutes of results.
My suspicion is that it is the .read() function, it returns about 34k of characters which is far less than the c# version. I tried adding 100000 as an argument to the ad function, but no change in result.
Left another solution wrote in Python 2.7 too.
In my case, got data each 30 minutes because many sensors sent values every minute and Xively API has limited half hour of data to this sent frequency.
It's general module:
for day in datespan(start_datetime, end_datetime, deltatime): # loop increasing deltatime to star_datetime until finish
while(True): # assurance correct retrieval data
try:
response = urllib2.urlopen('https://api.xively.com/v2/feeds/'+str(feed)+'.csv?key='+apikey_xively+'&start='+ day.strftime("%Y-%m-%dT%H:%M:%SZ")+'&interval='+str(interval)+'&duration='+duration) # get data
break
except:
time.sleep(0.3)
raise # try again
cr = csv.reader(response) # return data in columns
print '.'
for row in cr:
if row[0] in id: # choose desired data
f.write(row[0]+","+row[1]+","+row[2]+"\n") # write "id,timestamp,value"
The full script you can find it here: https://github.com/CarlosRufo/scripts/blob/master/python/retrievalDataXively.py
Hope you might help, delighted to answer any questions :)