How could I print the time of a max value in a rrdtool graph? - rrdtool

I've searched so many times that I'm tired to.
I've a rrdtool database, with which I'm used to print the MAX, Min, Average value.
Now, I would like to print the time of the max value stored in the rrd db.
Here is the definition of my rrd (CPU monitoring) :
rrdtool create CPU.rrd --start $Date \
DS:CPU_ALL:GAUGE:600:U:U \
DS:User:GAUGE:600:U:U \
DS:Sys:GAUGE:600:U:U \
DS:Wait:GAUGE:600:U:U \
DS:Idle:GAUGE:600:U:U \
RRA:AVERAGE:0.5:1:20000 \
RRA:AVERAGE:0.5:1:20000 \
RRA:AVERAGE:0.5:1:20000 \
RRA:AVERAGE:0.5:1:20000 \
RRA:AVERAGE:0.5:1:20000
Here is my graph script :
rrdtool graph CPUUsed.png --start -1w \
DEF:CPUTOTAL=CPU.rrd:CPU_ALL:AVERAGE AREA:CPUTOTAL#FF0000:"CPU Used" LINE2:CPUTOTAL#FF0000 \
--vertical-label "CPU" \
--title "CPU " \
--width 530 \
--height 380 \
GPRINT:CPUTOTAL:MAX:"MAX\:%6.2lf %s" \
GPRINT:CPUTOTAL:MIN:"MIN\:%6.2lf %s" \
GPRINT:CPUTOTAL:AVERAGE:"MOY\:%6.2lf %s" \
GPRINT:CPUTOTAL:LAST:"LAST\:%6.2lf %s"
How could I generate this graph adding the time of the max CPU value ?

OK, you have a couple of problems here.
Firstly, your RRD is misconfigured. You have 5 identical RRAs defined - this does not make sense. One RRA will hold values at the specified resolution for all the defined DSs. However, you may want to have more at higher granularity (to speed up graphs of month or year). You may also want to have a MIN or MAX type RRA so that your MIN and MAX figures are more accurate.
For example, this set defines both MAX and MIN RRAs as well as the average, and will also have 4 rollups that roughly correspond to daily, weekly, monthly and yearly graphs.
RRA:AVERAGE:0.5:1:20000 \
RRA:AVERAGE:0.5:6:2000 \
RRA:AVERAGE:0.5:24:2000 \
RRA:AVERAGE:0.5:288:2000 \
RRA:MAX:0.5:1:20000 \
RRA:MAX:0.5:6:2000 \
RRA:MAX:0.5:24:2000 \
RRA:MAX:0.5:288:2000 \
RRA:MIN:0.5:1:20000 \
RRA:MIN:0.5:6:2000 \
RRA:MIN:0.5:24:2000 \
RRA:MIN:0.5:288:2000
Secondly, when you want to print a single figure in the GPRINT line, you need to use a VDEF to convert your time series data (from the DEF or CDEF) into a single value, using some consolodation functions.
For example, this set of commands will use the MAX and MIN type DEFs defined previously, then calculate summaries over them using a VDEF. Of course, you could just use CPUTOTAL instead of defining CPUTOTALMAX and CPUTOTALMIN (saving yourself the additional RRAs) but, as you move to using the lower-granularity RRAs, accuracy will fall. If you don't have lower-granularity RRAs, then you will be accurate, but will use a lot of additional CPU at graph time and graph creation will be slower. Using different resolution RRAs speeds up graph creation.
DEF:CPUTOTAL=CPU.rrd:CPU_ALL:AVERAGE \
DEF:CPUTOTALMAX=CPU.rrd:CPU_ALL:MAX \
DEF:CPUTOTALMIN=CPU.rrd:CPU_ALL:MIN \
VDEF:overallmax=CPUTOTALMAX,MAXIMUM \
VDEF:overallmin=CPUTOTALMIN,MINIMUM \
VDEF:overallavg=CPUTOTAL,AVG \
VDEF:overalllst=CPUTOTAL,LAST \
AREA:CPUTOTAL#FF0000:"CPU Used" \
LINE2:CPUTOTAL#FF0000 \
GPRINT:overallmax:"MAX\:%6.2lf %s" \
GPRINT:overallmin:"MIN\:%6.2lf %s" \
GPRINT:overallavg:"MOY\:%6.2lf %s" \
GPRINT:overalllst:"LAST\:%6.2lf %s" \
GPRINT:overallmax:"Max was at %c":strftime
The last line will print the time of the maxima rather than the value. When a VDEF calculates a MAX or MIN, it actually returns two components - value, and point in time. Usually you use the value, but by appending :strftime to the GPRINT directrive you can use the time component instead.
I suggest you spend a bit more time working through the tutorials and examples on the RRDTool Website, which should help you gain a better understanding of how RRDTool works.

Related

Retraining Inception and specifying label_count = 2 but receiving three scores instead of two

I have modified the flower retraining code to have label_count =2 as shown here:
gcloud beta ml jobs submit training "$JOB_ID" \
--module-name trainer.task \
--package-path trainer \
--staging-bucket "$BUCKET" \
--region us-central1 \
-- \
--output_path "${GCS_PATH}/training" \
--eval_data_paths "${GCS_PATH}/preproc/eval*" \
--train_data_paths "${GCS_PATH}/preproc/train*" \
--label_count 2 \
--max_steps 4000
And I have modified dict.txt to have only two labels.
But the retrained model outputs three scores instead of two as expected. The unexpected third score is always very small as shown in this example:
KEY PREDICTION SCORES
Key123 0 [0.7956143617630005, 0.2043769806623459, 8.625334885437042e-06]
Why are there three scores and is there a change one can make so the model outputs only two scores?
Note: I have read the answers from Slaven Bilac and JoshGC to the question “cloudml retraining inception - received a label value outside the valid range” but these answers do not address my question above.
It's the "label" we apply to images that had no label in the training set. The behavior is discussed in this comment in model.py line 221
# Some images may have no labels. For those, we assume a default
# label. So the number of labels is label_count+1 for the default
# label.
I agree it's not a very intuitive behavior, but it makes the code a little more robust against datasets that are not as cleaned up! Hope this helps.

What's the command line for rrdtool to create a graph using the last update time as the end time?

Going off of this question: Print time of recording for LAST value
It appears possible to have rrdtool compute the timestamp of the last update in a rrd. How do you use this in a command as the "end" time?
i.e. I want to do something like this:
rrdtool graph img.png -a PNG -s e-600 -e LASTUPDATETIME -v "CPU Usage" \
--title "CPU Utilization" DEF:ds0a=node.rrd:ds0:AVERAGE \
DEF:ds1a=node.rrd:ds1:AVERAGE AREA:ds0a#35b73d:"User" \
LINE1:ds1a#0400ff:"System"
I tried mucking about the DEF, CDEF and VDEF things to no avail:
rrdtool graph img.png -a PNG -v "CPU Usage" --title "CPU Utilization" \
DEF:data=node.rrd:x:AVERAGE CDEF:count=data,UN,UNKN,COUNT,IF \
VDEF:last=count,MAXIMUM \
DEF:ds0a=node.rrd:ds0:AVERAGE:start=end-600:end=last \
DEF:ds1a=node.rrd:ds1:AVERAGE:start=end-600:end=last \
AREA:ds0a#35b73d:"User" LINE1:ds1a#0400ff:"System"
This results in:
ERROR: end time: unparsable time: last
Any ideas?
on the command line, you could do
rrdtool graph img.png -a PNG -s e-600 -e `date +%s node.rrd` -v "CPU Usage" \
--title "CPU Utilization" DEF:ds0a=node.rrd:ds0:AVERAGE \
DEF:ds1a=node.rrd:ds1:AVERAGE AREA:ds0a#35b73d:User \
LINE1:ds1a#0400ff:System

Convert Chile Map with Inserts

I have used the NaturalData states/Providences data set to generate a map of Chile using this command:
python converter.py \
--width 900 \
--country_name_index 12 \
--country_code_index 31 \
--where "iso_a2 = 'CL'" \
--projection mill \
--name "cl" \
--language en \
ne_10m_admin_1_states_provinces_shp.shp output/jquery-jvectormap-cl-mill-en.js
It generates a maps like this. (Minus the red circles)
The three circled islands are all labeled ValparaÃso, which corresponds to the providence circled on the main land mass.
Looking at the documentation provided on how to do insets (which uses Alaska and Hawaii as examples), I attempted to move these islands closer, so that my map was more centered.
python converter.py \
--width 900 \
--country_name_index 12 \
--country_code_index 31 \
--where "iso_a2 = 'CL'" \
--projection mill \
--name "cl" \
--language en \
--insets [{"codes": ["CL-VS"], "width": 200, "left": 10, "top": 370}]' \
ne_10m_admin_1_states_provinces_shp.shp output/jquery-jvectormap-cl-mill-en.js
Unfortunately, this fails with
converter.py: error: unrecognized arguments: 200, left: 10, top: 370},]' ne_10m_admin_1_states_provinces_shp.shp output/jquery-jvectormap-cl-mill-en.js
My questions:
How do I resolve the errors in that error message? The parameters are mentioned in both the documentation and in the code so I am unsure what should be used instead.
How can I move the three circled islands to be insets without affecting the mainland ValparaÃso?
Your insets argument is failing because it isn't quoted properly. You can use the following:
--insets "[{\"codes\": [\"CL-VS\"], \"width\": 200, \"left\": 10, \"top\": 370}]"

RRDtool bad format problem

I'm using RRDtool for graphing some monitoring information. One problem I faced when using rrd is using GRPINT directive. I use following command to graph networking Rx/Tx data:
rrdtool graph out.png -v bytes -a PNG --start "-6 hour" --title "WLAN traffic" \
--vertical-label="Bit/s" \
'DEF:_rx=/root/ppp0.rrd:rx:AVERAGE' \
'DEF:_tx=/root/ppp0.rrd:tx:AVERAGE' \
'CDEF:tx=_tx,-8,*' \
'CDEF:rx=_rx,8,*' \
'COMMENT:WLAN traffic\j' \
"AREA:rx#333333:WLAN Rx" \
"AREA:tx#990000:WLAN Tx" \
'GPRINT:rx:AVERAGE:"Rx average - %d"' \
'GPRINT:tx:AVERAGE:"Tx average - %d"'
I've got:
ERROR: bad format for GPRINT in 'Rx average - %d'
I tried to simplify format, but when I've got:
ERROR: bad format for GPRINT in '%d'
I understand that I doing something completely wrong. What's the problem?
Your %d format is for integers. GPRINT (and PRINT) only support double formats ... see man sprintf for inspiration. Try %.0lf for starters.

What Date Source(DS) and Round Robin Archive(RRA) should I choose for displaying information about multiple website registrations per period?

What Date Source(DS) and Round Robin Archive(RRA) should I choose for displaying information about multiple website registrations per period?
I want have RRA with one measurement = 5min. period. The Y axis displaying total number (not average) of registrations during last 5 minutes. Also, I want graph with information about total number of registrations (Y axis) per day (X axis).
What DS type and RRA "algoritm" shoud I choose to implement this?
You should look at the rrd-beginners guide. Hope this helps you.
From the top of my head, this should create the DB with the correct DST:
rrdtool create file_name.rrd /
--start N --step 300 /
DS:registrations:COUNTER:600:U:U /
RRA:MAX:0.5:1:288
And this should produce the wanted readout:
rrdtool graph file_name.png --start -86400 \
--x-grid HOUR:1:HOUR:8:HOUR:2:0:%Hh \
--vertical-label "num" \
TEXTALIGN:center \
DEF:num_regs=file_name.rrd:registrations:MAX \
LINE:num_regs#1F77B4:"Number of registrations"
Let me know if it works :)