Counting riemann events in rate function - clojure

Hi I have a use case where I have to aggregate my application response time for a time interval of 10 i.e (rate 10 and then calculate the average . The real problem is there is no way to calculate the number of events in riemann rate function for the time interval of 10. Is there any way to do that other than using (fixed-time-window .

Rate is unlikely to be the function you want for this. If I understand it you would like your code to:
gather all the events that happen in ten minutes
once the events are all available, calculate the average of the :metric key
emit one event with the service name, and that average as the value of it's metric.
If I'm not understanding then this answer won't fit, so let me know.
Rate takes in a reporting-interval and any number of streams to forward the average rate to. That first parameter to rate only determines how often it report the current rate and has no effect on the period over which it's aggregated. the built in rate function only has one available agrigation interval, it always reports in "metric per second". so it accumulates events for one second, and averages the mertic over that second, and it properly handles edge cases like reporting intervals with no events as a zero metric a reasonable number of times, though not forever. You should use rate where it fits, and not use it where you need explicit control over the aggregation period.
I often want events per minute, so I set the reporting period to 60 seconds, then multiply the output by 60 before forwarding it. This saves me handling all the edge cases in custom aggregation functions. keep in mind that this looses some accuracy in the rounding.
(rate 60
(adjust [:metric * 60]
index datadog))
You may also want to do something like:
(fixed-time-window 10
(smap folds/median
... your stuff ...

Related

How to determine the time an actuator has been enabled during the past x minutes?

I need to find the most efficient approach to the following. If someone can point me in the right direction, I can write the code myself.
Environment
I am using an ESP32 and working in Arduino C++.
What I want to achieve
I want to track the amount of time an actuator has been on over the past x minutes. This is to prevent the actuator from over-heating.
My idea
Storing current times in an array every time the actuator goes on (it is on for a fixed amount of time). When the oldest measurement is older than x minutes, it is removed from the array. If the array exceeds a certain size (e.g. certain amount of minutes the actuator has been on), a cool down period is started.
However, I feel there must be a more efficient / easy way to achieve this. How would you go about this?
Thanks in advance.
If possible, has temperature sensor is the easiest way.
With array, there will be problem with the size, especially, if you want to count in minutes. For counting, there is also way for easier as following:
T is the total time ON in last xx minutes as you expected. During initialization, it will be 0.
If actuator is ON, so every check cycle (may be every s or smaller depend on your required), T will be increase value = cycle time
If actuator is OFF: if T>0 then decrease value = cycle time, if T= 0, nothin to subtract more.

RRDTool data values (e.g. max value) are different in different time resolutions

currently I'm experimenting a bit with RRDTool. I'm aware that the accuracy gets lower the longer the time periods are selected. But I thought I could bypass this with my datasource settings.
For example temperature and humidity from my house, resoultion 1h:
And now with the resolution of 1d:
As you could see, there is a great difference for the max. value of the blue line.
I created my datasources and archives with this values:
"rrdtool create temp.rrd --step 30",
"DS:temp:GAUGE:60:U:U",
"DS:humidity:GAUGE:60:U:U",
"RRA:AVERAGE:0.5:1:1051200",
"RRA:MAX:0.5:1:1051200",
"RRA:MIN:0.5:1:1051200",
I thought that 1051200 (1 year = 31536000 / 30 s (resoulution) = 1051200) is correct for saving every value for a year and that there should be no need for interpolating.
Is it possible to get the exact values displayed even if the resolution changes (for example the max humidity (Luftfeuchtigkeit) at 99.9%)?
Here are my values for image creation:
"--start" => "-1h", (-1d etc-)
"--title" => "Haustemperatur",
"--vertical-label" => "°C / % RLF",
"--width" => 800,
"--height" => 600,
"--lower-limit" => "-5",
"DEF:temperatur=$rrdFile:temperatur:LAST",
"DEF:humidity=$rrdFile:humidity:LAST",
"LINE1:temperatur#33CC33:Temperatur",
"GPRINT:temperatur:LAST:\t\tAktuell\: %4.2lf °C",
"GPRINT:temperatur:AVERAGE:Schnitt\: %4.2lf °C",
"GPRINT:temperatur:MAX:Maximum\: %4.2lf °C\j",
"LINE1:humidity#0000FF:Relative Luftfeuchtigkeit",
"GPRINT:humidity:LAST:Aktuell\: %4.2lf %%",
"GPRINT:humidity:AVERAGE:Schnitt\: %4.2lf %%",
"GPRINT:humidity:MAX:Maximum\: %4.2lf %%\j",
Thanks for your help and any suggestions.
P.S. I'm using a library to generate the graphs and the database, please do not be surprised about possible syntax errors.
Your problem is that you are causing the values to be rolled-up on the fly at graph time, but have not correctly specified which rollup function to use. Your second graph is showing the MAXIMUM of the LAST in the interval, not the true Maximum.
There are a few issues to explain with this configuration:
Firstly, your RRD is defined using 3 RRAs with 1cdp=1pdp and different consolidation functions (AVG, MIN, MAX). This means they are functionally identical, but they do not save you any time at graphing as they have not done any pre-rollup for you! You should definitely consider having just one of these (probably AVG) and adding others at lower resolution to help speed up graphing when you have a bigger time window.
Secondly, you need to specify the on-the-fly rollup function. When graphing, RRDTool will work out the best RRA to use based on your DEF lines, and will perform any additional consolidation required on the fly. This can take a long time if the only available RRA is too high-granularity.
Your graph request uses DEF:temperatur=$rrdFile:temperatur:LAST but you do not actually have a LAST type RRA, so RRDTool will grab the last average. Your RRA data points are at 30s interval, but your second graph has (approx) 5min per pixel, meaning that RRDTool needs to grab the 10 entries from the RRA, and print the last. Looking at the data in the top graph, it seems that the last in that interval was the 66 value, though previous ones were 100.
So you have a choice. Do you want the graph to show the average for the time period, the maximum, or both? Do you want the figures at the bottom to show the maximum of the average, or the maximum of everything?
For example
"DEF:temperatur=$rrdFile:temperatur:AVERAGE",
"DEF:humidity=$rrdFile:humidity:AVERAGE",
"DEF:temperaturmax=$rrdFile:temperatur:MAX;reduce=MAX",
"DEF:humiditymax=$rrdFile:humidity:MAX;reduce=MAX",
"LINE1:temperatur#33CC33:Temperatur",
"LINE1:temperaturmax#66EE66:Maximum Temperatur",
"GPRINT:temperatur:LAST:\t\tAktuell\: %4.2lf °C",
"GPRINT:temperatur:AVERAGE:Schnitt\: %4.2lf °C",
"GPRINT:temperaturmax:MAX:Maximum\: %4.2lf °C\j",
"LINE1:humidity#0000FF:Relative Luftfeuchtigkeit",
"LINE1:humiditymax#3333FF:Maximum Luftfeuchtigkeit",
"GPRINT:humidity:LAST:Aktuell\: %4.2lf %%",
"GPRINT:humidity:AVERAGE:Schnitt\: %4.2lf %%",
"GPRINT:humiditymax:MAX:Maximum\: %4.2lf %%\j",
In this case, we define a separate DEF for the maximum data set, so that we can always obtain the highest value even after consolidation. This is also used in the GPRINT so that we get the MAX of the MAX rather than the MAX of the AVERAGE. The Maximum line is now drawn separately to the average line, so that we can see the effect of any rollup of data - the lines will be together at high-resolution but get further apart as the time window widens and resolution decreases.
TheDEF is set to force any rollup function used for the maxima to be MAX rather than AVG, so we can be sure to get the maximum rather than average of maxima.
We are also using AVERAGE rather than LAST in order to get more meaningful data after rollup. Note that we could also use a separate DEF for the LAST as well if we wanted to though it is of less usefulness.
Note that, if you ever expect to be generating graphs over more than a few days, you should definitely consider adding some lower-resolution RRAs for AVERAGE and MAX or else the graphs will generate very slowly. RRDTool is designed with the intention that data will be rolled up over time, rather than (as in a traditional database) every sample kept as-is. So, unless you really need to have 30s resolution data kept for an entire year, you may prefer to keep this high resolution data for only a week, and then have separate RRAs that roll up to 1 hour resolution and keep for longer. Many people keep the 30s for 2 days, then 30min-summary for 2 weeks, 2h-summary for 2 months, and then 1day-summary for 2 years.
For more information, see the RRDTool manual pages.

Interpreting Stackdriver Metrics meaning "Sampled every 60 seconds. After sampling, data is not visible for up to 240 seconds"

We are in process of identifying Stackdriver metrics
I am specifically looking at GCP predefined metric subscription/ack_message_count with description Cumulative count of messages acknowledged by Acknowledge requests, grouped by delivery type. Sampled every 60 seconds. After sampling, data is not visible for up to 240 seconds.
Can any one help me understand highlighted part, what does Sampled every 60 seconds. After sampling, data is not visible for up to 240 seconds. mean
once i check this metric will it not able available for next 240 seconds.
Thanks
"Sampled every" refers to granularity. In this case, you'll get a data point for every minute.
"not visible" refers to freshness. In this case, the newest data point will describe the system as it was 4 minutes ago. Put another way, if you do something and watch the graphs you won't see the metric reflect the change for 4 minutes.
From my understanding, the data is polled every 60 seconds but at the metrics creation the time until the data is polled would take up to 240 seconds. The BigQuery section makes this a bit clearer. Because the numbers are as such that it would not be feasible in an other context
Example: Scanned bytes. Sampled every 60 seconds. After sampling, data is not visible for up to 21720 seconds.

AWS CloudWatch metric math with a cumulative metric's value 30 minutes ago to show rate of change

I have a AWS CloudWatch custom metric that represents a cumulative value which continues to increase overtime. I will add that metric to a dashboard, but I also want to show the rate of change of this metric over the last 30 minutes. Ideally I would like a function to return the metric's value from 30 minutes ago and subtract that from the current value. The "Rate()" function does not seem to help.
I could submit the metrics value a second time with a timestamp that is 30 minutes in the future and subtract these two metrics, but I am hoping for a solution that uses metric math and does not force me to submit another metric. I can think of other use cases where I might want to do math with metrics from different time periods.
Hope I am just missing something here!
You can use some arithmetic to obtain the previous value and then you're able to calculate the percentage of change as you want.
The value you want is: (value_now - value_before) / value_before
Breaking this into 2 parts:
Obtain value_now - value_before. This is the absolute delta of the values.
Obtain value_before. This is the value of the metric in the last datapoint.
Assuming that your metric in Cloudwatch is m.
Step 1: The absolute delta
The absolute_delta can be obtained with: absolute_delta = RATE(m) * PERIOD(m).
Step 2: The previous value
With some arithmetic it is possible to obtain previous_value. Given the definition of absolute delta:
absolute_delta = value_now - value_before
Since we have value_now = m and absolute_delta, then it's a matter of inverting the equation:
value_before = value_now - absolute_delta
Final equation
Just plug everything together and you have your final metric:
change_percentage = 100 * absolute_delta / value_before
In CloudWatch terms:
Metric math function RATE() calculates the rate of change per second.
Returns the rate of change of the metric, per second. This is calculated as the difference between the latest data point value and the previous data point value, divided by the time difference in seconds between the two values.
From https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math.html
So to get the rate of change for your period you could do this:
RATE(m1)*PERIOD(m1)
and set the period of the dashboard to the wanted value.
Problem in your case is that you need it for a period of 30 min, I don't think you can set 30 min as period on the CloudWatch dashboard. Closest values would be 15 min or 1 hour.

Is there a 'value in time' concept in metrics, or how do I create one?

I am using metrics-clojure http://metrics-clojure.readthedocs.io/en/latest/ lists gauges, counters, meters, timers and histograms.
What I want is instead to report a number.
Very much like counter, but with a set! operation instead of just inc!/dec! or a meter that accepts a value.
One use case is processing batches of events. I can create a meter to watch the batches, but I would prefer to include the batch size such that the reporting end can use the correct units (so I can plot the number of events processed instead of the number of batches).
Another use case is wanting to produce a plot of some number that changes over time. Say again I was processing events, and I wanted to plot per event how many unique combinations of events I'd seen so far, how can I do this?
I can fake this a little bit with a gauge. I can create an atom, and have the gauge report the atom value, and set the atom value in the code... but I can't control when the gauge will report the value. So the value will only be plotted at points whenever the gauge happened to be queried, but I might want to record the values at more specific points (like the end of a batch, at intervals in a batch, or on every event).
And it seems convoluted.
Any suggestions?