How to compare event count value with previous time interval event - clojure

I am looking for whether I can compare the total number of event count for the current one hr interval with the total number of event count with the previous one hour interval and if the current hour count is less than previous hour count then one email should get triggered from Riemann.
I am not sure whether we can store the value and compare it with the current event value because I learned events will get expired due to TTL option in Riemann.
Please correct me if I am wrong and suggest me a reference code to achieve it in Riemann.
Thanks in advance

It sounds like you want the rate of change of the count over an hour and then to decide if that rate is negative? One way to do this is just as you describe:
(fold-interval-metric 3600 folds/count
(fixed-event-window 2
(smap folds/difference
(where (neg? (:metric event))
email))))
and this makes sense. You may find that if you use the built in derivative over time function ddt that and graph it you can spot these problems over much shorter timescales. If your success rate falls to zero on minute three of an hour 57 minutes is a long time for the computer to wait before it calls a human for help. If the rate of change on a 15 minute period approches negative infinity it's very likely that your service just stopped.
I'm fond of wrapping ddt in the exponential weighted moving average ewma so spikes don't set off the alarms and have had an extremely low false positive rate with this pattern:
(ewma 30 (ddt ...your stuff here...))
I often want to compare the rate of the requests to a service with the responses with this pattern which uses ewma ddt and project:
(pipe ↲ (splitp = service
"service:input" (ewma 30 ↲)
"service:output" (ewma 30 ↲)
bit-bucket) ;; throw out other services here
(project [(service "service:input")
(service "service:output")]
(smap folds/quotient-sloppy
(with :service "service-ratio-rate-of-change"
(ddt ...your streams here...)))))
If requests are infrequent you will need to play with the interval in all these examples to ensure that the alarms don't go off between events. If your events are infrequent you may also need to set the :ttl on the events high enough that they don't expire while you are agrigating them.
ps: the ↲ can be any symbol(s) you want, I just chose that unicode character.
pss: a false posative rate of one alarm per quarter should be reasonable if you consider these things carefully.

Related

How can I check if a message is about to pass the MessageRetentionPeriod?

I have an app that uses SQS to queue jobs. Ideally I want every job to be completed, but some are going to fail. Sometimes re-running them will work, and sometimes they will just keep failing until the retention period is reached. . I want to keep failing jobs in the queue as long as possible, to give them the maximum possible chance of success, so I don't want to set a maxReceiveCount. But I do want to detect when a job reaches the MessageRetentionPeriod limit, as I need to send an alert when a job fails completely. Currently I have the max retention at 14 days, but some jobs will still not be completed by then.
Is there a way to detect when a job is about to expire, and from there send it to a deadletter queue for additional processing?
Before you follow my advice below and assuming I've done the math for periods correctly, you will be better off enabling a redrive policy on the queue if you check for messages less often than every 20 minutes and 9 seconds.
SQS's "redrive policy" allows you to migrates messages to a dead letter queue after a threshold number of receives. The maximum receives that AWS allows for this is 1000, and over 14 days that works out to about 20 minutes per receive. (For simplicity, that is assuming that your job never misses an attempt to read queue messages. You can tweak the numbers to build in a tolerance for failure.)
If you check more often than that, you'll want to implement the solution below.
You can check for this "cutoff date" (when the job is about to expire) as you process the messages, and send messages to the deadletter queue if they've passed the time when you've given up on them.
Pseudocode to add to your current routine:
Call GetQueueAttributes to get the count, in seconds, of your queue's Message Retention Period.
Call ReceiveMessage to pull messages off of the queue. Make sure to explicitly request that the SentTimestamp is visible.
Foreach message,
Find your message's expiration time by adding the message retention period to the sent timestamp.
Create your cutoff date by subtracting your desired amount of time from the message's expiration time.
Compare the cutoff date with the current time. If the cutoff date has passed:
Call SendMessage to send your message to the Dead Letter queue.
Call DeleteMessage to remove your message from the queue you are processing.
If the cutoff date has not passed:
Process the job as normal.
Here's an example implementation in Powershell:
$queueUrl = "https://sqs.amazonaws.com/0000/my-queue"
$deadLetterQueueUrl = "https://sqs.amazonaws.com/0000/deadletter"
# Get the message retention period in seconds
$messageRetentionPeriod = (Get-SQSQueueAttribute -AttributeNames "MessageRetentionPeriod" -QueueUrl $queueUrl).Attributes.MessageRetentionPeriod
# Receive messages from our queue.
$queueMessages = #(receive-sqsmessage -QueueUrl $queueUrl -WaitTimeSeconds 5 -AttributeNames SentTimestamp)
foreach($message in $queueMessages)
{
# The sent timestamp is in epoch time.
$sentTimestampUnix = $message.Attributes.SentTimestamp
# For powershell, we need to do some quick conversion to get a DateTime.
$sentTimestamp = ([datetime]'1970-01-01 00:00:00').AddMilliseconds($sentTimestampUnix)
# Get the expiration time by adding the retention period to the sent time.
$expirationTime = $sentTimestamp.AddDays($messageRetentionPeriod / 86400 )
# I want my cutoff date to be one hour before the expiration time.
$cutoffDate = $expirationTime.AddHours(-1)
# Check if the cutoff date has passed.
if((Get-Date) -ge $cutoffDate)
{
# Cutoff Date has passed, move to deadletter queue
Send-SQSMessage -QueueUrl $deadLetterQueueUrl -MessageBody $message.Body
remove-sqsmessage -QueueUrl $queueUrl -ReceiptHandle $message.ReceiptHandle -Force
}
else
{
# Cutoff Date has not passed. Retry job?
}
}
This will add some overhead to every message you process. This also assumes that your message handler will receive the message inbetween the cutoff time and the expiration time. Make sure that your application is polling often enough to receive the message.

measuring concurent loop times in erlang

I create a round of processes in erlang and wish to measure the time that it took for the first message to pass throigh the network and the entire message series, each time the first node gets the message back it sends another one.
right now in the first node i have the following code:
receive
stop->
io:format("all processes stopped!~n"),
true;
start->
statistics(runtime),
Son!{number, 1},
msg(PID, Son, M, 1);
{_, M} ->
{Time1, _} = statistics(runtime),
io:format("The last message has arrived after ~p! ~n",[Time1*1000]),
Son!stop;
of course i start the statistics when sending the first message.
as you can see i use the Time_Since_Last_Call for the first message loop and wish to use the Total_Run_Time for the entire run, the problem is that Total_Run_Time is accumulative since i start the statistics for the first time.
The second thought i had in mind is using another process with 2 receive loops getting the times for each one adding them and printing but i'm sure that erlang can do better than this.
i guess the best method to solve this is somehow flush the Total_Run_Time, but i couldn't find how this could be done. any ideas how this can be tackled?
One way to measure round-trip times would be to send a timestamp along with each message. When the first node receives the message, it can then measure the round-trip time, calculating Total_Run_Time - Timestamp.
To calculate the total run time, I would memorize the first timestamp in the process state (or dictionary), and calculate the total run time when stopping the test.
Besides, given that you mention the network, are you sure that the CPU time (which is what statistics(runtime) calculates is what you're after? Perhaps, wall clock time would be more appropriate.

Scheduling reset every 24 hours at midnight

I have a counter "numberOrders" and i want to reset it everyday at midnight, to know how many orders I get in one day, what I have right now is this:
val system = akka.actor.ActorSystem("system")
system.scheduler.schedule(86400000 milliseconds, 0 milliseconds){(numberOrders = 0)}
This piece of code is inside a def which is called every time i get a new order, so want it does is: reset numberOrders after 24hours from the first order or from every order, I'm not really sure if every time there's a new order is going to reset after 24 hours, which is not what I want. I want to rest the variable everyday at midnight, any idea? Thanks!
To further increase pushy's answer. Since you might not always be sure when the site started and if you want to be exactly sure it runs at midnight you can do the following
val system = akka.actor.ActorSystem("system")
val wait = (24 hours).toMillis - System.currentTimeMillis
system.scheduler.schedule(Duration.apply(wait, MILLISECONDS), 24 hours, orderActor, ResetCounterMessage)
Might not be the tidiest of solutions but it does the job.
As schedule supports repeated executions, you could just set the interval parameter to 24 hours, the initial delay to the amount of time between now and midnight, and initiate the code at startup. You seem to be creating a new actorSystem every time you get an order right now, that does not seem quite right, and you would be rid of that as well.
Also I would suggest using the schedule method which sends messages to actors instead. This way the actor that processes the order could keep count, and if it receives a ResetCounter message it would simply reset the counter. You could simply write:
system.scheduler.schedule(x seconds, 24 hours, orderActor, ResetCounterMessage)
when you start up your actor system initially, and be done with it.

Log Slow Pages Taking Longer Than [n] Seconds In ColdFusion with Details

(ACF9)
Unless there's an option I'm missing, the "Log Slow Pages Taking Longer Than [n] Seconds" setting isn't useful for front-controller based sites (e.g., Model-Glue, FW/1, Fusebox, Mach-II, etc.).
For instance, in a Mura/Framework-One site, I just end up with:
"Warning","jrpp-186","04/25/13","15:26:36",,"Thread: jrpp-186, processing template: /home/mysite/public_html_cms/wwwroot/index.cfm, completed in 11 seconds, exceeding the 10 second warning limit"
"Warning","jrpp-196","04/25/13","15:27:11",,"Thread: jrpp-196, processing template: /home/mysite/public_html_cms/wwwroot/index.cfm, completed in 59 seconds, exceeding the 10 second warning limit"
"Warning","jrpp-214","04/25/13","15:28:56",,"Thread: jrpp-214, processing template: /home/mysite/public_html_cms/wwwroot/index.cfm, completed in 32 seconds, exceeding the 10 second warning limit"
"Warning","jrpp-134","04/25/13","15:31:53",,"Thread: jrpp-134, processing template: /home/mysite/public_html_cms/wwwroot/index.cfm, completed in 11 seconds, exceeding the 10 second warning limit"
Is there some way to get query string or post details in there, or is there another way to get what I'm after?
You can easily add some logging to your application for any requests that take longer than 10 seconds.
In onRequestStart():
request.startTime = getTickCount();
In onRequestEnd():
request.endTime = getTickCount();
if (request.endTime - request.startTime > 10000){
writeLog(cgi.QUERY_STRING);
}
If you're writing a Mach-II, FW/1 or ColdBox application, it's trivial to write a "plugin" that runs on every request which captures the URL or FORM variables passed in the request and stores that in a simple database table or log file. (You can even capture session.userID or IP address or whatever you may need.) If you're capturing to a database table, you'll probably not want any indexes to optimize for performance and you'll need to rotate that table so you're not trying to do high-speed inserts on a table with tens of millions of rows.
In Mach-II, you'd write a plugin.
In FW/1, you'd put a call to a controller which handles this into setupRequest() in your application.cfc.
In ColdBox, you'd write an interceptor.
The idea is that the log just tells you what pages arw xonsostently slow sp ypu can do your own performance tuning.
Turn on debugging for further details for a start.

Limit iterations per time unit

Is there a way to limit iterations per time unit? For example, I have a loop like this:
for (int i = 0; i < 100000; i++)
{
// do stuff
}
I want to limit the loop above so there will be maximum of 30 iterations per second.
I would also like the iterations to be evenly positioned in the timeline so not something like 30 iterations in first 0.4s and then wait 0.6s.
Is that possible? It does not have to be completely precise (though the more precise it will be the better).
#FredOverflow My program is running
very fast. It is sending data over
wifi to another program which is not
fast enough to handle them at the
current rate. – Richard Knop
Then you should probably have the program you're sending data to send an acknowledgment when it's finished receiving the last chunk of data you sent then send the next chunk. Anything else will just cause you frustrations down the line as circumstances change.
Suppose you have a good Now() function (GetTickCount() is bad example, it's OS specific and has bad precision):
for (int i = 0; i < 1000; i++){
DWORD have_to_sleep_until = GetTickCount() + EXPECTED_ITERATION_TIME_MS;
// do stuff
Sleep(max(0, have_to_sleep_until - GetTickCount()));
};
You can check elapsed time inside the loop, but it may be not an usual solution. Because computation time is totally up to the performance of the machine and algorithm, people optimize it during their development time(ex. many game programmer requires at least 25-30 frames per second for properly smooth animation).
easiest way (for windows) is to use QueryPerformanceCounter(). Some pseudo-code below.
QueryPerformanceFrequency(&freq)
timeWanted = 1.0/30.0 //time per iteration if 30 iterations / sec
for i
QueryPerf(count1)
do stuff
queryPerf(count2)
timeElapsed = (double)(c2 - c1) * (double)(1e3) / double(freq) //time in milliseconds
timeDiff = timeWanted - timeElapsed
if (timeDiff > 0)
QueryPerf(c3)
QueryPerf(c4)
while ((double)(c4 - c3) * (double)(1e3) / double(freq) < timeDiff)
queryPerf(c4)
end for
EDIT: You must make sure that the 'do stuff' area takes less time than your framerate or else it doesn't matter. Also instead of 1e3 for milliseconds, you can go all the way to nanoseconds if you do 1e9 (if you want that much accuracy)
WARNING... this will eat your CPU but give you good 'software' timing... Do it in a separate thread (and only if you have more than 1 processor) so that any guis wont lock. You can put a conditional in there to stop the loop if this is a multi-threaded app too.
#FredOverflow My program is running very fast. It is sending data over wifi to another program which is not fast enough to handle them at the current rate. – Richard Knop
What you might need a buffer or queue at the receiver side. The thread that receives the messages from the client (like through a socket) get the message and put it in the queue. The actual consumer of the messages reads/pops from the queue. Of course you need concurrency control for your queue.
Besides the flow control methods mentioned, if you also have the need to maintain an accurate specific data sending rate in your sender part. Usually it can be done like this.
E.x. if you want to send at 10Mbps, create a timer of interval 1ms so it will call a predefined function every 1ms. Then in the timer handler function, by keep tracking of 2 static variables 1)Time elapsed since beginning of sending data 2)How much data in bytes have been sent up to last call, you can easily calculate how much data is needed to be sent in the current call (or just sleep and wait for next call).
By this way, you can do "streaming" of data in a very stable way with very little jitterness, and this is usually adopted in streaming of videos. Of course it also depends on how accurate the timer is.