clojure Riemann project collectd - clojure

I am trying to do a custom configuration apparently simple using Riemann and Collectd. Basically I'd like to calculate the ratio between two streams. In order to do that I tried something like (as in Rieamann API project suggestion here):
(project [(service "cahe-miss")
(service "cache-all")]
(smap folds/quotient
(with :service "ratio"
index)))
Which apparently works, but after a while I noticed some of the results where miss calculated. After log debugging I finished with the following configuration in order to see what's happening and proint the values:
(project [(service "cache-miss")
(service "cache-all")]
(fn [[miss all]]
(if (or (nil? miss) (nil? all))
(do nil)
(do (where (= (:time miss) (:time all))
;to print time marks
(println (:time all))
(println (:time miss))
; to distinguish easily each event
(println "NEW LINE")
))
)
)
)
My surprise is that each time I get new data from collectd (every 10 seconds) the function I created is executed twice, like reusing previous unused data, and more over it looks like it doesn't care at all about my time equality constraint in the (where (= :time....) clasue. The problem is than I am dividing metrics with different time stamp. Below some ouput of the previous code:
1445606294
1445606294
NEW LINE -- First time I get data
1445606304
1445606294
NEW LINE
1445606304
1445606304
NEW LINE -- Second time I get data
1445606314
1445606304
NEW LINE
1445606314
1445606314
NEW LINE -- Third time I get data
Is there anyone that can give a hint on how to get the data formatted as I expected? I assume there is something I am not understading about the "project" function. Or something related to how incoming data is processed in riemann.
Thanks in advance!
Updated
I managed to solve my problem but still I don't have a clear idea of how it works, however I managed to do so. Right now I am receiving two different streams from collectd tail plugin (from nginx logs) and I managed to make the quotient between them as it follows:
(where (or (service "nginx/counter-cacheHit") (service "nginx/counter-cacheAll"))
(coalesce
(smap folds/quotient (with :service "cacheHit" (scale (* 1 100) index)))))
I have tested it widely and up to now it produces the right results. However I still don't understand several things... First, how it is that coalesce only returns data after both events are processed. Collectd sends the events of the both streams every two seconds with the same time mark, usin "project" instead of "coalesce" resulted in two different executions of smap each two seconds (one for each event), however coalesce results only with one execution of smap with the two events with the same time mark, which is exactly what I wanted.
Finally, I don't know which is the criteria to choose which is the numerator and denominator. Is it becaouse of the "or" clauses in "where" clause?
Anyway, with some blackmagic behind it but I managed to solve my problem ;^)
Thank you all!

taking the ratios between streams that where moving at different rates didn't work out for me. I have since settled on calculating ratios and rates within a fixed time interval or a moving time interval. This way you are capturing a consistent snapshot of events in a time block and calculating over this. Here is some elided code from comparing the rate a service is receiving events to the rate at which it is forwarding events:
(moving-time-window 30 ;; seconds
(smap (fn [events]
(let [in (or (->> events
(filter #(= (:service %) "event-received"))
count)
0)
out (or (->> events
(filter #(= (:service %) "event-sent"))
count)
0)
flow-rate (float (if (> in 0) (/ out in) 0))]
{:service "flow rate"
:metric flow-rate
:host "All"
:state (if (< flow-rate 0.99) "WARNING" "OK")
:time (:time (last events))
:ttl default-interval}))
(tag ["some" "tags" "here"] index)
(where (and
(< (:metric event) 0.9)
(= (:environment event) "production"))
(throttle 1 3600 send-to-slack))))
This takes in a fixed window of events, calculates the ratio for that block and emits an event containing that ratio as it's metric. then if the metric is bad it calls me on slack.

Related

Sleeping Barber Problem in Clojure using core/async and go

Hello and thank you for any help. I am just starting to learn Clojure, and think its amazing. Below is my codes for the sleeping barber problem. I thought that dropping-buffers from core/async would be perfect for this problem, and while it seems to work it never stops.
The haircuts and dropping buffer seem to work right.
---Edited
It does stop now. But I get an error trying to check customer-num for nil (I've noted the line in the code below). It seems like it can't do an if on a nil because it's nil!
(if (not (nil? customer-num)) ;; throws error => Cannot invoke "clojure.lang.IFn.invoke()" because the return value of "clojure.lang.IFn.invoke(Object)" is null
---End of edit
Also, how to get the return value of the number of haircuts to the calling operate-shop?
Sleeping barber problem as written up in Seven Languages in Seven Weeks. It was created by Edsger Dijkstra in 1965.
A barber shop takes customers.
Customers arrive at random intervals, from ten to thirty milliseconds.
The barber shop has three chairs in the waiting room.
The barber shop has one barber and one barber chair.
When the barber’s chair is empty, a customer sits in the chair, wakes up the barber, and gets a haircut.
If the chairs are occupied, all new customers will turn away.
Haircuts take twenty milliseconds.
After a customer receives a haircut, he gets up and leaves.
Determine how many haircuts a barber can give in ten seconds.
(ns sleepbarber
(:require [clojure.core.async
:as a
:refer [>! <! >!! <!! go go-loop chan dropping-buffer close! thread
alt! alts! alts!! timeout]]))
(def barber-shop (chan (dropping-buffer 3))) ;; no more than 3 customers waiting
(defn cut-hair []
(go-loop [haircuts 0]
(let [customer-num (<! barber-shop)]
(if (not (nil? customer-num)) ;; throws error => Cannot invoke "clojure.lang.IFn.invoke()" because the return value of "clojure.lang.IFn.invoke(Object)" is null
(do (<! (timeout 20)) ;; wait for haircut to finish
(println haircuts "haircuts!" (- customer-num haircuts) "customers turned away!!")
(recur (inc haircuts)))
haircuts))))
(defn operate-shop [open-time]
((let [[_ opening] (alts!! [(timeout open-time)
(go-loop [customer 0]
(<! (timeout (+ 10 (rand-int 20)))) ;; wait for random arrival of customers
(>! barber-shop customer)
(recur (+ customer 1)))])]
(close! barber-shop)
(close! opening)
)))
(cut-hair)
(operate-shop 2000)
Without running your code to confirm my suspicions, I see two problems with your implementation.
The first is that the body of operate-shop starts with ((, which you appear to intend as a grouping mechanism. But of course, in Clojure, (f x y) is how you call the function f with arguments x y. So your implementation calls alts!, then calls close!, then calls shutdown-agents - all intended so far - but then calls the result of alts! (which surely is not a function) with two nil arguments. So you should get a ClassCastException once your shop closes. Normally I would recommend just removing the outer parens, but since you're using core.async you should wrap the body in go, as in (go x y z). Is this your real code? If you call alts! outside of a go context, as your snippet suggests, you can only get a runtime error.
The second is that your first go-loop has no termination condition. You treat customer-num as if it were a number, but if the channel is closed, it will be nil: that's how you can tell a channel is closed. Involving it in subtraction should throw some kind of exception. Instead, you should check whether the result is nil, and if so, exit the loop as the shop is closed.

core.async loop blocked waiting to read from channel

let's say I've got a channel out (chan). I need to take values that are put into the channel and add them. The number of values is undetermined (thus cannot use traditional loop with an end case with (<! out)) and comes from an external IO. I'm using a fixed timeout with alts! but that doesn't seem like the best way to approach the problem. So far, I've got the following (which I got from https://gist.github.com/schaueho/5726a96641693dce3e47)
(go-loop
[[v ch] (alts! [out (timeout 1000)])
acc 0]
(if-not v
(do (close! out)
(deliver p acc))
(do
(>! task-ch (as/progress-tick))
(recur (alts! [out (timeout 1000)]) (+ acc v)))))
The problem I've got is that a timeout of 1000 is sometimes not enough and causes the go-loop to exit prematurely (as it may take more than 1000ms for the IO operation to complete and put the val in the out channel). I do not think increasing the timeout value is such a good idea as it may cause me to wait longer than necessary.
What is the best way to guarantee all reads from the out channel and exit out correctly from the loop?
Update:
Why am I using timeout?
Because the number of values being put in the channel is not fixed; which means, I cannot create an exit case. W/o the exit case, the go-loop will park indefinely waiting ((<! out)) for values to be put in the channel out. If you have a solve without the timeout, that'd be really awesome.
How do i know I've read the last value?
I dont. That's the problem. That's why I'm using timeout and alts!! to exit the go-loop.
What do you want to do w/ the result?
Simple addition for now. However, that's not the important bit.
Update Final:
I figured out a way to get the number of values I'd be dealing with. So I modified my logic to make use of that. I'm still going to use the timeout and alts! to prevent any locking.
(go-loop
[[v _] (alts! [out (timeout 1000)])
i 0
acc 0]
(if (and v (not= n i))
(do
(>! task-ch (as/progress-tick))
(recur (alts! [out (timeout 1000)]) (inc i) (+ acc v)))
(do (close! out)
(deliver p* (if (= n i) acc nil)))))
I think your problem is a bit higher-up in your design, not a core-async specific one:
On one hand, you have an undetermined amount of values coming in a channel — there could be 0, there could be 10, there could be 1,000,000.
On the other hand, you want to read all of them, do some calculation and then return. This is impossible to do — unless there is some other signal that you can use to say "I think I'm done now".
If that signal is the timing of the values, then your approach of using alts! is the correct one, albeit I believe the code can be simplified a bit.
Update: Do you have access to the "upstream" IO? Can you put a sentinel value (e.g. something like ::closed) to the channel when the IO operation finishes?
The 'best' way is to wait for either a special batch ending message from out or for out to be closed by the sender to mark the end of the inputs.
Either way, the solution rests with the sender communicating something about the inputs.

Closing a channel at the producer end when all the jobs are finished

For my Mandelbrot explorer project, I need to run several expensive jobs, ideally in parallel. I decided to try chunking the jobs, and running each chunk in its own thread, and end ended up with something like
(defn point-calculator [chunk-size points]
(let [out-chan (chan (count points))
chunked (partition chunk-size points)]
(doseq [chunk chunked]
(thread
(let [processed-chunk (expensive-calculation chunk)]
(>!! out-chan processed-chunk))))
out-chan))
Where points is a list of [real, imaginary] coordinates to be tested, and expensive-calculation is a function that takes the chunk, and tests each point in the chunk. Each chunk can take a long time to finish (potentially a minute or more depending on the chunk size and the number of jobs).
On my consumer end, I'm using
(loop []
(when-let [proc-chunk (<!! result-chan)]
; Do stuff with chunk
(recur)))
To consume each processed chunk. Right now, this blocks when the last chunk is consumed since the channel is still open.
I need a way of closing the channel when the jobs are done. This is proving difficult because of asynchronicity of the producer loop. I can't simply put a close! after the doseq since the loop doesn't block, and I can't just close when the last-indexed job is done, since the order is indeterminate.
The best idea I could come up with was maintaining a (atom #{}) of jobs, and disj each job as it finishes. Then I could either check for the set size in the loop, and close! when it's 0, or attach a watch to the atom and check there.
This seems very hackish though. Is there a more idiomatic way of dealing with this? Does this scenario suggest I'm using async incorrectly?
i would take a look at the take function from core-async. That is what it's documentation says:
"Returns a channel that will return, at most, n items from ch. After n items
have been returned, or ch has been closed, the return channel will close.
"
so it leads you to a simple fix: instead of returning out-chan you can just wrap it into take:
(clojure.core.async/take (count chunked) out-chan)
that should work.
Also i would recommend you to rewrite your example from blocking put/get to parking (<!, >!) and thread to go / go-loop which is more idiomatic usage for core async.
You may want to use async/pipeline(-blocking) to control parallelisms. And use aysnc/onto-chan to close the input channel automatically after all the chunks are copied.
E.g. below example shows a 16x improvement on elapsed time when parallelisms is set to 16.
(defn expensive-calculation [pts]
(Thread/sleep 100)
(reduce + pts))
(time
(let [points (take 10000 (repeatedly #(rand 100)))
chunk-size 500
inp-chan (chan)
out-chan (chan)]
(go-loop [] (when-let [res (<! out-chan)]
;; do stuff with chunk
(recur)))
(pipeline-blocking 16 out-chan (map expensive-calculation) inp-chan)
(<!! (onto-chan inp-chan (partition-all chunk-size points)))))

flushing the content of a core.async channel

Consider a core.async channel which is created like so:
(def c (chan))
And let's assume values are put and taken to this channel from different places (eg. in go-loops).
How would one flush all the items on the channel at a certain time?
For instance one could make the channel an atom and then have an event like this:
(def c (atom (chan))
(defn reset []
(close! #c)
(reset! c (chan)))
Is there another way to do so?
Read everything to a vector with into and don't use it.
(go (async/into [] c))
Let's define a little more clearly what you seem to want to do: you have code running in several go-loops, each of them putting data on the same channel. You want to be able to tell them all: "the channel you're putting values on is no good anymore; from now on, put your values on some other channel." If that's not what you want to do, then your original question doesn't make much sense, as there's no "flushing" to be done -- you either take the values being put on the channel, or you don't.
First, understand the reason your approach won't work, which the comments to your question touch on: if you deref an atom c, you get a channel, and that value is always the same channel. You have code in go-loops that have called >! and are currently parked, waiting for takers. When you close #c, those parked threads stay parked (anyone parked while taking from a channel (<!) will immediately get the value nil when the channel closes, but parked >!s will simply stay parked). You can reset! c all day long, but the parked threads are still parked on a previous value they got from derefing.
So, how do you do it? Here's one approach.
(require '[clojure.core.async :as a
:refer [>! <! >!! <!! alt! take! go-loop chan close! mult tap]])
(def rand-int-chan (chan))
(def control-chan (chan))
(def control-chan-mult (mult control-chan))
(defn create-worker
[put-chan control-chan worker-num]
(go-loop [put-chan put-chan]
(alt!
[[put-chan (rand-int 10)]]
([_ _] (println (str "Worker" worker-num " generated value."))
(recur put-chan))
control-chan
([new-chan] (recur new-chan)))))
(defn create-workers
[n c cc]
(dotimes [n n]
(let [tap-chan (chan)]
(a/tap cc tap-chan)
(create-worker c tap-chan n))))
(create-workers 5 rand-int-chan control-chan-mult)
So we are going to create 5 worker loops that will put their result on rand-int-chan, and we will give them a "control channel." I will let you explore mult and tap on your own, but in short, we are creating a single channel which we can put values on, and that value is then broadcast to all channels which tap it.
In our worker loop, we do one of two things: put a value onto the rand-int-chan that we use when we create it, or we will take a value off of this control channel. We can cleverly let the worker thread know that the channel to put its values on has changed by actually handing it the new channel, which it will then bind on the next time through the loop. So, to see it in action:
(<!! rand-int-chan)
=> 6
Worker2 generated value.
This will take random ints from the channel, and the worker thread will print that it has generated a value, to see that indeed multiple threads are participating here.
Now, let's say we want to change the channel to put the random integers on. No problem, we do:
(def new-rand-int-chan (chan))
(>!! control-chan new-rand-int-chan)
(close! rand-int-chan) ;; for good measure, may not be necessary
We create the channel, and then we put that channel onto our control-chan. When we do this, ever worker thread will have the second portion of its alt! executed, which simply loops back to the top of the go-loop, except this time, the put-chan will be bound to the new-rand-int-chan we just received. So now:
(<!! new-rand-int-chan)
=> 3
Worker1 generated value.
This gives us our integers, which is exactly what we want. Any attempt to <!! from the old channel will give nil, since we closed the channel:
(<!! rand-int-chan)
; nil

Infinite lazy-sequence of events from external feed

Say I have a function, (get-events "feed"), that returns a vector of events in chronological order, taken from an external source.
Now, at any given moment, that function returns a list of events up to that point in time. Called a few seconds later, it will return a few more events, etc, as the feed continually grows.
If I want to create a lazy-seq that forever pulls new events from the feed, making sure it doesn't repeat those that have already been seen, how would I write this? I'm running into a stack overflow error when I don't use recur, but I can't use recur, because it doesn't appear in a tail position.
(def continually-list-events
([feed] (continually-list-events feed (hash-set)))
([feed seen]
(let [events-now (get-events feed)]
(into (remove seen events-now)
(lazy-seq
(continually-list-events feed
(into seen events-now))))))
You can see I'm trying to use an accumulator to track events already seen (in a set), and I'm making sure to always filter out the ones I've seen.
If each step keeps track of how many events have been received so far then that iteration can return a sequence of new events by dropping the old ones.
user> (->> (iterate (fn [[events-so-far contents]]
(let [events (get-events)
new-events (drop events-so-far events)]
[(count events) new-events])))
(mapcat second))
Then you can drop the counts from the sequence and flatten the chunks of events into a sequence of single events.
In your example the stackoverflow is because there is no call to cons after the call to lazy-seq so it's calculating the whole list as the first item in the sequence.
user> (defn example [x] (lazy-seq (cons x (example (inc x)))))
#'user/example
user> (take 5 (example 4))
(4 5 6 7 8)
user> (defn example [x] (lazy-seq (example (inc x))))
#'user/example
user> (take 5 (example 4))
... long pause then out of memory ...
PS: using lazy-seq directly is somewhat uncommon, though it's important to know how it works.