Is there an elegant way to stop a running go block?
(without introducing a flag and polluting the code with checks/branches)
(ns example
(:require-macros [cljs.core.async.macros :refer [go]])
(:require [cljs.core.async :refer [<! timeout]]))
(defn some-long-task []
(go
(println "entering")
; some complex long-running task (e.g. fetching something via network)
(<! (timeout 1000))
(<! (timeout 1000))
(<! (timeout 1000))
(<! (timeout 1000))
(println "leaving")))
; run the task
(def task (some-long-task))
; later, realize we no longer need the result and want to cancel it
; (stop! task)
Sorry, this is not possible with core.async today. What you get back from creating a go block is a normal channel what the result of the block will be put on, though this does not give you any handle to the actual block itself.
As stated in Arthur's answer, you cannot terminate a go block immediately, but you since your example indicates a multi-phased task (using sub-tasks), an approach like this might work:
(defn task-processor
"Takes an initial state value and number of tasks (fns). Puts tasks
on a work queue channel and then executes them in a go-loop, with
each task passed the current state. A task's return value is used as
input for next task. When all tasks are processed or queue has been
closed, places current result/state onto a result channel. To allow
nil values, result is wrapped in a map:
{:value state :complete? true/false}
This fn returns a map of {:queue queue-chan :result result-chan}"
[init & tasks]
(assert (pos? (count tasks)))
(let [queue (chan)
result (chan)]
(async/onto-chan queue tasks)
(go-loop [state init, i 0]
(if-let [task (<! queue)]
(recur (task state) (inc i))
(do (prn "task queue finished/terminated")
(>! result {:value state :complete? (== i (count tasks))}))))
{:queue queue
:result result}))
(defn dummy-task [x] (prn :task x) (Thread/sleep 1000) (inc x))
;; kick of tasks
(def proc (apply task-processor 0 (repeat 100 dummy-task)))
;; result handler
(go
(let [res (<! (:result proc))]
(prn :final-result res)))
;; to stop the queue after current task is complete
;; in this example it might take up to an additional second
;; for the terminated result to be delivered
(close! (:queue proc))
You may want to use future and future-cancel for such task.
(def f (future (while (not (Thread/interrupted)) (your-function ... ))))
(future-cancel f)
Why do cancelled Clojure futures continue using CPU?
Related
I have a channel where I am putting values into inside a doseq loop.
This code reads from a list of isbns and for each isbn, does an amazon search to return contents of a book, and then calls another function to get the title and rank
(def book_channel (chan 10))
make sure you use clojure.core.async/into rather than clojure.core/into. Here is an example of a round trip from collection to channel and back to collection:
user> (require '[clojure.core.async :as async :refer [<! <!! >!! >! chan go]])
nil
user> (def book-chan (async/to-chan [:book1 :book2 :book3]))
#'user/book-chan
user> (<!! (clojure.core.async/into [] book-chan))
[:book1 :book2 :book3]
clojure.core.async/into returns a channel that will have exactly one item written to it. That one item will be written once it's input channel closes. This keeps the whole thing asynchronous and it does require that the code putting things into the book-channel close the chan to signal that all the books are there.
You need to do some type of coordination to determine when all of your work is finished. You can pull that coordination out into the main thread fairly easily:
(def book_channel (chan 10))
(defn concurrency_test
[list_of_isbns]
(doseq [isbn list_of_isbns]
(go (>! book_channel
(get_title_and_rank_for_one_isbn
(amazon_search isbn)))))
(prn (loop [results []]
(if (= (count results) (count list_of_isbns))
results
(recur (conj results (<!! book_channel)))))))
Here, I used a loop that keeps waiting for results and adding them to the vector until we have as many results as we do isbns. You'll want to make sure that get_title_and_rank_for_one_isbn always generates a result that can be put on a channel, otherwise the loop will wait forever.
You should close! the book_channel after you finish pushing stuff into it. Per async/into documentation - "ch must close before into produces a result."
(let [book> (chan)]
(go
(doseq [e (range 8)]
(>! book> e))
(close! book>))
(<!! (async/into [] book>)))
Alternatively, you can use async/onto-chan which will close the channel for you:
(let [book> (chan)]
(async/onto-chan book> (range 8))
(<!! (async/into [] book>)))
I have 100 workers (agents) that share one ref that contains collection of tasks. While this collection have tasks, each worker get one task from this collection (in dosync block), print it and sometimes put it back in the collection (in dosync block):
(defn have-tasks?
[tasks]
(not (empty? #tasks)))
(defn get-task
[tasks]
(dosync
(let [task (first #tasks)]
(alter tasks rest)
task)))
(defn put-task
[tasks task]
(dosync (alter tasks conj task))
nil)
(defn worker
[& {:keys [tasks]}]
(agent {:tasks tasks}))
(defn worker-loop
[{:keys [tasks] :as state}]
(while (have-tasks? tasks)
(let [task (get-task tasks)]
(println "Task: " task)
(when (< (rand) 0.1)
(put-task tasks task))))
state)
(defn create-workers
[count & options]
(->> (range 0 count)
(map (fn [_] (apply worker options)))
(into [])))
(defn start-workers
[workers]
(doseq [worker workers] (send-off worker worker-loop)))
(def tasks (ref (range 1 10000000)))
(def workers (create-workers 100 :tasks tasks))
(start-workers workers)
(apply await workers)
When i run this code, the last value printed by agents is (after several tries):
435445,
4556294,
1322061,
3950017.
But never 9999999 what I expect.
And every time the collection is really empty at the end.
What I'm doing wrong?
Edit:
I rewrote worker-loop as simple as possible:
(defn worker-loop
[{:keys [tasks] :as state}]
(loop []
(when-let [task (get-task tasks)]
(println "Task: " task)
(recur)))
state)
But problem is still there.
This code behaves as expected when create one and only one worker.
The problem here has nothing to do with agents and barely anything to do with laziness. Here's a somewhat reduced version of the original code that still exhibits the problem:
(defn f [init]
(let [state (ref init)
task (fn []
(loop [last-n nil]
(if-let [n (dosync
(let [n (first #state)]
(alter state rest)
n))]
(recur n)
(locking :out
(println "Last seen:" last-n)))))
workers (->> (range 0 5)
(mapv (fn [_] (Thread. task))))]
(doseq [w workers] (.start w))
(doseq [w workers] (.join w))))
(defn r []
(f (range 1 100000)))
(defn i [] (f (->> (iterate inc 1)
(take 100000))))
(defn t []
(f (->> (range 1 100000)
(take Integer/MAX_VALUE))))
Running this code shows that both i and t, both lazy, reliably work, whereas r reliably doesn't. The problem is in fact a concurrency bug in the class returned by the range call. Indeed, that bug is documented in this Clojure ticket and is fixed as of Clojure version 1.9.0-alpha11.
A quick summary of the bug in case the ticket is not accessible for some reason: in the internals of the rest call on the result of range, there was a small opportunity for a race condition: the "flag" that says "the next value has already been computed" was set before the actual value itself, which meant that a second thread could see that flag as true even though the "next value" is still nil. The call to alter would then fix that nil value on the ref. It's been fixed by swapping the two assignment lines.
In cases where the result of range was either forcibly realized in a single thread or wrapped in another lazy seq, that bug would not appear.
I asked this question on the Clojure Google Group and it helped me to find the answer.
The problem is that I used a lazy sequence within the STM transaction.
When I replaced this code:
(def tasks (ref (range 1 10000000)))
by this:
(def tasks (ref (into [] (range 1 10000000))))
it worked as expected!
In my production code where the problem occurred, I used the Korma framework that also returns a lazy collection of tuples, as in my example.
Conclusion: Avoid the use of lazy data structures within the STM transaction.
When the last number in the range is reached, there a are still older numbers being held by the workers. Some of these will be returned to the queue, to be processed again.
In order to better see what is happening, you can change worker-loop to print the last task handled by each worker:
(defn worker-loop
[{:keys [tasks] :as state}]
(loop [last-task nil]
(if (have-tasks? tasks)
(let [task (get-task tasks)]
;; (when (< (rand) 0.1)
;; (put-task tasks task)
(recur task))
(when last-task
(println "Last task:" last-task))))
state)
This also shows the race condition in the code, where tasks seen by have-tasks? often is taken by others when get-task is called near the end of the processing of the tasks.
The race condition can be solved by removing have-tasks? and instead using the return value of nil from get-task as a signal that no more tasks are available (at the moment).
Updated:
As observed, this race conditions does not explain the problem.
Neither is the problem solved by removing a possible race condition in get-task like this:
(defn get-task [tasks]
(dosync
(first (alter tasks rest))))
However changing get-task to use an explicit lock seems to solve the problem:
(defn get-task [tasks]
(locking :lock
(dosync
(let [task (first #tasks)]
(alter tasks rest)
task))))
What's best way in clojure to implement something like an actor or agent (asynchronously updated, uncoordinated reference) that does the following?
gets sent messages/data
executes some function on that data to obtain new state; something like (fn [state new-msgs] ...)
continues to receive messages/data during that update
once done with that update, runs the same update function against all messages that have been sent in the interim
An agent doesn't seem quite right here. One must simultaneously send function and data to agents, which doesn't leave room for a function which operates on all data that has come in during the last update. The goal implicitly requires a decoupling of function and data.
The actor model seems generally better suited in that there is a decoupling of function and data. However, all actor frameworks I'm aware of seem to assume each message sent will be processed separately. It's not clear how one would turn this on it's head without adding extra machinery. I know Pulsar's actors accept a :lifecycle-handle function which can be used to make actors do "special tricks" but there isn't a lot of documentation around this so it's unclear whether the functionality would be helpful.
I do have a solution to this problem using agents, core.async channels, and watch functions, but it's a bit messy, and I'm hoping there is a better solution. I'll post it as a solution in case others find it helpful, but I'd like to see what other's come up with.
Here's the solution I came up with using agents, core.async channels, and watch functions. Again, it's a bit messy, but it does what I need it to for now. Here it is, in broad strokes:
(require '[clojure.core.async :as async :refer [>!! <!! >! <! chan go]])
; We'll call this thing a queued-agent
(defprotocol IQueuedAgent
(enqueue [this message])
(ping [this]))
(defrecord QueuedAgent [agent queue]
IQueuedAgent
(enqueue [_ message]
(go (>! queue message)))
(ping [_]
(send agent identity)))
; Need a function for draining a core async channel of all messages
(defn drain! [c]
(let [cc (chan 1)]
(go (>! cc ::queue-empty))
(letfn
; This fn does all the hard work, but closes over cc to avoid reconstruction
[(drainer! [c]
(let [[v _] (<!! (go (async/alts! [c cc] :priority true)))]
(if (= v ::queue-empty)
(lazy-seq [])
(lazy-seq (cons v (drainer! c))))))]
(drainer! c))))
; Constructor function
(defn queued-agent [& {:keys [buffer update-fn init-fn error-handler-builder] :or {:buffer 100}}]
(let [q (chan buffer)
a (agent (if init-fn (init-fn) {}))
error-handler-fn (error-handler-builder q a)]
; Set up the queue, and watcher which runs the update function when there is new data
(add-watch
a
:update-conv
(fn [k r o n]
(let [queued (drain! q)]
(when-not (empty? queued)
(send a update-fn queued error-handler-fn)))))
(QueuedAgent. a q)))
; Now we can use these like this
(def a (queued-agent
:init-fn (fn [] {:some "initial value"})
:update-fn (fn [a queued-data error-handler-fn]
(println "Receiving data" queued-data)
; Simulate some work/load on data
(Thread/sleep 2000)
(println "Done with work; ready to queue more up!"))
; This is a little warty at the moment, but closing over the queue and agent lets you requeue work on
; failure so you can try again.
:error-handler-builder
(fn [q a] (println "do something with errors"))))
(defn -main []
(doseq [i (range 10)]
(enqueue a (str "data" i))
(Thread/sleep 500) ; simulate things happening
; This part stinks... have to manually let the queued agent know that we've queued some things up for it
(ping a)))
As you'll notice, having to ping the queued-agent here every time new data is added is pretty warty. It definitely feels like things are being twisted out of typical usage.
Agents are the inverse of what you want here - they are a value that gets sent updating functions. This easiest with a queue and a Thread. For convenience I am using future to construct the thread.
user> (def q (java.util.concurrent.LinkedBlockingDeque.))
#'user/q
user> (defn accumulate
[summary input]
(let [{vowels true consonents false}
(group-by #(contains? (set "aeiouAEIOU") %) input)]
(-> summary
(update-in [:vowels] + (count vowels))
(update-in [:consonents] + (count consonents)))))
#'user/accumulate
user> (def worker
(future (loop [summary {:vowels 0 :consonents 0} in-string (.take q)]
(if (not in-string)
summary
(recur (accumulate summary in-string)
(.take q))))))
#'user/worker
user> (.add q "hello")
true
user> (.add q "goodbye")
true
user> (.add q false)
true
user> #worker
{:vowels 5, :consonents 7}
I came up with something closer to an actor, inspired by Tim Baldridge's cast on actors (Episode 16). I think this addresses the problem much more cleanly.
(defmacro take-all! [c]
`(loop [acc# []]
(let [[v# ~c] (alts! [~c] :default nil)]
(if (not= ~c :default)
(recur (conj acc# v#))
acc#))))
(defn eager-actor [f]
(let [msgbox (chan 1024)]
(go (loop [f f]
(let [first-msg (<! msgbox) ; do this so we park efficiently, and only
; run when there are actually messages
msgs (take-all! msgbox)
msgs (concat [first-msg] msgs)]
(recur (f msgs)))))
msgbox))
(let [a (eager-actor (fn f [ms]
(Thread/sleep 1000) ; simulate work
(println "doing something with" ms)
f))]
(doseq [i (range 20)]
(Thread/sleep 300)
(put! a i)))
;; =>
;; doing something with (0)
;; doing something with (1 2 3)
;; doing something with (4 5 6)
;; doing something with (7 8 9 10)
;; doing something with (11 12 13)
I'd like to use memoize for a function that uses core.async and <! e.g
(defn foo [x]
(go
(<! (timeout 2000))
(* 2 x)))
(In the real-life, it could be useful in order to cache the results of server calls)
I was able to achieve that by writing a core.async version of memoize (almost the same code as memoize):
(defn memoize-async [f]
(let [mem (atom {})]
(fn [& args]
(go
(if-let [e (find #mem args)]
(val e)
(let [ret (<! (apply f args))]; this line differs from memoize [ret (apply f args)]
(swap! mem assoc args ret)
ret))))))
Example of usage:
(def foo-memo (memoize-async foo))
(go (println (<! (foo-memo 3)))); delay because of (<! (timeout 2000))
(go (println (<! (foo-memo 3)))); subsequent calls are memoized => no delay
I am wondering if there are simpler ways to achieve the same result.
**Remark: I need a solution that works with <!. For <!!, see this question: How to memoize a function that uses core.async and blocking channel read? **
You can use the built in memoize function for this. Start by defining a method that reads from a channel and returns the value:
(defn wait-for [ch]
(<!! ch))
Note that we'll use <!! and not <! because we want this function block until there is data on the channel in all cases. <! only exhibits this behavior when used in a form inside of a go block.
You can then construct your memoized function by composing this function with foo, like such:
(def foo-memo (memoize (comp wait-for foo)))
foo returns a channel, so wait-for will block until that channel has a value (i.e. until the operation inside foo finished).
foo-memo can be used similar to your example above, except you do not need the call to <! because wait-for will block for you:
(go (println (foo-memo 3))
You can also call this outside of a go block, and it will behave like you expect (i.e. block the calling thread until foo returns).
This was a little trickier than I expected. Your solution isn't correct, because when you call your memoized function again with the same arguments, sooner than the first run finishes running its go block, you will trigger it again and get a miss. This is often the case when you process lists with core.async.
The one below uses core.async's pub/sub to solve this (tested in CLJS only):
(def lookup-sentinel #?(:clj ::not-found :cljs (js-obj))
(def pending-sentinel #?(:clj ::pending :cljs (js-obj))
(defn memoize-async
[f]
(let [>in (chan)
pending (pub >in :args)
mem (atom {})]
(letfn
[(memoized [& args]
(go
(let [v (get #mem args lookup-sentinel)]
(condp identical? v
lookup-sentinel
(do
(swap! mem assoc args pending-sentinel)
(go
(let [ret (<! (apply f args))]
(swap! mem assoc args ret)
(put! >in {:args args :ret ret})))
(<! (apply memoized args)))
pending-sentinel
(let [<out (chan 1)]
(sub pending args <out)
(:ret (<! <out)))
v))))]
memoized)))
NOTE: it probably leaks memory, subscriptions and <out channels are not closed
I have used this function in one of my projects to cache HTTP calls. The function caches results for a given amount of time and uses a barrier to prevent executing the function multiple times when the cache is "cold" (due to the context switch inside the go block).
(defn memoize-af-until
[af ms clock]
(let [barrier (async/chan 1)
last-return (volatile! nil)
last-return-ms (volatile! nil)]
(fn [& args]
(async/go
(>! barrier :token)
(let [now-ms (.now clock)]
(when (or (not #last-return-ms) (< #last-return-ms (- now-ms ms)))
(vreset! last-return (<! (apply af args)))
(vreset! last-return-ms now-ms))
(<! barrier)
#last-return)))))
You can test that it works properly by setting the cache time to 0 and observe that the two function calls take approximately 10 seconds. Without the barrier the two calls would finish at the same time:
(def memo (memoize-af-until #(async/timeout 5000) 0 js/Date))
(async/take! (memo) #(println "[:a] Finished"))
(async/take! (memo) #(println "[:b] Finished"))
I'm looking for a very simple way to call a function periodically in Clojure.
JavaScript's setInterval has the kind of API I'd like. If I reimagined it in Clojure, it'd look something like this:
(def job (set-interval my-callback 1000))
; some time later...
(clear-interval job)
For my purposes I don't mind if this creates a new thread, runs in a thread pool or something else. It's not critical that the timing is exact either. In fact, the period provided (in milliseconds) can just be a delay between the end of one call completing and the commencement of the next.
If you want very simple
(defn set-interval [callback ms]
(future (while true (do (Thread/sleep ms) (callback)))))
(def job (set-interval #(println "hello") 1000))
=>hello
hello
...
(future-cancel job)
=>true
Good-bye.
There's also quite a few scheduling libraries for Clojure:
(from simple to very advanced)
at-at
chime (core.async integration)
monotony
quartzite
Straight from the examples of the github homepage of at-at:
(use 'overtone.at-at)
(def my-pool (mk-pool))
(let [schedule (every 1000 #(println "I am cool!") my-pool)]
(do stuff while schedule runs)
(stop schedule))
Use (every 1000 #(println "I am cool!") my-pool :fixed-delay true) if you want a delay of a second between end of task and start of next, instead of between two starts.
This is how I would do the core.async version with stop channel.
(defn set-interval
[f time-in-ms]
(let [stop (chan)]
(go-loop []
(alt!
(timeout time-in-ms) (do (<! (thread (f)))
(recur))
stop :stop))
stop))
And the usage
(def job (set-interval #(println "Howdy") 2000))
; Howdy
; Howdy
(close! job)
The simplest approach would be to just have a loop in a separate thread.
(defn periodically
[f interval]
(doto (Thread.
#(try
(while (not (.isInterrupted (Thread/currentThread)))
(Thread/sleep interval)
(f))
(catch InterruptedException _)))
(.start)))
You can cancel execution using Thread.interrupt():
(def t (periodically #(println "Hello!") 1000))
;; prints "Hello!" every second
(.interrupt t)
You could even just use future to wrap the loop and future-cancel to stop it.
I took a stab at coding this up, with a slightly modified interface than specified in the original question. Here's what I came up with.
(defn periodically [fn millis]
"Calls fn every millis. Returns a function that stops the loop."
(let [p (promise)]
(future
(while
(= (deref p millis "timeout") "timeout")
(fn)))
#(deliver p "cancel")))
Feedback welcomed.
Another option would be to use java.util.Timer's scheduleAtFixedRate method
edit - multiplex tasks on a single timer, and stop a single task rather than the entire timer
(defn ->timer [] (java.util.Timer.))
(defn fixed-rate
([f per] (fixed-rate f (->timer) 0 per))
([f timer per] (fixed-rate f timer 0 per))
([f timer dlay per]
(let [tt (proxy [java.util.TimerTask] [] (run [] (f)))]
(.scheduleAtFixedRate timer tt dlay per)
#(.cancel tt))))
;; Example
(let [t (->timer)
job1 (fixed-rate #(println "A") t 1000)
job2 (fixed-rate #(println "B") t 2000)
job3 (fixed-rate #(println "C") t 3000)]
(Thread/sleep 10000)
(job3) ;; stop printing C
(Thread/sleep 10000)
(job2) ;; stop printing B
(Thread/sleep 10000)
(job1))
Using core.async
(ns your-namespace
(:require [clojure.core.async :as async :refer [<! timeout chan go]])
)
(def milisecs-to-wait 1000)
(defn what-you-want-to-do []
(println "working"))
(def the-condition (atom true))
(defn evaluate-condition []
#the-condition)
(defn stop-periodic-function []
(reset! the-condition false )
)
(go
(while (evaluate-condition)
(<! (timeout milisecs-to-wait))
(what-you-want-to-do)))