rxjava and clojure asynchrony mystery: futures promises and agents, oh my - clojure

I apologize in advance for the length of this note. I spent considerable time making it shorter, and this was as small as I could get it.
I have a mystery and would be grateful for your help. This mystery comes from the behavior of an rxjava observer I wrote in Clojure over a couple of straightforward observables cribbed from online samples.
One observable synchronously sends messages to the onNext handlers of its observers, and my supposedly principled observer behaves as expected.
The other observable asynchronously does the same, on another thread, via a Clojure future. The exact same observer does not capture all events posted to its onNext; it just seems to lose a random number of messages at the tail.
There is an intentional race in the following between the expiration of a wait for the promised onCompleted and the expiration of a wait for all events sent to an agent collector. If the promise wins, I expect to see false for onCompleted and a possibly short queue in the agent. If the agent wins, I expect to see true for onCompleted and all messages from the agent's queue. The one result I DO NOT expect is true for onCompleted AND a short queue from the agent. But, Murphy doesn't sleep, and that's exactly what I see. I don't know whether garbage-collection is at fault, or some internal queuing to Clojure's STM, or my stupidity, or something else altogether.
I present the source in the order of its self-contained form, here, so that it can be run directly via lein repl. There are three cermonials to get out of the way: first, the leiningen project file, project.clj, which declares dependency on the 0.9.0 version of Netflix's rxjava:
(defproject expt2 "0.1.0-SNAPSHOT"
:description "FIXME: write description"
:url "http://example.com/FIXME"
:license {:name "Eclipse Public License"
:url "http://www.eclipse.org/legal/epl-v10.html"}
:dependencies [[org.clojure/clojure "1.5.1"]
[com.netflix.rxjava/rxjava-clojure "0.9.0"]]
:main expt2.core)
Now, the namespace and a Clojure requirement and the Java imports:
(ns expt2.core
(:require clojure.pprint)
(:refer-clojure :exclude [distinct])
(:import [rx Observable subscriptions.Subscriptions]))
Finally, a macro for output to the console:
(defmacro pdump [x]
`(let [x# ~x]
(do (println "----------------")
(clojure.pprint/pprint '~x)
(println "~~>")
(clojure.pprint/pprint x#)
(println "----------------")
x#)))
Finally, to my observer. I use an agent to collect the messages sent by any observable's onNext. I use an atom to collect a potential onError. I use a promise for the onCompleted so that consumers external to the observer can wait on it.
(defn- subscribe-collectors [obl]
(let [;; Keep a sequence of all values sent:
onNextCollector (agent [])
;; Only need one value if the observable errors out:
onErrorCollector (atom nil)
;; Use a promise for 'completed' so we can wait for it on
;; another thread:
onCompletedCollector (promise)]
(letfn [;; When observable sends a value, relay it to our agent"
(collect-next [item] (send onNextCollector (fn [state] (conj state item))))
;; If observable errors out, just set our exception;
(collect-error [excp] (reset! onErrorCollector excp))
;; When observable completes, deliver on the promise:
(collect-completed [ ] (deliver onCompletedCollector true))
;; In all cases, report out the back end with this:
(report-collectors [ ]
(pdump
;; Wait for everything that has been sent to the agent
;; to drain (presumably internal message queues):
{:onNext (do (await-for 1000 onNextCollector)
;; Then produce the results:
#onNextCollector)
;; If we ever saw an error, here it is:
:onError #onErrorCollector
;; Wait at most 1 second for the promise to complete;
;; if it does not complete, then produce 'false'.
;; I expect if this times out before the agent
;; times out to see an 'onCompleted' of 'false'.
:onCompleted (deref onCompletedCollector 1000 false)
}))]
;; Recognize that the observable 'obl' may run on another thread:
(-> obl
(.subscribe collect-next collect-error collect-completed))
;; Therefore, produce results that wait, with timeouts, on both
;; the completion event and on the draining of the (presumed)
;; message queue to the agent.
(report-collectors))))
Now, here is a synchronous observable. It pumps 25 messages down the onNext throats of its observers, then calls their onCompleteds.
(defn- customObservableBlocking []
(Observable/create
(fn [observer] ; This is the 'subscribe' method.
;; Send 25 strings to the observer's onNext:
(doseq [x (range 25)]
(-> observer (.onNext (str "SynchedValue_" x))))
; After sending all values, complete the sequence:
(-> observer .onCompleted)
; return a NoOpSubsription since this blocks and thus
; can't be unsubscribed (disposed):
(Subscriptions/empty))))
We subscribe our observer to this observable:
;;; The value of the following is the list of all 25 events:
(-> (customObservableBlocking)
(subscribe-collectors))
It works as expected, and we see the following results on the console
{:onNext (do (await-for 1000 onNextCollector) #onNextCollector),
:onError #onErrorCollector,
:onCompleted (deref onCompletedCollector 1000 false)}
~~>
{:onNext
["SynchedValue_0"
"SynchedValue_1"
"SynchedValue_2"
"SynchedValue_3"
"SynchedValue_4"
"SynchedValue_5"
"SynchedValue_6"
"SynchedValue_7"
"SynchedValue_8"
"SynchedValue_9"
"SynchedValue_10"
"SynchedValue_11"
"SynchedValue_12"
"SynchedValue_13"
"SynchedValue_14"
"SynchedValue_15"
"SynchedValue_16"
"SynchedValue_17"
"SynchedValue_18"
"SynchedValue_19"
"SynchedValue_20"
"SynchedValue_21"
"SynchedValue_22"
"SynchedValue_23"
"SynchedValue_24"],
:onError nil,
:onCompleted true}
----------------
Here is an asynchronous observable that does exactly the same thing, only on a future's thread:
(defn- customObservableNonBlocking []
(Observable/create
(fn [observer] ; This is the 'subscribe' method
(let [f (future
;; On another thread, send 25 strings:
(doseq [x (range 25)]
(-> observer (.onNext (str "AsynchValue_" x))))
; After sending all values, complete the sequence:
(-> observer .onCompleted))]
; Return a disposable (unsubscribe) that cancels the future:
(Subscriptions/create #(future-cancel f))))))
;;; For unknown reasons, the following does not produce all 25 events:
(-> (customObservableNonBlocking)
(subscribe-collectors))
But, surprise, here is what we see on the console: true for onCompleted, implying that the promise DID NOT TIME-OUT; but only some of the asynch messages. The actual number of messages we see varies from run to run, implying that there is some concurrency phenomenon at play. Clues appreciated.
----------------
{:onNext (do (await-for 1000 onNextCollector) #onNextCollector),
:onError #onErrorCollector,
:onCompleted (deref onCompletedCollector 1000 false)}
~~>
{:onNext
["AsynchValue_0"
"AsynchValue_1"
"AsynchValue_2"
"AsynchValue_3"
"AsynchValue_4"
"AsynchValue_5"
"AsynchValue_6"],
:onError nil,
:onCompleted true}
----------------

The await-for on agent means Blocks the current thread until all actions dispatched thus
far (from this thread or agent) to the agents have occurred, which means that it may happen that after your await is over there is still some other thread that can send messages to the agent and that is what is happening in your case. After your await on agent is over and you have deref its value in the :onNext key in the map, then you wait for the on completed promise which turns out to be true after the wait but in the mean time some other messages were dispatched to the agent to be collected into the vector.
You can solve this by having the :onCompleted key as the first key in the map which basically means wait for the completion and then wait for the agents coz by that time there is no more send calls on the agent can happen after as have already received onCompleted.
{:onCompleted (deref onCompletedCollector 1000 false)
:onNext (do (await-for 0 onNextCollector)
#onNextCollector)
:onError #onErrorCollector
}

Related

Clojure: Polling database periodically - core.async w/ timeout channel VS vanilla recursive Thread w/ sleep?

I have a Ring-based server which has an atom for storing application state which is periodically fetched from database every 10 seconds for frequently changing info and every 60 seconds for the rest.
(defn set-world-update-interval
[f time-in-ms]
(let [stop (async/chan)]
(async/go-loop []
(async/alt!
(async/timeout time-in-ms) (do (async/<! (async/thread (f)))
(recur))
stop :stop))
stop))
(mount/defstate world-listener
:start (set-world-update-interval #(do (println "Checking data in db") (reset! world-atom (fetch-world-data)) ) 10000)
:stop (async/close! world-listener))
It works pretty good. RAM usage is pretty stable. But I'm wondering if this is an improper use of core.async?
Perhaps it should be a regular Thread instead like this?
(doto (Thread. (fn []
(loop []
(Thread/sleep 1000)
(println "Checking data in db")
(reset! world-atom (fetch-world-data))
(recur))))
(.setUncaughtExceptionHandler
(reify Thread$UncaughtExceptionHandler
(uncaughtException [this thread exception]
(println "Cleaning up!"))))
(.start))
While there's nothing wrong with your core.async implementation of this pattern, I'd suggest using a java.util.concurrent.ScheduledExecutorService for this. It gives you precise control over the thread pool and the scheduling.
Try something like this:
(ns your-app.world
(:require [clojure.tools.logging :as log]
[mount.core :as mount])
(:import
(java.util.concurrent Executors ScheduledExecutorService ThreadFactory TimeUnit)))
(defn ^ThreadFactory create-thread-factory
[thread-name-prefix]
(let [thread-number (atom 0)]
(reify ThreadFactory
(newThread [_ runnable]
(Thread. runnable (str thread-name-prefix "-" (swap! thread-number inc)))))))
(defn ^ScheduledExecutorService create-single-thread-scheduled-executor
[thread-name-prefix]
(let [thread-factory (create-thread-factory thread-name-prefix)]
(Executors/newSingleThreadScheduledExecutor thread-factory)))
(defn schedule
[executor runnable interval unit]
(.scheduleWithFixedDelay executor runnable 0 interval unit))
(defn shutdown-executor
"Industrial-strength executor shutdown, modify/simplify according to need."
[^ScheduledExecutorService executor]
(if (.isShutdown executor)
(log/info "Executor already shut down")
(do
(log/info "Shutting down executor")
(.shutdown executor) ;; Disable new tasks from being scheduled
(try
;; Wait a while for currently running tasks to finish
(if-not (.awaitTermination executor 10 TimeUnit/SECONDS)
(do
(.shutdownNow executor) ;; Cancel currently running tasks
(log/info "Still waiting to shut down executor. Sending interrupt to tasks.")
;; Wait a while for tasks to respond to being cancelled
(when-not (.awaitTermination executor 10 TimeUnit/SECONDS)
(throw (ex-info "Executor could not be shut down" {}))))
(log/info "Executor shutdown completed"))
(catch InterruptedException _
(log/info "Interrupted while shutting down. Sending interrupt to tasks.")
;; Re-cancel if current thread also interrupted
(.shutdownNow executor)
;; Preserve interrupt status
(.interrupt (Thread/currentThread)))))))
(defn world-updating-fn
[]
(log/info "Updating world atom")
;; Do your thing here
)
(mount/defstate world-listener
:start (doto (create-single-thread-scheduled-executor "world-listener")
(schedule world-updating-fn 10 TimeUnit/MINUTES))
:stop (shutdown-executor world-listener))
It seems a little silly to me to use go-loop to create a goroutine that parks on a timeout, when all the work you actually want to do is IO-intensive, and therefore (rightly) done in a separate thread. The result of this is that you're going through threads every cycle. These threads are pooled by core.async, so you're not doing the expensive work of creating new threads from nothing, but there's still some overhead to getting them out of the pool and interacting with core.async for the timeout. I'd just keep it simple by leaving core.async out of this operation.

Proper way to ensure clj-http's connection manager is closed after all requests are done

I have a code that is a combination of clj-http, core.async facilities and an atom. It creates some threads to fetch and parse a bunch of pages:
(defn fetch-page
([url] (fetch-page url nil))
([url conn-manager]
(-> (http.client/get url {:connection-manager conn-manager})
:body hickory/parse hickory/as-hickory)))
(defn- create-worker
[url-chan result conn-manager]
(async/thread
(loop [url (async/<!! url-chan)]
(when url
(swap! result assoc url (fetch-page url conn-manager))
(recur (async/<!! url-chan))))))
(defn fetch-pages
[urls]
(let [url-chan (async/to-chan urls)
pages (atom (reduce (fn [m u] (assoc m u nil)) {} urls))
conn-manager (http.conn-mgr/make-reusable-conn-manager {})
workers (mapv (fn [_] (create-worker url-chan pages conn-manager))
(range n-cpus))]
; wait for workers to finish and shut conn-manager down
(dotimes [_ n-cpus] (async/alts!! workers))
(http.conn-mgr/shutdown-manager conn-manager)
(mapv #(get #pages %) urls)))
The idea is to use multiple threads to reduce the time to fetch and parse the pages, but I'd like to not overload the server, sending a lot of requests at once - that is why a connection manager was used. I don't know if my approach is correct, suggestions are welcome. Currently the problem is that the last requests fail because the connection manager is shutdown before they terminate: Exception in thread "async-thread-macro-15" java.lang.IllegalStateException: Connection pool shut down.
The main questions: how do I close the connection manager at the right moment (and why my current code fails in doing it)? The side quest: is my approach right? If not, what could I do to fetch and parse multiple pages at once, while not overloading the server?
Thanks!
The problem is that async/alts!! returns on the first result (and will keep doing so since workers never changes). I think using async/merge to build a channel and then repeatedly read off of it should work.
(defn fetch-pages
[urls]
(let [url-chan (async/to-chan urls)
pages (atom (reduce (fn [m u] (assoc m u nil)) {} urls))
conn-manager (http.conn-mgr/make-reusable-conn-manager {})
workers (mapv (fn [_] (create-worker url-chan pages conn-manager))
(range n-cpus))
all-workers (async/merge workers)]
; wait for workers to finish and shut conn-manager down
(dotimes [_ n-cpus] (async/<!! all-workers))
(http.conn-mgr/shutdown-manager conn-manager)
(mapv #(get #pages %) urls)))
Alternatively, you could recur and keep shrinking workers instead so that you're only waiting on previously unfinished workers.
(defn fetch-pages
[urls]
(let [url-chan (async/to-chan urls)
pages (atom (reduce (fn [m u] (assoc m u nil)) {} urls))
conn-manager (http.conn-mgr/make-reusable-conn-manager {})
workers (mapv (fn [_] (create-worker url-chan pages conn-manager))
(range n-cpus))]
; wait for workers to finish and shut conn-manager down
(loop [workers workers]
(when (seq workers)
(let [[_ finished-worker] (async/alts!! workers)]
(recur (filterv #(not= finished-worker %) workers)))))
(http.conn-mgr/shutdown-manager conn-manager)
(mapv #(get #pages %) urls)))
I believe Alejandro is correct about the reason for your error, and this is logical, since your error indicates that you have shut down the connection manager before all requests have been completed, so it's likely that all the workers have not finished when you shut it down.
Another solution I'll propose stems from the fact that you aren't actually doing anything in your create-worker thread that requires it to be a channel, which is implicitly created by async/thread. So, you can replace it with a future, like so:
(defn- create-worker
[url-chan result conn-manager]
(future
(loop [url (a/<!! url-chan)]
(when url
(swap! result assoc url (fetch-page url conn-manager))
(recur (a/<!! url-chan))))))
And in your fetch-pages function, "join" by derefing:
(doseq [worker workers]
#worker) ; alternatively, use deref to specify timeout
This eliminates a good deal of core.async interference in what is not a core.async issue to begin with. This of course depends on you keeping your method of collecting the data as-is, that is, using swap! on an atom to keep track of page data. If you were to send the result of fetch-page out onto a return channel, or something similar, then you'd want to keep your current thread approach.
Regarding your concern about overloading the server -- you have not yet defined what it means to "overload" the server. There are two dimensions of this: one is the rate of requests (number of requests per second, for example), and the other is the number of concurrent requests. Your current app has n worker threads, and that is the effective concurrency (along with the settings in the connection manager). But this does nothing to address the rate of requests per second.
This is a little more complicated than it might seem, though it is possible. You have to consider the total of all requests done by all threads per unit of time, and managing that is not something to tackle in one answer here. I suggest you do some research about throttling and rate limiting, and give it a go, and then go from there with questions.

Ring: is there a way to manage http requests from the same client ip asynchronously?

so that the following request will be put into a queue and wont process until the response of predecessor is sent?
I am trying to write to the database and then quickly after, allow the user to query and update his/her recent inserted document
You should take a look at core.async and cljs-http.
You need to create a confirmation-channel with (chan), write your database-write function with a callback that will put >! a signal into the confirmation-channel.
(defn database-write [data db-params callback]
;Do stuff
(callback status-of-the-computation))
(def confirmation-chan (chan))
;First operation
(go
(database-write ;data ;db-params
(fn [x] (>! confirmation-chan x))))
;Second operation
(go
(when (<! confirmation-chan)
(http/post "http://example.com" {:form-params {:key1 [1 2 3] :key2 "value2"}})))
The second operation won't start until the first one completes.

messages publish before subscribe in core.async

In the following example I can see, that published messages arrive to the subscribed channel, although, they are published before subscription is made.
(let [in (async/chan)
out (async/chan)
pub (async/pub in :key)]
(async/go
(>! in {:id 1 :key :k1})
(>! in {:id 2 :key :k1})
(>! in {:id 3 :key :k1}))
(async/sub pub :k1 out)
(async/go-loop []
(println (<! out))
(recur)))
Is this expected behavior? As far as I can see in the documentation, it clearly states:
Items received when there are no matching subs get dropped.
I get same results in both Clojure and ClojureScript.
Added:
with mult/tap I see similar behavior
You don't know that the messages are published before the subscription is made. Because go is asynchronous, it's very possible that the subscription happens before the first message has been put into the channel. It's a race condition, really.
Try putting a (Thread/sleep [some value]) before the suscription and see what happens.

Catch exception from async process in Clojure?

Assume I'm sending an email, asynchronously, and want my program to continue its execution. So far, I've been doing this with futures, but unfortunately, when the email fails to send, no exception is raised.
I understand that dereferencing the future will raise an ExecutionException, but derefing would defeat the point.
Is there a better way to "fire and go", without losing exception information?
I'd go with agents for this, using handler functions to handle exceptions thrown by agent actions:
(agent initial-state :error-handler handler-fn)
See (doc agent), (doc set-error-handler!), (doc set-error-mode!) for details. initial-state here might simply be nil, or perhaps a structure holding some logging data.
To make this convenient, you'd want to have an email function usable with send (send-off, send-via):
(defn email [agent-state message] ...)
If the main thread needs to be notified that something is amiss, it will need to pay attention to communications over some channel. (Java queues are one possibility, the channels of core.async are another.) The handler function can then push messages over that channel.
Without trying...
(def foo (future (Thread/sleep 5000)
(throw (Exception. "bananas!"))
(Thread/sleep 5000)))
(future-done? foo)
;=> false
5 seconds later...
(future-done? foo)
;=> true
#foo
;=> Exception bananas! user/fn...
With trying...
(def bar (future (try (Thread/sleep 5000)
(throw (Exception. "bananas!"))
(Thread/sleep 5000)
(catch Exception e (println "Oh, " (.getMessage e))))))
(future-done? bar)
;=> false
5 seconds later...
;=>Oh, bananas!
#bar
;=> nil