What's the meaning of :key parameter in add-watch function - clojure

My problem is that with the doc and examples providede i can't understand the meaning of :key parameter or its possible values
This is the official doc page of the function that I'm referring:
http://clojuredocs.org/clojure_core/clojure.core/add-watch
add-watch clojure.core
(add-watch reference key fn)
Adds a watch function to an agent/atom/var/ref reference. The watch fn
must be a fn of 4 args: a key, the reference, its old-state, its
new-state. Whenever the reference's state might have been changed, any
registered watches will have their functions called. The watch fn will
be called synchronously, on the agent's thread if an agent, before any
pending sends if agent or ref. Note that an atom's or ref's state may
have changed again prior to the fn call, so use old/new-state rather
than derefing the reference. Note also that watch fns may be called
from multiple threads simultaneously. Var watchers are triggered only
by root binding changes, not thread-local set!s. Keys must be unique
per reference, and can be used to remove the watch with remove-watch,
but are otherwise considered opaque by the watch mechanism.
Thanks

It's basically just an identifier that you can use in calling code to identify the watch, in case you have more than one watches per reference. It's something that should have significance to your application code, but will be passed through by Clojure.
For instance:
user> (def a (atom 0))
#'user/a
user> (add-watch a
:count-to-3
(fn [k r old-state new-state]
(println "changed from" old-state "to" new-state)
(when (>= new-state 3)
(remove-watch a :count-to-3))))
#<Atom#3287a10: 0>
user> (dotimes [_ 5] (swap! a inc))
changed from 0 to 1
changed from 1 to 2
changed from 2 to 3
nil
user> #a
5

The answer's right there:
Keys must be unique per reference, and can be used to remove
the watch with remove-watch, but are otherwise considered opaque by
the watch mechanism.
In other words, the actual watch mechanism doesn't care what you set the key to (as long as it's unique among the handlers set on the given ref), but you'll need to hang on to it if you ever want to call remove-watch to get rid of your handler

Related

Streaming data to the caller in JVM

I have a function which gets data periodically and then stops getting the data. This function has to return the data that it is fetching periodically to the caller of the function either
As and when it gets
At one shot
The 2nd one is an easy implementation i.e you block the caller, fetch all the data and then send it in one shot.
But I want to implement the 1st one (I want to avoid having callbacks). Is streams the things to be used here? If so, how? If not, how do I return something on which the caller can query for data and also stop when it returns a signal that there is no more data?
Note: I am on the JVM ecosystem, clojure to be specific. I have had a look at the clojure library core.async which kind of solves this kind of a problem with the use of channels. But I was thinking if there is any other way which probably looks like this (assuming streams is something that can be used).
Java snippet
//Function which will periodically fetch MyData until there is no data
public Stream<MyData> myFunction() {
...
}
myFunction().filter(myData -> myData.text.equals("foo"))
Maybe you can just use seq - which is lazy by default (like Stream) so caller can decide when to pull the data in. And when there are no more data myFunction can simply end the sequence. While doing this, you would also encapsulate some optimisation within myFunction - e.g. to get data in batch to minimise roundtrips. Or fetch data periodically per your original requirement.
Here is one naive implementation:
(defn my-function []
(let [batch 100]
(->> (range)
(map #(let [from (* batch %)
to (+ from batch)]
(db-get from to)))
;; take while we have data from db-get
(take-while identity)
;; returns as one single seq/Stream
(apply concat))))
;; use it as a normal seq/Stream
(->> (my-function)
(filter odd?))
where db-get would be something like:
(defn db-get [from to]
;; return first 1000 records only, i.e. returns nil to signal completion
(when (< from 1000)
;; returns a range of records
(range from to)))
You might want to check https://github.com/ReactiveX/RxJava and https://github.com/ReactiveX/RxClojure (seems no longer maintained?)

Pedestal component not updating itself after start method

I have the code bellow for my Pedestal component. When Stuart Sierra's library starts my system-map, the method start implemented in the Pedestal defrecord gets called and it returns an updated version of my component with :pedestal-server associated. Shouldn't the lifecycle manager propagate the updated component so it could be used by the stop method? Whenever I try to stop the server by calling (component/stop (system)) in the REPL, nothing happens because the :pedestal-server key is set to nil.
(defrecord Pedestal [service-map pedestal-server]
component/Lifecycle
(start [this]
(if pedestal-server
this
(assoc this :pedestal-server
(-> service-map
http/create-server
http/start))))
(stop [this]
(when pedestal-server
(http/stop pedestal-server))
(assoc this :pedestal-server nil)))
(defn new-pedestal []
(map->Pedestal {}))
You should note that calling (com.stuartsierra.component/start) on a component the function returns a started copy of the component but it does not modify the component itself. Likewise calling (com.stuartsierra.component/stop) will return a stopped copy of the component.
I think that the value of :pedestal-server key is nil because you did not store the returned value of the (start) call but you called it on the original (unstarted) component.
You need to store the state of the application in some storage, like an atom or a var. Then you can update the state of the storage with start and stop.
For example:
;; first we create a new component and store it in system.
(def system (new-pedestal))
;; this function starts the state and saves it:
(defn start-pedestal! [] (alter-var-root #'system component/start))
;; this function stops the running state:
(defn stop-pedestal! [] (alter-var-root #'system component/stop))

Idiomatic error/exception handling with threading macros

I'm fetching thousands of entities from an API one at a time using http requests. As next step in the pipeline I want to shovel all of them into a database.
(->> ids
(pmap fetch-entity)
(pmap store-entity)
(doall))
fetch-entity expects a String id and tries to retrieve an entity using an http request and either returns a Map or throws an exception (e.g. because of a timeout).
store-entity expects a Map and tries to store it in a database. It possibly throws an exception (e.g. if the Map doesn't match the database schema or if it didn't receive a Map at all).
Inelegant Error Handling
My first "solution" was to write wrapper functions fetch-entity' and store-entity' to catch exceptions of their respective original functions.
fetch-entity' returns its input on failure, basically passing along a String id if the http request failed. This ensures that the whole pipeline keeps on trucking.
store-entity' checks the type of its argument. If the argument is a Map (fetch entity was successful and returned a Map) it attempts to store it in the database.
If the attempt of storing to the database throws an exception or if store-entity' got passed a String (id) instead of a Map it will conj to an external Vector of error_ids.
This way I can later use error_ids to figure out how often there was a failure and which ids were affected.
It doesn't feel like the above is a sensible way to achieve what I'm trying to do. For example the way I wrote store-entity' complects the function with the previous pipeline step (fetch-entity') because it behaves differently based on whether the previous pipeline step was successful or not.
Also having store-entity' be aware of an external Vector called error_ids does not feel right at all.
Is there an idiomatic way to handle these kinds of situations where you have multiple pipeline steps where some of them can throw exceptions (e.g. because they are I/O) where I can't easily use predicates to make sure the function will behave predictable and where I don't want to disturb the pipeline and only later check in which cases it went wrong?
It is possible to use a type of Try monad, for example from the cats library:
It represents a computation that may either result in an exception or return a successfully computed value. Is very similar to the Either monad, but is semantically different.It consists of two types: Success and Failure. The Success type is a simple wrapper, like Right of the Either monad. But the Failure type is slightly different from Left, because it always wraps an instance of Throwable (or any value in cljs since you can throw arbitrary values in the JavaScript host).(...)It is an analogue of the try-catch block: it replaces try-catch’s stack-based error handling with heap-based error handling. Instead of having an exception thrown and having to deal with it immediately in the same thread, it disconnects the error handling and recovery.
Heap-based error-handling is what you want.
Below I made an example of fetch-entity and store-entity. I made fetch-entity throw an ExceptionInfo on the first id (1) and store-entity throws a DivideByZeroException on the second id (0).
(ns your-project.core
(:require [cats.core :as cats]
[cats.monad.exception :as exc]))
(def ids [1 0 2]) ;; `fetch-entity` throws on 1, `store-entity` on 0, 2 works
(defn fetch-entity
"Throws an exception when the id is 1..."
[id]
(if (= id 1)
(throw (ex-info "id is 1, help!" {:id id}))
id))
(defn store-entity
"Unfortunately this function still needs to be aware that it receives a Try.
It throws a `DivideByZeroException` when the id is 0"
[id-try]
(if (exc/success? id-try) ; was the previous step a success?
(exc/try-on (/ 1 (exc/extract id-try))) ; if so: extract, apply fn, and rewrap
id-try)) ; else return original for later processing
(def results
(->> ids
(pmap #(exc/try-on (fetch-entity %)))
(pmap store-entity)))
Now you can filter results on successes or failures with respectively success? or failure? and retrieve the values via cats-extract
(def successful-results
(->> results
(filter exc/success?)
(mapv cats/extract)))
successful-results ;; => [1/2]
(def error-messages
(->> results
(filter exc/failure?)
(mapv cats/extract) ; gets exceptions without raising them
(mapv #(.getMessage %))))
error-messages ;; => ["id is 1, help!" "Divide by zero"]
Note that if you want to only loop over the errors or successful-results once you can use a transducer as follows:
(transduce (comp
(filter exc/success?)
(map cats/extract))
conj
results))
;; => [1/2]
My first thought is to combine fetch-entity and store-entity into a single operation:
(defn fetch-and-store [id]
(try
(store-entity (fetch-entity id))
(catch ... <log error msg> )))
(doall (pmap fetch-and-store ids))
Would something like this work?

How does the binding function work with core.async?

Is it ok to use binding with core.async? I'm using ClojureScript so core.async is very different.
(def ^:dynamic token "no-token")
(defn call
[path body]
(http-post (str host path) (merge {:headers {"X-token" token}} body)))) ; returns a core.async channel
(defn user-settings
[req]
(call "/api/user/settings" req))
; elsewhere after I've logged in
(let [token (async/<! (api/login {:user "me" :pass "pass"}))]
(binding
[token token]
(user-settings {:all-settings true})))
In ClojureScript1, binding is basically with-redefs plus an extra check that the Vars involved are marked :dynamic. On the other hand, gos get scheduled for execution1 in chunks (that is, they may be "parked" and later resumed, and interleaving between go blocks is arbitrary). These models don't mesh very well at all.
In short, no, please use explicitly-passed arguments instead.
1 The details are different in Clojure, but the conclusion remains the same.
2 Using the fastest mechanism possible, setTimeout with a time of 0 if nothing better is available.

var versus atom for runtime constants

Based on command-line input, I need to set some run-time constants which a number of downstream functions are going to use. The code in those functions may execute in other threads so I am not considering the "declare var and use binding macro" combination. What are the pros/cons of using a var (with alter-var-root) for this versus using an atom? That is,
(declare *dry-run*) ; one of my constants
(defn -main [& args]
; fetch command line option
;(cli args ...)
(alter-var-root #'*dry-run* (constantly ...))
(do-stuff-in-thread-pool))
versus
(def *dry-run* (atom true))
(defn -main [& args]
; fetch command line option
;(cli args ...)
(reset! *dry-run* ...)
(do-stuff-in-thread-pool))
If there is another option besides these two which I should consider, would love to know.
Also, ideally I would've preferred not to provide an initial val to the atom because I want to set defaults elsewhere (with the cli invocation), but I can live with it, especially if using the atom offers advantages compared to the alternative(s).
Write-once values are exactly the use case promises are designed for:
(def dry-run (promise))
(defn -main []
(deliver dry-run true))
(defn whatever [f]
(if #dry-run
...))
AFAIK alter-var-root only guarantees synchronized variable value changing and doesn't guarantee safe reading during this change. On other hand the atom really provides atomically change the state of identity.
If you don't want provide an initial value you can just set it to nil:
(def *dry-run* (atom nil))
What is wrong with just using a var and alter-var-root? You set up the new value in your startup function, before you really kick-off the workers. So there is no race in reading. And you can save the # everywhere you need the value.