'Repeatedly' in Core.Async - clojure

Consider the following snippet:
(require '[clojure.core.async :refer :all])
(def my-chan (chan (buffer 10)))
(go (while true
(>! my-chan (rand))))
This basically provides a buffered channel, which always contains some 10 random numbers. When the channel is consumed, the buffer is filled again.
Is there an abstraction for this in core.async? As there are transducers for manipulating the consumption of channels, there might be something for the production of them as well:
For sequences one would go for something like this:
(def my-seq
(map (fn [_] (rand)) (range)))
or, just:
(def my-seq (repeatedly rand))
Which of course is not buffered, but it might give an idea of what I'm looking for.

Transducers don't manipulate the consumption of channels -- they affect the values, but they don't affect the consumption of the data on the channel.
You seem to be asking of a way to abstract the creation of a channel, and then get values off of it as a sequence. Here are some ideas, though I'm not convinced that core.async really offers anything above normal clojure.core functionality in this case.
Abstraction is done here the way it usually is done -- with functions. This will call f and put its result on the channel. The implication here is of course that f will be side-effecting, and impure, otherwise it would be quite a boring channel to consume from, with every value being identical.
(defn chan-factory
[f buf]
(let [c (chan buf)]
(go-loop []
(>! c (f))
(recur))
c))
If you then wanted to create a lazy sequence from this, you could do:
(defn chan-to-seq [c]
(lazy-seq
(cons (<!! c) (chan-to-seq c))))
Define your seq:
(def rand-from-chan (chan-to-seq (chan-factory rand 10)))
(take 5 rand-from-chan)
=>
(0.6873518531956767
0.6940302424998631
0.07293052906941855
0.7264083273536271
0.4670275072317531)
However, you can accomplish this same thing by doing:
(def rand-nums (repeatedly rand))
So, while what you're doing is a great thought experiment, it may be more helpful to find some concrete use cases, and then maybe you will receive more specific ideas. Good luck!

Related

Functional alternative to "let"

I find myself writing a lot of clojure in this manner:
(defn my-fun [input]
(let [result1 (some-complicated-procedure input)
result2 (some-other-procedure result1)]
(do-something-with-results result1 result2)))
This let statement seems very... imperative. Which I don't like. In principal, I could be writing the same function like this:
(defn my-fun [input]
(do-something-with-results (some-complicated-procedure input)
(some-other-procedure (some-complicated-procedure input)))))
The problem with this is that it involves recomputation of some-complicated-procedure, which may be arbitrarily expensive. Also you can imagine that some-complicated-procedure is actually a series of nested function calls, and then I either have to write a whole new function, or risk that changes in the first invocation don't get applied to the second:
E.g. this works, but I have to have an extra shallow, top-level function that makes it hard to do a mental stack trace:
(defn some-complicated-procedure [input] (lots (of (nested (operations input)))))
(defn my-fun [input]
(do-something-with-results (some-complicated-procedure input)
(some-other-procedure (some-complicated-procedure input)))))
E.g. this is dangerous because refactoring is hard:
(defn my-fun [input]
(do-something-with-results (lots (of (nested (operations (mistake input))))) ; oops made a change here that wasn't applied to the other nested calls
(some-other-procedure (lots (of (nested (operations input))))))))
Given these tradeoffs, I feel like I don't have any alternatives to writing long, imperative let statements, but when I do, I cant shake the feeling that I'm not writing idiomatic clojure. Is there a way I can address the computation and code cleanliness problems raised above and write idiomatic clojure? Are imperitive-ish let statements idiomatic?
The kind of let statements you describe might remind you of imperative code, but there is nothing imperative about them. Haskell has similar statements for binding names to values within bodies, too.
If your situation really needs a bigger hammer, there are some bigger hammers that you can either use or take for inspiration. The following two libraries offer some kind of binding form (akin to let) with a localized memoization of results, so as to perform only the necessary steps and reuse their results if needed again: Plumatic Plumbing, specifically the Graph part; and Zach Tellman's Manifold, whose let-flow form furthermore orchestrates asynchronous steps to wait for the necessary inputs to become available, and to run in parallel when possible. Even if you decide to maintain your present course, their docs make good reading, and the code of Manifold itself is educational.
I recently had this same question when I looked at this code I wrote
(let [user-symbols (map :symbol states)
duplicates (for [[id freq] (frequencies user-symbols) :when (> freq 1)] id)]
(do-something-with duplicates))
You'll note that map and for are lazy and will not be executed until do-something-with is executed. It's also possible that not all (or even not any) of the states will be mapped or the frequencies calculated. It depends on what do-something-with actually requests of the sequence returned by for. This is very much functional and idiomatic functional programming.
i guess the simplest approach to keep it functional would be to have a pass-through state to accumulate the intermediate results. something like this:
(defn with-state [res-key f state]
(assoc state res-key (f state)))
user> (with-state :res (comp inc :init) {:init 10})
;;=> {:init 10, :res 11}
so you can move on to something like this:
(->> {:init 100}
(with-state :inc'd (comp inc :init))
(with-state :inc-doubled (comp (partial * 2) :inc'd))
(with-state :inc-doubled-squared (comp #(* % %) :inc-doubled))
(with-state :summarized (fn [st] (apply + (vals st)))))
;;=> {:init 100,
;; :inc'd 101,
;; :inc-doubled 202,
;; :inc-doubled-squared 40804,
;; :summarized 41207}
The let form is a perfectly functional construct and can be seen as syntactic sugar for calls to anonymous functions. We can easily write a recursive macro to implement our own version of let:
(defmacro my-let [bindings body]
(if (empty? bindings)
body
`((fn [~(first bindings)]
(my-let ~(rest (rest bindings)) ~body))
~(second bindings))))
Here is an example of calling it:
(my-let [a 3
b (+ a 1)]
(* a b))
;; => 12
And here is a macroexpand-all called on the above expression, that reveal how we implement my-let using anonymous functions:
(clojure.walk/macroexpand-all '(my-let [a 3
b (+ a 1)]
(* a b)))
;; => ((fn* ([a] ((fn* ([b] (* a b))) (+ a 1)))) 3)
Note that the expansion doesn't rely on let and that the bound symbols become parameter names in the anonymous functions.
As others write, let is actually perfectly functional, but at times it can feel imperative. It's better to become fully comfortable with it.
You might, however, want to kick the tires of my little library tl;dr that lets you write code like for example
(compute
(+ a b c)
where
a (f b)
c (+ 100 b))

Using let style destructuring for def

Is there a reasonable way to have multiple def statements happen with destructing the same way that let does it? For Example:
(let [[rtgs pcts] (->> (sort-by second row)
(apply map vector))]
.....)
What I want is something like:
(defs [rtgs pcts] (->> (sort-by second row)
(apply map vector)))
This comes up a lot in the REPL, notebooks and when debugging. Seriously feels like a missing feature so I'd like guidance on one of:
This exists already and I'm missing it
This is a bad idea because... (variable capture?, un-idiomatic?, Rich said so?)
It's just un-needed and I must be suffering from withdrawals from an evil language. (same as: don't mess up our language with your macros)
A super short experiment give me something like:
(defmacro def2 [[name1 name2] form]
`(let [[ret1# ret2#] ~form]
(do (def ~name1 ret1#)
(def ~name2 ret2#))))
And this works as in:
(def2 [three five] ((juxt dec inc) 4))
three ;; => 3
five ;; => 5
Of course and "industrial strength" version of that macro might be:
checking that number of names matches the number of inputs. (return from form)
recursive call to handle more names (can I do that in a macro like this?)
While I agree with Josh that you probably shouldn't have this running in production, I don't see any harm in having it as a convenience at the repl (in fact I think I'll copy this into my debug-repl kitchen-sink library).
I enjoy writing macros (although they're usually not needed) so I whipped up an implementation. It accepts any binding form, like in let.
(I wrote this specs-first, but if you're on clojure < 1.9.0-alpha17, you can just remove the spec stuff and it'll work the same.)
(ns macro-fun
(:require
[clojure.spec.alpha :as s]
[clojure.core.specs.alpha :as core-specs]))
(s/fdef syms-in-binding
:args (s/cat :b ::core-specs/binding-form)
:ret (s/coll-of simple-symbol? :kind vector?))
(defn syms-in-binding
"Returns a vector of all symbols in a binding form."
[b]
(letfn [(step [acc coll]
(reduce (fn [acc x]
(cond (coll? x) (step acc x)
(symbol? x) (conj acc x)
:else acc))
acc, coll))]
(if (symbol? b) [b] (step [] b))))
(s/fdef defs
:args (s/cat :binding ::core-specs/binding-form, :body any?))
(defmacro defs
"Like def, but can take a binding form instead of a symbol to
destructure the results of the body.
Doesn't support docstrings or other metadata."
[binding body]
`(let [~binding ~body]
~#(for [sym (syms-in-binding binding)]
`(def ~sym ~sym))))
;; Usage
(defs {:keys [foo bar]} {:foo 42 :bar 36})
foo ;=> 42
bar ;=> 36
(defs [a b [c d]] [1 2 [3 4]])
[a b c d] ;=> [1 2 3 4]
(defs baz 42)
baz ;=> 42
About your REPL-driven development comment:
I don't have any experience with Ipython, but I'll give a brief explanation of my REPL workflow and you can maybe comment about any comparisons/contrasts with Ipython.
I never use my repl like a terminal, inputting a command and waiting for a reply. My editor supports (emacs, but any clojure editor should do) putting the cursor at the end of any s-expression and sending that to the repl, "printing" the result after the cursor.
I usually have a comment block in the file where I start working, just typing whatever and evaluating it. Then, when I'm reasonably happy with a result, I pull it out of the "repl-area" and into the "real-code".
(ns stuff.core)
;; Real code is here.
;; I make sure that this part always basically works,
;; ie. doesn't blow up when I evaluate the whole file
(defn foo-fn [x]
,,,)
(comment
;; Random experiments.
;; I usually delete this when I'm done with a coding session,
;; but I copy some forms into tests.
;; Sometimes I leave it for posterity though,
;; if I think it explains something well.
(def some-data [,,,])
;; Trying out foo-fn, maybe copy this into a test when I'm done.
(foo-fn some-data)
;; Half-finished other stuff.
(defn bar-fn [x] ,,,)
(keys 42) ; I wonder what happens if...
)
You can see an example of this in the clojure core source code.
The number of defs that any piece of clojure will have will vary per project, but I'd say that in general, defs are not often the result of some computation, let alone the result of a computation that needs to be destructured. More often defs are the starting point for some later computation that will depend on this value.
Usually functions are better for computing a value; and if the computation is expensive, then you can memoize the function. If you feel you really need this functionality, then by all means, use your macro -- that's one of the sellings points of clojure, namely, extensibility! But in general, if you feel you need this construct, consider the possibility that you're relying too much on global state.
Just to give some real examples, I just referenced my main project at work, which is probably 2K-3K lines of clojure, in about 20 namespaces. We have about 20 defs, most of which are marked private and among them, none are actually computing anything. We have things like:
(def path-prefix "/some-path")
(def zk-conn (atom nil))
(def success? #{200})
(def compile* (clojure.core.memoize/ttl compiler {} ...)))
(def ^:private nashorn-factory (NashornScriptEngineFactory.))
(def ^:private read-json (comp json/read-str ... ))
Defining functions (using comp and memoize), enumerations, state via atom -- but no real computation.
So I'd say, based on your bullet points above, this falls somewhere between 2 and 3: it's definitely not a common use case that's needed (you're the first person I've ever heard who wants this, so it's uncommon to me anyway); and the reason it's uncommon is because of what I said above, i.e., it may be a code smell that indicates reliance on too much global state, and hence, would not be very idiomatic.
One litmus test I have for much of my code is: if I pull this function out of this namespace and paste it into another, does it still work? Removing dependencies on external vars allows for easier testing and more modular code. Sometimes we need it though, so see what your requirements are and proceed accordingly. Best of luck!

Using doseq and atoms rather than loop/recur

I have been rewriting Land Of Lisp's orc-battle game in Clojure. During the process I am using a more functional style. I have come up with two methods for writing part of the higher level game loop. One involving loop/recur and the other using doseq and atoms. Here are the two functions:
(defn monster-round [player monsters]
(loop [n 0 p player]
(if (>= n (count monsters))
p
(recur (inc n)
(if (monster-dead? (nth monsters n))
p
(let [r (monster-attack (nth monsters n) p)]
(print (:attack r))
(:player r)))))))
(defn monster-round-2 [player monsters]
(let [p (atom player)]
(doseq [m monsters]
(if (not (monster-dead? m))
(let [r (monster-attack m #p)]
(print (:attack r))
(reset! p (:player r)))))
#p))
I like the second method better because the code is more concise and is easier to follow. Is there any reason why the first approach is better? Or am I missing a different way to do this?
is this equivalent? if so, i prefer it - it's compact, clearer than your solutions (imho!), and functional
(defn monster-round [monsters player]
(if-let [[monster & monsters] monsters]
(recur monsters
(if (monster-dead? monster)
player
(let [r (monster-attack monster player)]
(print (:attack r))
(:player r))))
player))
(note: i changed the argument order to monster-round so that the recur looked nicer)
more generally, you should not have introduced n in your "functional" version (it's not really really functional if you've got an index...). indexing into a sequence is very, very rarely needed. if you had fought the temptation to do that a little harder, i think you would have written the routine above...
but, after writing that, i thought: "hmmm. that's just iterating over monsters. why can't we use a standard form? it's not a for loop because player changes. so it must be a fold (ie a reduce), which carries the player forwards". and then it was easy to write:
(defn- fight [player monster]
(if (monster-dead? monster)
player
(let [r (monster-attack monster player)]
(print (:attack r))
(:player r))))
(defn monster-round [player monsters]
(reduce fight player monsters))
which (if it does what you want) is the Correct Answer(tm).
(maybe i am not answering the question? i think you missed the better way, as above. in general, you should be able to thread the computation around the data structure, which does normally not require mutation; often you can - and should - use the standard forms like map and reduce because they help document the process for others).

How many threads does Clojure's pmap function spawn for URL-fetching operations?

The documentation on the pmap function leaves me wondering how efficient it would be for something like fetching a collection of XML feeds over the web. I have no idea how many concurrent fetch operations pmap would spawn and what the maximum would be.
If you check the source you see:
> (use 'clojure.repl)
> (source pmap)
(defn pmap
"Like map, except f is applied in parallel. Semi-lazy in that the
parallel computation stays ahead of the consumption, but doesn't
realize the entire result unless required. Only useful for
computationally intensive functions where the time of f dominates
the coordination overhead."
{:added "1.0"}
([f coll]
(let [n (+ 2 (.. Runtime getRuntime availableProcessors))
rets (map #(future (f %)) coll)
step (fn step [[x & xs :as vs] fs]
(lazy-seq
(if-let [s (seq fs)]
(cons (deref x) (step xs (rest s)))
(map deref vs))))]
(step rets (drop n rets))))
([f coll & colls]
(let [step (fn step [cs]
(lazy-seq
(let [ss (map seq cs)]
(when (every? identity ss)
(cons (map first ss) (step (map rest ss)))))))]
(pmap #(apply f %) (step (cons coll colls))))))
The (+ 2 (.. Runtime getRuntime availableProcessors)) is a big clue there. pmap will grab the first (+ 2 processors) pieces of work and run them asynchronously via future. So if you have 2 cores, it's going to launch 4 pieces of work at a time, trying to keep a bit ahead of you but the max should be 2+n.
future ultimately uses the agent I/O thread pool which supports an unbounded number of threads. It will grow as work is thrown at it and shrink if threads are unused.
Building on Alex's excellent answer that explains how pmap works, here's my suggestion for your situation:
(doall
(map
#(future (my-web-fetch-function %))
list-of-xml-feeds-to-fetch))
Rationale:
You want as many pieces of work in-flight as you can, since most will block on network IO.
Future will fire off an asynchronous piece of work for each request, to be handled in a thread pool. You can let Clojure take care of that intelligently.
The doall on the map will force the evaluation of the full sequence (i.e. the launch of all the requests).
Your main thread can start dereferencing the futures right away, and can therefore continue making progress as the individual results come back
No time to write a long response, but there's a clojure.contrib http-agent which creates each get/post request as its own agent. So you can fire off a thousand requests and they'll all run in parallel and complete as the results come in.
Looking the operation of pmap, it seems to go 32 threads at a time no mater what number of processors you have, the issue is that map will go ahead of the computation by 32 and the futures are started in their own. (SAMPLE)
(defn samplef [n]
(println "starting " n)
(Thread/sleep 10000)
n)
(def result (pmap samplef (range 0 100)))
; you will wait for 10 seconds and see 32 prints then when you take the 33rd an other 32
; prints this mins that you are doing 32 concurrent threads at a time
; to me this is not perfect
; SALUDOS Felipe

Is there a way to control the number of threads used with pmap?

I am just doing some performance testing with clojure using pmap and I would like to be able to control the number of threads being used with pmap. I know when using something like OpenMP one can set the number of threads using omp_set_num_threads(). I was wondering if there would be anything similar in clojure.
Here's code for pmap:
(defn pmap
"Like map, except f is applied in parallel. Semi-lazy in that the
parallel computation stays ahead of the consumption, but doesn't
realize the entire result unless required. Only useful for
computationally intensive functions where the time of f dominates
the coordination overhead."
([f coll]
(let [n (+ 2 (.. Runtime getRuntime availableProcessors))
rets (map #(future (f %)) coll)
step (fn step [[x & xs :as vs] fs]
(lazy-seq
(if-let [s (seq fs)]
(cons (deref x) (step xs (rest s)))
(map deref vs))))]
(step rets (drop n rets))))
As you can see, pmap takes all available processors and uses them cyclically. So, no, there's no way to set the number of threads... but you can always write your own pmap, which will provide such functionality.