Clojure vector -> vector function using # and % syntax sugar? - clojure

Is it possible to code using #, %1, %2 for the below?
(defn fib-step [[a b]]
[b (+ a b)])
(defn fib-seq []
(map first (iterate fib-step [0 1])))
user> (take 20 (fib-seq))
(0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181)
In short, I'd like to know how to write
vector -> vector function using # and % syntax sugar.
Thanks

You can easily produce a vector using the #() reader form with the -> threading macro. For example, the following two functions are equivalent:
(fn [a b] [b a])
#(-> [%2 %])
However, if you need to do destructuring, as in your case, you're best off just sticking with one of the fn forms with an explicit parameter list. The best you'd get with #() is something like this:
#(-> [(% 1) (+ (% 0) (% 1))])
or
#(-> [(% 1) (apply + %)])
Using the higher-order juxt function is another nice way to create vectors, but unfortunately in this case it doesn't buy you much either:
(def fib-step (juxt second #(apply + %)))
I think out of all the options, using fn is still the best fit because of its easy support for destructuring:
(fn [[a b]] [b (+ a b)])

I would suggest making fib-step take 2 parameters rather than a vector as that would make things more clear that this function need two values whereas a vector as param means it can take any number of values as param (in the form of the vector).
(def fib-step #(-> [%2 (+ %1 %2)]))
(defn fib-seq []
(map first (iterate (partial apply fib-step) [0 1])))

I think using the vector function is clearer than the (-> [...]) "trick":
#(vector (% 1) (apply + %))
Though in this instance, with destructuring, I'd just use a named function or (fn [...] ...) anyway.

Here is the code:
(def step #(-> [(% 1) (+ (% 0) (% 1))]))
(def fib #(map first (iterate step [0 1])))
(println
(take 20 (fib))
)
or
(def step #(-> [(% 1) (+ (% 0) (% 1))]))
(def fib (->> [0 1]
(iterate step)
(map first)))
(println
(->> fib
(take 20))
))

Related

Clojure: map function with updatable state

What is the best way of implementing map function together with an updatable state between applications of function to each element of sequence? To illustrate the issue let's suppose that we have a following problem:
I have a vector of the numbers. I want a new sequence where each element is multiplied by 2 and then added number of 10's in the sequence up to and including the current element. For example from:
[20 30 40 10 20 10 30]
I want to generate:
[40 60 80 21 41 22 62]
Without adding the count of 10 the solution can be formulated using a high level of abstraction:
(map #(* 2 %) [20 30 40 10 20 10 30])
Having count to update forced me to "go to basic" and the solution I came up with is:
(defn my-update-state [x state]
(if (= x 10) (+ state 1) state)
)
(defn my-combine-with-state [x state]
(+ x state))
(defn map-and-update-state [vec fun state update-state combine-with-state]
(when-not (empty? vec)
(let [[f & other] vec
g (fun f)
new-state (update-state f state)]
(cons (combine-with-state g new-state) (map-and-update-state other fun new-state update-state combine-with-state))
)))
(map-and-update-state [20 30 40 50 10 20 10 30 ] #(* 2 %) 0 my-update-state my-combine-with-state )
My question: is it the appropriate/canonical way to solve the problem or I overlooked some important concepts/functions.
PS:
The original problem is walking AST (abstract syntax tree) and generating new AST together with updating symbol table, so when proposing the solution to the problem above please keep it in mind.
I do not worry about blowing up stack, so replacement with loop+recur is not
my concern here.
Is using global Vars or Refs instead of passing state as an argument a definite no-no?
You can use reduce to accumulate a pair of the number of 10s seen so far and the current vector of results.:
(defn map-update [v]
(letfn [(update [[ntens v] i]
(let [ntens (+ ntens (if (= 10 i) 1 0))]
[ntens (conj v (+ ntens (* 2 i)))]))]
(second (reduce update [0 []] v))))
To count # of 10 you can do
(defn count-10[col]
(reductions + (map #(if (= % 10) 1 0) col)))
Example:
user=> (count-10 [1 2 10 20 30 10 1])
(0 0 1 1 1 2 2)
And then a simple map for the final result
(map + col col (count-10 col)))
Reduce and reductions are good ways to traverse a sequence keeping a state. If you feel your code is not clear you can always use recursion with loop/recur or lazy-seq like this
(defn twice-plus-ntens
([coll] (twice-plus-ntens coll 0))
([coll ntens]
(lazy-seq
(when-let [s (seq coll)]
(let [item (first s)
new-ntens (if (= 10 item) (inc ntens) ntens)]
(cons (+ (* 2 item) new-ntens)
(twice-plus-ntens (rest s) new-ntens)))))))
have a look at map source code evaluating this at your repl
(source map)
I've skipped chunked optimization and multiple collection support.
You can make it a higher-order function this way
(defn map-update
([mf uf coll] (map-update mf uf (uf) coll))
([mf uf val coll]
(lazy-seq
(when-let [s (seq coll)]
(let [item (first s)
new-status (uf item val)]
(cons (mf item new-status)
(map-update mf uf new-status (rest s))))))))
(defn update-f
([] 0)
([item status]
(if (= item 10) (inc status) status)))
(defn map-f [item status]
(+ (* 2 item) status))
(map-update map-f update-f in)
The most appropriate way is to use function with state
(into
[]
(map
(let [mem (atom 0)]
(fn [val]
(when (== val 10) (swap! mem inc))
(+ #mem (* val 2)))))
[20 30 40 10 20 10 30])
also see
memoize
standard function

How to define a general recurrence function in Clojure

I had an idea for a general function for recurrence relations in Clojure:
(defn recurrence [f inits]
(let [answer (lazy-seq (recurrence f inits))
windows (partition (count inits) 1 answer)]
(concat inits (lazy-seq (map f windows)))))
Then, for example, we can define the Fibonacci sequence as
(def fibs (recurrence (partial apply +) [0 1N]))
This works well enough for small numbers:
(take 10 fibs)
;(0 1N 1N 2N 3N 5N 8N 13N 21N 34N)
But it blows the stack if asked to realise a long sequence:
(first (drop 10000 fibs))
;StackOverflowError ...
Is there any way to overcome this?
The issue here is that you are building up calls to concat with every iteration, and the concat calls build up a big pile of unevaluated thunks that blow up when you finally ask for a value. By using cons and only passing forward the needed count of values (and concat, but not a recursive stack blowing concat), we get a better behaved lazy sequence:
user>
(defn recurrence
[f seed]
(let [step (apply f seed)
new-state (concat (rest seed) (list step))]
(lazy-seq (cons step (recurrence f new-state)))))
#'user/recurrence
user> (def fibs (recurrence +' [0 1]))
#'user/fibs
user> (take 10 fibs)
(1 2 3 5 8 13 21 34 55 89)
user> (first (drop 1000 fibs))
113796925398360272257523782552224175572745930353730513145086634176691092536145985470146129334641866902783673042322088625863396052888690096969577173696370562180400527049497109023054114771394568040040412172632376N
Starting from the accepted answer.
We want to start the sequence with the seed.
As the author suggests, we use a queue for efficiency. There's no need for a deque: clojure's PersistentQueue is all we need.
The adapted recurrence might look like this:
(defn recurrence
[f seed]
(let [init-window (into (clojure.lang.PersistentQueue/EMPTY) seed)
unroll (fn unroll [w] (lazy-seq (cons
(peek w)
(unroll (-> w
pop
(conj (apply f w)))))))]
(unroll init-window)))
... and, as before ...
(def fibs (recurrence +' [0 1]))
Then
(take 12 fibs)
;(0 1 1 2 3 5 8 13 21 34 55 89)
and
(first (drop 10002 fibs))
;88083137989997064605355872998857923445691333015376030932812485815888664307789011385238647061572694566755888008658862476758094375234981509702215595106015601812940878487465890539696395631360292400123725490667987980947195761919733084221263262792135552511961663188744083262743015393903228035182529922900769207624088879893951554938584166812233127685528968882435827903110743620870056104022290494963321073406865860606579792362403866826411642270661211435590340090149458419810817251120025713501918959350654895682804718752319215892119222907223279849851227166387954139546662644064653804466345416102543306712688251378793506564112970620367672131344559199027717813404940431009754143637417645359401155245658646088296578097547699141284451819782703782878668237441026255023475279003880007450550868002409533068098127495095667313120369142331519140185017719214501847645741030739351025342932514280625453085775191996236343792432215700850773568257988920265539647922172315902209901079830195949058505943508013044450503826167880993094540503572266189964694973263576375908606977788395730196227274629745722872833622300472769312273603346624292690875697438264265712313123637644491367875538847442013130532147345613099333195400845560466085176375175045485046787815133225349388996334014329318304865656815129208586686515835880811316065788759195646547703631454040090435955879604123186007481842117640574158367996845627012099571008761776991075470991386301988104753915798231741447012236434261594666985397841758348337030914623617101746431922708522824868155612811426016775968762121429282582582088871795463467796927317452368633552346819405423359738696980252707545944266042764236577381721803749442538053900196250284054406347238606575093877669323501452512412179883698552204038865069179867773579705703841178650618818357366165649529547898801198617541432893443650952033983923542592952070864044249738338089778163986683069566736505126466886304227253105034231716761535350441178724210841830855527586882822093246545813120624113290391593897765219320931179697869997243770533719319530526369830529543842405655495229382251039116426750156771132964376N
Another way, based on an idea stolen - I think - from Joy of Clojure, is ...
(defn recurrence
[f seed]
(let [init-window (into (clojure.lang.PersistentQueue/EMPTY) seed)
windows (iterate
(fn [w] (-> w, pop, (conj (apply f w))))
init-window)]
(map peek windows)))

Efficient side-effect-only analogue of Clojure's map function

What if map and doseq had a baby? I'm trying to write a function or macro like Common Lisp's mapc, but in Clojure. This does essentially what map does, but only for side-effects, so it doesn't need to generate a sequence of results, and wouldn't be lazy. I know that one can iterate over a single sequence using doseq, but map can iterate over multiple sequences, applying a function to each element in turn of all of the sequences. I also know that one can wrap map in dorun. (Note: This question has been extensively edited after many comments and a very thorough answer. The original question focused on macros, but those macro issues turned out to be peripheral.)
This is fast (according to criterium):
(defn domap2
[f coll]
(dotimes [i (count coll)]
(f (nth coll i))))
but it only accepts one collection. This accepts arbitrary collections:
(defn domap3
[f & colls]
(dotimes [i (apply min (map count colls))]
(apply f (map #(nth % i) colls))))
but it's very slow by comparison. I could also write a version like the first, but with different parameter cases [f c1 c2], [f c1 c2 c3], etc., but in the end, I'll need a case that handles arbitrary numbers of collections, like the last example, which is simpler anyway. I've tried many other solutions as well.
Since the second example is very much like the first except for the use of apply and the map inside the loop, I suspect that getting rid of them would speed things up a lot. I have tried to do this by writing domap2 as a macro, but the way that the catch-all variable after & is handled keeps tripping me up, as illustrated above.
Other examples (out of 15 or 20 different versions), benchmark code, and times on a Macbook Pro that's a few years old (full source here):
(defn domap1
[f coll]
(doseq [e coll]
(f e)))
(defn domap7
[f coll]
(dorun (map f coll)))
(defn domap18
[f & colls]
(dorun (apply map f colls)))
(defn domap15
[f coll]
(when (seq coll)
(f (first coll))
(recur f (rest coll))))
(defn domap17
[f & colls]
(let [argvecs (apply (partial map vector) colls)] ; seq of ntuples of interleaved vals
(doseq [args argvecs]
(apply f args))))
I'm working on an application that uses core.matrix matrices and vectors, but feel free to substitute your own side-effecting functions below.
(ns tst
(:use criterium.core
[clojure.core.matrix :as mx]))
(def howmany 1000)
(def a-coll (vec (range howmany)))
(def maskvec (zero-vector :vectorz howmany))
(defn unmaskit!
[idx]
(mx/mset! maskvec idx 1.0)) ; sets element idx of maskvec to 1.0
(defn runbench
[domapfn label]
(print (str "\n" label ":\n"))
(bench (def _ (domapfn unmaskit! a-coll))))
Mean execution times according to Criterium, in microseconds:
domap1: 12.317551 [doseq]
domap2: 19.065317 [dotimes]
domap3: 265.983779 [dotimes with apply, map]
domap7: 53.263230 [map with dorun]
domap18: 54.456801 [map with dorun, multiple collections]
domap15: 32.034993 [recur]
domap17: 95.259984 [doseq, multiple collections interleaved using map]
EDIT: It may be that dorun+map is the best way to implement domap for multiple large lazy sequence arguments, but doseq is still king when it comes to single lazy sequences. Performing the same operation as unmask! above, but running the index through (mod idx 1000), and iterating over (range 100000000), doseq is about twice as fast as dorun+map in my tests (i.e. (def domap25 (comp dorun map))).
You don't need a macro, and I don't see why a macro would be helpful here.
user> (defn do-map [f & lists] (apply mapv f lists) nil)
#'user/do-map
user> (do-map (comp println +) (range 2 6) (range 8 11) (range 22 40))
32
35
38
nil
note do-map here is eager (thanks to mapv) and only executes for side effects
Macros can use varargs lists, as the (useless!) macro version of do-map demonstrates:
user> (defmacro do-map-macro [f & lists] `(do (mapv ~f ~#lists) nil))
#'user/do-map-macro
user> (do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40))
32
35
38
nil
user> (macroexpand-1 '(do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40)))
(do (clojure.core/mapv (comp println +) (range 2 6) (range 8 11) (range 22 40)) nil)
Addendum:
addressing the efficiency / garbage-creation concerns:
note that below I truncate the output of the criterium bench function, for conciseness reasons:
(defn do-map-loop
[f & lists]
(loop [heads lists]
(when (every? seq heads)
(apply f (map first heads))
(recur (map rest heads)))))
user> (crit/bench (with-out-str (do-map-loop (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 11.367804 µs
...
This looks promising because it doesn't create a data structure that we aren't using anyway (unlike mapv above). But it turns out it is slower than the previous (maybe because of the two map calls?).
user> (crit/bench (with-out-str (do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 7.427182 µs
...
user> (crit/bench (with-out-str (do-map (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 8.355587 µs
...
Since the loop still wasn't faster, let's try a version which specializes on arity, so that we don't need to call map twice on every iteration:
(defn do-map-loop-3
[f a b c]
(loop [[a & as] a
[b & bs] b
[c & cs] c]
(when (and a b c)
(f a b c)
(recur as bs cs))))
Remarkably, though this is faster, it is still slower than the version that just used mapv:
user> (crit/bench (with-out-str (do-map-loop-3 (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 9.450108 µs
...
Next I wondered if the size of the input was a factor. With larger inputs...
user> (def test-input (repeatedly 3 #(range (rand-int 100) (rand-int 1000))))
#'user/test-input
user> (map count test-input)
(475 531 511)
user> (crit/bench (with-out-str (apply do-map-loop-3 (comp println +) test-input)))
...
Execution time mean : 1.005073 ms
...
user> (crit/bench (with-out-str (apply do-map (comp println +) test-input)))
...
Execution time mean : 756.955238 µs
...
Finally, for completeness, the timing of do-map-loop (which as expected is slightly slower than do-map-loop-3)
user> (crit/bench (with-out-str (apply do-map-loop (comp println +) test-input)))
...
Execution time mean : 1.553932 ms
As we see, even with larger input sizes, mapv is faster.
(I should note for completeness here that map is slightly faster than mapv, but not by a large degree).

Use of `->` (the "thread-first" macro) Clojure

http://clojuredocs.org/clojure_core/clojure.core/-%3E
(def step #(-> [(% 1) (+ (% 0) (% 1))]))
(def fib #(map first (iterate step [0 1])))
The code above generates Fib sequence, and I want to rewrite the 2nd line like below:
(def fib #(-> (iterate step [0 1]) (map first)))
or
(def fib #(-> [0 1] (iterate step) (map first)))
However, the both code failes when
(println
(take 10 (fib))
)
with an error
java.lang.IllegalArgumentException: Don't know how to create ISeq from: clojure.core$first
Is it impossible to rewrite like these or any proper way?
Thanks.
You want the ->> thread-last macro.

Applying multiple filters to a collection in a thrush in Clojure

The following code
(let [coll [1 2 3 4 5]
filters [#(> % 1) #(< % 5)]]
(->> coll
(filter (first filters))
(filter (second filters))))
Gives me
(2 3 4)
Which is great, but how do I apply all the filters in coll without having to explicitly name them?
There may be totally better ways of doing this, but ideally I'd like to know an expression that can replace (filter (first filters)) (filter (second filters)) above.
Thanks!
Clojure 1.3 has a new every-pred function, which you could use thusly:
(filter (apply every-pred filters) coll)
This should work :-
(let [coll [1 2 3 4 5]
filters [#(> % 1) #(< % 5)]]
(filter (fn [x] (every? #(% x) filters)) coll)
)
I can't say I'm very proud of the following, but at least it works and allows for infinite filters:
(seq
(reduce #(clojure.set/intersection
(set %1)
(set %2)) (map #(filter % coll) filters)))
If you can use sets in place of seqs it would simplify the above code as follows:
(reduce clojure.set/intersection (map #(filter % coll) filters))
(let [coll [1 2 3 4 5]
filters [#(> % 1) #(< % 5)]]
(reduce (fn [c f] (filter f c)) coll filters))