multi-arity defn in Clojure -- first match first serve? - clojure

To be concrete, what is supposed to happen in the following situation:
(defn avg
([] 0)
([& args] (/ (reduce + args) (count args))))
(avg)
i.e., can I rely on clojure to always return 0 rather than divide-by-zero?

You can rely on Clojure to return 0 rather than divide-by-zero. But it isn't first match, first served:
(defn avg
([& args] (/ (reduce + args) (count args)))
([] 0))
(avg)
; 0
The specific arities take precedence over the rest argument, as described here.

Related

How do I use "mean" as the final reducing function in a transducer?

I'm trying to estimate the mean distance of all pairs of points in a unit square.
This transducer returns a vector of the distances of x randomly selected pairs of points, but the final step would be to take the mean of all values in that vector. Is there a way to use mean as the final reducing function (or to include it in the composition)?
(defn square [x] (* x x))
(defn mean [x] (/ (reduce + x) (count x)))
(defn xform [iterations]
(comp
(partition-all 4)
(map #(Math/sqrt (+ (square (- (first %) (nth % 1)))
(square (- (nth % 2) (nth % 3))))))
(take iterations)))
(transduce (xform 5) conj (repeatedly #(rand)))
[0.5544757422041136
0.4170515673848907
0.7457675423415904
0.5560901974277822
0.6053573945754688]
(transduce (xform 5) mean (repeatedly #(rand)))
Execution error (ArityException) at test.core/eval19667 (form-init9118116578029918666.clj:562).
Wrong number of args (0) passed to: test.core/mean
If you implement your mean function differently, you won't have to collect all the values before computing the mean. Here is how you can implement it, based on this Java code:
(defn mean
([] [0 1]) ;; <-- Construct an empty accumulator
([[mu n]] mu) ;; <-- Get the mean (final step)
([[mu n] x] ;; <-- Accumulate a value to the mean
[(+ mu (/ (- x mu) n)) (inc n)]))
And you use it like this:
(transduce identity mean [1 2 3 4])
;; => 5/2
or like this:
(transduce (xform 5) mean (repeatedly #(rand)))
;; => 0.582883812837961
From the docs of transduce:
If init is not supplied, (f) will be called to produce it. f should be
a reducing step function that accepts both 1 and 2 arguments, if it
accepts only 2 you can add the arity-1 with 'completing'.
To disect this:
Your function needs 0-arity to produce an initial value -- so conj
is fine (it produces an empty vector).
You need to provide a 2-arity function to do the actual redudcing
-- again conj is fine here
You need to provide a 1-arity function to finalize - here you want
your mean.
So as the docs suggest, you can use completing to just provide that:
(transduce (xform 5) (completing conj mean) (repeatedly #(rand)))
; → 0.4723186070904141
If you look at the source of completing you will see how it produces
all of this:
(defn completing
"Takes a reducing function f of 2 args and returns a fn suitable for
transduce by adding an arity-1 signature that calls cf (default -
identity) on the result argument."
{:added "1.7"}
([f] (completing f identity))
([f cf]
(fn
([] (f))
([x] (cf x))
([x y] (f x y)))))

mapcat breaking the lazyness

I have a function that produces lazy-sequences called a-function.
If I run the code:
(map a-function a-sequence-of-values)
it returns a lazy sequence as expected.
But when I run the code:
(mapcat a-function a-sequence-of-values)
it breaks the lazyness of my function. In fact it turns that code into
(apply concat (map a-function a-sequence-of-values))
So it needs to realize all the values from the map before concatenating those values.
What I need is a function that concatenates the result of a map function on demand without realizing all the map beforehand.
I can hack a function for this:
(defn my-mapcat
[f coll]
(lazy-seq
(if (not-empty coll)
(concat
(f (first coll))
(my-mapcat f (rest coll))))))
But I can't believe that clojure doesn't have something already done. Do you know if clojure has such feature? Only a few people and I have the same problem?
I also found a blog that deals with the same issue: http://clojurian.blogspot.com.br/2012/11/beware-of-mapcat.html
Lazy-sequence production and consumption is different than lazy evaluation.
Clojure functions do strict/eager evaluation of their arguments. Evaluation of an argument that is or that yields a lazy sequence does not force realization of the yielded lazy sequence in and of itself. However, any side effects caused by evaluation of the argument will occur.
The ordinary use case for mapcat is to concatenate sequences yielded without side effects. Therefore, it hardly matters that some of the arguments are eagerly evaluated because no side effects are expected.
Your function my-mapcat imposes additional laziness on the evaluation of its arguments by wrapping them in thunks (other lazy-seqs). This can be useful when significant side effects - IO, significant memory consumption, state updates - are expected. However, the warning bells should probably be going off in your head if your function is doing side effects and producing a sequence to be concatenated that your code probably needs refactoring.
Here is similar from algo.monads
(defn- flatten*
"Like #(apply concat %), but fully lazy: it evaluates each sublist
only when it is needed."
[ss]
(lazy-seq
(when-let [s (seq ss)]
(concat (first s) (flatten* (rest s))))))
Another way to write my-mapcat:
(defn my-mapcat [f coll] (for [x coll, fx (f x)] fx))
Applying a function to a lazy sequence will force realization of a portion of that lazy sequence necessary to satisfy the arguments of the function. If that function itself produces lazy sequences as a result, those are not realized as a matter of course.
Consider this function to count the realized portion of a sequence
(defn count-realized [s]
(loop [s s, n 0]
(if (instance? clojure.lang.IPending s)
(if (and (realized? s) (seq s))
(recur (rest s) (inc n))
n)
(if (seq s)
(recur (rest s) (inc n))
n))))
Now let's see what's being realized
(let [seq-of-seqs (map range (list 1 2 3 4 5 6))
concat-seq (apply concat seq-of-seqs)]
(println "seq-of-seqs: " (count-realized seq-of-seqs))
(println "concat-seq: " (count-realized concat-seq))
(println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))
;=> seq-of-seqs: 4
; concat-seq: 0
; seqs-in-seq: [0 0 0 0 0 0]
So, 4 elements of the seq-of-seqs got realized, but none of its component sequences were realized nor was there any realization in the concatenated sequence.
Why 4? Because the applicable arity overloaded version of concat takes 4 arguments [x y & xs] (count the &).
Compare to
(let [seq-of-seqs (map range (list 1 2 3 4 5 6))
foo-seq (apply (fn foo [& more] more) seq-of-seqs)]
(println "seq-of-seqs: " (count-realized seq-of-seqs))
(println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))
;=> seq-of-seqs: 2
; seqs-in-seq: [0 0 0 0 0 0]
(let [seq-of-seqs (map range (list 1 2 3 4 5 6))
foo-seq (apply (fn foo [a b c & more] more) seq-of-seqs)]
(println "seq-of-seqs: " (count-realized seq-of-seqs))
(println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))
;=> seq-of-seqs: 5
; seqs-in-seq: [0 0 0 0 0 0]
Clojure has two solutions to making the evaluation of arguments lazy.
One is macros. Unlike functions, macros do not evaluate their arguments.
Here's a function with a side effect
(defn f [n] (println "foo!") (repeat n n))
Side effects are produced even though the sequence is not realized
user=> (def x (concat (f 1) (f 2)))
foo!
foo!
#'user/x
user=> (count-realized x)
0
Clojure has a lazy-cat macro to prevent this
user=> (def y (lazy-cat (f 1) (f 2)))
#'user/y
user=> (count-realized y)
0
user=> (dorun y)
foo!
foo!
nil
user=> (count-realized y)
3
user=> y
(1 2 2)
Unfortunately, you cannot apply a macro.
The other solution to delay evaluation is wrap in thunks, which is exactly what you've done.
Your premise is wrong. Concat is lazy, apply is lazy if its first argument is, and mapcat is lazy.
user> (class (mapcat (fn [x y] (println x y) (list x y)) (range) (range)))
0 0
1 1
2 2
3 3
clojure.lang.LazySeq
note that some of the initial values are evaluated (more on this below), but clearly the whole thing is still lazy (or the call would never have returned, (range) returns an endless sequence, and will not return when used eagerly).
The blog you link to is about the danger of recursively using mapcat on a lazy tree, because it is eager on the first few elements (which can add up in a recursive application).

Clojure: Implementing the comp function

4Clojure Problem 58 is stated as:
Write a function which allows you to create function compositions. The parameter list should take a variable number of functions, and create a function applies them from right-to-left.
(= [3 2 1] ((__ rest reverse) [1 2 3 4]))
(= 5 ((__ (partial + 3) second) [1 2 3 4]))
(= true ((__ zero? #(mod % 8) +) 3 5 7 9))
(= "HELLO" ((__ #(.toUpperCase %) #(apply str %) take) 5 "hello world"))
Here __ should be replaced by the solution.
In this problem the function comp should not be employed.
A solution I found is:
(fn [& xs]
(fn [& ys]
(reduce #(%2 %1)
(apply (last xs) ys) (rest (reverse xs)))))
It works. But I don't really understand how the reduce works here. How does it represent (apply f_1 (apply f_2 ...(apply f_n-1 (apply f_n args))...)?
Let's try modifying that solution in 3 stages. Stay with each for a while and see if you get it. Stop if and when you do lest I confuse you more.
First, let's have more descriptive names
(defn my-comp [& fns]
(fn [& args]
(reduce (fn [result-so-far next-fn] (next-fn result-so-far))
(apply (last fns) args) (rest (reverse fns)))))
then factor up some
(defn my-comp [& fns]
(fn [& args]
(let [ordered-fns (reverse fns)
first-result (apply (first ordered-fns) args)
remaining-fns (rest ordered-fns)]
(reduce
(fn [result-so-far next-fn] (next-fn result-so-far))
first-result
remaining-fns))))
next replace reduce with a loop which does the same
(defn my-comp [& fns]
(fn [& args]
(let [ordered-fns (reverse fns)
first-result (apply (first ordered-fns) args)]
(loop [result-so-far first-result, remaining-fns (rest ordered-fns)]
(if (empty? remaining-fns)
result-so-far
(let [next-fn (first remaining-fns)]
(recur (next-fn result-so-far), (rest remaining-fns))))))))
My solution was:
(fn [& fs]
(reduce (fn [f g]
#(f (apply g %&))) fs))
Lets try that for:
((
(fn [& fs]
(reduce (fn [f g]
#(f (apply g %&))) fs))
#(.toUpperCase %)
#(apply str %)
take)
5 "hello world"))
fs is a list of the functions:
#(.toUpperCase %)
#(apply str %)
take
The first time through the reduce, we set
f <--- #(.toUpperCase %)
g <--- #(apply str %)
We create an anonymous function, and assign this to the reduce function's accumulator.
#(f (apply g %&)) <---- uppercase the result of apply str
Next time through the reduce, we set
f <--- uppercase the result of apply str
g <--- take
Again we create a new anonymous function, and assign this to the reduce function's accumulator.
#(f (apply g %&)) <---- uppercase composed with apply str composed with take
fs is now empty, so this anonymous function is returned from reduce.
This function is passed 5 and "hello world"
The anonymous function then:
Does take 5 "hello world" to become (\h \e \l \l \o)
Does apply str to become "hello"
Does toUppercase to become "HELLO"
Here's an elegent (in my opinion) definition of comp:
(defn comp [& fs]
(reduce (fn [result f]
(fn [& args]
(result (apply f args))))
identity
fs))
The nested anonymous functions might make it hard to read at first, so let's try to address that by pulling them out and giving them a name.
(defn chain [f g]
(fn [& args]
(f (apply g args))))
This function chain is just like comp except that it only accepts two arguments.
((chain inc inc) 1) ;=> 3
((chain rest reverse) [1 2 3 4]) ;=> (3 2 1)
((chain inc inc inc) 1) ;=> ArityException
The definition of comp atop chain is very simple and helps isolate what reduce is bringing to the show.
(defn comp [& fs]
(reduce chain identity fs))
It chains together the first two functions, the result of which is a function. It then chains that function with the next, and so on.
So using your last example:
((comp #(.toUpperCase %) #(apply str %) take) 5 "hello world") ;=> "HELLO"
The equivalent only using chain (no reduce) is:
((chain identity
(chain (chain #(.toUpperCase %)
#(apply str %))
take))
5 "hello world")
;=> "HELLO"
At a fundamental level, reduce is about iteration. Here's what an implementation in an imperative style might look like (ignoring the possibility of multiple arities, as Clojure's version supports):
def reduce(f, init, seq):
result = init
for item in seq:
result = f(result, item)
return result
It's just capturing the pattern of iterating over a sequence and accumulating a result. I think reduce has a sort of mystique around it which can actually make it much harder to understand than it needs to be, but if you just break it down you'll definitely get it (and probably be surprised how often you find it useful).
Here is my solution:
(defn my-comp
([] identity)
([f] f)
([f & r]
(fn [& args]
(f (apply (apply my-comp r) args)))))
I like A. Webb's solution better, though it does not behave exactly like comp because it does not return identity when called without any arguments. Simply adding a zero-arity body would fix that issue though.
Consider this example:
(def c (comp f1 ... fn-1 fn))
(c p1 p2 ... pm)
When c is called:
first comp's rightmost parameter fn is applied to the p* parameters ;
then fn-1 is applied to the result of the previous step ;
(...)
then f1 is applied to the result of the previous step, and its result is returned
Your sample solution does exactly the same.
first the rightmost parameter (last xs) is applied to the ys parameters:
(apply (last xs) ys)
the remaining parameters are reversed to be fed to reduce:
(rest (reverse xs))
reduce takes the provided initial result and list of functions and iteratively applies the functions to the result:
(reduce #(%2 %1) ..init.. ..functions..)

Clojure function throws a null pointer exception

I've just begun to teach myself clojure and I'm having fun. However trouble began when I began to exec this function I wrote!
It's a simple function that accepts multiple number of arguments & returns the difference between the last and the first arguments.
(defn diff-last-first
"gets the difference between the last & the first arguments"
[& args]
(- (get args (- (count args) 1)) (get args 0)))
I know that I can simply use the last function to get the last element of args, but I'm not able to understand why this is throwing a NullPointerException when I execute
(diff-last-first 1 2 3)
If you do want to access the nth value of a list, you can use nth:
(defn diff-last-first [& args]
(- (nth args (dec (count args)))
(nth args 0)))
But of course, as you pointed out in your question, it is more idiomatic to use first and last:
(defn diff-last-first [& args]
(- (last args)
(first args)))
(get (list :foo) 0) evaluates to nil.
Lists are not supposed to be accessed by index: it is a common design decision in Clojure to prevent such inefficiencies.
Ok, Got it!
#vemv was right! [& args] is a list and hence (get args 0) returns nil
Whereas (defn get-l-f [args] (- (get args (- (count args) 1)) (get args 0))) works as expected as args is a vector here!

Scheme -> Clojure: multimethods with predicates in the methods?

I'm converting some Scheme code to Clojure. The original uses a dispatching pattern that's very similar to multimethods, but with an inverted approach to the matching predicates. For example, there a generic function "assign-operations". The precise implementation details aren't too important at the moment, but notice that it can take a list of argument-predicates.
(define (assign-operation operator handler . argument-predicates)
(let ((record
(let ((record (get-operator-record operator))
(arity (length argument-predicates)))
(if record
(begin
(if (not (fix:= arity (operator-record-arity record)))
(error "Incorrect operator arity:" operator))
record)
(let ((record (make-operator-record arity)))
(hash-table/put! *generic-operator-table* operator record)
record)))))
(set-operator-record-tree! record
(bind-in-tree argument-predicates
handler
(operator-record-tree record)))))
The dispatched functions supply these predicates, one per argument in the arity of the function.
(assign-operation 'merge
(lambda (content increment) content)
any? nothing?)
(assign-operation 'merge
(lambda (content increment) increment)
nothing? any?)
(assign-operation 'merge
(lambda (content increment)
(let ((new-range (intersect-intervals content increment)))
(cond ((interval-equal? new-range content) content)
((interval-equal? new-range increment) increment)
((empty-interval? new-range) the-contradiction)
(else new-range))))
interval? interval?)
Later, when the generic function "merge" is called, each handler is asked if it works on the operands.
As I understand multimethods, the dispatch function is defined across the set of implementations, with dispatch to a specific method based on the return value of the dispatch-fn. In the Scheme above, new assign-operation functions can define predicates arbitrarily.
What would be an equivalent, idiomatic construct in Clojure?
EDIT: The code above comes from "The Art of the Propagator", by Alexey Radul and Gerald Sussman.
You can do this with Clojure's multimethods fairly easily - the trick is to create a dispatch function that distinguishes between the different sets of predicates.
The easiest way to do this is probably just to maintain a vector of "composite predicates" that apply all of the individual predicates to the full argument list, and use the index of this vector as the dispatch value:
(def pred-list (ref []))
(defn dispatch-function [& args]
(loop [i 0]
(cond
(>= i (count #pred-list)) (throw (Error. "No matching function!"))
(apply (#pred-list i) args) i
:else (recur (inc i)))))
(defmulti handler dispatch-function)
(defn assign-operation [function & preds]
(dosync
(let [i (count #pred-list)]
(alter pred-list conj
(fn [& args] (every? identity (map #(%1 %2) preds args))))
(defmethod handler i [& args] (apply function args)))))
Then you can create operations to handle whatever predicates you like as follows:
(assign-operation (fn [x] (/ x 2)) even?)
(assign-operation (fn [x] (+ x 1)) odd?)
(take 15 (iterate handler 77))
=> (77 78 39 40 20 10 5 6 3 4 2 1 2 1 2)