Clojure iteration over vector of functions - clojure

I am reading a book on Clojure that says:
"Another fun thing you can do with map is pass it a collection of functions. You could use this if you wanted to perform a set of calculations on different collections of numbers, like so:"
(def sum #(reduce + %))
(def avg #(/ (sum %) (count %)))
(defn stats
[numbers]
(map #(% numbers) [sum count avg]))
(stats [3 4 10])
; => (17 3 17/3)
(stats [80 1 44 13 6])
; => (144 5 144/5)
"In this example, the stats function iterates over a vector of functions, applying each function to numbers."
I find this very confusing and the book doesn't give anymore explanation.
I know % represent arguments in anonymous functions, but I can't work out what values they represent in this example. What are the %'s?
And also how can stats iterate over count if count is nested within avg?
Many thanks.

It helps to not think in "code being executed" , but in "expression trees being reduced". Expression trees are rewritten until the result appears. Symbols are replaced by "what they stand for" and functions are applied to their arguments when a "live function" appears in the first position of a list; as in (some-function a b c). This is done in top-down fashion from the top of the expression tree to the leaves, stopping when the quote symbol is encountered.
In the example below, we unfortunately cannot mark what has already been reduced and what not as there is no support for coloring. Note that the order of reduction is not necessarily the one corresponding to what the compiled code issued by the Clojure compiler actually would do.
Starting with:
(defn stats
[numbers]
(map #(% numbers) [sum count avg]))
...we shall call stats.
First difficulty is that stats can be called with a collection as a single thing:
(stats [a0 a1 a2 ... an])
or it could be called with a series of values:
(stats a0 a1 a2 ... an)
Which is it? Unfortunately the expected calling style can only be found by looking at the function definition. In this case, the definition says
(defn stats [numbers] ...
which means stats expects a single thing called numbers. Thus we call it like this:
(stats [3 4 10])
Now reduction starts! The vector of numbers that is the argument is reduced to itself because every element of a vector is reduced and a number reduces to itself. The symbol stats is reduced to the function declared earlier. The definition of stats is actually:
(fn [numbers] (map #(% numbers) [sum count avg]))
...which is a bit hidden by the defn shorthand. Thus
(stats [3 4 10])
becomes
((fn [numbers] (map #(% numbers) [sum count avg])) [3 4 10])
Next, reducing the fn expression yields a live function of one argument. Let's mark the live function with a ★ and let's use mathematical arrow notation:
(★(numbers ➜ (map #(% numbers) [sum count avg])) [3 4 10])
The live function is on first position of the list, so a function call will follow. The function call consists in replacing the occurrence of numbers by the argument [3 4 10] in the live function's body and stripping the outer parentheses of the whol expression:
(map #(% [3 4 10]) [sum count avg])
Symbols map, sum, count, avg resolve to known, defined functions, where map and count come from the Clojure core library, and the rest has been defined earlier. Again, we mark them as live:
(★map #(% [3 4 10]) [★sum ★count ★avg]))
Again, the # % notation is a shorthand for a function taking one argument and inserting it into the % position, let's make this evident:
(★map (fn [x] (x [3 4 10])) [★sum ★count ★avg]))
Reducing the fn expression yields a live function of one argument. Again, mark with ★ and let's use mathematical arrow notation:
(★map ★(x ➜ (x [3 4 10])) [★sum ★count ★avg]))
A live function ★map is in head position and thus the whole expression is reduced according to the specification of map: apply the first argument, a function, to every element of the 2nd argument, a collection. We can assume the collection is created first, and then the collection members are further evaluated, so:
[(★(x ➜ (x [3 4 10])) ★sum)
(★(x ➜ (x [3 4 10])) ★count)
(★(x ➜ (x [3 4 10])) ★avg)]
Every element of the collection can be further reduced as each has a live function of 1 argument in head position and one argument available. Thus in each case, x is appropriately substituted:
[(★sum [3 4 10])
(★count [3 4 10])
(★avg [3 4 10])]
Every element of the collection can be further reduced as each has a live function of 1 argument in head position. The exercise continues:
[ ((fn [x] (reduce + x)) [3 4 10])
(★count [3 4 10])
((fn [x] (/ (sum x) (count x))) [3 4 10])]
then
[ (★(x ➜ (reduce + x)) [3 4 10])
3
(★(x ➜ (/ (sum x) (count x))) [3 4 10])]
then
[ (reduce + [3 4 10])
3
(/ ((fn [x] (reduce + x)) [3 4 10]) (count [3 4 10]))]
then
[ (★reduce ★+ [3 4 10])
3
(/ (*(x ➜ (reduce + x)) [3 4 10]) (count [3 4 10]))]
then
[ (★+ (★+ 3 4) 10)
3
(/ (reduce + [3 4 10]) (count [3 4 10]))]
then
[ (★+ 7 10)
3
(★/ (★reduce ★+ [3 4 10]) (★count [3 4 10]))]
then
[ 17
3
(★/ 17 3)]
finally
[ 17
3
17/3]
You can also use function juxt. Try (doc juxt) on the REPL:
clojure.core/juxt
([f] [f g] [f g h] [f g h & fs])
Takes a set of functions and returns a fn that is the juxtaposition
of those fns. The returned fn takes a variable number of args, and
returns a vector containing the result of applying each fn to the
args (left-to-right).
((juxt a b c) x) => [(a x) (b x) (c x)]
Let's try that!
(def sum #(reduce + %))
(def avg #(/ (sum %) (count %)))
((juxt sum count avg) [3 4 10])
;=> [17 3 17/3]
((juxt sum count avg) [80 1 44 13 6])
;=> [144 5 144/5]
And thus we can define stats alternatively as
(defn stats [numbers] ((juxt sum count avg) numbers))
(stats [3 4 10])
;=> [17 3 17/3]
(stats [80 1 44 13 6])
;=> [144 5 144/5]
P.S.
Sometimes Clojure-code is hard to read because you don't know what "stuff" you are dealing with. There is no special syntactic marker for scalars, collections, or functions and indeed a collection can appear as a function, or a scalar can be a collection. Compare with Perl, which has notation $scalar, #collection, %hashmap, function but also $reference-to-stuff and $$scalarly-dereferenced-stuff and #$collectionly-dereferenced-stuff and %$hashmapply-dereferenced-stuff).

% stands for the first argument of the anonymous function.
(map #(% numbers) [sum count avg]))
Is equivalent to the following:
(map (fn [f] (f numbers)) [sum count avg])
where I have used the regular version rather than the short form version for anonymous functions and explicitly named the argument as 'f". See https://practicalli.github.io/clojure/defining-behaviour-with-functions/anonymous-functions.html for a fuller explanation of short form version.
In Clojure functions are first-class citizens so they can be treated as values and passed to functions. When functions are passed as values this is called generating higher-order functions (see https://clojure.org/guides/higher_order_functions).

Related

Applying a transducer directly and with "transduce" yield different results

As far as I understand, a transducer is a function that transforms a reducer function before reduce takes place. In other words, (transduce transducer reducer collection) is equivalent to (reduce (transducer reducer) collection). So these two expressions
(reduce ((map inc) -) 0 [3 4 5])
(transduce (map inc) - 0 [3 4 5])
must return the same value. Right?
Wrong
(reduce ((map inc) -) 0 [3 4 5]) -15
(transduce (map inc) - 0 [3 4 5]) 15
A bug or a feature? My version of Clojure is 1.8.0.
It turns out that (transduce) implements a slightly different algorithm.
(reduce) calls (reducer aggregate element) for every element in the collection. A total of n calls for a collection of n elements.
(transduce) calls (reducer aggregate element) for every element and then for some reason calls (reducer aggregate) again, making n+1 calls. As a result, (transduce) doesn't work as expected with (-).

How do I replicate items from a list in Clojure?

I've tried this for so many nights that I've finally given up on myself. Seems like an extremely simple problem, but I guess I'm just not fully understanding Clojure as well as I should be (I partially attribute that to my almost sole experience with imperative languages). The problem is from hackerrank.com
Here is the problem:
Problem Statement
Given a list repeat each element of the list n times. The input and output
portions will be handled automatically by the grader.
Input Format
First line has integer S where S is the number of times you need to repeat
elements. After this there are X lines, each containing an integer. These are the
X elements of the array.
Output Format
Repeat each element of the original list S times. So you have to return
list/vector/array of S*X integers. The relative positions of the values should be
same as the original list provided as input.
Constraints
0<=X<=10
1<=S<=100
So, given:
2
1
2
3
Output:
1
1
2
2
3
3
I've tried:
(fn list-replicate [num list]
(println (reduce
(fn [element seq] (dotimes [n num] (conj seq element)))
[]
list))
)
But that just gives me an exception. I've tried so many other solutions, and this probably isn't one of my better ones, but it was the quickest one I could come up with to post something here.
(defn list-replicate [num list]
(mapcat (partial repeat num) list))
(doseq [x (list-replicate 2 [1 2 3])]
(println x))
;; output:
1
1
2
2
3
3
The previous answer is short and it works, but it is very "compressed" and is not easy for new people to learn. I would do it in a simpler and more obvious way.
First, look at the repeat function:
user=> (doc repeat)
-------------------------
clojure.core/repeat
([x] [n x])
Returns a lazy (infinite!, or length n if supplied) sequence of xs.
user=> (repeat 3 5)
(5 5 5)
So we see how to easily repeat something N times.
What if we run (repeat n ...) on each element of the list?
(def N 2)
(def xvals [1 2 3] )
(for [curr-x xvals]
(repeat N curr-x))
;=> ((1 1) (2 2) (3 3))
So we are getting close, but we have a list-of-lists for output. How to fix? The simplest way is to just use the flatten function:
(flatten
(for [curr-x xvals]
(repeat N curr-x)))
;=> (1 1 2 2 3 3)
Note that both repeat and for are lazy functions, which I prefer to avoid unless I really need them. Also, I usually prefer to store my linear collections in a concrete vector, instead of a generic "seq" type. For these reasons, I include an extra step of forcing the results into a single (eagar) vector for the final product:
(defn list-replicate [num-rep orig-list]
(into []
(flatten
(for [curr-elem xvals]
(repeat N curr-elem)))))
(list-replicate N xvals)
;=> [1 1 2 2 3 3]
I would suggest building onto Alan's solution and instead of flatten use concat as this will preserve the structure of the data in case you have input sth like this [[1 2] [3 4]].
((fn [coll] (apply concat (for [x coll] (repeat 2 x)))) [[1 2] [3 4]])
output: => ([1 2] [1 2] [3 4] [3 4])
unlike with flatten, which does the following
((fn [coll] (flatten (for [x coll] (repeat 2 x)))) [[1 2] [3 4]])
output: => (1 2 1 2 3 4 3 4)
as for simple lists e.g. '(1 2 3), it works the same:
((fn [coll] (apply concat (for [x coll] (repeat 2 x)))) '(1 2 3))
output => (1 1 2 2 3 3)
(reduce #(count (map println (repeat %1 %2))) num list)

What does %-mark mean in Clojure?

I've tried to find the answer but it's quite difficult to search just the %-mark. So I've seen %-mark sometimes but I can't understand what is its function. It would be very nice if somebody could tell the explanation.
I'm assuming this is inside an anonymous function, like #(first %) if so it means the first parameter. If there are more that one, you can number them %1,%2 etc.
So for instance
(filter #(odd? %) [1 2 3 4 5 6]) => (1 3 5)
Note: In this example you would normally just do (filter odd? [1 2 3 4 5 6])
#(blah %) is shorthand for an argument to an anonymous function. So if you're squaring each element in a list, instead of
(map (fn [x] (* x x)) [1 2 3])
you can write
(map #(* % %) [1 2 3])
i.e. substituting #(* % %) for (fn [x] (* x x)) as the anonymous function. Each will give (1 4 9)
% is just a placeholder for arguments in the #(...) reader macro witch rewrites to a (fn* ...) call. It means the first passed argument.
You can add a number after the % to indicate index number of argument, beware first argument index is 1, so % == %1.
You shall provide as many arguments to the returned function as the highest index you use in the function definition.
#(str %4 %2)
gives
(fn* [p1__680# p2__679# p3__681# p4__678#] (str p4__678# p2__679#))
and needs 4 arguments.
Observe that %4 and %2 are managed first and in reading order and non used arguments are created after by the macro filling the gaps.

Map with an accumulator in Clojure?

I want to map over a sequence in order but want to carry an accumulator value forward, like in a reduce.
Example use case: Take a vector and return a running total, each value multiplied by two.
(defn map-with-accumulator
"Map over input but with an accumulator. func accepts [value accumulator] and returns [new-value new-accumulator]."
[func accumulator collection]
(if (empty? collection)
nil
(let [[this-value new-accumulator] (func (first collection) accumulator)]
(cons this-value (map-with-accumulator func new-accumulator (rest collection))))))
(defn double-running-sum
[value accumulator]
[(* 2 (+ value accumulator)) (+ value accumulator)])
Which gives
(prn (pr-str (map-with-accumulator double-running-sum 0 [1 2 3 4 5])))
>>> (2 6 12 20 30)
Another example to illustrate the generality, print running sum as stars and the original number. A slightly convoluted example, but demonstrates that I need to keep the running accumulator in the map function:
(defn stars [n] (apply str (take n (repeat \*))))
(defn stars-sum [value accumulator]
[[(stars (+ value accumulator)) value] (+ value accumulator)])
(prn (pr-str (map-with-accumulator stars-sum 0 [1 2 3 4 5])))
>>> (["*" 1] ["***" 2] ["******" 3] ["**********" 4] ["***************" 5])
This works fine, but I would expect this to be a common pattern, and for some kind of map-with-accumulator to exist in core. Does it?
You should look into reductions. For this specific case:
(reductions #(+ % (* 2 %2)) 2 (range 2 6))
produces
(2 6 12 20 30)
The general solution
The common pattern of a mapping that can depend on both an item and the accumulating sum of a sequence is captured by the function
(defn map-sigma [f s] (map f s (sigma s)))
where
(def sigma (partial reductions +))
... returns the sequence of accumulating sums of a sequence:
(sigma (repeat 12 1))
; (1 2 3 4 5 6 7 8 9 10 11 12)
(sigma [1 2 3 4 5])
; (1 3 6 10 15)
In the definition of map-sigma, f is a function of two arguments, the item followed by the accumulator.
The examples
In these terms, the first example can be expressed
(map-sigma (fn [_ x] (* 2 x)) [1 2 3 4 5])
; (2 6 12 20 30)
In this case, the mapping function ignores the item and depends only on the accumulator.
The second can be expressed
(map-sigma #(vector (stars %2) %1) [1 2 3 4 5])
; (["*" 1] ["***" 2] ["******" 3] ["**********" 4] ["***************" 5])
... where the mapping function depends on both the item and the accumulator.
There is no standard function like map-sigma.
General conclusions
Just because a pattern of computation is common does not imply that
it merits or requires its own standard function.
Lazy sequences and the sequence library are powerful enough to tease
apart many problems into clear function compositions.
Rewritten to be specific to the question posed.
Edited to accommodate the changed second example.
Reductions is the way to go as Diego mentioned however to your specific problem the following works
(map #(* % (inc %)) [1 2 3 4 5])
As mentioned you could use reductions:
(defn map-with-accumulator [f init-value collection]
(map first (reductions (fn [[_ accumulator] next-elem]
(f next-elem accumulator))
(f (first collection) init-value)
(rest collection))))
=> (map-with-accumulator double-running-sum 0 [1 2 3 4 5])
(2 6 12 20 30)
=> (map-with-accumulator stars-sum 0 [1 2 3 4 5])
("*" "***" "******" "**********" "***************")
It's only in case you want to keep the original requirements. Otherwise I'd prefer to decompose f into two separate functions and use Thumbnail's approach.

clojure for sequence comprehnsion adding two elements at a time

The comprehension:
(for [i (range 5])] i)
... yields: (0 1 2 3 4)
Is there an idiomatic way to get (0 0 1 1 2 4 3 9 4 16) (i.e. the numbers and their squares) using mostly the for comprehension?
The only way I've found so far is doing a:
(apply concat (for [i (range 5)] (list i (* i i))))
Actually, using only for is pretty simple if you consider applying each function (identity and square) for each value.
(for [i (range 5), ; for every value
f [identity #(* % %)]] ; for every function
(f i)) ; apply the function to the value
; => (0 0 1 1 2 4 3 9 4 16)
Since for loops x times, it will return a collection of x values. Multiple nested loops (unless limited by while or when) will give x * y * z * ... results. That is why external concatenation will always be necessary.
A similar correlation between input and output exists with map. However, if multiple collections are given in map, the number of values in the returned collection is the size of the smallest collection parameter.
=> (map (juxt identity #(* % %)) (range 5))
([0 0] [1 1] [2 4] [3 9] [4 16])
Concatenating the results of map is so common mapcat was created. Because of that, one might argue mapcat is a more idiomatic way over for loops.
=> (mapcat (juxt identity #(* % %)) (range 5))
(0 0 1 1 2 4 3 9 4 16)
Although this is just shorthand for apply concat (map, and a forcat function or macro could be created just as easily.
However, if an accumulation over a collection is needed, reduce is usually considered the most idiomatic.
=> (reduce (fn [acc i] (conj acc i (* i i))) [] (range 5))
[0 0 1 1 2 4 3 9 4 16]
Both the for and map options would mean traversing a collection twice, once for the range, and once for concatenating the resulting collection. The reduce option only traverses the range.
Care to share why "using mostly the for comprehension" is a requirement ?
I think you are doing it right.
A slightly compressed way maybe achieved using flatten
(flatten (for [i (range 5)] [ i (* i i) ] ))
But I would get rid of the for comprehension and just use interleave
(let [x (range 5)
y (map #(* % %) x)]
(interleave x y))
Disclaimer: I am just an amateur clojurist ;)