(defn square [x]
(do
(println (str "Processing: " x))
(* x x)))
(println (map square '(1 2 3 4 5)))
Why is the output
(Processing: 1
Processing: 2
1 Processing: 3
4 Processing: 4
9 Processing: 5
16 25)
not
(Processing: 1
1 Processing: 2
4 Processing: 3
9 Processing: 4
16 Processing: 5
25)
?
Because map is lazy. It uses lazy-seq under the covers, which pre-caches the result of rest. So you see the two println statements appear when your code grabs the first value of the map sequence.
See also this blog post: Lazy Sequences
println uses [[x & xs] xs] destructuring form in its implementation. This is equivelent to [x (first xs), xs (next xs)] and next is less lazy than rest, so it realizes the two items before printing the first.
For example,
=> (defn fn1 [[x & xs]] nil)
#'user/fn1
=> (fn1 (map square '(1 2 3 4 5)))
Processing: 1
Processing: 2
nil
Are you like me to learn with code snippets? Here are some.
Let's have a look at the documentation of map.
user=> (doc map)
-------------------------
clojure.core/map
([f coll] [f c1 c2] [f c1 c2 c3] [f c1 c2 c3 & colls])
Returns a lazy sequence consisting of the result of applying f to the
set of first items of each coll, followed by applying f to the set
of second items in each coll, until any one of the colls is
exhausted. Any remaining items in other colls are ignored. Function
f should accept number-of-colls arguments.
nil
map returns a lazy sequence (you should already read the references given by #noahz). To fully realize the lazy sequence (it's often not a good practice as the lazy seq might be infinite and hence never end) you can use dorun or doall.
user=> (doc dorun)
-------------------------
clojure.core/dorun
([coll] [n coll])
When lazy sequences are produced via functions that have side
effects, any effects other than those needed to produce the first
element in the seq do not occur until the seq is consumed. dorun can
be used to force any effects. Walks through the successive nexts of
the seq, does not retain the head and returns nil.
nil
user=> (doc doall)
-------------------------
clojure.core/doall
([coll] [n coll])
When lazy sequences are produced via functions that have side
effects, any effects other than those needed to produce the first
element in the seq do not occur until the seq is consumed. doall can
be used to force any effects. Walks through the successive nexts of
the seq, retains the head and returns it, thus causing the entire
seq to reside in memory at one time.
nil
Although they seem similar they are not - mind the difference with how they treat the head of the realized sequence.
With the knowledge, you can influence the way the map lazy sequence behaves with doall.
user=> (defn square [x]
#_=> (println (str "Processing: " x))
#_=> (* x x))
#'user/square
user=> (doall (map square '(1 2 3 4 5)))
Processing: 1
Processing: 2
Processing: 3
Processing: 4
Processing: 5
(1 4 9 16 25)
As you might've noticed I also changed the definition of the square function as you don't need do inside the function (it's implicit with the defn macro).
In the Clojure Programming book, there's the sentence you may like for the case of '(1 2 3 4 5):
"Most people simply use a vector literal for such cases, within which member expressions
will always be evaluated."
Instead of copying the relevant sections to support this statement, I'd rather recommend the book as it's worth the time and money.
Related
It appears that apply forces the realization of four elements given a lazy sequence.
(take 1
(apply concat
(repeatedly #(do
(println "called")
(range 1 10)))))
=> "called"
=> "called"
=> "called"
=> "called"
Is there a way to do an apply which does not behave this way?
Thank You
Is there a way to do an apply which does not behave this way?
I think the short answer is: not without reimplementing some of Clojure's basic functionality. apply's implementation relies directly on Clojure's implementation of callable functions, and tries to discover the proper arity of the given function to .invoke by enumerating the input sequence of arguments.
It may be easier to factor your solution using functions over lazy, un-chunked sequences / reducers / transducers, rather than using variadic functions with apply. For example, here's your sample reimplemented with transducers and it only invokes the body function once (per length of range):
(sequence
(comp
(mapcat identity)
(take 1))
(repeatedly #(do
(println "called")
(range 1 10))))
;; called
;; => (1)
Digging into what's happening in your example with apply, concat, seq, LazySeq, etc.:
repeatedly returns a new LazySeq instance: (lazy-seq (cons (f) (repeatedly f))).
For the given 2-arity (apply concat <args>), apply calls RT.seq on its argument list, which for a LazySeq then invokes LazySeq.seq, which will invoke your function
apply then calls a Java impl. method applyToHelper which tries to get the length of the argument sequence. applyToHelper tries to determine the length of the argument list using RT.boundedLength, which internally calls next and in turn seq, so it can find the proper overload of IFn.invoke to call
concat itself adds another layer of lazy-seq behavior.
You can see the stack traces of these invocations like this:
(take 1
(repeatedly #(do
(clojure.stacktrace/print-stack-trace (Exception.))
(range 1 10))))
The first trace descends from the apply's initial call to seq, and the subsequent traces from RT.boundedLength.
in fact, your code doesn't realize any of the items from the concatenated collections (ranges in your case). So the resulting collection is truly lazy as far as elements are concerned. The prints you get are from the function calls, generating unrealized lazy seqs. This one could easily be checked this way:
(defn range-logged [a b]
(lazy-seq
(when (< a b)
(println "realizing item" a)
(cons a (range-logged (inc a) b)))))
user> (take 1
(apply concat
(repeatedly #(do
(println "called")
(range-logged 1 10)))))
;;=> called
;; called
;; called
;; called
;; realizing item 1
(1)
user> (take 10
(apply concat
(repeatedly #(do
(println "called")
(range-logged 1 10)))))
;; called
;; called
;; called
;; called
;; realizing item 1
;; realizing item 2
;; realizing item 3
;; realizing item 4
;; realizing item 5
;; realizing item 6
;; realizing item 7
;; realizing item 8
;; realizing item 9
;; realizing item 1
(1 2 3 4 5 6 7 8 9 1)
So my guess is that you have nothing to worry about, as long as the collection returned from repeatedly closure is lazy
From what I gather a transformer is the use of functions that change , alter , a collection of elements . Like if I did added 1 to each element in a collection of
[1 2 3 4 5]
and it became
[2 3 4 5 6]
but writing the code for this looks like
(map inc)
but I keep getting this sort of code confused with a reducer. Because it produces a new accumulated result .
The question I ask is , what is the difference between a transformer and a reducer ?
You are likely just confusing various nomenclature (as the comments above suggest), but I'll answer what I think is your question by taking some liberties in interpreting what you mean to be reducer and transformer.
Reducing:
A reducing function (what you probably think is a reducer), is a function that takes an accumulated value and a current value, and returns a new accumulated value.
(accumulated, current) => accumulated
These functions are passed to reduce, and they successively step through a sequence performing whatever the body of the reducing function says with it's two arguments (accumulated and current), and then returning a new accumulated value which will be used as the accumulated value (first argument) to the next call of the reducing function.
For example, plus can be viewed as a reducing function.
(reduce + [0 1 2]) => 3
First, the reducing function (plus in this example) is called with 0 and 1, which returns 1. On the next call, 1 is now the accumulated value, and 2 is the current value, so plus is called with 1 and 2, returning 3, which completes the reduction as there are no further elements in the collection to process.
It may help to look at a simplified version of a reduce implementation:
(defn reduce1
([f coll] ;; f is a reducing function
(let [[x y & xs] coll]
;; called with the accumulated value so far "x"
;; and cur value in input sequence "y"
(if y (reduce1 f (cons (f x y) xs))
x)))
([f start coll]
(reduce1 f (cons start coll))))
You can see that the function "f" , or the "reducing function" is called on each iteration with two arguments, the accumulated value so far, and the next value in the input sequence. The return value of this function is used as the first argument in the next call, etc. and thus has the type:
(x, y) => x
Transforming:
A transformation, the way I think you mean it, suggests the shape of the input does not change, but is simply modified according to an arbitrary function. This would be functions you pass to map, as they are applied to each element and build up a new collection of the same shape, but with that function applied to each element.
(map inc [0 1 2]) => '(1 2 3)
Notice the shape is the same, it's still a 3 element sequence, whereas in the reduction above, you input a 3 element sequence and get back an integer. Reductions can change the shape of the final result, map does not.
Note that I say the "shape" doesn't change, but the type of each element may change depending on what your "transforming" function does:
(map #(list (inc %)) [0 1 2]) => '((1) (2) (3))
It's still a 3 element sequence, but now each element is a list, not an integer.
Addendum:
There are two related concepts in Clojure, Reducers and Transducers, which I just wanted to mention since you asked about reducers (which have as specific meaning in Clojure) and transformers (which are the names Clojurists typically assign to a transducing function via the shorthand "xf"). It would turn this already long answer into a short-story if I tried to explain the details of both here, and it's been done better than I can do by others:
Transducers:
http://elbenshira.com/blog/understanding-transducers/
https://www.youtube.com/watch?v=6mTbuzafcII
Reducers and Transducers:
https://eli.thegreenplace.net/2017/reducers-transducers-and-coreasync-in-clojure/
It turns out that many transformations of collections can be expressed in terms of reduce. For instance map could be implemented as
(defn map [f coll] (reduce (fn [x y] (conj x (f y))) [] [0 1 2 3 4]))
and then you would call
(map inc [1 2 3 4 5])
to obtain
[2 3 4 5 6]
In our homemade implementation of map, the function that we pass to reduce is
(fn [x y] (conj x (f y))))
where f is the function that we would like to apply to every element. So we can write a function that produces such a function for us, passing the function that we would like to map.
(defn mapping-with-conj [f] (fn [x y] (conj x (f y))))
But we still see the presence of conj in the above function assuming we want to add elements to a collection. We can get even more flexibility by extra indirection:
(defn mapping [f] (fn [step] (fn [x y] (step x (f y)))))
Then we can use it like this:
(def increase-by-1 (mapping inc))
(reduce (increase-by-1 conj) [] [1 2 3])
The (map inc) you are referring does what our call to (mapping inc) does. Why would you want to do things this way? The answer is that it gives us a lot of flexibility to build things. For instance, instead of building up a collection, we can do
(reduce ((map inc) +) 0 [1 2 3 4 5])
Which will give us the sum of the mapped collection [2 3 4 5 6]. Or we can add extra processing steps just by simple function composition.
(reduce ((comp (filter odd?) (map inc)) conj) [] [1 2 3 4 5])
which will first remove even elements from the collection before we map. The transduce function in Clojure does essentially what the above line does, but takes care of another few extra details, too. So you would actually write
(transduce (comp (filter odd?) (map inc)) conj [] [1 2 3 4 5])
To sum up, the map function in Clojure has two arities. Calling it like (map inc [1 2 3 4 5]) will map every element of a collection so that you obtain [2 3 4 5 6]. Calling it just like (map inc) gives us a function that behaves pretty much like our mapping function in the above explanation.
If I have the following string containing a valid Clojure/ClojureScript form:
"(+ 1 (+ 2 (/ 6 3)))"
How would I evaluate the first "step" of this form? In other words, how would I turn the above form into this:
"(+ 1 (+ 2 2))"
and then turn that corresponding form into this:
"(+ 1 4)"
You use recursion.
You need to have a function that evaluates the numbers to themselves but if it's not a number you need to apply the operation on the evaluation of the arguments.. Thus
(evaluate '(+ 1 (+ 2 (/ 6 3))))
This should be treated as:
(+ (evaluate '1) (evaluate '(+ 2 (/ 6 3))))
When it starts doing your first step several steps are waiting for the results to be done as well.
Note I'm using list structure and not strings. With strings you would need to use some function to get it parsed.
The other answers are great if you want to execute code in steps, but I want to mention that this evaluation can also be visualized using a debugger. See below Cider's debugger in action:
By using cider-debug-defun-at-point we add a breakpoint on evaluate. Then when the evaluate definition is evaluated the breakpoint is hit, and we step through the code by pressing next repeatedly.
A debugger is very handy when you want to evaluate "steps" of forms.
Below is a very basic implementation that does what you're looking for. It would be more common to eval the entire form, but since you're wanting to just simplify the innermost expressions, this does it:
(defn leaf?
[x]
(and (list? x)
(symbol? (first x))
(not-any? list? (rest x))))
(defn eval-one
[expr]
(cond
(leaf? expr) (apply (-> (first expr) resolve var-get)
(rest expr))
(list? expr) (apply list (map eval-one expr))
:default expr
))
(read-string "(+ 1 (+ 2 (/ 6 3)))")
=> (+ 1 (+ 2 (/ 6 3)))
(eval-one *1)
=> (+ 1 (+ 2 2))
(eval-one *1)
=> (+ 1 4)
(eval-one *1)
=> 5
This is naive and for illustrative purposes only, so please don't be under the impression that a real eval would work this way.
We define a leaf as a list whose first element is a symbol and which does not contain any other lists which could be evaluated. We then process the form, evaluating leaf expressions, recursively evaluating non-leaf expressions which are lists, and for anything else, we just insert it into the resulting expression. The result is that all innermost expressions which can be evaluated, according to our definition, are evaluated.
To add to the other great answers, here is a simple function that should return the first form to evaluate in a given string:
(defn first-eval [form-str]
(let [form (read-string form-str)
tree-s (tree-seq sequential? identity form)]
(first (filter #(= % (flatten %)) tree-s))))
Usage:
(first-eval "(+ 1 (+ 2 (/ 6 3)))") ;; returns (/ 6 3)
tree-seq is fairly limited in it's ability to evalute ALL form, but it's a start.
A popular online tutorial gives this example to build natural numbers:
(def infseq (map inc (iterate inc 0)))
For instance, (take 5 infseq) gives:
1 2 3 4 5
I can see what (iterate inc 0) does, but, as a whole, I do not understand what exactly is going on. (For example, this is not a normal function definition.)
Could somebody please explain?
Let's break that expression down:
(map inc (iterate inc 0)))
Is a list (the datastructure) with this structure:
(function-to-call function-passed-as-first-srgument another-list-as-second-arg)
Now let's explore it by examining it from the inside out!
That inner list:
(iterate inc 0)
has this strucute
(function-to-call function-passed-as-first-argument number)
the function being called is iterate, which creates infinite sequences by keeping track of it's internal state, and every time a new value is required to make the make the sequence longer, it takes the function passed as the first argument and calls it on the current state.
the function passed as the first argument is inc which takes a numbr and ads one
the third argument to iterate is the initial-state. where it should start.
so when this inner expression is evaluated it will immediately return a data structure representing a list without actually building that list yet. When the first value is read from that list it will return the initial value 0 , when the second value is asked for it will use the inc function to come up with the number 1. If the first or second values are needed again they will just be used as is, not recalculated.
so the first argument represents a contract to produce as many numbers as are required. This is in it's self the third argument to the original expression.
The initial expression takes that lazy list and makes a new lazy list.
this new lazy list is returned by the map function.
(map inc (0 1 2 3 4 ... as many as you read ...))
which will apply inc to each of these just as it's read, and only at the moment it's read (actually it caches 20 or so items ahead to be a little faster) resulting in this sequence:
((inc 0) (inc 1) (inc 2) (inc 3) ... as much as you read from the sequence ...)
Which works out to:
(1 2 3 4 ... created lazily)
Which is the same result as these equivalent expressions,
(rest (range))
(iterate inc 1)
and many other forms.
. For the purposes of the example, we could define map and iterate as follows:
(defn iterate [f init]
(lazy-cons init (iterate f (f init))))
(defn map [f [x & xs]]
(lazy-cons (f x) (map f xs)))
... where lazy-cons is a version of cons that doesn't act until it has to. It used to be part of clojure.core, and might have been defined thus:
(defmacro lazy-cons [x xs]
`(lazy-seq (cons ~x ~xs)))
To understand these definitions, you'll need to get to grips with recursion, laziness, destructuring, and macros: quite a task! But do so and you'll really understand how clojure's sequence library, including iterate and map, works. It's how I learned.
The definition of iterate is valid. The one of map only works for a single endless sequence argument.
(take 5 (iterate inc 0)) => (0 1 2 3 4)
iterate repeatedly applys the inc function in a loop. You are starting with 0, so you get [0 1 2 3...]
(take 5 (map inc (iterate inc 0))) => (1 2 3 4 5)
(map inc <collection>) applies inc once time to each item in the collection, to the previous result is transformed to [1 2 3 4 ...]
(take 5 (range)) => (0 1 2 3 4)
range w/o any args starts at 0 and counts forever, same as the first example.
Since all of these collections are infinite in length, we need something like (take 5 <collection>) to limit the length of what is printed.
Reduce works fine but it is more like fold-left.
Is there any other form of reduce that lets me fold to right ?
The reason that the clojure standard library only has fold-left (reduce) is actually very subtle and it is because clojure isn't lazy enough to get the main benefit of fold-right.
The main benefit of fold-right in languages like haskell is that it can actually short-circuit.
If we do foldr (&&) True [False, True, True, True, True] the way that it actually gets evaluated is very enlightening. The only thing it needs to evaluate is the function and with 1 argument (the first False). Once it gets there it knows the answer and does not need to evaluate ANY of the Trues.
If you look very closely at the picture:
you will see that although conceptually fold-right starts and the end of the list and moves towards the front, in actuality, it starts evaluating at the FRONT of the list.
This is an example of where lazy/curried functions and tail recursion can give benefits that clojure can't.
Bonus Section (for those interested)
Based on a recommendation from vemv, I would like to mention that Clojure added a new function to the core namespace to get around this limitation that Clojure can't have the lazy right fold. There is a function called reduced in the core namespace which allows you to make Clojure's reduce lazier. It can be used to short-circuit reduce by telling it not to look at the rest of the list. For instance, if you wanted to multiply lists of numbers but had reason to suspect that the list would occasionally contain zero and wanted to handle that as a special case by not looking at the remainder of the list once you encountered a zero, you could write the following multiply-all function (note the use of reduced to indicate that the final answer is 0 regardless of what the rest of the list is).
(defn multiply-all [coll]
(reduce
(fn [accumulator next-value]
(if (zero? next-value)
(reduced 0)
(* accumulator next-value)))
1
coll))
And then to prove that it short-circuits you could multiply an infinite list of numbers which happens to contain a zero and see that it does indeed terminate with the answer of 0
(multiply-all
(cycle [1 2 3 4 0]))
Let's look at a possible definition of each:
(defn foldl [f val coll]
(if (empty? coll) val
(foldl f (f val (first coll)) (rest coll))))
(defn foldr [f val coll]
(if (empty? coll) val
(f (foldr f val (rest coll)) (first coll))))
Notice that only foldl is in tail position, and the recursive call can be replaced by recur. So with recur, foldl will not take up stack space, while foldr will. That's why reduce is like foldl. Now let's try them out:
(foldl + 0 [1 2 3]) ;6
(foldl - 0 [1 2 3]) ;-6
(foldl conj [] [1 2 3]) ;[1 2 3]
(foldl conj '() [1 2 3]) ;(3 2 1)
(foldr + 0 [1 2 3]) ;6
(foldr - 0 [1 2 3]) ;-6
(foldr conj [] [1 2 3]) ;[3 2 1]
(foldr conj '() [1 2 3]) ;(1 2 3)
Is there some reason you want to fold right? I think the most common usage of foldr is to put together a list from front to back. In Clojure we don't need that because we can just use a vector instead. Another choice to avoid stack overflow is to use a lazy sequence:
(defn make-list [coll]
(lazy-seq
(cons (first coll) (rest coll))))
So, if you want to fold right, some efficient alternatives are
Use a vector instead.
Use a lazy sequence.
Use reduced to short-circuit reduce.
If you really want to dive down a rabbit hole, use a transducer.