How to properly use "iterate" and "partial" in Clojure? - clojure

Most reference to iterate are for operators, and all the applications on functions are so confusing that I still don't get how to use iterate in my code, and what partial is.
I am doing a programming homework, trying to use Newton's method to get square root for a number n. That is, with guess as the initial approximation, keep computing new approximations by computing the average of the approximation and n/approximation. Continue until the difference between the two most recent approximations is less than epsilon.
I am trying to do the approximation part first, I believe that is something I need to use iterate and partial. And later the epsilon is something I need to use "take"?
Here is the code I have for approximation without the epsilon:
(defn sqrt [n guess]
(iterate (partial sqrt n) (/ (+ n (/ n guess)) 2)))
This code does not work properly though, when I enter (sqrt 2 2), it gives me (3/2 user=> ClassCastException clojure.lang.Cons cannot be cast to java.lang.Number clojure.lang.Numbers.divide (Numbers.java:155).
I guess this is the part I need to iterate over and over again? Could someone please give me some hints? Again, this is a homework problem, so please do not provide me direct solution to the entire problem, I need some ideas and explanations that I can learn from.

partial takes a function and at least one parameter for that function and returns a new function that expects the rest of the parameters.
(def take-five (partial take 5))
(take-five [1 2 3 4 5 6 7 8 9 10])
;=> (1 2 3 4 5)
iterate generates an infinite sequence by taking two parameters: a function and a seed value. The seed value is used as the first element in the generated list and the second is computed by applying the function to the seed, the second value is used as the input for the function to get the third value and so on.
(take-five (iterate inc 0))
;=> (0 1 2 3 4)
ClojureDocs offers good documentation on both functions: http://clojuredocs.org/clojure_core/clojure.core/iterate and http://clojuredocs.org/clojure_core/clojure.core/partial.

So, #ponzao explained quite well what iterate and partial do, and #yonki made the point that you don't really need it. If you like to explore some more seq functions it's probably a good idea to try it anyways (although the overhead from lazy sequences might result in a somewhat not ideal performance).
Hints:
(iterate #(sqrt n %) initial-approximation) will give you a seq of approximations.
you can use partition to create pairs of subsequent approximations.
discard everything not fulfilling the epsilon condition using drop-while
get result.
It's probably quite rewarding to solve this using sequences since you get in contact with a lot of useful seq functions.
Note: There is a full solution somewhere in the edit history of this answer. Sorry for that, didn't fully get the "homework" part.

I think you're missing the point. You don't need iterate neither partial too.
If you need to execute some computation till condition is fulfilled you can use easy to understand loop/recur instruction. loop/recur can be understood as: do some computation, check if condition is fulfilled, if yes return computed value, if not repeat computation.
Since you don't want entire solution, only an advice where to go, have a proper look on loop/recur and everything gonna be all right.
#noisesmith made good point. reduce is not for computing till condition is fullfiled, but may be useful when performing some computation with limited number of steps.

Related

Newbie Problem Understanding Clojure Lazy Sequences

I've just started learning Clojure and I'm puzzled by how lazy sequences work. In particular, I don't understand why these 2 expressions produce different results in the repl:
;; infinite range works OK
(user=> (take 3 (map #(/(- % 5)) (range)))
(-1/5 -1/4 -1/3)
;; finite range causes error
user=> (take 3 (map #(/(- % 5)) (range 1000)))
Error printing return value (ArithmeticException) at clojure.lang.Numbers/divide (Numbers.java:188).
Divide by zero
I take the sequence of integers (0 1 2 3 ...) and apply a function that subtracts 5 and then takes the reciprocal. Obviously this causes a division-by-zero error if it's applied to 5. But since I'm only taking the first 3 values from a lazy sequence I wasn't expecting to see an exception.
The results are what I expected when I use all the integers, but I get an error if I use the first 1000 integers.
Why are the results different?
Clojure 1.1 introduced "chunked" sequences,
This can provide greater efficiency ... Consumption of chunked-seqs as
normal seqs should be completely transparent. However, note that some
sequence processing will occur up to 32 elements at a time. This could
matter to you if you are relying on full laziness to preclude the
generation of any non-consumed results. [Section 2.3 of "Changes to Clojure in Version 1.1"]
In your example (range) seems to be producing a seq that realizes one element at a time and (range 999) is producing a chunked seq. map will consume a chunked seq a chunk at a time, producing a chunked seq. So when take asks for the first element of the chunked seq, function passed to map is called 32 times on the values 0 through 31.
I believe it is wisest to code in such a way the code will still work for any seq producing function/arity if that function produces a chunked seq with an arbitrarily large chunk.
I do not know if one writes a seq producing function that is not chunked if one can rely in current and future versions of library functions like map and filter to not convert the seq into a chunked seq.
But, why the difference? What are the implementation details such that (range) and (range 999) are different in the sort of seq produced?
Range is implemented in clojure.core.
(range) is defined as (iterate inc' 0).
Ultimately iterate's functionality is provided by the Iterate class in Iterate.java.
(range end) is defined, when end is a long, as (clojure.lang.LongRange/create end)
The LongRange class lives in LongRange.java.
Looking at the two java files it can be seen that the LongRange class implements IChunkedSeq and the Iterator class does not. (Exercise left for the reader.)
Speculation
The implementation of clojure.lang.Iterator does not chunk because iterator can be given a function of arbitrary complexity and the efficiency from chunking can easily be overwhelmed by computing more values than needed.
The implementation of (range) relies on iterator instead of a custom optimized Java class that does chunking because the (range) case is not believed to be common enough to warrant optimization.

Using a generative test library in clojure vs build your own using higher order functions

Clojure has a number of libraries for generative testing such as test.check, test.generative or data.generators.
It is possible to use higher order functions to create random data generators that are composable such as:
(defn gen [create-fn content-fn lazy]
(fn [] (reduce #(create-fn %1 %2) (for [a lazy] (content-fn)))))
(def a (gen str #(rand-nth [\a \b \c]) (range 10)))
(a)
(def b (gen vector #(rand-int 10) (range 2)))
(b)
(def c (gen hash-set b (range (rand-int 10))))
(c)
This is just an example and could be modified with different parameters, filters, partials, etc to create data generating functions which are quite flexible.
Is there something that any of the generative libraries can do that isn't also just as (or more) succinctly achievable by composing some higher order functions?
As a side note to the stackoverflow gods: I don't believe this question is subjective. I'm not asking for an opinion on which library is better. I want to know what specific feature(s) or technique(s) of any/all data generative libraries differentiate them from composing vanilla higher order functions. An example answer should illustrate generating random data using any of the libraries with an explanation as to why this would be more complex to do by composing HOFs in the way I have illustrated above.
test.check does this way better. Most notably, suppose you generate a random list of 100 elements, and your test fails: something about the way you handled that list is wrong. What now? How do you find the basic bug? It surely doesn't depend on exactly those 100 inputs; you could probably reproduce it with a list of just a few elements, or even an empty list if something is wrong with your base case.
The feature that makes all this actually useful isn't the random generators, it is the "shrinking" of those generators. Once test.check finds an input that breaks your tests, it tries to simplify the input as much as possible while still making your tests break. For a list of integers, the shrinks are simple enough you could maybe do them yourself: remove any element, or decrease any element. Even that may not be true: choosing the order to do shrinks in is probably a harder problem than I realize. And for larger inputs, like a list of maps from vectors to a 3-tuple of [string, int, keyword], you'll find it totally unmanageable, whereas test.check has done all the hard work already.

Is there a more idiomatic way to get N random elements of a collection in Clojure?

I’m currrently doing this: (repeatedly n #(rand-nth (seq coll))) but I suspect there might be a more idiomatic way, for 2 reasons:
I’ve found that there’s frequently a more concise and expressive alternative to using short anonymous functions, e.g. partial
the docstring for repeatedly says “presumably with side effects”, implying that it’s not intended to be used to produce values
I suppose I could figure out a way to use reduce but that seems like it would be tricky and less efficient, as it would have to process the entire collection, since reduce is not lazy.
An easy solution but not optimal for big collections could be:
(take n (shuffle coll))
Has the "advantage" of not repeating elements. Also you could implement a lazy-shuffle but it will involve more code.
I know it's not exactly what you're asking - but if you're doing a lot of sampling and statistical work, you might be interested in Incanter ([incanter "1.5.2"]).
Incanter provides the function sample, which provides options for sample size, and replacement.
(require '[incanter.stats :refer [sample]]))
(sample [1 2 3 4 5 6 7] :size 5 :replacement false)
; => (1 5 6 2 7)

Clojure: reduce vs. apply

I understand the conceptual difference between reduce and apply:
(reduce + (list 1 2 3 4 5))
; translates to: (+ (+ (+ (+ 1 2) 3) 4) 5)
(apply + (list 1 2 3 4 5))
; translates to: (+ 1 2 3 4 5)
However, which one is more idiomatic clojure? Does it make much difference one way or the other? From my (limited) performance testing, it seems reduce is a bit faster.
reduce and apply are of course only equivalent (in terms of the ultimate result returned) for associative functions which need to see all their arguments in the variable-arity case. When they are result-wise equivalent, I'd say that apply is always perfectly idiomatic, while reduce is equivalent -- and might shave off a fraction of a blink of an eye -- in a lot of the common cases. What follows is my rationale for believing this.
+ is itself implemented in terms of reduce for the variable-arity case (more than 2 arguments). Indeed, this seems like an immensely sensible "default" way to go for any variable-arity, associative function: reduce has the potential to perform some optimisations to speed things up -- perhaps through something like internal-reduce, a 1.2 novelty recently disabled in master, but hopefully to be reintroduced in the future -- which it would be silly to replicate in every function which might benefit from them in the vararg case. In such common cases, apply will just add a little overhead. (Note it's nothing to be really worried about.)
On the other hand, a complex function might take advantage of some optimisation opportunities which aren't general enough to be built into reduce; then apply would let you take advantage of those while reduce might actually slow you down. A good example of the latter scenario occuring in practice is provided by str: it uses a StringBuilder internally and will benefit significantly from the use of apply rather than reduce.
So, I'd say use apply when in doubt; and if you happen to know that it's not buying you anything over reduce (and that this is unlikely to change very soon), feel free to use reduce to shave off that diminutive unnecessary overhead if you feel like it.
For newbies looking at this answer,
be careful, they are not the same:
(apply hash-map [:a 5 :b 6])
;= {:a 5, :b 6}
(reduce hash-map [:a 5 :b 6])
;= {{{:a 5} :b} 6}
It doesn't make a difference in this case, because + is a special case that can apply to any number of arguments. Reduce is a way to apply a function that expects a fixed number of arguments (2) to an arbitrarily long list of arguments.
Opinions vary- In the greater Lisp world, reduce is definitely considered more idiomatic. First, there is the variadic issues already discussed. Also, some Common Lisp compilers will actually fail when apply is applied against very long lists because of how they handle argument lists.
Amongst Clojurists in my circle, though, using apply in this case seems more common. I find it easier to grok and prefer it also.
I normally find myself preferring reduce when acting on any kind of collection - it performs well, and is a pretty useful function in general.
The main reason I would use apply is if the parameters mean different things in different positions, or if you have a couple of initial parameters but want to get the rest from a collection, e.g.
(apply + 1 2 other-number-list)
In this specific case I prefer reduce because it's more readable: when I read
(reduce + some-numbers)
I know immediately that you're turning a sequence into a value.
With apply I have to consider which function is being applied: "ah, it's the + function, so I'm getting... a single number". Slightly less straightforward.
When using a simple function like +, it really doesn't matter which one you use.
In general, the idea is that reduce is an accumulating operation. You present the current accumulation value and one new value to your accumulating function The result of the function is the cumulative value for the next iteration. So, your iterations look like:
cum-val[i+1] = F( cum-val[i], input-val[i] ) ; please forgive the java-like syntax!
For apply, the idea is that you are attempting to call a function expecting a number of scalar arguments, but they are currently in a collection and need to be pulled out. So, instead of saying:
vals = [ val1 val2 val3 ]
(some-fn (vals 0) (vals 1) (vals 2))
we can say:
(apply some-fn vals)
and it is converted to be equivalent to:
(some-fn val1 val2 val3)
So, using "apply" is like "removing the parentheses" around the sequence.
Bit late on the topic but I did a simple experiment after reading this example. Here is result from my repl, I just can't deduce anything from the response, but seems there is some sort of caching kick in between reduce and apply.
user=> (time (reduce + (range 1e3)))
"Elapsed time: 5.543 msecs"
499500
user=> (time (apply + (range 1e3)))
"Elapsed time: 5.263 msecs"
499500
user=> (time (apply + (range 1e4)))
"Elapsed time: 19.721 msecs"
49995000
user=> (time (reduce + (range 1e4)))
"Elapsed time: 1.409 msecs"
49995000
user=> (time (reduce + (range 1e5)))
"Elapsed time: 17.524 msecs"
4999950000
user=> (time (apply + (range 1e5)))
"Elapsed time: 11.548 msecs"
4999950000
Looking at source code of clojure reduce its pretty clean recursion with internal-reduce, didn't found anything on implementation of apply though. Clojure implementation of + for apply internally invoke reduce, which is cached by repl, which seem to explain the 4th call. Can someone clarify whats really happening here?
The beauty of apply is given function (+ in this case) can be applied to argument list formed by pre-pending intervening arguments with an ending collection. Reduce is an abstraction to process collection items applying the function for each and doesn't work with variable args case.
(apply + 1 2 3 [3 4])
=> 13
(reduce + 1 2 3 [3 4])
ArityException Wrong number of args (5) passed to: core/reduce clojure.lang.AFn.throwArity (AFn.java:429)
A bit late, but...
In this case, there is not a big difference. But in general they are not equivalent. Further more reduce can be more performant. Why?
reduce checks if a collection or type implements IReduced interface. That means a type knows how provide its values to the reducing function in the most performant why.
reduce can be stopped prematurely by returning a Reduced value.
Apply on the other hand, is invoked by applyToHelper. Which dispatches to the right arity by counting the args, unpacking the values from the collection.
Is it a big performance impact? Probably not.
My opinion is as others already pointed out. Use reduce if you want to semantically "reduce" a collection to a single value. Otherwise use apply.

Clojure: Avoiding stack overflow in Sieve of Erathosthene?

Here's my implementation of Sieve of Erathosthene in Clojure (based on SICP lesson on streams):
(defn nats-from [n]
(iterate inc n))
(defn divide? [p q]
(zero? (rem q p)))
(defn sieve [stream]
(lazy-seq (cons (first stream)
(sieve (remove #(divide? (first stream) %)
(rest stream))))))
(def primes (sieve (nats-from 2)))
Now, it's all OK when i take first 100 primes:
(take 100 primes)
But, if i try to take first 1000 primes, program breaks because of stack overflow.
I'm wondering if is it possible to change somehow function sieve to become tail-recursive and, still, to preserve "streamnes" of algorithm?
Any help???
Firstly, this is not the Sieve of Eratosthenes... see my comment for details.
Secondly, apologies for the close vote, as your question is not an actual duplicate of the one I pointed to... My bad.
Explanation of what is happening
The difference lies of course in the fact that you are trying to build an incremental sieve, where the range over which the remove call works is infinite and thus it's impossible to just wrap a doall around it. The solution is to implement one of the "real" incremental SoEs from the paper I seem to link to pretty frequently these days -- Melissa E. O'Neill's The Genuine Sieve of Eratosthenes.
A particularly beatiful Clojure sieve implementation of this sort has been written by Christophe Grand and is available here for the admiration of all who might be interested. Highly recommended reading.
As for the source of the issue, the questions I originally thought yours was a duplicate of contain explanations which should be useful to you: see here and here. Once again, sorry for the rash vote to close.
Why tail recursion won't help
Since the question specifically mentions making the sieving function tail-recursive as a possible solution, I thought I would address that here: functions which transform lazy sequences should not, in general, be tail recursive.
This is quite an important point to keep in mind and one which trips up many an unexperienced Clojure (or Haskell) programmer. The reason is that a tail recursive function of necessity only returns its value once it is "ready" -- at the very end of the computation. (An iterative process can, at the end of any particular iteration, either return a value or continue on to the next iteration.) In constrast, a function which generates a lazy sequence should immediately return a lazy sequence object which encapsulates bits of code which can be asked to produce the head or tail of the sequence whenever that's desired.
Thus the answer to the problem of stacking lazy transformations is not to make anything tail recursive, but to merge the transformations. In this particular case, the best performance can be obtained by using a custom scheme to fuse the filtering operations, based on priority queues or maps (see the aforementioned article for details).