How to create a lazy sequence of random numbers in clojure

How to create a lazy sequence of random numbers in clojure - clojure

How can I create a lazy sequence of random numbers?
My current code:
(import '(java.util Random))
(def r (new Random))
(defn rnd [_]
(.nextInt r 10))
(defn random-numbers [max]
(iterate #(.nextInt r max) (.nextInt r max)))
(println (take 5 (random-numbers 10)))
executing it throws an exception:
(Exception in thread "main" clojure.lang.ArityException: Wrong number of args (1) passed to: user$random-numbers$fn
at clojure.lang.AFn.throwArity(AFn.java:437)
at clojure.lang.AFn.invoke(AFn.java:39)
at clojure.core$iterate$fn__3870.invoke(core.clj:2596)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.RT.seq(RT.java:466)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core$take$fn__3836.invoke(core.clj:2499)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.Cons.next(Cons.java:39)
at clojure.lang.RT.next(RT.java:580)
at clojure.core$next.invoke(core.clj:64)
at clojure.core$nthnext.invoke(core.clj:2752)
at clojure.core$print_sequential.invoke(core_print.clj:57)
at clojure.core$fn__4990.invoke(core_print.clj:140)
at clojure.lang.MultiFn.invoke(MultiFn.java:167)
at clojure.core$pr_on.invoke(core.clj:3264)
at clojure.core$pr.invoke(core.clj:3276)
at clojure.lang.AFn.applyToHelper(AFn.java:161)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.core$apply.invoke(core.clj:600)
at clojure.core$prn.doInvoke(core.clj:3309)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at clojure.core$apply.invoke(core.clj:600)
at clojure.core$println.doInvoke(core.clj:3329)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at user$eval7.invoke(testing.clj:12)
at clojure.lang.Compiler.eval(Compiler.java:6465)
at clojure.lang.Compiler.load(Compiler.java:6902)
at clojure.lang.Compiler.loadFile(Compiler.java:6863)
at clojure.main$load_script.invoke(main.clj:282)
at clojure.main$script_opt.invoke(main.clj:342)
at clojure.main$main.doInvoke(main.clj:426)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.lang.Var.invoke(Var.java:401)
at clojure.lang.AFn.applyToHelper(AFn.java:161)
at clojure.lang.Var.applyTo(Var.java:518)
at clojure.main.main(main.java:37)
[Finished in 3.8s with exit code 1]
Is this a completey wrong approach, because I am using state, namely r is an instance of java.util.Random, or is it just a nooby syntax error?
I just studing clojure on myself, so please bear with me :) .

repeatedly is great for repeatedly running a function and gathering the results in a seq
user> (take 10 (repeatedly #(rand-int 42)))
(14 0 38 14 37 6 37 32 38 22)
as for your original approach: iterate takes an argument, feeds it to a function and then takes the result of that and passes it back to the same function. I't not quite what you want here because the function you are using doesn't need any arguments. You can of course give it a placeholder for that argument and get it working, though repeatedly is likely a better fit.
(defn random-numbers [max]
(iterate (fn [ignored-arg] (.nextInt r max)) (.nextInt r max)))
#'user/random-numbers
user> (println (take 5 (random-numbers 10)))
(3 0 0 2 0)

As a general guide, do not start with a class/function from Java. Look first at Clojure's core functions and namespaces in clojure.* (and then at the contributed namespaces which are now in modular repos: see http://dev.clojure.org/display/doc/Clojure+Contrib); rand-int itself is readily available in clojure-core. So, how would one start the search for a random number helper?
With Clojure 1.3 onwards, you can "use" the clojure-repl namespace to have access to the handy apropos function (used in the same spirit as the apropos command in Unix/linux); apropos returns all matching defs in namespaces loaded so far.
user> (use 'clojure.repl)
nil
user> (apropos "rand")
(rand rand-int rand-nth)
The find-doc function in clojure.repl is also another alternative.
The other pointer is to search at www.clojuredocs.org which includes usage-examples for the funcs in clojure core and clojure.*.

For testing purposes at least, it's good to be able to repeat a "random" sequence by seeding the generator. The new spec library reports the results of its tests this way.
The native Clojure functions don't let you seed a random sequence, so we have to resort to the underlying Java functions:
(defn random-int-seq
"Generates a reproducible sequence of 'random' integers (actually longs)
from an integer (long) seed. Supplies its own random seed if need be."
([] (random-int-seq (rand-int Integer/MAX_VALUE)))
([seed]
(let [gen (java.util.Random. seed)]
(repeatedly #(.nextLong gen)))))

Lets use a transducer shall we?
(def xf (map (fn [x] (* 10 (rand)))))
We can also use rand-int as:
(def xf (map (fn [x] (* 10 (rand-int 10)))))
To use this to generate lazy sequence, we'll use sequence
(sequence xf (range))
This returns a lazy sequence of random numbers. To get a complete sequence of n numbers we can use take as:
(take n (sequence xf (range)))

Related

Infinitely recursive lazy sequence appears as empty sequence in Clojure

Suppose I wrote:
(def stuff
(lazy-seq stuff))
When I ask for the value of stuff in REPL, I would expect it to be stuck in an infinite loop, since I'm defining stuff as itself(which pretty much says nothing about this sequence at all).
However, I got an empty sequence instead.
> stuff
()
Why?
Edit: By "recursive" I meant recursive data, not recursive functions.
I'm still confused about why the sequence terminated. As a comparison, the following code is stuck in infinite loop(and blows the stack).
(def stuff
(lazy-seq (cons (first stuff) [])))
Some background: This question arises from me trying to implement a prime number generator using the sieve of Eratosthenes. My first attempt was:
(def primes
(lazy-seq (cons 2
(remove (fn [x]
(let [ps (take-while #(< % x) primes)]
(some #(zero? (mod x %)) ps)))
(range 3 inf))))) ;; My customized range function that returns an infinite sequence
I figured that it would never work, since take-while would keep asking for more primes even if they could not be calculated yet. So it surprised me when it worked pretty well.
> (take 20 primes)
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71)

First, each lazy seq can only be realized once. Second, your definition of stuff doesn't use recursion — stuff isn't a function. If you look at the definition of lazy-seq, you can see that your definition of stuff expands to
(def stuff (new clojure.lang.LazySeq (fn* [] stuff)))
When the fn arg to the clojure.lang.LazySeq constructor is invoked, it returns the same lazy seq that has already been realized. So, when you attempt to print the lazy seq to the REPL, iteration immediately terminates and returns nil.
You can verify that the type of stuff is clojure.lang.LazySeq
user=> (type stuff)
clojure.lang.LazySeq
and that after printing stuff to the REPL, stuff has been realized
user=> (realized? stuff)
false
user=> stuff
()
user=> (realized? stuff)
true
You can use recursion to get the effect that you expected
user=> (defn stuff
[]
(lazy-seq (stuff)))
#'user/stuff
user=> (stuff) ;; Hangs forever.

Why does clojure attempt to resolve this symbol?

I am working through the Armstrong Numbers exercise on Exercism's Clojure track. An armstrong number is a number equal to the sum of its digits raised to the power of the number of digits. 153 is an Armstrong number, because: 153 = 1^3 + 5^3 + 3^3 = 1 + 125 + 27 = 153. 154 is not an Armstrong number, because: 154 != 1^3 + 5^3 + 4^3 = 1 + 125 + 64 = 190.
The test file for this exercise will call the armstrong? function, pass in a number, and expects true if the number is an Armstrong number. I have already solved the problem with this code:
(ns armstrong-numbers)
(defn pow [a b]
(reduce * 1 (repeat b a)))
(defn armstrong? [num]
(->> num
(iterate #(quot % 10))
(take-while pos?)
(map #(mod % 10))
((fn [sq]
(map #(pow % (count sq))
sq)))
(apply +)
(= num)))
but now I am trying to refactor the code. This is what I would like the code to look like:
(ns armstrong-numbers
(:require [swiss.arrows :refer :all]))
(defn pow [a b]
(reduce * 1 (repeat b a)))
(defn armstrong? [num]
(-<>> num
(iterate #(quot % 10))
(take-while pos?)
(map #(mod % 10))
(map #(pow % (count <>))
<>)
(apply +)
(= num)))
A link to the package required above: https://github.com/rplevy/swiss-arrows.
In the first code section, I create an implicit function within the thread-last macro because the sequence returned from the map form is needed in two different places in the second map form. That implicit function works just fine, but I just wanted to make the code sleeker. But when I test the second code block, I get the following error: java.lang.RuntimeException: Unable to resolve symbol: <> in this context.
I get this error whether I use #(), partial , or fn inside the second map form. I have figured out that because all of the preceding are macros (or a special form in fns case), they cannot resolve <> because it's only meaningful to the -<>> macro, which is called at a later point in macroexpansion. But why do #(), partial, and fn attempt to resolve that character at all? As far as I can see, they have no reason to know what the symbol is, or what it's purpose is. All they need to do is return that symbol rearranged into the proper s-expressions. So why does clojure attempt to resolve this (<>) symbol?

The <> symbol is only valid in the topmost level of a clause (plus literals for set, map, vector directly therein). -<> and -<>> do not establish bindings (as in let) for <>, but do code insertion at macro expansion time.
This code insertion is done only at toplevel, because making it work deeper is not only much more complex (you need a so-called code walker), but also raises interesting questions regarding the nesting of arrow forms. This complexity is likely not worth it, for such a simple effect.
If you want a real binding, you can use as-> (from clojure.core).

The documentation for -<>> is quite clear that it doesn't behave the way you wish it did:
"the 'diamond spear': top-level insertion of x in place of single
positional '<>' symbol within the threaded form if present, otherwise
mostly behave as the thread-last macro. Also works with hash literals
and vectors."
It performs replacement:
Of a single symbol
At the top level of the threaded form
Your example wishing to use it for multiple symbols, nested within subforms, will thus not work.

You have made a mistake leaving off the <> symbol in most of your forms in the failing case. Here is a working version (using the similar it-> macro in place of the Swiss Arrows). I also cleaned up the pow a bit.
(defn pow-int [a b] (Math/round (Math/pow a b)))
(defn armstrong? [num]
(it-> num
(iterate #(quot % 10) it)
(take-while pos? it)
(map #(mod % 10) it)
(map #(pow-int % (count it)) it)
(apply + it)
(= it num)))
(armstrong? 153) => true
(armstrong? 154) => false
You can find the docs on it-> here.
If you leave off the (collection) arg to a function like map, it returns a transducer; from the docs:
(map f)(map f coll)(map f c1 c2)(map f c1 c2 c3)(map f c1 c2 c3 & colls)
Returns a lazy sequence consisting of the result of applying f to
the set of first items of each coll, followed by applying f to the
set of second items in each coll, until any one of the colls is
exhausted. Any remaining items in other colls are ignored. Function
f should accept number-of-colls arguments. Returns a transducer when
no collection is provided.
And, always refer to the Clojure CheatSheet!

Reducing memory usage in a simple Clojure program

I'm trying to solve the Fibonacci problem on codeeval. At first I wrote it in the usual recursive way and, although I got the right output, I failed the test since it used ~70MB of memory and the usage limit is 20MB. I found an approximation formula and rewrote it to use that thinking it was the heavy stack usage that was causing me to exceed the limit. However, there doesn't seem to be any reduction.
(ns fibonacci.core
(:gen-class))
(use 'clojure.java.io)
(def phi (/ (+ 1 (Math/sqrt 5)) 2))
(defn parse-int
"Find the first integer in the string"
[int-str]
(Integer. (re-find #"\d+" int-str)))
(defn readfile
"Read in a file of integers separated by newlines and return them as a list"
[filename]
(with-open [rdr (reader filename)]
(reverse (map parse-int (into '() (line-seq rdr))))))
(defn fibonacci
"Calculate the Fibonacci number using an approximatation of Binet's Formula. From
http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/fibFormula.html"
[term]
(Math/round (/ (Math/pow phi term) (Math/sqrt 5))))
(defn -main
[& args]
(let [filename (first args)
terms (readfile filename)]
(loop [terms terms]
((comp println fibonacci) (first terms))
(if (not (empty? (rest terms)))
(recur (rest terms))))))
(-main (first *command-line-args*))
Sample input:
0
1
2
3
4
5
6
7
8
9
10
11
12
13
50
Sample output:
0
1
1
2
3
5
8
13
21
34
55
89
144
233
12586269025
Their input is clearly much larger than this and I don't get to see it. How can I modify this code to use dramatically less memory?
Edit: It's impossible. Codeeval knows about the problem and is working on it. See here.

Codeeval is broken for Clojure. There are no accepted Clojure solutions listed and there is a message from the company two months ago saying they are still working on the problem.
Darn.

Presumably the problem is that the input file is very large, and your readfile function is creating the entire list of lines in memory.
The way to avoid that is to process a single line at a time, and not hold on to the whole sequence, something like:
(defn -main [& args]
(let [filename (first args)]
(with-open [rdr (reader filename)]
(doseq [line (line-seq rdr)]
(-> line parse-int fibonacci println)))))

Are you getting a stack overflow error? If so, Clojure supports arbitrary precision, so it might help to try using /' instead of /. The apostrophe versions of +, -, * and / are designed to coerce numbers into BigInts to prevent stack overflow.
See this question.

(Another) Stack overflow on loop-recur in Clojure

Similar questions: One, Two, Three.
I am thoroughly flummoxed here. I'm using the loop-recur form, I'm using doall, and still I get a stack overflow for large loops. My Clojure version is 1.5.1.
Context: I'm training a neural net to mimic XOR. The function xor is the feed-forward function, taking weights and input and returning the result; the function b-xor is the back-propagation function that returns updated weights given the results of the last call to xor.
The following loop runs just fine, runs very fast, and returns a result, and based off of the results it returns, it is training the weights perfectly:
(loop [res 1 ; <- initial value doesn't matter
weights xorw ; <- initial pseudo-random weights
k 0] ; <- count
(if (= k 1000000)
res
(let [n (rand-int 4)
r (doall (xor weights (first (nth xorset n))))]
(recur (doall r)
(doall (b-xor weights r (second (nth xorset n))))
(inc k)))))
But of course, that only gives me the result of the very last run. Obviously I want to know what weights have been trained to get that result! The following loop, with nothing but the return value changed, overflows:
(loop [res 1
weights xorw
k 0]
(if (= k 1000000)
weights ; <- new return value
(let [n (rand-int 4)
r (doall (xor weights (first (nth xorset n))))]
(recur (doall r)
(doall (b-xor weights r (second (nth xorset n))))
(inc k)))))
This doesn't make sense to me. The entirety of weights gets used in each call to xor. So why could I use weights internally but not print it to the REPL?
And as you can see, I've stuck doall in all manner of places, more than I think I should need. XOR is a toy example, so weights and xorset are both very small. I believe the overflow occurs not from the execution of xor and b-xor, but when the REPL tries to print weights, for these two reasons:
(1) this loop can go up to 1500 without overflowing the stack.
(2) the time the loop runs is consistent with the length of the loop; that is, if I loop to 5000, it runs for half a second and then prints a stack overflow; if I loop to 1000000, it runs for ten seconds and then prints a stack overflow -- again, only if I print weights and not res at the end.
(3) EDIT: Also, if I just wrap the loop in (def w ... ), then there is no stack overflow. Attempting to peek at the resulting variable does, though.
user=> (clojure.stacktrace/e)
java.lang.StackOverflowError: null
at clojure.core$seq.invoke (core.clj:133)
clojure.core$map$fn__4211.invoke (core.clj:2490)
clojure.lang.LazySeq.sval (LazySeq.java:42)
clojure.lang.LazySeq.seq (LazySeq.java:60)
clojure.lang.RT.seq (RT.java:484)
clojure.core$seq.invoke (core.clj:133)
clojure.core$map$fn__4211.invoke (core.clj:2490)
clojure.lang.LazySeq.sval (LazySeq.java:42)
nil
Where is the lazy sequence?
If you have suggestions for better ways to do this (this is just my on-the-fly REPL code), that'd be great, but I'm really looking for an explanation as to what is happening in this case.
EDIT 2: Definitely (?) a problem with the REPL.
This is bizarre. weights is a list containing six lists, four of which are empty. So far, so good. But trying to print one of these empty lists to the screen results in a stack overflow, but only the first time. The second time it prints without throwing any errors. Printing the non-empty lists produces no stack overflow. Now I can move on with my project, but...what on earth is going on here? Any ideas? (Please pardon the following ugliness, but I thought it might be helpful)
user=> (def ww (loop etc. etc. ))
#'user/ww
user=> (def x (first ww))
#'user/x
user=> x
StackOverflowError clojure.lang.RT.seq (RT.java:484)
user=> x
()
user=> (def x (nth ww 3))
#'user/x
user=> x
(8.47089879874061 -8.742792338501289 -4.661609290853221)
user=> (def ww (loop etc. etc. ))
#'user/ww
user=> ww
StackOverflowError clojure.core/seq (core.clj:133)
user=> ww
StackOverflowError clojure.core/seq (core.clj:133)
user=> ww
StackOverflowError clojure.core/seq (core.clj:133)
user=> ww
StackOverflowError clojure.core/seq (core.clj:133)
user=> ww
(() () () (8.471553034351501 -8.741870954507117 -4.661171802683782) () (-8.861958958234174 8.828933147027938 18.43649480263751 -4.532462509591159))

If you call doall on a sequence that contains more lazy sequences, doall does not recursively iterate through the subsequences. In this particular case, the return value of b-xor contained empty lists that were defined lazily from previous empty lists defined lazily from previous empty lists, and so on. All I had to do was add a single doall to the map that produced the empty lists (in b-xor), and the problem disappeared. This loop (with all of the doall's removed) never overflows:
(loop [res 1
weights xorw
k 0]
(if (= k 1000000)
weights
(let [n (rand-int 4)
r (xor weights (first (nth xorset n)))]
(recur r
(b-xor weights r (second (nth xorset n)))
(inc k)))))
Okay. So I have an answer. I hope this is helpful to some other poor soul who thought he'd solved his lazy sequencing issues with a badly-placed doall.
This still leaves me with a question about the REPL, but it should probably go under a different question so it won't have all of the baggage of this problem with it. You can see in my question above that the empty lists were evaluated correctly. Why did printing them the first time throw an exception? I'm going to experiment a bit with this, and if I can't figure it out...new question!

I have two versions of a function to count leading hash(#) characters, which is better?

I wrote a piece of code to count the leading hash(#) character of a line, which is much like a heading line in Markdown
### Line one -> return 3
######## Line two -> return 6 (Only care about the first 6 characters.
Version 1
(defn
count-leading-hash
[line]
(let [cnt (count (take-while #(= % \#) line))]
(if (> cnt 6) 6 cnt)))
Version 2
(defn
count-leading-hash
[line]
(loop [cnt 0]
(if (and (= (.charAt line cnt) \#) (< cnt 6))
(recur (inc cnt))
cnt)))
I used time to measure both tow implementations, found that the first version based on take-while is 2x faster than version 2. Taken "###### Line one" as input, version 1 took 0.09 msecs, version 2 took about 0.19 msecs.
Question 1. Is it recur that slows down the second implementation?
Question 2. Version 1 is closer to functional programming paradigm , is it?
Question 3. Which one do you prefer? Why? (You're welcome to write your own implementation.)
--Update--
After reading the doc of cloujure, I came up with a new version of this function, and I think it's much clear.
(defn
count-leading-hash
[line]
(->> line (take 6) (take-while #(= \# %)) count))

IMO it isn't useful to take time measurements for small pieces of code
Yes, version 1 is more functional
I prefer version 1 because it is easier to spot errors
I prefer version 1 because it is less code, thus less cost to maintain.
I would write the function like this:
(defn count-leading-hash [line]
(count (take-while #{\#} (take 6 line))))

No, it's the reflection used to invoke .charAt. Call (set! *warn-on-reflection* true) before creating the function, and you'll see the warning.
Insofar as it uses HOFs, sure.
The first, though (if (> cnt 6) 6 cnt) is better written as (min 6 cnt).

1: No. recur is pretty fast. For every function you call, there is a bit of overhead and "noise" from the VM: the REPL needs to parse and evaluate your call for example, or some garbage collection might happen. That's why benchmarks on such tiny bits of code don't mean anything.
Compare with:
(defn
count-leading-hash
[line]
(let [cnt (count (take-while #(= % \#) line))]
(if (> cnt 6) 6 cnt)))
(defn
count-leading-hash2
[line]
(loop [cnt 0]
(if (and (= (.charAt line cnt) \#) (< cnt 6))
(recur (inc cnt))
cnt)))
(def lines ["### Line one" "######## Line two"])
(time (dorun (repeatedly 10000 #(dorun (map count-leading-hash lines)))))
;; "Elapsed time: 620.628 msecs"
;; => nil
(time (dorun (repeatedly 10000 #(dorun (map count-leading-hash2 lines)))))
;; "Elapsed time: 592.721 msecs"
;; => nil
No significant difference.
2: Using loop/recur is not idiomatic in this instance; it's best to use it only when you really need it and use other available functions when you can. There are many useful functions that operate on collections/sequences; check ClojureDocs for a reference and examples. In my experience, people with imperative programming skills who are new to functional programming use loop/recur a lot more than those who have a lot of Clojure experience; loop/recur can be a code smell.
3: I like the first version better. There are lots of different approaches:
;; more expensive, because it iterates n times, where n is the number of #'s
(defn count-leading-hash [line]
(min 6 (count (take-while #(= \# %) line))))
;; takes only at most 6 characters from line, so less expensive
(defn count-leading-hash [line]
(count (take-while #(= \# %) (take 6 line))))
;; instead of an anonymous function, you can use `partial`
(defn count-leading-hash [line]
(count (take-while (partial = \#) (take 6 line))))
edit:
How to decide when to use partial vs an anonymous function?
In terms of performance it doesn't matter, because (partial = \#) evaluates to (fn [& args] (apply = \# args)). #(= \# %) translates to (fn [arg] (= \# arg)). Both are very similar, but partial gives you a function that accepts an arbitrary number of arguments, so in situations where you need it, that's the way to go. partial is the λ (lambda) in lambda calculus. I'd say, use what's easier to read, or partial if you need a function with an arbitrary number of arguments.

Micro-benchmarks on the JVM are almost always misleading, unless you really know what you're doing. So, I wouldn't put too much weight on the relative performance of your two solutions.
The first solution is more idiomatic. You only really see explicit loop/recur in Clojure code when it's the only reasonable alternative. In this case, clearly, there is a reasonable alternative.
Another option, if you're comfortable with regular expressions:
(defn count-leading-hash [line]
(count (or (re-find #"^#{1,6}" line) "")))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to create a lazy sequence of random numbers in clojure - clojure

Related

Infinitely recursive lazy sequence appears as empty sequence in Clojure

Why does clojure attempt to resolve this symbol?

Reducing memory usage in a simple Clojure program

(Another) Stack overflow on loop-recur in Clojure

I have two versions of a function to count leading hash(#) characters, which is better?

Categories

Resources