With Clojure, how do I generate a random long number? I know Clojure has a rand-int function but it only works for integer. If a given number is long, I got this repl error:
IllegalArgumentException Value out of range for int: 528029243649 clojure.lang.RT.intCast (RT.java:1205)
If you take a look at the source of rand-int
(defn rand-int
"Returns a random integer between 0 (inclusive) and n (exclusive)."
[n] (int (rand n)))
You can do a similar thing
(long (rand n)))
Clojure's rand and rand-int use java.util.Random as the underlying random number generator. If your application depends heavily on random numbers, you might want to consider using a higher-quality random number generator written in Java, such as MersenneTwisterFast. This has a nextLong() method, and it's very easy to use from Clojure. Java's standard class SecureRandom might be worth considering, too; it's designed for different purposes than the Mersenne Twister. There are other good Java random number generators available. Depends on what you're using the random numbers for. For occasional use of random numbers, java.util.Random might be just fine. There are additional options mentioned in comments by others.
I'll describe use of MersenneTwisterFast. Using the other classes I mentioned would be essentially the same, but without the initial steps.
With Leiningen, add something like this to project.clj:
:java-source-paths ["src/java"]
and then put the Java source for MersenneTwisterFast.java in src/java/ec/util. Then you can do this:
(ns my.namespace
(:import [ec.util MersenneTwisterFast]))
(def rng (MersenneTwisterFast. 42)) ; Specify a different seed, e.g. from system time.
(defn next-long [] (.nextLong rng))
Related
I'm new to clojure, and as quick practice I wrote a function that is supposed to go through the Fibonacci sequence until it exceeds 999999999 1 billion times (does some extra math too but not very important). I've written something that does the same in Java, and while I understand that by nature Clojure is slower than Java, the java program took 35 seconds to complete while the Clojure one took 27 minutes, which I found very surprising (considering nodejs was able to complete it in about 8 minutes). I compiled the class with the repl and ran it with this Java command java -cp `clj -Spath` fib. Really unsure was to why this was so slow.
(defn fib
[]
(def iter (atom (long 0)))
(def tester (atom (long 0)))
(dotimes [n 1000000000]
(loop [prev (long 0)
curr (long 1)]
(when (<= prev 999999999)
(swap! iter inc)
(if (even? #iter)
(swap! tester + prev)
(swap! tester - prev))
(recur curr (+ prev curr)))))
(println (str "Done in: " #iter " Test: " #tester))
)
Here is my Java method for reference
public static void main(String[] args) {
long iteration = 0;
int test = 0;
for (int n = 0; n < 1000000000; n++) {
int x = 0, y = 1;
while (true) {
iteration += 1;
if (iteration % 2 == 0) {
test += x;
}
else {
test -=x;
}
int i = x + y;
x = y;
y = i;
if (x > 999999999) { break; }
}
}
System.out.println("iter: " + iteration + " " + test);
}
One thing a lot of newcomers to Clojure don't realize is that Clojure is a higher-level language by default. That means it will force you into implementations that will handle overflow on arithmetic, will treat numbers as objects you can extend, will prevent you from mutating any variable, will force you to have thread-safe code, and will push you towards functional solutions that rely on recursion for looping.
It also doesn't force you to type everything by default, which is also convenient not to have to care to think about the type of everything and making sure all your types are compatible, like that your vector contains only Integers for example, Clojure doesn't care, letting you put Integers and Longs in it.
All this stuff is great for writing fast-enough correct, evolvable, and maintainable applications, but it is not so great for high-performance algorithms.
That means by default Clojure is optimized for implementing applications and not for implementing high-performance algorithms.
Unfortunately, it seems most people that "try" a new language, and thus newcomers to Clojure will tend to first use the language to try and implement high-performance algorithms. This is an obvious mismatch in what Clojure defaults to be good at, and lots of newcomers are immediately faced with the added friction Clojure causes here. Clojure assumed you were going to implement an app, not some high-performance one billion N sized Fibonacci-like algorithm.
But don't lose hope, Clojure can also be used to implement high-performance algorithms, but it isn't the default, so you generally need to be a more experienced Clojure developer to know how to do so, as it is less obvious.
Here's your algorithm in Clojure, which performs as fast as your Java implementation, it's a recursive re-write of your exact Java code:
(ns playground
(:gen-class)
(:require [fastmath.core :as fm]))
(defn -main []
(loop [n (long 0) iteration (long 0) test (long 0)]
(if (fm/< n 1000000000)
(let [^longs inner
(loop [x (long 0) y (long 1) iteration iteration test test]
(let [iteration (fm/inc iteration)
test (if (fm/== (fm/mod iteration 2) 0) (fm/+ test x) (fm/- test x))
i (fm/+ x y)
x y
y i]
(if (fm/> x 999999999)
(doto (long-array 2) (aset 0 iteration) (aset 1 test))
(recur x y iteration test))))]
(recur (fm/inc n) (aget inner 0) (aget inner 1)))
(str "iter: " iteration " " test))))
(println (time (-main)))
"Elapsed time: 47370.544514 msecs"
;;=> iter: 45000000000 0
Using deps:
:deps {generateme/fastmath {:mvn/version "2.1.8"}}
As you can see, on my laptop, it completes in ~47 seconds. I also ran your Java version on my laptop to compare on my exact hardware, and for Java I got: 46947.343671 ms.
So on my laptop, you can see the Clojure and the Java are basically just as fast each, both clocking in at around 47 seconds.
The difference is that in Java, the style of programming is always conductive to implementing high-performance algorithms. You can directly use primitive types and primitive arithmetic, no boxing, no overflow checks, mutable variables with no synchronization or atomicity or volatility protections, etc.
Few things were thus required to get similar performance in Clojure:
Use primitive types
Use primitive math
Avoid the use of higher-level managed mutable containers like atom
And obviously, we needed to run the same algorithm too, so similar implementation. I wasn't trying to compare if another algorithm exists that can be faster for the same problem, but how to implement the same algo in Clojure so it runs just as fast.
In order to do primitive types in Clojure, you have to know that you are only allowed to do so inside local contexts using let and loop, and all function call will undo the primitive type, unless they too are typed to primitive long or double (the only supported primitive types that can cross function boundaries in Clojure).
That's the first thing I did then, just re-write your same loops using Clojure's loop/recur and declare the same variables as you did, but using let shadowing instead, so we don't need a managed mutable container.
Finally, I made use of Fastmath, a library that provides a lot of primitive versions of arithmetic functions so that we can do primitive math. Clojure core has some of its own, but it doesn't have mod for example, so I needed to pull in Fastmath.
That's it.
Generally, this is what you need to know, keep to primitive types, keep to primitive math (using fastmath), type hint to avoid reflection, leverage let shadowing, keep to primitive arrays, and you'll get Clojure high-performance implementations.
There's a good set of info about it here: https://clojure.org/reference/java_interop#primitives
One last thing, the philosophy of Clojure is that it is meant to implement fast-enough correct, evolvable and maintainable apps that can scale. That's why the language is the way it is. While you can, as I've shown, implement high-performance algos, Clojure's philosophy is also not to re-invent a syntax for things that Java already is great at. Clojure can use Java, so for algorithms that need very imperative, mutable, primitive logic, it would expect you'd just fallback to Java to write this as a static method, and then just use it from Clojure. Or it thinks you'll even delegate to something more performant than even Java, and use BLAS, or a GPU to perform super-fast matrix math, or something of that sort. That's why it doesn't bother to provide its own imperative constructs, or raw memory access and all that, since it doesn't think it do anything better than the hosts it runs over.
Your code might seem like a "basic function", but there are two main problems:
You used atom. Atom isn't variable as you know it from Java, but it's construct for managing synchronous state, free of race conditions. So reset! and swap! are atomic operations and they're slow. Look at this example:
(let [counter (atom 0)]
(dotimes [x 1000]
(-> (Thread. (fn [] (swap! counter inc)))
.start))
(Thread/sleep 2000)
#counter)
=> 1000
1000 threads is started, value of counter is 1000x increased, result is 1000, no surprise. But compare that with volatile!, which isn't thread-safe:
(let [counter (volatile! 0)]
(dotimes [x 1000]
(-> (Thread. (fn [] (vswap! counter inc)))
.start))
(Thread/sleep 2000)
#counter)
=> 989
See also Clojure Reference about Atoms.
Unless you really need atoms and volatiles, you shouldn't use them. Usage of loop is also discouraged, because there is usually some better function, which does exactly what you want. You tried to literally rewrite your Java function into Clojure. Clojure requires different approach to problems and your code definitelly isn't idiomatic. I suggest you to not rewrite Java code to Clojure line by line, but find some easy problems and learn how to solve them in Clojure way, without atom, volatile! and loop.
By the way, there is memoize, which can be useful in examples like yours.
If you are a beginner at programming, I suggest you always assume your code is wrong before assuming the language/lib/framework/platform is wrong.
Take a look at Fibonacci sequence various implementations in Java and Clojure, you may learn something.
As others have noted, a straightforward translation of the Java code to Clojure runs rather slowly. However, if we write a Fibonacci number generator which takes advantage of Clojure's strengths we can get something which is short and does its job more idiomatically.
To start, let's say we want a function which will computed the n'th number of the Fibonacci sequence (1, 1, 2, 3, 5, 8, 13, 21, 34, 55, ...). To do that we could use:
(defn fib [n]
(loop [a 1
b 0
cnt n]
(if (= cnt 1)
a
(recur (+' a b) a (dec cnt)))))
which iteratively recomputes the "next" Fibonacci value until it gets to the one which is desired.
Given this function we can develop one which creates a collection of the Fibonacci sequence values by mapping this function across a range of index values:
(defn fib-seq [n]
(map #(fib %) (range 1 (inc n))))
But this is of course a stunningly inefficient way of computing a sequence of Fibonacci values, since for each value we have to compute all of the preceding values and then we only save the last one. If we want a more efficient way to compute the entire sequence we can loop through the possibilities and gather the results in a collection:
(defn fib-seq [n]
(loop [curr 1
prev 0
c '(1)]
(if (= n (count c))
(reverse c)
(let [new-curr (+' curr prev)]
(recur new-curr curr (cons new-curr c))))))
This gives us a reasonably efficient way to collect the values of the Fibonacci sequence. For your test of a billion loops through (fib 45) (the 45th term of the sequence being the first one which exceeds 999,999,999) I used:
(time (dotimes [n 1000000000](fib-seq 45)))
which completed in 17.5 seconds on my hardware and OS (Windows 10, dual-processor Intel i5 # 2.6 GHz).
I have recently watched Rich Hickeys talk at Cojure Conj 2016 and although it was very interesting, I didn't really understand the point in clojure.spec or when you'd use it. It seemed like most of the ideas, such as conform, valid etc, had similar functions in Clojure already.
I have only been learning clojure for around 3 months now so maybe this is due to lack of programming/Clojure experience.
Do clojure.spec and cljs.spec work in similar ways to Clojure and Cljs in that, although they are not 100% the same, they are based on the same underlying principles.
Are you tired of documenting your programs?
Does the prospect of making up yet more tests cause procrastination?
When the boss says "test coverage", do you cower with fear?
Do you forget what your data names mean?
For smooth expression of hard specifications, you need Clojure.Spec!
Clojure.spec gives you a uniform method of documenting, specifying, and automatically testing your programs, and of validating your live data.
It steals virtually every one of its ideas. And it does nothing you can't do for yourself.
But in my - barely informed - opinion, it changes the economy of specification, making it worth while doing properly. A game-changer? - quite possibly.
At the clojure/conj conference last week, probably half of the presentations featured spec in some way, and it's not even out of alpha yet. spec is a major feature of clojure; it is here to stay, and it is powerful.
As an example of its power, take static type checking, hailed as a kind of safety net by so many, and a defining characteristic of so many programming languages. It is incredibly limited in that it's only good at compile time, and it only checks types. spec, on the other hand, validates and conforms any predicate (not just type) for the args, the return, and can also validate relationships between the two. All of this is external to the function's code, separating the logic of the function from being commingled with validation and documentation about the code.
Regarding WORKFLOW:
One archetypal example of the benefits of relationship-checking, versus only type-checking, is a function which computes the substring of a string. Type checking ensures that in (subs s start end) the s is a string and start and end are integers. However, additional checking must be done within the function to ensure that start and end are positive integers, that end is greater than start, and that the resulting substring is no larger than the original string. All of these things can be spec'd out, for example (forgive me if some of this is a bit redundant or maybe even inaccurate):
(s/fdef clojure.core/subs
:args (s/and (s/cat :s string? :start nat-int? :end (s/? nat-int?))
(fn [{:keys [s start end]}]
(if end
(<= 0 start end (count s))
(<= 0 start (count s)))))
:ret string?
:fn (fn [{{:keys [s start end]} :args, substring :ret}]
(and (if end
(= (- end start) (count substring))
(= (- (count s) start) (count substring)))
(<= (count substring) (count s)))))
Call the function with sample data meeting the above args spec:
(s/exercise-fn `subs)
Or run 1000 tests (this may fail a few times, but keep running and it will work--this is due to the built-in generator not being able to satisfy the second part of the :args predicate; a custom generator can be written if needed):
(stest/check `subs)
Or, want to see if your app makes calls to subs that are invalid while it's running in real time? Just run this, and you'll get a spec exception if the function is called and the specs are not met:
(stest/instrument `subs)
We have not integrated this into our work flow yet, and can't in production since it's still alpha, but the first goal is to write specs. I'm putting them in the same namespace but in separate files currently.
I foresee our work flow being to run the tests for spec'd functions using this (found in the clojure spec guide):
(-> (stest/enumerate-namespace 'user) stest/check)
Then, it would be advantageous to turn on instrumenting for all functions, and run the app under load as we normally would test it, and ensure that "real world" data works.
You can also use s/conform to destructure complex data in functions themselves, or use s/valid as pre- and post- conditions for running functions. I'm not too keen on this, as it's overhead in a production system, but it is a possibility.
The sky's the limit, and we've just scratched the surface! Cool things coming in the next months and years with spec!
Clojure has a number of libraries for generative testing such as test.check, test.generative or data.generators.
It is possible to use higher order functions to create random data generators that are composable such as:
(defn gen [create-fn content-fn lazy]
(fn [] (reduce #(create-fn %1 %2) (for [a lazy] (content-fn)))))
(def a (gen str #(rand-nth [\a \b \c]) (range 10)))
(a)
(def b (gen vector #(rand-int 10) (range 2)))
(b)
(def c (gen hash-set b (range (rand-int 10))))
(c)
This is just an example and could be modified with different parameters, filters, partials, etc to create data generating functions which are quite flexible.
Is there something that any of the generative libraries can do that isn't also just as (or more) succinctly achievable by composing some higher order functions?
As a side note to the stackoverflow gods: I don't believe this question is subjective. I'm not asking for an opinion on which library is better. I want to know what specific feature(s) or technique(s) of any/all data generative libraries differentiate them from composing vanilla higher order functions. An example answer should illustrate generating random data using any of the libraries with an explanation as to why this would be more complex to do by composing HOFs in the way I have illustrated above.
test.check does this way better. Most notably, suppose you generate a random list of 100 elements, and your test fails: something about the way you handled that list is wrong. What now? How do you find the basic bug? It surely doesn't depend on exactly those 100 inputs; you could probably reproduce it with a list of just a few elements, or even an empty list if something is wrong with your base case.
The feature that makes all this actually useful isn't the random generators, it is the "shrinking" of those generators. Once test.check finds an input that breaks your tests, it tries to simplify the input as much as possible while still making your tests break. For a list of integers, the shrinks are simple enough you could maybe do them yourself: remove any element, or decrease any element. Even that may not be true: choosing the order to do shrinks in is probably a harder problem than I realize. And for larger inputs, like a list of maps from vectors to a 3-tuple of [string, int, keyword], you'll find it totally unmanageable, whereas test.check has done all the hard work already.
Most reference to iterate are for operators, and all the applications on functions are so confusing that I still don't get how to use iterate in my code, and what partial is.
I am doing a programming homework, trying to use Newton's method to get square root for a number n. That is, with guess as the initial approximation, keep computing new approximations by computing the average of the approximation and n/approximation. Continue until the difference between the two most recent approximations is less than epsilon.
I am trying to do the approximation part first, I believe that is something I need to use iterate and partial. And later the epsilon is something I need to use "take"?
Here is the code I have for approximation without the epsilon:
(defn sqrt [n guess]
(iterate (partial sqrt n) (/ (+ n (/ n guess)) 2)))
This code does not work properly though, when I enter (sqrt 2 2), it gives me (3/2 user=> ClassCastException clojure.lang.Cons cannot be cast to java.lang.Number clojure.lang.Numbers.divide (Numbers.java:155).
I guess this is the part I need to iterate over and over again? Could someone please give me some hints? Again, this is a homework problem, so please do not provide me direct solution to the entire problem, I need some ideas and explanations that I can learn from.
partial takes a function and at least one parameter for that function and returns a new function that expects the rest of the parameters.
(def take-five (partial take 5))
(take-five [1 2 3 4 5 6 7 8 9 10])
;=> (1 2 3 4 5)
iterate generates an infinite sequence by taking two parameters: a function and a seed value. The seed value is used as the first element in the generated list and the second is computed by applying the function to the seed, the second value is used as the input for the function to get the third value and so on.
(take-five (iterate inc 0))
;=> (0 1 2 3 4)
ClojureDocs offers good documentation on both functions: http://clojuredocs.org/clojure_core/clojure.core/iterate and http://clojuredocs.org/clojure_core/clojure.core/partial.
So, #ponzao explained quite well what iterate and partial do, and #yonki made the point that you don't really need it. If you like to explore some more seq functions it's probably a good idea to try it anyways (although the overhead from lazy sequences might result in a somewhat not ideal performance).
Hints:
(iterate #(sqrt n %) initial-approximation) will give you a seq of approximations.
you can use partition to create pairs of subsequent approximations.
discard everything not fulfilling the epsilon condition using drop-while
get result.
It's probably quite rewarding to solve this using sequences since you get in contact with a lot of useful seq functions.
Note: There is a full solution somewhere in the edit history of this answer. Sorry for that, didn't fully get the "homework" part.
I think you're missing the point. You don't need iterate neither partial too.
If you need to execute some computation till condition is fulfilled you can use easy to understand loop/recur instruction. loop/recur can be understood as: do some computation, check if condition is fulfilled, if yes return computed value, if not repeat computation.
Since you don't want entire solution, only an advice where to go, have a proper look on loop/recur and everything gonna be all right.
#noisesmith made good point. reduce is not for computing till condition is fullfiled, but may be useful when performing some computation with limited number of steps.
I’m currrently doing this: (repeatedly n #(rand-nth (seq coll))) but I suspect there might be a more idiomatic way, for 2 reasons:
I’ve found that there’s frequently a more concise and expressive alternative to using short anonymous functions, e.g. partial
the docstring for repeatedly says “presumably with side effects”, implying that it’s not intended to be used to produce values
I suppose I could figure out a way to use reduce but that seems like it would be tricky and less efficient, as it would have to process the entire collection, since reduce is not lazy.
An easy solution but not optimal for big collections could be:
(take n (shuffle coll))
Has the "advantage" of not repeating elements. Also you could implement a lazy-shuffle but it will involve more code.
I know it's not exactly what you're asking - but if you're doing a lot of sampling and statistical work, you might be interested in Incanter ([incanter "1.5.2"]).
Incanter provides the function sample, which provides options for sample size, and replacement.
(require '[incanter.stats :refer [sample]]))
(sample [1 2 3 4 5 6 7] :size 5 :replacement false)
; => (1 5 6 2 7)