Standard Deviation in clojure - clojure

I am trying to write a function in clojure to find the standard deviation of a sequence (vector). So far I have defined a function to find the average of a set of numbers, but I am having an issue with a couple of things.
First I am confused over how to use a square root and powers in clojure. Second I am trying to figure out how to pull out each element individually out the vector and subtract the mean from it and then square it.
So far this is my function
(defn mean [a] (/ (reduce + a) (count a)))
(defn standarddev [a] (Math/sqrt (/ (reduce + (map square #(- % (mean a) a))) (- (count a) 1 ))))

As long as you have a double, you can use Java's Math class (https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html) to perform calculations like sqrt. You don't need to do anything special to access the Math class, because Clojure make all java.lang classes available to you w/o import.

You are pretty close.
Assuming you already have the following functions
(defn square [n] (* n n))
(defn mean [a] (/ (reduce + a) (count a)))
2 problems with your standarddev function
(defn standarddev [a] (Math/sqrt (/ (map square (map - a (mean a))) (- (count a) 1 ))))
1) (map - a (mean a))
Doesn't work because you are subtracting a "number" from a "vector".
To fix repeat (mean a) as many times as there are elements in "a"
Easiest and by no means efficient solution would be
(map - a (repeat (mean a)))
2) (map square (map - a (mean a))) Doesn't work because of #1 above and because map returns a "vector".
To fix sum the elements of the vector
(reduce + (map square (map - a (repeat (mean a)))))
Your standard dev function should now be
(defn standarddev [a]
(Math/sqrt (/
(reduce + (map square (map - a (repeat (mean a)))))
(- (count a) 1 ))))

You can gently increase performance by getting rid of the map altogether
(def square #(* % %))
(defn standard-deviation
[a]
(let [mn (mean a)]
(Math/sqrt
(/ (reduce #(+ %1 (square (- %2 mn))) 0 a)
(dec (count a))))))

First I am confused over how to use a square root and powers in clojure.
To square something, just multiply it by itself:
(defn square [n]
(* n n))
If you want a power higher than 2, you could also use an exponentiation function:
(defn exp [x n]
(reduce * (repeat n x)))
Second I am trying to figure out how to pull out each element individually out the vector and subtract the mean from it and then square it.
The Clojure (functional) way of iterating through a seq is to use map. Map takes a function and a collection and returns the result of applying that function to each element of the collection.
(defn squares [avg coll] (map #(square (- % avg)) coll))
Final standard-deviation function, using the above 2 functions and your mean:
(defn standard-deviation [coll]
(let [avg (mean coll)
squares (squares avg coll)
total (count coll)]
(Math/sqrt (/ (reduce + squares) (- total 1)))))
inspiration from: https://github.com/clojure-cookbook/clojure-cookbook/blob/master/01_primitive-data/1-20_simple-statistics.asciidoc

Corrected sample standard deviation, same as in R sd function
(defn sd [abc]
(Math/sqrt
(/ (reduce + (map #(* % %)
(map #(- % (/ (reduce + abc) (count abc))) abc)))
(dec (count abc))
)
)
)

Related

Clojure function to Replace Count

I need help with an assignment that uses Clojure. It is very small but the language is a bit confusing to understand. I need to create a function that behaves like count without actually using the count funtion. I know a loop can be involved with it somehow but I am at a lost because nothing I have tried even gets my code to work. I expect it to output the number of elements in list. For example:
(defn functionname []
...
...)
(println(functionname '(1 4 8)))
Output:3
Here is what I have so far:
(defn functionname [n]
(def n 0)
(def x 0)
(while (< x n)
do
()
)
)
(println(functionname '(1 4 8)))
It's not much but I think it goes something like this.
This implementation takes the first element of the list and runs a sum until it can't anymore and then returns the sum.
(defn recount [list-to-count]
(loop [xs list-to-count sum 0]
(if (first xs)
(recur (rest xs) (inc sum))
sum
)))
user=> (recount '(3 4 5 9))
4
A couple more example implementations:
(defn not-count [coll]
(reduce + (map (constantly 1) coll)))
or:
(defn not-count [coll]
(reduce (fn [a _] (inc a)) 0 coll))
or:
(defn not-count [coll]
(apply + (map (fn [_] 1) coll)))
result:
(not-count '(5 7 8 1))
=> 4
I personally like the first one with reduce and constantly.

clojure performance on badly performing code

I have completed this problem on hackerrank and my solution passes most test cases but it is not fast enough for 4 out of the 11 test cases.
My solution looks like this:
(ns scratch.core
(require [clojure.string :as str :only (split-lines join split)]))
(defn ascii [char]
(int (.charAt (str char) 0)))
(defn process [text]
(let [parts (split-at (int (Math/floor (/ (count text) 2))) text)
left (first parts)
right (if (> (count (last parts)) (count (first parts)))
(rest (last parts))
(last parts))]
(reduce (fn [acc i]
(let [a (ascii (nth left i))
b (ascii (nth (reverse right) i))]
(if (> a b)
(+ acc (- a b))
(+ acc (- b a))))
) 0 (range (count left)))))
(defn print-result [[x & xs]]
(prn x)
(if (seq xs)
(recur xs)))
(let [input (slurp "/Users/paulcowan/Downloads/input10.txt")
inputs (str/split-lines input)
length (read-string (first inputs))
texts (rest inputs)]
(time (print-result (map process texts))))
Can anyone give me any advice about what I should look at to make this faster?
Would using recursion instead of reduce be faster or maybe this line is expensive:
right (if (> (count (last parts)) (count (first parts)))
(rest (last parts))
(last parts))
Because I am getting a count twice.
You are redundantly calling reverse on every iteration of the reduce:
user=> (let [c [1 2 3]
noisey-reverse #(doto (reverse %) println)]
(reduce (fn [acc e] (conj acc (noisey-reverse c) e))
[]
[:a :b :c]))
(3 2 1)
(3 2 1)
(3 2 1)
[(3 2 1) :a (3 2 1) :b (3 2 1) :c]
The reversed value could be calculated inside the containing let, and would then only need to be calculated once.
Also, due to the way your parts is defined, you are doing linear time lookups with each call to nth. It would be better to put parts in a vector and do indexed lookup. In fact you wouldn't need a reversed parts, and could do arithmetic based on the count of the vector to find the item to look up.

What's a more idiomatic and concise way of writing Pascal's Triangle with Clojure?

I implemented a naive solution for printing a Pascal's Triangle of N depth which I'll include below. My question is, in what ways could this be improved to make it more idiomatic? I feel like there are a number of things that seem overly verbose or awkward, for example, this if block feels unnatural: (if (zero? (+ a b)) 1 (+ a b)). Any feedback is appreciated, thank you!
(defn add-row [cnt acc]
(let [prev (last acc)]
(loop [n 0 row []]
(if (= n cnt)
row
(let [a (nth prev (- n 1) 0)
b (nth prev n 0)]
(recur (inc n) (conj row (if (zero? (+ a b)) 1 (+ a b)))))))))
(defn pascals-triangle [n]
(loop [cnt 1 acc []]
(if (> cnt n)
acc
(recur (inc cnt) (conj acc (add-row cnt acc))))))
(defn pascal []
(iterate (fn [row]
(map +' `(0 ~#row) `(~#row 0)))
[1]))
Or if you're going for maximum concision:
(defn pascal []
(->> [1] (iterate #(map +' `(0 ~#%) `(~#% 0)))))
To expand on this: the higher-order-function perspective is to look at your original definition and realize something like: "I'm actually just computing a function f on an initial value, and then calling f again, and then f again...". That's a common pattern, and so there's a function defined to cover the boring details for you, letting you just specify f and the initial value. And because it returns a lazy sequence, you don't have to specify n now: you can defer that, and work with the full infinite sequence, with whatever terminating condition you want.
For example, perhaps I don't want the first n rows, I just want to find the first row whose sum is a perfect square. Then I can just (first (filter (comp perfect-square? sum) (pascal))), without having to worry about how large an n I'll need to choose up front (assuming the obvious definitions of perfect-square? and sum).
Thanks to fogus for an improvement: I need to use +' rather than just + so that this doesn't overflow when it gets past Long/MAX_VALUE.
(defn next-row [row]
(concat [1] (map +' row (drop 1 row)) [1]))
(defn pascals-triangle [n]
(take n (iterate next-row '(1))))
Not as terse as the others, but here's mine:)
(defn A []
(iterate
(comp (partial map (partial reduce +))
(partial partition-all 2 1) (partial cons 0))
[1]))

find the n-tuples of all integers below m whose sum is a prime

I am going through the Clojure in Action book and code similar to that below is given for a function that returns all pairs of numbers below m whose sum is a prime (assume prime? is given):
(defn pairs-for-primes [m]
(let [z (range 0 m)]
(for [a z b z :when (prime? (+ a b))]
(list a b))))
How would one generalize that to return the n-tuples of all numbers below m whose sum is a prime?
(defn all-ntuples-below [n m]
...
for can be used for a sort of "special case" of cartesian product, where you know the sets in advance at compile time. Since you don't actually know the sets you want the product of, you need to use a real cartesian-product function. For example, with clojure.math.combinatorics, you could write
(defn pairs-for-primes [m n]
(let [z (range 0 m)
tuples (apply cartesian-product (repeat n z))]
(filter #(prime? (apply + %)) tuples)))
But perhaps your question is about how to implement a cartesian product? It's not that hard, although the version below is not terribly performant:
(defn cartesian-product [sets]
(cond (empty? sets) (list (list))
(not (next sets)) (map list (first sets))
:else (for [x (first sets)
tuple (cartesian-product (rest sets))]
(cons x tuple))))
You can use take to do that (as pairs-for-primes returns a sequence take will only cause it to calculate the number of items required)
(defn all-ntuples-below [n m]
(take n (pairs-for-primes m)))

What's wrong with this clojure prime seq?

I can't figure out why this definition of a lazy primes sequence would cause non-termination. The stack-trace I get isn't very helpful (my one complaint about clojure is obtuse stack-traces).
(declare naturals is-prime? primes)
(defn naturals
([] (naturals 1))
([n] (lazy-seq (cons n (naturals (inc n))))))
(defn is-prime? [n]
(not-any? #(zero? (rem n %))
(take-while #(> n (* % %)) (primes))))
(defn primes
([] (lazy-seq (cons 2 (primes 3))))
([n] (let [m (first (filter is-prime? (naturals n)))]
(lazy-seq (cons m (primes (+ 2 m)))))))
(take 10 (primes)) ; this results in a stack overflow error
Let's start executing primes, and we'll magically realise one seq just to be clear. I'll ignore naturals because it's correctly lazy:
> (magically-realise-seq (primes))
=> (magically-realise-seq (lazy-seq (cons 2 (primes 3))))
=> (cons 2 (primes 3))
=> (cons 2 (let [m (first (filter is-prime? (naturals 3)))]
(lazy-seq (cons m (primes (+ 2 3))))))
=> (cons 2 (let [m (first (filter
(fn [n]
(not-any? #(zero? (rem n %))
(take-while #(> n (* % %)) (primes)))))
(naturals 3)))]
(lazy-seq (cons m (primes (+ 2 3))))))
I've substituted is-prime? in as a fn at the end there—you can see that primes will get called again, and realised at least once as take-while pulls out elements. This will then cause the loop.
The issue is that to know to calculate the "primes" function you are using the "is-prime?" function, and then to calculate the "is-prime?" function you are using "(primes)", hence the stack over flow.
So to calculate the "(primes 3)", you are need calculate the "(first (filter is-prime? (naturals 3)))", which is going to call "(is-prime? 1)", which is calling "(primes)", which in turns calls "(primes 3)". In other words you are doing:
user=> (declare a b)
#'user/b
user=> (defn a [] (b))
#'user/a
user=> (defn b [] (a))
#'user/b
user=> (a)
StackOverflowError user/b (NO_SOURCE_FILE:1)
To see how to generate prime numbers: Fast Prime Number Generation in Clojure
I think the problem is, that you're trying to use (primes) before it's already constructed.
Changing is-prime? like that fixes the problem:
(defn is-prime? [n]
(not-any? #(zero? (rem n %))
(take-while #(>= n (* % %)) (next (naturals)))))
(Note, that I've changed > with >=, otherwise it gives that 4 is prime. It still says that 1 is prime, which isn't true and may cause problems if you use is-prime? elsewhere.