How do I translate a complicated recurrence from clojure to maxima? - clojure

I think this little program:
(defn average [lst] (/ (reduce + lst) (count lst)))
(defn sqsum [lst] (reduce + (map #(* % %) lst)))
(defn tet [row col]
(cond (= [row col] [0 0]) 0
(= [row col] [1 0]) 1
(< row (inc col)) 0
(> row (inc col)) (average (for [i (range row)] (tet i col)))
(= row (inc col)) (Math/sqrt (- 1 (sqsum (for [i (range col)] (tet row i)))))))
gives me the coordinates of the vertices of generalised tetrahedra / euclidean simplices in various dimensions.
Unfortunately clojure will express things like sqrt(3/4) in floating point, whereas I'd like the answers in symbolic form.
Maxima would be ideal for this sort of thing, but I don't know how to express this relation in maxima.
Alternatively, solutions involving adding symbolic square roots to clojure would also be nice.

In Maxima, a memoizing function is defined by f[x, y] := ..., that is, with square brackets instead of parentheses for the arguments.
From what I can tell, this is a translation of the Clojure function:
average (lst) := apply ("+", lst) / length (lst);
sqsum (lst) := apply ("+", map (lambda ([x], x^2), lst));
tet [row, col] :=
if row < col + 1 then 0
else if row > col + 1 then average (makelist (tet [i, col], i, 0, row - 1))
else if row = col + 1 then sqrt (1 - sqsum (makelist (tet [row, i], i, 0, col - 1)));
tet [0, 0] : 0;
tet [1, 0] : 1;
E.g.:
radcan (tet[4, 3]);
=> sqrt(5)/2^(3/2)
radcan (tet[7, 6]);
=> 2/sqrt(7)
First one agrees with a[4, 3] above. Dunno about the second.

This does the business
a[0,0]:0;
a[1,0]:1;
for row:2 while row<=15 do (
(col:(row-1)),
for r:0 while r<=col do (a[r,col]:0),
for c:0 while c<col do (a[row,c]:(sum(a[i,c],i,0,row-1))/row),
a[row,col]:radcan(sqrt(1-sum(a[row,c]^2,c,0,col-1))),
disp(a[row,col]^2));
But is there anyway to express it as the original recursion and memoize it so it runs in finite time?

I've done the first few 'by hand' in maxima, like this, if anyone needs inspiration.
(This is the 2d iteration which is equivalent to the above recursion (once memoized))
So I guess my question now could be 'how do I express this as a for loop in maxima'
one-simplex
a[0,0]:0;
a[1,0]:1;
two-simplex (equilateral triangle)
a[0,1]:0;
a[1,1]:0;
a[2,0]:(a[0,0]+a[1,0])/2;
a[2,1]:sqrt(1-a[2,0]^2);
three-simplex (tetrahedron)
a[0,2]:0;
a[1,2]:0;
a[2,2]:0;
a[3,0]:(a[0,0]+a[1,0]+a[2,0])/3;
a[3,1]:(a[0,1]+a[1,1]+a[2,1])/3;
a[3,2]:sqrt(1-a[3,0]^2-a[3,1]^2);
four-simplex (tetrahedron)
col:3;
a[0,col]:0;
a[1,col]:0;
a[2,col]:0;
a[3,col]:0;
col:0;
a[4,col]:(a[0,col]+a[1,col]+a[2,col]+a[3,col])/4;
col:1;
a[4,col]:(a[0,col]+a[1,col]+a[2,col]+a[3,col])/4;
col:2;
a[4,col]:(a[0,col]+a[1,col]+a[2,col]+a[3,col])/4;
a[4,3]:sqrt(1-a[4,0]^2-a[4,1]^2-a[4,2]^2);
radcan(%);

Related

Convert pseudo-code with nested for-loops to Clojure

I want to implement this psuedo code in Clojure:
function(n)
B[0] <-- 1
for m <-- 1 to n do
B[m] <-- 0
for k <-- 0 to m - 1 do
B[m] <-- B[m] − binom(m+1, k) * B[k]
B[m] <-- B[m]/(m+1)
return B[n]
My first thought was to do something like this:
(defn foo [n]
(if (= n 0)
(int 1)
(for [k (range 0 (- n 1))]
(* (binom (+ n 1) k)
(foo k)))))
but now I'm stuck and I don't know how to continue. The nested loops confuse me a lot when I try to translate them to Clojure.
I'd really appreciate some help on how to write this code in Clojure, I feel a bit lost.
Thanks in advance!
Some algorithms are naturally imperative in nature. Don't be afraid to write imperative code if that is the easiest solution, rather than trying to "force fit" the algorithm into a functional style.
This algorithm could easily use a mutable atom to store the B array:
(defn factorial [x]
(reduce * (range 2 (inc x))))
(defn binom [n k]
(/ (factorial n)
(factorial k) (factorial (- n k))))
(defn bernoulli [n]
(let [B (atom (vec (repeat n 0)))] ; allocate B[0]..B[n-1] = zeros
(swap! B assoc 0 1) ; B[0] = 1
(doseq [m (range 1 (inc n))] ; 1..n
(swap! B assoc m 0) ; B[m] = 0
(doseq [k (range m)] ; 0..(m-1)
(swap! B #(assoc % m ; B[m] = ...
(-
(get % m) ; B[m]
(*
(binom (inc m) k)
(get % k)))))) ; B[k]
(swap! B update m ; B[m] = B[m] ...
#(/ % (inc m))))
(get #B n)))
(dotest
(dotimes [i 10]
(spyx [i (bernoulli i)])))
with result
[i (bernoulli i)] => [0 1]
[i (bernoulli i)] => [1 -1/2]
[i (bernoulli i)] => [2 1/6]
[i (bernoulli i)] => [3 0N]
[i (bernoulli i)] => [4 -1/30]
[i (bernoulli i)] => [5 0N]
[i (bernoulli i)] => [6 1/42]
[i (bernoulli i)] => [7 0N]
[i (bernoulli i)] => [8 -1/30]
[i (bernoulli i)] => [9 0N]
You could also use with-local-vars for some algorithms, or even drop down into a (mutable) Java array. You can see an example of that in this mutable Java matrix example
The given pseudocode computes the nth Bernoulli number. It uses all the previous Bernoulli numbers to compute the result. Much like the factorial function, this lends itself to a recursive algorithm which may be implemented with memoize to avoid re-computation of earlier numbers:
(def factorial
"Returns n!."
(memoize (fn [n]
(if (< 1 n)
(* n (factorial (dec n)))
1N))))
(def bernoulli
"Returns the nth Bernoulli number."
(memoize (fn [n]
(if (zero? n)
1
(let [n! (factorial n)
term #(/ (* n! (bernoulli %))
(factorial %)
(factorial (- n % -1)))
terms (map term (range n))]
(reduce - 0 terms))))))
(map bernoulli (range 9)) ; => (1 -1/2 1/6 0N -1/30 0N 1/42 0N -1/30)
The pseudo code uses plenty of in-place updates which makes it a bit hard to read. But essentially, the code computes a list of numbers where every number is computed from the previous numbers in the list. And then we pick one of the numbers in the list.
I would implement this algorithm like
(defn compute-next-B [B]
(let [m (count B)
m+1 (inc m)
terms (map-indexed (fn [k Bk] (- (* Bk (binom m+1 k)))) B)]
(conj B (/ (apply + terms) m+1))))
(defn foo [n]
(->> [1]
(iterate compute-next-B)
(drop n)
first
last))
The outer loop from the pseudo code is the lazy sequence produced by (iterate compute-next-B ...). The inner loop from the pseudo code is the iteration inside (apply + terms) on the lazy sequence terms.

why this code is wrong ? How does 'recur' work?

I don't know why the code below is wrong:
(defn factorial [n]
(loop [n n
acc 1]
(if (zero? n)
acc
(recur (* acc n)(dec n)))))
(= 1 (factorial 1))
How does recur work?
The arguments to the recur are the wrong way round.
n should become (dec n)
acc should become (* acc n)
So it should be
(recur (dec n) (* acc n))
We can recast the given algorithm to see what's going on inside it.
If we represent the pair of arguments as a vector, the function that generates the next pair is
(fn [[n acc]] [(* acc n) (dec n)])
We can generate the endless sequence of possible pairs for a given noby applying iterate to the function above, starting with [no 1].
(fn [no]
(iterate (fn [[n acc]] [(* acc n) (dec n)]) [no 1]))
Applying this to 1 generates
([1 1] [1 0] [0 0] [0 -1] ...)
We stop at element 2, the first with an initial 0, returning the other 0.
If we put the arguments the right way round, we can get the proper factorial thus:
(defn factorial [no]
((comp second first)
(drop-while
(comp not zero? first)
(iterate (fn [[n acc]] [(dec n) (* acc n)]) [no 1]))))
This returns the second element of the first pair in the sequence with a zero first (Duh!).
Hopelessly overcomplicated for normal use, but does it work?
=> (map factorial (range 6))
(1 1 2 6 24 120)
Yes.

find all ordered triples of distinct positive integers i, j, and k less than or equal to a given integer n that sum to a given integer s

this is the exercise 2.41 in SICP
I have wrote this naive version myself:
(defn sum-three [n s]
(for [i (range n)
j (range n)
k (range n)
:when (and (= s (+ i j k))
(< 1 k j i n))]
[i j k]))
The question is: is this considered idiomatic in clojure? And how can I optimize this piece of code? since it takes forever to compute(sum-three 500 500)
Also, how can I have this function take an extra argument to specify number of integer to compute the sum? So instead of sum of three, It should handle more general case like sum of two, sum of four or sum of five etc.
I suppose this cannot be achieved by using for loop? not sure how to add i j k binding dynamically.
(Update: The fully optimized version is sum-c-opt at the bottom.)
I'd say it is idiomatic, if not the fastest way to do it while staying idiomatic. Well, perhaps using == in place of = when the inputs are known to be numbers would be more idiomatic (NB. these are not entirely equivalent on numbers; it doesn't matter here though.)
As a first optimization pass, you could start the ranges higher up and replace = with the number-specific ==:
(defn sum-three [n s]
(for [k (range n)
j (range (inc k) n)
i (range (inc j) n)
:when (== s (+ i j k))]
[i j k]))
(Changed ordering of the bindings since you want the smallest number last.)
As for making the number of integers a parameter, here's one approach:
(defn sum-c [c n s]
(letfn [(go [c n s b]
(if (zero? c)
[[]]
(for [i (range b n)
is (go (dec c) n (- s i) (inc i))
:when (== s (apply + i is))]
(conj is i))))]
(go c n s 0)))
;; from the REPL:
user=> (sum-c 3 6 10)
([5 4 1] [5 3 2])
user=> (sum-c 3 7 10)
([6 4 0] [6 3 1] [5 4 1] [5 3 2])
Update: Rather spoils the exercise to use it, but math.combinatorics provides a combinations function which is tailor-made to solve this problem:
(require '[clojure.math.combinatorics :as c])
(c/combinations (range 10) 3)
;=> all combinations of 3 distinct numbers less than 10;
; will be returned as lists, but in fact will also be distinct
; as sets, so no (0 1 2) / (2 1 0) "duplicates modulo ordering";
; it also so happens that the individual lists will maintain the
; relative ordering of elements from the input, although the docs
; don't guarantee this
filter the output appropriately.
A further update: Thinking through the way sum-c above works gives one a further optimization idea. The point of the inner go function inside sum-c was to produce a seq of tuples summing up to a certain target value (its initial target minus the value of i at the current iteration in the for comprehension); yet we still validate the sums of the tuples returned from the recursive calls to go as if we were unsure whether they actually do their job.
Instead, we can make sure that the tuples produced are the correct ones by construction:
(defn sum-c-opt [c n s]
(let [m (max 0 (- s (* (dec c) (dec n))))]
(if (>= m n)
()
(letfn [(go [c s t]
(if (zero? c)
(list t)
(mapcat #(go (dec c) (- s %) (conj t %))
(range (max (inc (peek t))
(- s (* (dec c) (dec n))))
(min n (inc s))))))]
(mapcat #(go (dec c) (- s %) (list %)) (range m n))))))
This version returns the tuples as lists so as to preserve the expected ordering of results while maintaining code structure which is natural given this approach. You can convert them to vectors with a map vec pass.
For small values of the arguments, this will actually be slower than sum-c, but for larger values, it is much faster:
user> (time (last (sum-c-opt 3 500 500)))
"Elapsed time: 88.110716 msecs"
(168 167 165)
user> (time (last (sum-c 3 500 500)))
"Elapsed time: 13792.312323 msecs"
[168 167 165]
And just for added assurance that it does the same thing (beyond inductively proving correctness in both cases):
; NB. this illustrates Clojure's notion of equality as applied
; to vectors and lists
user> (= (sum-c 3 100 100) (sum-c-opt 3 100 100))
true
user> (= (sum-c 4 50 50) (sum-c-opt 4 50 50))
true
for is a macro so it's hard to extend your nice idiomatic answer to cover the general case. Fortunately clojure.math.combinatorics provides the cartesian-product function that will produce all the combinations of the sets of numbers. Which reduces the problem to filter the combinations:
(ns hello.core
(:require [clojure.math.combinatorics :as combo]))
(defn sum-three [n s i]
(filter #(= s (reduce + %))
(apply combo/cartesian-product (repeat i (range 1 (inc n))))))
hello.core> (sum-three 7 10 3)
((1 2 7) (1 3 6) (1 4 5) (1 5 4) (1 6 3) (1 7 2) (2 1 7)
(2 2 6) (2 3 5) (2 4 4) (2 5 3) (2 6 2) (2 7 1) (3 1 6)
(3 2 5) (3 3 4) (3 4 3) (3 5 2) (3 6 1) (4 1 5) (4 2 4)
(4 3 3) (4 4 2) (4 5 1) (5 1 4) (5 2 3) (5 3 2) (5 4 1)
(6 1 3) (6 2 2) (6 3 1) (7 1 2) (7 2 1))
assuming that order matters in the answers that is
For making your existing code parameterized you can use reduce.This code shows a pattern that can be used where you want to paramterize the number of cases of a for macro usage.
Your code without using for macro (using only functions) would be:
(defn sum-three [n s]
(mapcat (fn [i]
(mapcat (fn [j]
(filter (fn [[i j k]]
(and (= s (+ i j k))
(< 1 k j i n)))
(map (fn [k] [i j k]) (range n))))
(range n)))
(range n)))
The pattern is visible, there is inner most map which is covered by outer mapcat and so on and you want to paramterize the nesting level, hence:
(defn sum-c [c n s]
((reduce (fn [s _]
(fn [& i] (mapcat #(apply s (concat i [%])) (range n))))
(fn [& i] (filter #(and (= s (apply + %))
(apply < 1 (reverse %)))
(map #(concat i [%]) (range n))))
(range (dec c)))))

How can I split a collection into two parts given by a percentage

I have a collection which I'd like to split by an arbitrary percentage. The actual problem I'm trying to solve is to split a dataset into a training and cross-validation set.
The destination of each element should be chosen at random, but each source element should appear only once in the result and the size of the partitions is fixed. If the source collection has duplicates, the duplicates could appear in different output partitions or the same.
I have this implementation:
(defn split-shuffled
"Returns a 2 element vector partitioned by the percentage
specified by p. Elements are selected at random. Each
element of the source collection will appear only once in
the result."
[c p]
(let [m (count c)
idxs (into #{} (take (* m p) (shuffle (range m))))
afn (fn [i x] (if (idxs i) x))
bfn (fn [i x] (if-not (idxs i) x))]
[(keep-indexed afn c) (keep-indexed bfn c)]))
repl> (split-shuffled (range 10) 0.2)
[(4 6) (0 1 2 3 5 7 8 9)]
repl> (split-shuffled (range 10) 0.4)
[(1 4 6 7) [0 2 3 5 8 9)]
But I'm not happy that keep-indexed is called twice.
How can this be improved?
EDIT: I originally wanted to keep the order in the partitions, but I dropped that requirement without re-thinking, so #mikera's solution is correct!
Why do you need the indexes at all?
Just shuffle the collection directly:
(defn split-shuffled
[c p]
(let [c (shuffle c)
m (count c)
t (* m p)]
[(take t c) (drop t c)]))

Different solutions for Clojure implementation of problem

Here is a problem Statement :
Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers.
The solution is long,
(defn large [x y]
(if (> x y) x y))
(defn large-3 [x y z]
(if(> (large x y) z) (large x y) z))
(defn small [x y]
(if (< x y) x y))
(defn small-3 [x y z]
(if (< (small x y) z ) (small x y) z))
(defn second-largest [x y z]
(let [greatest (large-3 x y z)
smallest (small-3 x y z)]
(first (filter #(and (> greatest %) (< smallest %)) [x y z]))))
(defn square [a]
(* a a)
)
(defn sum-of-square [x y z]
(+ (square (large-3 x y z)) (square (second-largest x y z))))
Just wanted to know what different/succinct ways this problem can be solved in Clojure.
(defn foo [& xs]
(let [big-xs (take 2 (sort-by - xs))]
(reduce + (map * big-xs big-xs))))
; why only 3? how about N
(defn sum-of-squares [& nums]
(reduce + (map #(* % %) (drop 1 (sort nums)))))
or if you want "the sum of the greatest two numbers:
(defn sum-of-squares [& nums]
(reduce + (map #(* % %) (take 2 (reverse (sort nums))))))
(take 2 (reverse (sort nums))) fromMichał Marczyk's answer.
(See a sequence version of the problem together with a lazy solution in my second update to this answer below.)
(defn square [n]
(* n n))
;; generalises easily to larger numbers of arguments
(defn sum-of-larger-squares [x y z]
(apply + (map square (take 2 (reverse (sort [x y z]))))))
;; shorter; generalises easily if you want
;; 'the sum of the squares of all numbers but n smallest'
(defn sum-of-larger-squares [x y z]
(apply + (map square (drop 1 (sort [x y z])))))
Update:
To expand on the comments from the above, the first version's straighforward generalisation is to this:
(defn sum-of-larger-squares [n & xs]
(apply + (map square (take n (reverse (sort xs))))))
The second version straightforwardly generalises to the version Arthur posted in the meantime:
(defn sum-of-larger-squares [n & xs]
(apply + (map square (drop n (sort xs)))))
Also, I've seen exactly the same problem being solved in Scheme, possibly even on SO... It included some fun solutions, like one which calculated the some of all three squares, then subtracted the smallest square (that's very straightforward to express with Scheme primitives). That's 'unefficient' in that it calculates the one extra square, but it's certainly very readable. Can't seem to find the link now, unfortunately.
Update 2:
In response to Arthur Ulfeldt's comment on the question, a lazy solution to a (hopefully fun) different version of the problem. Code first, explanation below:
(use 'clojure.contrib.seq-utils) ; recently renamed to clojure.contrib.seq
(defn moving-sum-of-smaller-squares [pred n nums]
(map first
(reductions (fn [[current-sum [x :as current-xs]] y]
(if (pred y x)
(let [z (peek current-xs)]
[(+ current-sum (- (* z z)) (* y y))
(vec (sort-by identity pred (conj (pop current-xs) y)))])
[current-sum
current-xs]))
(let [initial-xs (vec (sort-by identity pred (take n nums)))
initial-sum (reduce + (map #(* % %) initial-xs))]
[initial-sum initial-xs])
(drop n nums))))
The clojure.contrib.seq-utils (or c.c.seq) lib is there for the reductions function. iterate could be used instead, but not without some added complexity (unless one would be willing to calculate the length of the seq of numbers to be processed at the start, which would be at odds with the goal of remaining as lazy as possible).
Explanation with example of use:
user> (moving-sum-of-smaller-squares < 2 [9 3 2 1 0 5 3])
(90 13 5 1 1 1)
;; and to prove laziness...
user> (take 2 (moving-sum-of-smaller-squares < 2 (iterate inc 0)))
(1 1)
;; also, 'smaller' means pred-smaller here -- with
;; a different ordering, a different result is obtained
user> (take 10 (moving-sum-of-smaller-squares > 2 (iterate inc 0)))
(1 5 13 25 41 61 85 113 145 181)
Generally, (moving-sum-of-smaller-squares pred n & nums) generates a lazy seq of sums of squares of the n pred-smallest numbers in increasingly long initial fragments of the original seq of numbers, where 'pred-smallest' means smallest with regard to the ordering induced by the predicate pred. With pred = >, the sum of n greatest squares is calculated.
This function uses the trick I mentioned above when describing the Scheme solution which summed three squares, then subtracted the smallest one, and so is able to adjust the running sum by the correct amount without recalculating it at each step.
On the other hand, it does perform a lot of sorting; I find it's not really worthwhile to try and optimise this part, as the seqs being sorted are always n elements long and there's a maximum of one sorting operation at each step (none if the sum doesn't require adjustment).