StackOverflowError caused by memoized Fibonacci on Clojure - clojure

Config
Tested under clojure 1.10.3 and openjdk 17.0.1
Problem
Below is the slightly revised version of the memoized Fibonacci, and the general techniques refer to
wiki memoization.
(def fib
(memoize #(condp = %
0 (bigdec 0)
1 1
(+ (fib (dec %)) (fib (- % 2))))))
(fib 225) ; line 7
I had thought that the above memoized Fibonacci in FP like Clojure would act in spirit as the equivalent to the imperative DP e.g. in Python below,
def fib(n):
dp = [0, 1] + [0] * n
for i in range(2, n + 1):
dp[i] = dp[i - 1] + dp[i - 2]
return dp[n]
Question 1
Why did I actually get the following error when Fibonacci number was raised to 225 in my case?
Syntax error (StackOverflowError) compiling at 7:1
I also tried on the drop-in replacement memo on core.memoize and got the same error when Fibonacci number was raised to 110.
Tracing
Below I added tracing the recursive outlooks of the naive Fibonacci vs. the memoized Fibonacci,
(ns fib.core)
(defn naive-fib [n]
(condp = n
0 (bigdec 0)
1 1
(+ (naive-fib (dec n)) (naive-fib (- n 2)))))
(def memo-fib
(memoize #(condp = %
0 (bigdec 0)
1 1
(+ (memo-fib (dec %)) (memo-fib (- % 2))))))
(in-ns 'user)
(require '[clojure.tools.trace :refer [trace-ns]])
(trace-ns 'fib.core)
(fib.core/naive-fib 5)
(println)
(fib.core/memo-fib 5)
Overlapping sub-problems from the naive Fibonacci were clearly eliminated by the memoized Fibonacci. Nothing seemed suspicious to cause StackOverflowError at the first sight, the depth of the stack frames for the memoized Fibonacci was strictly linear to the input number n, and the width was limited to 1.
TRACE t427: (fib.core/naive-fib 5)
TRACE t428: | (fib.core/naive-fib 4)
TRACE t429: | | (fib.core/naive-fib 3)
TRACE t430: | | | (fib.core/naive-fib 2)
TRACE t431: | | | | (fib.core/naive-fib 1)
TRACE t431: | | | | => 1
TRACE t432: | | | | (fib.core/naive-fib 0)
TRACE t432: | | | | => 0M
TRACE t430: | | | => 1M
TRACE t433: | | | (fib.core/naive-fib 1)
TRACE t433: | | | => 1
TRACE t429: | | => 2M
TRACE t434: | | (fib.core/naive-fib 2)
TRACE t435: | | | (fib.core/naive-fib 1)
TRACE t435: | | | => 1
TRACE t436: | | | (fib.core/naive-fib 0)
TRACE t436: | | | => 0M
TRACE t434: | | => 1M
TRACE t428: | => 3M
TRACE t437: | (fib.core/naive-fib 3)
TRACE t438: | | (fib.core/naive-fib 2)
TRACE t439: | | | (fib.core/naive-fib 1)
TRACE t439: | | | => 1
TRACE t440: | | | (fib.core/naive-fib 0)
TRACE t440: | | | => 0M
TRACE t438: | | => 1M
TRACE t441: | | (fib.core/naive-fib 1)
TRACE t441: | | => 1
TRACE t437: | => 2M
TRACE t427: => 5M
TRACE t446: (fib.core/memo-fib 5)
TRACE t447: | (fib.core/memo-fib 4)
TRACE t448: | | (fib.core/memo-fib 3)
TRACE t449: | | | (fib.core/memo-fib 2)
TRACE t450: | | | | (fib.core/memo-fib 1)
TRACE t450: | | | | => 1
TRACE t451: | | | | (fib.core/memo-fib 0)
TRACE t451: | | | | => 0M
TRACE t449: | | | => 1M
TRACE t452: | | | (fib.core/memo-fib 1)
TRACE t452: | | | => 1
TRACE t448: | | => 2M
TRACE t453: | | (fib.core/memo-fib 2)
TRACE t453: | | => 1M
TRACE t447: | => 3M
TRACE t454: | (fib.core/memo-fib 3)
TRACE t454: | => 2M
TRACE t446: => 5M
Question 2
Why could Clojure assert at compile-time that the depth of merely 225 stack frames for the memoized Fibonacci in my case could potentially explode the whole JVM stack thus cease to run the recursion altogether? From the source code on memoize below, I could see that an empty hashmap was initiated to cache the arguments and the returns for the memoized Fibonacci. Did the said hashmap cause the assertion of StackOverflowError? Why?
(defn memoize
"Returns a memoized version of a referentially transparent function. The
memoized version of the function keeps a cache of the mapping from arguments
to results and, when calls with the same arguments are repeated often, has
higher performance at the expense of higher memory use."
{:added "1.0"
:static true}
[f]
(let [mem (atom {})]
(fn [& args]
(if-let [e (find #mem args)]
(val e)
(let [ret (apply f args)]
(swap! mem assoc args ret)
ret)))))
Others - (for the sake of completeness but unrelated to the OP)
We can take advantage of loop recur to achieve something like TCO, or the laziness of iterate implementation.
(ns fib.core)
(defn tail-fib [n]
(loop [n n
x (bigdec 0)
y 1]
(condp = n
0 0
1 y
(recur (dec n) y (+ x y)))))
(defn lazy-fib [n]
(->> n
(nth (iterate (fn [[x y]]
[y (+ x y)])
[(bigdec 0) 1]))
first))
(in-ns 'user)
(require '[clojure.tools.trace :refer [trace-ns]])
(trace-ns 'fib.core)
(fib.core/tail-fib 2000)
(println)
(fib.core/lazy-fib 2000)
The tracing tells that they don't make any recursive call longer in effect.
TRACE t471: (fib.core/tail-fib 2000)
TRACE t471: => 4224696333392304878706725602341482782579852840250681098010280137314308584370130707224123599639141511088446087538909603607640194711643596029271983312598737326253555802606991585915229492453904998722256795316982874482472992263901833716778060607011615497886719879858311468870876264597369086722884023654422295243347964480139515349562972087652656069529806499841977448720155612802665404554171717881930324025204312082516817125M
TRACE t476: (fib.core/lazy-fib 2000)
TRACE t476: => 4224696333392304878706725602341482782579852840250681098010280137314308584370130707224123599639141511088446087538909603607640194711643596029271983312598737326253555802606991585915229492453904998722256795316982874482472992263901833716778060607011615497886719879858311468870876264597369086722884023654422295243347964480139515349562972087652656069529806499841977448720155612802665404554171717881930324025204312082516817125M

Memoization does not affect the stack when the cache is empty. Your Clojure code is not equivalent to the Python code because the Clojure version is recursive while the Python version is iterative.

Memoize returns function, so you have to use def:
(def fib
(memoize #(condp = %
0 (bigdec 0)
1 1
(+ (fib (dec %)) (fib (- % 2))))))
(fib 225)
=> 47068900554068939361891195233676009091941690850M

Not sure why on a fresh repl
(def fib
(memoize #(condp = %
0 (bigdec 0)
1 1
(+ (fib (dec %)) (fib (- % 2))))))
(fib 136) gives me an overflow
(fib 135) works fine
then when I do (fib 136) again I git no error.
Doing, from a fresh repl,
(map fib (range 1 10000))
works fine

I've carefully investigated the problem and finally figured it out, hopefully.
At first, let's look at a comparable yet handy recursive implementation on memoized Fibonacci in Python, before we jump into study on it in Clojure.
def memo_fib(n):
def fib(n):
print(f"ALL: {n}")
if n not in dp:
print(f"NEW: {n}")
if n == 0:
dp[0] = 0
elif n == 1:
dp[1] = 1
else:
dp[n - 1], dp[n - 2] = fib(n - 1), fib(n - 2)
dp[n] = dp[n - 1] + dp[n - 2]
return dp[n]
dp = {}
return fib(n)
memo_fib(5)
We add two print clauses to trace the recursive executions on memoized Fibonacci. ALL denotes what actually enter the spawned call stacks even if they are already memoized, while NEW denotes what proportionally show up only if they have yet to memoize. Obviously, the total ALL steps are greater than the NEW ones, which is also evident on the output below,
ALL: 5
NEW: 5
ALL: 4
NEW: 4
ALL: 3
NEW: 3
ALL: 2
NEW: 2
ALL: 1
NEW: 1
ALL: 0
NEW: 0
ALL: 1
ALL: 2
ALL: 3
Let's also compare it to naive Fibonacci in Python below,
def naive_fib(n):
print(f"NAIVE: {n}")
if n == 0:
return 0
if n == 1:
return 1
return naive_fib(n - 1) + naive_fib(n - 2)
naive_fib(5)
We can see below that memoized Fibonacci does reduce the recursive steps thus improve the stack more efficiently over naive Fibonacci, which invalidates the explanation #Eugene Pakhomov posted:
Memoization does not affect the stack when the cache is empty.
NAIVE: 5
NAIVE: 4
NAIVE: 3
NAIVE: 2
NAIVE: 1
NAIVE: 0
NAIVE: 1
NAIVE: 2
NAIVE: 1
NAIVE: 0
NAIVE: 3
NAIVE: 2
NAIVE: 1
NAIVE: 0
NAIVE: 1
Back to memoized Fibonacci in Clojure, what we have been able to trace in the OP are just the NEW steps. To peek at the ALL steps, we have to trace memoize function itself inside out. Please help if you know how to do it conveniently.
Now it's time to conclude that memoize can help reduce repetitive recursive calls but it cannot entirely prevent recursive calls from spawning stack frames again and again for some of overlapping sub-problems. StackOverflowError emerges to surprise you if those sub-problems grow rapidly.

Related

clojure reduced not terminating reduce function

In the clojure documentation it says:
Usage: (reduced x)
Wraps x in a way such that a reduce will terminate with the value x
I am trying to return from a reduce function with a boolean and a vector values.
(def bp (reduce (fn [[balanced stack] singlenum]
(def stack2 (conj stack singlenum))
(println stack2)
(if (= 2 singlenum)
(reduced [false stack2])
)
[balanced stack2]
)
[true (vector)] [1 2 3 4]
))
bp evaluates as [true [1 2 3 4]], I was expecting [false [1 2]]. The reduced did not terminate the reduce function. I was attempting to terminate the reduce function with a specific values.
You have the correct logic there. I just revised your usage of if and def.
if - I moved [balanced stack2] to the else part. Otherwise reduced will never be detected.
def - the def inside fn should be replaced with let
(def bp (reduce (fn [[balanced stack] singlenum]
(let [stack2 (conj stack singlenum)]
(println stack2)
(if (= 2 singlenum)
(reduced [false stack2])
[balanced stack2])))
[true (vector)]
[1 2 3 4]))
| | | | | stack=> []
| | | | | singlenum=> 1
| | | | (conj stack singlenum)=> [1]
| | | | stack2=> [1]
[1]
| | | (println stack2)=> nil
| | | | | singlenum=> 1
| | | | (= 2 singlenum)=> false
| | | | | balanced=> true
| | | | | stack2=> [1]
| | | (if (= 2 singlenum) (reduced #) [balanced stack2])=> [true [1]]
| | (let [stack2 #] (println stack2) (if # # #))=> [true [1]]
| | | | | stack=> [1]
| | | | | singlenum=> 2
| | | | (conj stack singlenum)=> [1 2]
| | | | stack2=> [1 2]
[1 2]
| | | (println stack2)=> nil
| | | | | singlenum=> 2
| | | | (= 2 singlenum)=> true
| | | | | | stack2=> [1 2]
| | | | (reduced [false stack2])=> #reduced[{:status :ready, :val [false [1 2]]} 0x5fbdbb78]
| | | (if (= 2 singlenum) (reduced #) [balanced stack2])=> #reduced[{:status :ready, :val [false [1 2]]} 0x5fbdbb78]
| | (let [stack2 #] (println stack2) (if # # #))=> #reduced[{:status :ready, :val [false [1 2]]} 0x5fbdbb78]
(reduce (fn # #) [true #] [1 2 3 4])=> [false [1 2]]

Clojure - Function that returns all integers up to a certain number

I want to create a function that when you input a number say 5 it'll return a vector of all numbers from 1 to 5, ie [1 2 3 4 5]
So far I am at
(defn counter [number]
(let [x 1
result []]
(when (<= x number)
(conj result x)
(inc x))))
Right now it'll put 1 into the vector but I want to say now (inc x) and run through again. Do I have to use recur?
Any help is much appreciated
The answers from #fl00r are fine, but I wanted to throw in a reducible one as well:
(defn get-range [n] ;; range/into
(into [] (range 1 n)))
Reducibles are a better fit for this use case because
range was rebuilt in Clojure 1.7 to be self-reducible, AND for the special case of long start/step/end values, it will use primitive longs during self reduction.
Using the into transducer form means that the final vector can be built directly, rather than building a lazy sequence and then putting it into a vector
into will automatically use transients when building the vector - there is a small overhead (compared to the loop approach) for very small ranges, but a big savings on larger ranges
Because you are building a concrete collection (a vector), there is no benefit from laziness, so an eager approach makes more sense
This approach is far more time and memory efficient than any of the sequence approaches. range as a reducible will consume no heap space (just a few locals). The transient vector is built by filling up arrays and then directly building persistent vector nodes as needed. By comparison, all of the sequence approaches will do boxed math on the iterator, build multiple nested cached sequence values, then copy those values one by one into a vector. The loop approach used above will use primitive math and obtains some of the perf benefits (but still has the downside of adding elements one-by-one to the vector vs transients).
Quick perf test (for vectors of various sizes):
Size | range/vec | iterate/vec | loop/recur | loop/recur' | range/into |
-----|---------------|---------------|---------------|---------------|---------------|
1 | 160.968880 ns | 180.287974 ns | 40.373079 ns | 79.203722 ns | 136.157046 ns |
10 | 378.058753 ns | 851.381372 ns | 342.720391 ns | 200.658997 ns | 253.015756 ns |
100 | 2.486726 µs | 8.034826 µs | 3.464423 µs | 1.471333 µs | 1.760118 µs |
1000 | 23.349414 µs | 88.188242 µs | 37.247193 µs | 16.443044 µs | 17.109882 µs |
Perf tests were done with Criterium quick-bench on Java 1.8/Clojure 1.8
Versions tested:
range/vec - 1st example from #fl00r - primitive math, 1 sequence, transients
iterate/vec - 2nd example from #fl00r - boxed math, 2 sequences, transients
loop/recur - 3nd example from #fl00r - primitive math, 0 sequences, no transients
loop/recur' - same as prior, but modified to use transients - primitive math, 0 sequences, transients
range/into - the example at the top of this answer - primitive math, 0 sequences, transients
Note that both of the last two have similar characteristics but loop/recur' uses a lot more code:
(defn get-range [n] ;; loop/recur'
(loop [m 1
res (transient [])]
(if (> m n)
(persistent! res)
(recur (inc m) (conj! res m)))))
There is a builtin function range. To achieve your goal (vector of numbers from 1 to n inclusively) you can wrap it as follows:
(defn get-range [n]
(->> n inc (range 1) vec))
(get-range 5)
#=> [1 2 3 4 5]
Also, you can go and use iterate function
(defn get-range [n]
(->> (iterate inc 1)
(take n)
vec))
Or use a loop:
(defn get-range [n]
(loop [m 1
res []]
(if (> m n)
res
(recur (inc m) (conj res m)))))

Logic error when checking diagonal - nQueens

After solving my error with values-list and being able to run my program until the end, I found that my diagonal check seems to have a logic error. My input is as follows:
(THREAT? '(1 3) '((1 0)(2 4)(3 0)(4 0)(5 0)(6 0)(7 0)(8 0)))
The first argument being a board space that we are testing is ok or not to place a queen and the second argument is the state of the board, y values 1-8 determine the column positions of a piece and a 0 value would indicate that x value row would hold no piece. My code is as follows:
(defun diagonal(point1 point2)
(= (abs (- ( car point1 ) ( car point2 )))
(abs (- ( cadr point1 ) ( cadr point2 ))))
)
(defun THREAT?(x y)
; Checks threat on the vertical
(when (not (eq (values-list (cdr (nth (- (car x) 1 ) y )) ) '0 ) )
(return-from THREAT? t)
)
(loop for i from 0 to (list-length y)
; Checks threat on the horizontal
when (eq (values-list ( cdr x )) (values-list (cdr (nth i y))) )
do (return-from THREAT? t)
; With the help of the diagonal function checks along the diagonal
when (diagonal x (nth i y) )
do (return-from THREAT? t)
)
)
If my understanding is correct my program should loop through every single element of y. It will pass the x and the current y pair to the diagonal function. The diagonal function will minus the two and absolute value them and check if they are equal (if they are diagonal then they should be ex. (1 2) and (2 3) are diagonal and therefore |1 - 2| = 1 and |2 - 3| = 1). The diagonal function should return true if these numbers are equivalent. The corresponding when statement should only activate when it receives a true from the diagonal function and yet it seems to always return true, even when I give the program a completely blank board. How do I fix diagonal to correctly determine a threat on the board? Any and all help is greatly appreciated!
I have rewritten your code to better Lisp style.
much better naming.
procedures with useful names make comments redundant
individual procedures are better testable
got rid of the VALUES-LIST nonsense
get rid of all CAR, CDR, CADR. Use FIRST and SECOND.
introduced accessors for x and y components of a point
got rid of the strange control flow with RETURN-FROM, replaced it with a simple OR
actually directly iterate over a list, instead of using NTH all the time
EQ is not for comparing number equality, use = instead
don't place parentheses alone on a line.
indent and format the code correctly
don't put spaces between parentheses
put a space between an atom and an open parenthesis
Code:
(defun get-x (point)
(first point))
(defun get-y (point)
(second point))
(defun diagonal? (point1 point2)
(= (abs (- (get-x point1) (get-x point2)))
(abs (- (get-y point1) (get-y point2)))))
(defun vertical? (point)
(not (zerop (get-y point))))
(defun horizontal? (point1 point2)
(= (get-y point1)
(get-y point2)))
(defun threat? (point list-of-columns)
(or (vertical? (nth (1- (get-x point)) list-of-columns))
(loop for point2 in list-of-columns
when (or (horizontal? point point2)
(diagonal? point point2))
return t)))
Example
Now we can trace the three threat predicates:
? (trace vertical? diagonal? horizontal?)
NIL
Now you can call your example:
? (threat? '(1 3) '((1 0) (2 4) (3 0) (4 0) (5 0) (6 0) (7 0) (8 0)))
0> Calling (VERTICAL? (1 0))
<0 VERTICAL? returned NIL
0> Calling (HORIZONTAL? (1 3) (1 0))
<0 HORIZONTAL? returned NIL
0> Calling (DIAGONAL? (1 3) (1 0))
<0 DIAGONAL? returned NIL
0> Calling (HORIZONTAL? (1 3) (2 4))
<0 HORIZONTAL? returned NIL
0> Calling (DIAGONAL? (1 3) (2 4))
<0 DIAGONAL? returned T
T
This should help, so that you can better debug your code... Look at the trace output.
A version which does not use empty column descriptions
(defun get-x (point)
(first point))
(defun get-y (point)
(second point))
(defun diagonal? (point1 point2)
(= (abs (- (get-x point1) (get-x point2)))
(abs (- (get-y point1) (get-y point2)))))
(defun vertical? (point list-of-columns)
(let ((point2 (find (get-x point) list-of-columns :key #'get-x)))
(and point2 (not (zerop (get-y point2))))))
(defun horizontal? (point1 point2)
(= (get-y point1)
(get-y point2)))
(defun threat? (point list-of-columns)
(or (vertical? point list-of-columns)
(loop for point2 in list-of-columns
when (or (horizontal? point point2)
(diagonal? point point2))
return t)))
(defun print-board (board)
(format t "~%+-+-+-+-+-+-+-+-+")
(dotimes (y 8)
(terpri)
(dotimes (x 8)
(format t "|~a" (if (member (list x y) board :test #'equal) "x" " ")))
(format t "|~%+-+-+-+-+-+-+-+-+")))
Example:
CL-USER 138 > (threat? '(1 2) '((2 4)))
NIL
CL-USER 139 > (print-board '((1 2) (2 4)))
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
| |x| | | | | | |
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
| | |x| | | | | |
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
NIL
Another example:
CL-USER 140 > (threat? '(1 2) '((2 4) (4 5)))
T
CL-USER 141 > (print-board '((1 2) (2 4) (4 5)))
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
| |x| | | | | | |
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
| | |x| | | | | |
+-+-+-+-+-+-+-+-+
| | | | |x| | | |
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
| | | | | | | | |
+-+-+-+-+-+-+-+-+
NIL

Why is `(a) read as a list while `(a b) isn't?

While learning clojure, I was very surprised to find out that these two objects are different types:
(list? `(inc)) ;; true
(list? `(inc 1)) ;; false
In theory, I understand why the second form returns false, that object
is actually a clojure.lang.Cons. In practice, though, I don't
understand why that is happening.
Why does the reader read `(inc) different from `(inc 1)? What is happening under the hood?
When the reader encounters a syntax-quoted form, that turns out to be a collection, it will iterate over each element and call syntax-quote recursively. The result is consed, beginning with nil.
So it comes down to the question why the following holds:
> (list? (cons 'inc nil))
true
> (list? (cons 'inc (cons 1 nil)))
false
This seems to be a matter of definition.
list? is actually a function of very limited usefulness. In fact I have yet to see Clojure code that used list? without it being at best a poor choice, more often the cause of a bug.
If you want to know if something is "listy", seq? is a great choice.
in action:
user=> (pprint/print-table (for [item [[] () `(a) `(a b) (seq [1])]]
{'item (pr-str item)
'seq? (seq? item)
'list? (list? item)
'type (type item)}))
| item | seq? | list? | type |
|-----------------+-------+-------+------------------------------------------------|
| [] | false | false | class clojure.lang.PersistentVector |
| () | true | true | class clojure.lang.PersistentList$EmptyList |
| (user/a) | true | true | class clojure.lang.PersistentList |
| (user/a user/b) | true | false | class clojure.lang.Cons |
| (1) | true | false | class clojure.lang.PersistentVector$ChunkedSeq |

How does #(for [x %, i (range %2)] x) do what it does?

I'm working through the 4clojure.com problems (this is from problem 33), and I can't for the life of me figure out how this works:
(#(for [x %, i (range %2)] x) [1 2 3] 2) ; --> '(1 1 2 2 3 3)
I can see that for binds x to [1 2 3] and then does something twice (i is bound to '(0 1)), but I'd expect an answer like '([1 2 3] [1 2 3]). It looks like this code is somehow doing a mapcat on the output.
The docstring for for includes the following: Collections are iterated in a nested fashion, rightmost fastest.... This gives me the intuition that i is taking on the values 0, 1, 2 for x, but I can't say I understand what's going on.
Can somebody explain what's going on in a way that improves my mental model of how for works? Many thanks!
When you fill in the function arguments you get the following:
(for [x [1 2 3]
i (range 2)]
x)
;; => (1 1 2 2 3 3)
Where:
(range 2) ;; => (0 1)
The rightmost item the docstring is referring to is i, which has two elements. So, if you unroll the loop, x and i would progress like the following table:
(clojure.pprint/print-table
(for [x [1 2 3] i (range 2)] {:x x :i i}))
| :x | :i |
|----+----|
| 1 | 0 |
| 1 | 1 |
| 2 | 0 |
| 2 | 1 |
| 3 | 0 |
| 3 | 1 |
The result of for is a list containing the results returned for every iteration of the loop. In this case you are just returning x, so your resulting list would correspond to only the x column in the above table.