Tracking down a StackOverflow in a Clojure program, contains SSCCE - clojure

I am having a hard time tracking down a stackoverflow error produced by my clojure implementation of the Bron-Kerbosch algorithm.
Below is my implementation of the algorithm itself, and here is a link to a SSCCE outlining the issue http://pastebin.com/gtpTT4B4.
(defn Bron-Kerbosch [r p x graph cliques]
;(println (str "R:" r " P:" p " X:" x " Clq:" cliques))
(cond (and (empty? p) (empty? x)) (concat cliques r)
:else
(let [neigh (neighV graph (dec (count p)))]
(loop [loop-clq '(cliques)
loop-cnt (dec (count p))
loop-p '(p)
loop-x '(x)]
(cond (= -1 loop-cnt) loop-clq
:else
(recur (concat loop-clq
(Bron-Kerbosch (concat r loop-cnt)
(concat p neigh)
(filter (set x) neigh)
graph cliques))
(dec loop-cnt)
(filter (set p) loop-cnt)
(concat x loop-cnt)))))))
I would have to assume that the issue obviously lies within one of my two bootstrap conditions (cond (and (empty? p) (empty? x)) and (cond (= -1 loop-cnt) because the function algorithm is recursive.
Though this assumes that I am building the lists x r p correctly. Judging by the output of the commented out print statement (cliques is always printed as an EmptyList) I assume that my list comprehension might also be the issue.
Along the same lines, the other issue I can somewhat see is that I am not actually calling the algorithm properly in the BK-Call function (in the SSCCEE).
My overall question is what is causing this? Though this is somewhat too open, another question that might lead me to my answer is how I might go about using the print statement on the first line.
When the print statement is uncommented it produces the output
R:clojure.lang.LazySeq#2044e0b9 P:clojure.lang.LazySeq#7f9700a5 X:clojure.lang.LazySeq#1 Clq:clojure.lang.PersistentList$EmptyList#1
I assume that if I could see that x r p are at each call I might be able to see where the algorithm is going wrong.
Can anyone point me in the right direction?
EDIT: The neighV function from the SSCCE
(defn neighV [graph nodenum]
(let [ret-list (for [i (range (count graph)) :when (contains? (graph i) nodenum)] i)]
ret-list))
EDIT2: Noisesmith's answers had gotten me closer to the solution and made sense to me. I wrapped all of my concat in doall. After trying to call the function again I was getting "Cannot cast Long to Seq" errors, I figured that these stemmed from trying to concat loop-cnt onto lists in the function
fptests.core> (BK-Call (sanity1))
IllegalArgumentException Don't know how to create ISeq from: java.lang.Long clojure.lang.RT.seqFrom (RT.java:505)
fptests.core> (concat 1 '(2 3))
IllegalArgumentException Don't know how to create ISeq from: java.lang.Long clojure.lang.RT.seqFrom (RT.java:505)
So I then wrapped each loop-cnt in a '() to turn it into a list before it is concat
fptests.core> (concat '(1) '(2 3))
(1 2 3)
Which, after I made all of these changes, I ended back at my stack overflow.. Here is the new Bron-Kerbosch function with all of the edits. I guess I now have the same questions as I did before..
Though the new ones are, did I implement that changes that I should have correctly, does the usage of '() make sense to fix the issue that came up after implementing noisesmith's changes?
(defn Bron-Kerbosch1 [r p x graph cliques]
(cond (and (empty? p) (empty? x)) (doall (concat cliques r))
:else
(let [neigh (neighV graph (dec (count p)))]
(loop [loop-clq '(cliques)
loop-cnt (dec (count p))
loop-p '(p)
loop-x '(x)]
(cond (= -1 loop-cnt) loop-clq
:else
(recur (doall (concat loop-clq
(Bron-Kerbosch1 (doall (concat r '(loop-cnt)))
(doall (concat p neigh))
(filter (set x) neigh)
graph cliques)))
(dec loop-cnt)
(filter (set p) '(loop-cnt))
(doall (concat x '(loop-cnt)))))))))
EDIT3: After patiently waiting for my prn statements to stop (not sure why I put them in when I knew it was in a SO) I found that most if not all statements printed were along the lines of
"R:" (loop-cnt loop-cnt loop-cnt loop-cnt loop-cnt loop-cnt loop-cnt ...)
"P:" (range (count graph) 0 2 3) " X:" () " Clq:" ()
After inspecting this some I realized that I have not been recursively calling the function properly. I have been union'ing items to P instead of removing them. This causes P to continuously grow. This is most likely the cause of my stack overflow. Though there are still some issues. I still am creating a stackoverflow, yet again.
Once I fixed my issue of continuing to union to P my issue is that when I concat loop-cnt it is not, I guess to say, evaluated to a value but it stays as a variable name loop-cnt. I suspect that my stack overflow now lies with my bootstrap condition not being met because it cannot be met if loop-cnt is not evaluated to a number.
So I think my issue now lies with concat loop-cnt to a list as a number and not a variable.

concat is lazy. Recursive calls that build concat on top of concat, without realizing any of the prior layers of laziness, each add to the size of the stack of calls that will be needed to realize the lazy-seq.
Does this concatenation need to be lazy at all? If not, wrap the calls to concat in calls to doall. This will make the concatenation eager, which reduces the size of the call stack needed to realize the final result, and thus eliminating the StackOverflowError.
Also, the correct way to print a lazy-seq and see the contents is prn, you can use pr-str to get the form of the value that pr or prn would use as a string, if needed.

You are misusing quoted lists, I think.
For example, in (defn Bron-Kerbosch1 [ ... ] ... ), '(cliques) evaluates to a list containing the symbol cliques, not to a list containing the argument cliques as its one element. You want (list cliques) for that.
Similar cases abound.

Related

Is it bad practice to try and keep track of iterations while using reduce/map in Clojure?

So being new to Clojure and functional programming in general, I sometimes (to quote a book) "feel like your favourite tool has been taken from you". Trying to get a better grasp on this stuff I'm doing string manipulation problems.
So knowing the functional paradigm is all about recursion (and other things) I've been using tail recursive functions to do things I'd normally do with loops, then trying to implement using map or reduce. For those more experienced, does this sound like a sane thing to do?
I'm starting to get frustrated because I'm running into problems where I need to keep track of the index of each character when iterating over strings but that's proving difficult because reduce and map feel "isolated". I can't increment a value while a string is being reduced...
Is there something I'm missing; a function for exactly this.. Or can this specific case just not be implemented using these core functions? Or is the way I'm going about it just wrong and un-functional-like which is why I'm stuck?
Here's an example I'm having:
This function takes five separate strings then using reduce, builds a vector containing all the characters at position char-at in each string. How could you change this code so that char-at (in the anonymous function) gets incremented after each string gets passed? This is what I mean by it feels "isolated" and I don't know how to get around this.
(defn new-string-from-five
"This function takes a character at position char-at from each of the strings to make a new vector"
[five-strings char-at]
(reduce (fn [result string]
(conj result (get-char-at string char-at)))
[]
five-strings))
Old :
"abc" "def" "ghi" "jkl" "mno" -> [a d g j m] (always taken from index 0)
Modified :
"abc" "def" "ghi" "jkl" "mno" ->[a e i j n] (index gets incremented and loops back around)
I don't think there's anything insane about writing string manip functions to get your head around things, though it's certainly not the only way. I personally found clojure for the brave and true, 4clojure, and the clojurians slack channel most helpful when learning clojure.
On your question, probably the most common thing to do would be to add an index to your initial collection (in this case a string) using map-indexed
(user=> (map-indexed vector [9 9 9])
([0 9] [1 9] [2 9])
So for your example
(defn new-string-from-five
"This function takes a character at position char-at from each of the strings to make a new vector"
[five-strings char-at]
(reduce (fn [result [string-idx string]]
(conj result (get-char-at string (+ string-idx char-at))))
[]
(map-indexed vector five-strings)))
But how would I build map-indexed? Well
Non-lazily:
(defn map-indexed' [f coll]
(loop [idx 0
res []
rest-coll coll]
(if (empty? rest-coll)
res
(recur (inc idx) (conj res (f idx (first rest-coll))) (rest rest-coll)))))
Lazily (recommend not trying to understand this yet):
(defn map-indexed' [f coll]
(letfn [(map-indexed'' [idx f coll]
(if (empty? coll)
'()
(lazy-seq (conj (map-indexed'' (inc idx) f (rest coll)) (f idx (first coll))))))]
(map-indexed'' 0 f coll)))
You can use reductions:
(defn new-string-from-five
[five-strings]
(->> five-strings
(reductions
(fn [[res i] string]
[(get-char-at string i) (inc i)])
[nil 0])
rest
(mapv first)))
But in this case, I think map, mapv or map-indexed is cleaner. E.g.
(map-indexed
(fn [i s] (get-char-at s i))
["abc" "def" "ghi" "jkl" "mno"])

How to make reduce more readable in Clojure?

A reduce call has its f argument first. Visually speaking, this is often the biggest part of the form.
e.g.
(reduce
(fn [[longest current] x]
(let [tail (last current)
next-seq (if (or (not tail) (> x tail))
(conj current x)
[x])
new-longest (if (> (count next-seq) (count longest))
next-seq
longest)]
[new-longest next-seq]))
[[][]]
col))
The problem is, the val argument (in this case [[][]]) and col argument come afterward, below, and it's a long way for your eyes to travel to match those with the parameters of f.
It would look more readable to me if it were in this order instead:
(reduceb val col
(fn [x y]
...))
Should I implement this macro, or am I approaching this entirely wrong in the first place?
You certainly shouldn't write that macro, since it is easily written as a function instead. I'm not super keen on writing it as a function, either, though; if you really want to pair the reduce with its last two args, you could write:
(-> (fn [x y]
...)
(reduce init coll))
Personally when I need a large function like this, I find that a comma actually serves as a good visual anchor, and makes it easier to tell that two forms are on that last line:
(reduce (fn [x y]
...)
init, coll)
Better still is usually to not write such a large reduce in the first place. Here you're combining at least two steps into one rather large and difficult step, by trying to find all at once the longest decreasing subsequence. Instead, try splitting the collection up into decreasing subsequences, and then take the largest one.
(defn decreasing-subsequences [xs]
(lazy-seq
(cond (empty? xs) []
(not (next xs)) (list xs)
:else (let [[x & [y :as more]] xs
remainder (decreasing-subsequences more)]
(if (> y x)
(cons [x] remainder)
(cons (cons x (first remainder)) (rest remainder)))))))
Then you can replace your reduce with:
(apply max-key count (decreasing-subsequences xs))
Now, the lazy function is not particularly shorter than your reduce, but it is doing one single thing, which means it can be understood more easily; also, it has a name (giving you a hint as to what it's supposed to do), and it can be reused in contexts where you're looking for some other property based on decreasing subsequences, not just the longest. You can even reuse it more often than that, if you replace the > in (> y x) with a function parameter, allowing you to split up into subsequences based on any predicate. Plus, as mentioned it is lazy, so you can use it in situations where a reduce of any sort would be impossible.
Speaking of ease of understanding, as you can see I misunderstood what your function is supposed to do when reading it. I'll leave as an exercise for you the task of converting this to strictly-increasing subsequences, where it looked to me like you were computing decreasing subsequences.
You don't have to use reduce or recursion to get the descending (or ascending) sequences. Here we are returning all the descending sequences in order from longest to shortest:
(def in [3 2 1 0 -1 2 7 6 7 6 5 4 3 2])
(defn descending-sequences [xs]
(->> xs
(partition 2 1)
(map (juxt (fn [[x y]] (> x y)) identity))
(partition-by first)
(filter ffirst)
(map #(let [xs' (mapcat second %)]
(take-nth 2 (cons (first xs') xs'))))
(sort-by (comp - count))))
(descending-sequences in)
;;=> ((7 6 5 4 3 2) (3 2 1 0 -1) (7 6))
(partition 2 1) gives every possible comparison and partition-by allows you to mark out the runs of continuous decreases. At this point you can already see the answer and the rest of the code is removing the baggage that is no longer needed.
If you want the ascending sequences instead then you only need to change the < to a >:
;;=> ((-1 2 7) (6 7))
If, as in the question, you only want the longest sequence then put a first as the last function call in the thread last macro. Alternatively replace the sort-by with:
(apply max-key count)
For maximum readability you can name the operations:
(defn greatest-continuous [op xs]
(let [op-pair? (fn [[x y]] (op x y))
take-every-second #(take-nth 2 (cons (first %) %))
make-canonical #(take-every-second (apply concat %))]
(->> xs
(partition 2 1)
(partition-by op-pair?)
(filter (comp op-pair? first))
(map make-canonical)
(apply max-key count))))
I feel your pain...they can be hard to read.
I see 2 possible improvements. The simplest is to write a wrapper similar to the Plumatic Plumbing defnk style:
(fnk-reduce { :fn (fn [state val] ... <new state value>)
:init []
:coll some-collection } )
so the function call has a single map arg, where each of the 3 pieces is labelled & can come in any order in the map literal.
Another possibility is to just extract the reducing fn and give it a name. This can be either internal or external to the code expression containing the reduce:
(let [glommer (fn [state value] (into state value)) ]
(reduce glommer #{} some-coll))
or possibly
(defn glommer [state value] (into state value))
(reduce glommer #{} some-coll))
As always, anything that increases clarity is preferred. If you haven't noticed already, I'm a big fan of Martin Fowler's idea of Introduce Explaining Variable refactoring. :)
I will apologize in advance for posting a longer solution to something where you wanted more brevity/clarity.
We are in the new age of clojure transducers and it appears a bit that your solution was passing the "longest" and "current" forward for record-keeping. Rather than passing that state forward, a stateful transducer would do the trick.
(def longest-decreasing
(fn [rf]
(let [longest (volatile! [])
current (volatile! [])
tail (volatile! nil)]
(fn
([] (rf))
([result] (transduce identity rf result))
([result x] (do (if (or (nil? #tail) (< x #tail))
(if (> (count (vswap! current conj (vreset! tail x)))
(count #longest))
(vreset! longest #current))
(vreset! current [(vreset! tail x)]))
#longest)))))))
Before you dismiss this approach, realize that it just gives you the right answer and you can do some different things with it:
(def coll [2 1 10 9 8 40])
(transduce longest-decreasing conj coll) ;; => [10 9 8]
(transduce longest-decreasing + coll) ;; => 27
(reductions (longest-decreasing conj) [] coll) ;; => ([] [2] [2 1] [2 1] [2 1] [10 9 8] [10 9 8])
Again, I know that this may appear longer but the potential to compose this with other transducers might be worth the effort (not sure if my airity 1 breaks that??)
I believe that iterate can be a more readable substitute for reduce. For example here is the iteratee function that iterate will use to solve this problem:
(defn step-state-hof [op]
(fn [{:keys [unprocessed current answer]}]
(let [[x y & more] unprocessed]
(let [next-current (if (op x y)
(conj current y)
[y])
next-answer (if (> (count next-current) (count answer))
next-current
answer)]
{:unprocessed (cons y more)
:current next-current
:answer next-answer}))))
current is built up until it becomes longer than answer, in which case a new answer is created. Whenever the condition op is not satisfied we start again building up a new current.
iterate itself returns an infinite sequence, so needs to be stopped when the iteratee has been called the right number of times:
(def in [3 2 1 0 -1 2 7 6 7 6 5 4 3 2])
(->> (iterate (step-state-hof >) {:unprocessed (rest in)
:current (vec (take 1 in))})
(drop (- (count in) 2))
first
:answer)
;;=> [7 6 5 4 3 2]
Often you would use a drop-while or take-while to short circuit just when the answer has been obtained. We could so that here however there is no short circuiting required as we know in advance that the inner function of step-state-hof needs to be called (- (count in) 1) times. That is one less than the count because it is processing two elements at a time. Note that first is forcing the final call.
I wanted this order for the form:
reduce
val, col
f
I was able to figure out that this technically satisfies my requirements:
> (apply reduce
(->>
[0 [1 2 3 4]]
(cons
(fn [acc x]
(+ acc x)))))
10
But it's not the easiest thing to read.
This looks much simpler:
> (defn reduce< [val col f]
(reduce f val col))
nil
> (reduce< 0 [1 2 3 4]
(fn [acc x]
(+ acc x)))
10
(< is shorthand for "parameters are rotated left"). Using reduce<, I can see what's being passed to f by the time my eyes get to the f argument, so I can just focus on reading the f implementation (which may get pretty long). Additionally, if f does get long, I no longer have to visually check the indentation of the val and col arguments to determine that they belong to the reduce symbol way farther up. I personally think this is more readable than binding f to a symbol before calling reduce, especially since fn can still accept a name for clarity.
This is a general solution, but the other answers here provide many good alternative ways to solve the specific problem I gave as an example.

Bindings inside lazy prime number generator function in Clojure

I have implemented a lazy prime number generator (nextprime returns the next prime starting from the number passed):
(defn allprimes
([] (allprimes 2))
([x] (lazy-seq (cons (nextprime x) (allprimes (nextprime x))))))
Let's assume nextprime is a costly function, in order not to execute it twice I have tried binding it to a symbol:
(defn allprimes
([] (allprimes 2))
([x] (let [next (nextprime x)]
(cons next (lazy-seq (allprimes (next)))))))
But this does not work (java.lang.Long cannot be cast to clojure.lang.IFn). Why?
Also, is there any difference between (cons n (lazy-seq ...)) and (lazy-seq (cons n ...)) ?
edit: thanks Kyle for pointing the error in the first question. If parentheses are removed from next, it works.
You have extra parenthesis around (next) : next is then called as a function IFn though next is a Long...
For your second question, the las example in clojure doc gives details on the difference between the 2 solutions.
Roughly (cons n (lazy-seq ...)) will always evaluate n, even if it is not consumed, and (lazy-seq (cons n ...)) is fully lazy. It can matter if n is not just a number but some function that may be computation intensive.

Weird behaviour binding in loop recursion

I'm learning Clojure, and I'm trying to solve the Problem 31: Write a function which packs consecutive duplicates into sub-lists.
(= (__ [1 1 2 1 1 1 3 3]) '((1 1) (2) (1 1 1) (3 3)))
I know I can solve this using identity, and in a functional way, but I want to solve it using recursion, because I've not well established this idea in my brain.
My solution would be this:
(defn packing [lista]
(loop [[fst snd :as all] lista mem [] tmp '(fst)]
(print "all is " all "\n\n") ;; something is wrong; it always is an empty list
(if (seq? all)
(if (= fst snd)
(recur (rest all) mem (cons snd tmp))
(recur (rest all) (conj mem tmp) (list snd)))
(seq mem))))
My idea is a recursive loop always taking the first 2 items and comparing. If they are the same number, I include this inside a temporary list tmp; if they're different, I include my temporary list inside men. (This is my final list; a better name would be final_list.)
Because it compares the first 2 items, but at the same time it needs a recursive loop only bypassing the first item, I named the entire list all.
I don't know if the logic is good but inclusive if this was wrong I'm not sure why when I print.
(print "all is " all "\n\n") I receive an empty list
A few points:
'(fst) creates a list containing a symbol fst, not the value of fst, this is one of the reasons to prefer using vectors, e.g., [fst]
you should avoid assuming the input will not be empty
you can use conj for both lists and vectors
destructuring is nestable
(defn packing [coll]
(loop [[x & [y :as more] :as all] coll
result []
same '()]
(if all
(if (= x y)
(recur more result (conj same x))
(recur more (conj result (conj same x)) '()))
result)))
in your code all isn't empty..only happen than is an infinite loop and you always see a empty list...in the firsts lines you can see than it works like expected..
the mistake is in (seq? all) because a empty list is a seq too... try (seq? '()) and return true...then you do a empty loop
you need change this for (empty? all) your code would be
other mistake is '(fst) because it return the simbol fst and not the value...change it for (list fst)
(defn badpacking [lista]
(loop [[fst snd :as all] lista mem [] tmp (list fst)]
(if (empty? all)
(seq mem)
(if (= fst snd)
(recur (rest all) mem (cons snd tmp))
(recur (rest all) (conj mem tmp) (list snd))))))

Grouping a sequence of bools in clojure?

I would like to transform the following sequence:
(def boollist [true false false false true false true])
Into the following:
[[true] [false false false true] [false true]]
My code leads to a Stackoverflow:
(defn sep [boollst]
(loop [lst boollst
separated [[]]
[left right] (take 2 lst)]
(if (nil? left) separated)
(recur (next lst)
(if (false? left)
(conj (last separated) left)
(conj separated [left]))
(take 2 (next lst)))))
Is there an elegant way of transforming this?
There's probably a much more elegant way, but this is what I came up with:
(defn f
([xs] (f xs [] []))
([[x & xs :as all] acc a]
(if (seq all)
(if x
(recur xs [] (conj a (conj acc x)))
(recur xs (conj acc x) a))
a)))
It just traverses the sequence keeping track of the current vector of falses, and a big accumulator of everything so far.
A short, "clever" solution would be:
(defn sep [lst]
(let [x (partition-by identity lst)]
(filter last (map concat (cons [] x) x))))
The "stack overflow" issue is due to the philosophy of Clojure regarding recursion and is easily avoided if approached correctly. You should always implement these types of functions* in a lazy way: If you can't find a trick for solving the problem using library functions, as I did above, you should use "lazy-seq" for the general solution (like pmjordan did) as explained here: http://clojure.org/lazy
* Functions that eat up a list and return a list as the result. (If something other than a list is returned, the idiomatic solution is to use "recur" and an accumulator, as shown by dfan's example, which I would not consider idiomatic in this case.)
Here's a version that uses lazy evaluation and is maybe a little more readable:
(defn f [bools]
(when-not (empty? bools)
(let
[[l & r] bools
[lr rr] (split-with false? r)]
(lazy-seq (cons
(cons l lr)
(f rr))))))
It doesn't return vectors though, so if that's a requirement you need to manually pass the result of concat and of the function itself to vec, thus negating the advantage of using lazy evaluation.
The stack overflow error is because your recur is outside of the if. You evaluate the if form for side effects, then unconditionally recur. (feel free to edit for format, I'm not at a real keyboard).