Clojure: Why does this give a StackOverflowError?

Clojure: Why does this give a StackOverflowError? - clojure

(reduce concat (repeat 10000 []))
I understand that flatten is probably a better way to do this but I am still curious as to why this causes an error.

It's because concat produces a lazy sequence.
So, when you're calling
(concat a b)
no actual concatenation is done unless you're trying to use the result.
So, your code creates 10000 nested lazy sequences, causing StackOverflow error.
I can see two ways to prevent it from throwing an error.
First way is to force concat execution using doall function:
(reduce (comp doall concat) (repeat 10000 []))
Second way is to use greedy into function instead of lazy concat:
(reduce into (repeat 10000 []))
Update
As for your suggestion about using flatten, it's not a good solution, because flatten is recursive, so it'll try to flatten all nested collections as well. Consider the following example:
(flatten (repeat 3 [[1]]))
It will produce flattened sequence (1 1 1) instead of concatenated one ([1] [1] [1]).
I think that the best solution would be to use concat with apply:
(apply concat (repeat 10000 []))
Because it will produce single lazy sequence without throwing StackOverflow error.

concat is lazy, so all the calls to concat are saved up until the results are used. doall forces lazy sequences and can prevent this error:
user> (reduce concat (repeat 10000 []))
StackOverflowError clojure.lang.RT.seq (RT.java:484)
user> (reduce (comp doall concat) (repeat 10000 []))
()

Related

Clojure - operate on each item in list of lists

I'm working on my first-ever functional program in Clojure. I'm having some issues figuring out how to step through each item in a list, in each list in a list, and operate on it while keeping return values. I'm sure the issue comes from my unfamiliarity with Clojure and functional programming and was hoping someone could explain the best method to do the following:
psuedo-code algorithm:
for each lst in list
for each item in lst
return_values.append = do_something(item)
I first tried nesting two doseq functions and then calling my do_something function, which worked to call the function on the item, but didn't save my return values. I then tried a for and cons to an empty list, but was unable to get my return values outside of the for.
Would it be possible/preferable to break the list of lists down first? Could I still get a list of lists of return values?
In the end, I would like the result to be a list of lists of return values to match the input list of lists.
If anyone could explain the best method for doing this in Clojure, and why, it would be much appreciated.

Nested for loop will do the trick:
(for [lst my-list]
(for [item lst] (do_something item)))
It will take nested list my-list (list of lists) and convert it into another nested list by applying do_something to each element.
In clojure, for returns a list of values already, so there is no need to handle it yourself. Furthermore, since all data structures in clojure are immutable, you can't do this by appending elements to initially empty list with cons.

If you have a deeply nested list and you want to keep its structure, but transform the values, you can use clojure.walk/postwalk to operate on each value, e.g.:
(def nested '(1 (2 3 (4 5)) 6))
(defn transform-values [coll f]
(clojure.walk/postwalk #(if (not (list? %))
(f %)
%)
coll))
(transform-values nested inc)
=> (2 (3 4 (5 6)) 7)
You can, of course, pass any function to transform-values.

This can be done as a simple recursive walk. The first implementation that comes to mind for this would be the following for sequences:
(defn deep-walk
[f data]
(map (fn [s] (if (seq? s)
(deep-walk f s)
(f s)))
data))
And this slight variation for vectors:
(defn vec-deep-walk
[f data]
(vec (map (fn [s] (if (vector? s)
(vec-deep-walk f s)
(f s)))
data)))
Just a quick test with the following:
(vec-deep-walk (partial + 1) [1 [2 3] 4 [5 [6 7]]])
Gives the following output:
[2 [3 4] 5 [6 [7 8]]]
The walk functions take two parameters, the first is a function that takes a single parameter. This will be called for each non-seq/vector element in your data, which is passed as the second parameter. The results will be returned in a nested structure that is identical to the input structure.

Lazy concatenation of sequence in Clojure

Here's a beginner's question: Is there a way in Clojure to lazily concatenate an arbitrary number of sequences? I know there's lazy-cat macro, but I can't think of its correct application for an arbitrary number of sequences.
My use case is lazy loading data from an API via paginated (offseted/limited) requests. Each request executed via request-fn below retrieves 100 results:
(map request-fn (iterate (partial + 100) 0))
When there are no more results, request-fn returns an empty sequence. This is when I stop the iteration:
(take-while seq (map request-fn (iterate (partial + 100) 0)))
For example, the API might return up to 500 results and can be mocked as:
(defn request-fn [offset] (when (< offset 500) (list offset)))
If I want to concatenate the results, I can use (apply concat results) but that eagerly evaluates the results sequence:
(apply concat (take-while seq (map request-fn (iterate (partial + 100) 0))))
Is there a way how to concatenate the results sequence lazily, using either lazy-cat or something else?

For the record, apply will consume only enough of the arguments sequence as it needs to determine which arity to call for the provided function. Since the maximum arity of concat is 3, apply will realize at most 3 items from the underlying sequence.
If those API calls are expensive and you really can't afford to make unnecessary ones, then you will need a function that accepts a seq-of-seqs and lazily concatenates them one at a time. I don't think there's anything built-in, but it's fairly straightforward to write your own:
(defn lazy-cat' [colls]
(lazy-seq
(if (seq colls)
(concat (first colls) (lazy-cat' (next colls))))))

Better way of creating a flat list out of numbers and vectors

I've got a function like this:
(defn magic
[a b c]
(flatten (conj [] a b c)))
So on these inputs I get the following:
(magic 1 2 3) => (1 2 3)
(magic 1 [2 3] 4) => (1 2 3 4)
My question is, is there a better way of doing this?
The problem can be summarised as:
I don't know whether I will get numbers or vectors as input, but I need to return a single flat list

This could be slightly simplified (and generalized) as:
(defn magic [& args]
(flatten (apply list args)))
Or, as pointed out in the comments, it can be simplified even further (since args above is already a seq):
(defn magic [& args]
(flatten args))
Other than that, I don't see much else that can be improved about this. Is there anything in particular that's bothering you about your implementation?

If you can get seqs of seqs then you need to be more careful. And will have to recursively go into the list. There is a clojure native function for this tree-seq see the examples here:
http://clojuredocs.org/clojure_core/clojure.core/tree-seq
You'd want something like this (untested):
(defn nonempty-seq [x]
"returns x as a seq if it's a non-empty seq otherwise nil/false"
(and (coll? x) (seq x)))
(tree-seq nonempty-seq seq expr)

Is it possible to do destructured head/tail separation of lazy sequences in clojure?

I see some examples that show we can get a nice head/tail destructuring of a sequence in clojure as follows:
(if-let [[x & xs] (seq coll)]
However I assume this won't work as desired for lazy sequences because this puts the values into a vector, which aren't lazy. I tried changing the vector form to a list form, and it gave me binding errors, quoted or not.
Without having binding like this, it seems that if I've got a lazy sequence where each element was a computationally-intensive equation of the previous element, I'd have to do that computation twice to get the head and tail as separate statements, right?
(let [head (first my-lazy-seq) ;; has to calculate the value of head.
tail (rest my-lazy-seq)] ;; also has to calculate the value of head to prepare the rest of the sequence.
Is there any way around this, or am I making an incorrect assumption somewhere?

user=> (let [[x & xs] (range)] [x (take 10 xs)])
[0 (1 2 3 4 5 6 7 8 9 10)]
xs is still a lazy seq, so you can use the destructuring with no problems. This will force the first element of xs, though. (Destructuring uses vector notation, but it doesn't necessarily use vectors under the covers.)
With respect to your second question: lazy seqs cache their results, so your second option would also work without extra recalculation. The head will only be calculated once.

The binding vector [x & xs] isn't actually constructing a vector at runtime. It's just the notation used for destructuring into head & tail.
So it works fine on infinite sequences:
(if-let [[x & xs] (range)]
(apply str x (take 9 xs)))
=> "0123456789"
The destructuring form is actually producing a lazy sequence in this case, which you can observe as follows:
(if-let [[x & xs :as my-seq] (range)]
(class my-seq))
=> clojure.lang.LazySeq

Clojure apply vs map

I have a sequence (foundApps) returned from a function and I want to map a function to all it's elements. For some reason, apply and count work for the sequnece but map doesn't:
(apply println foundApps)
(map println rest foundApps)
(map (fn [app] (println app)) foundApps)
(println (str "Found " (count foundApps) " apps to delete"))))
Prints:
{:description another descr, :title apptwo, :owner jim, :appstoreid 1235, :kind App, :key #<Key App(2)>} {:description another descr, :title apptwo, :owner jim, :appstoreid 1235, :kind App, :key #<Key App(4)>}
Found 2 apps to delete for id 1235
So apply seems to happily work for the sequence, but map doesn't. Where am I being stupid?

I have a simple explanation which this post is lacking. Let's imagine an abstract function F and a vector. So,
(apply F [1 2 3 4 5])
translates to
(F 1 2 3 4 5)
which means that F has to be at best case variadic.
While
(map F [1 2 3 4 5])
translates to
[(F 1) (F 2) (F 3) (F 4) (F 5)]
which means that F has to be single-variable, or at least behave this way.
There are some nuances about types, since map actually returns a lazy sequence instead of vector. But for the sake of simplicity, I hope it's pardonable.

Most likely you're being hit by map's laziness. (map produces a lazy sequence which is only realised when some code actually uses its elements. And even then the realisation happens in chunks, so that you have to walk the whole sequence to make sure it all got realised.) Try wrapping the map expression in a dorun:
(dorun (map println foundApps))
Also, since you're doing it just for the side effects, it might be cleaner to use doseq instead:
(doseq [fa foundApps]
(println fa))
Note that (map println foundApps) should work just fine at the REPL; I'm assuming you've extracted it from somewhere in your code where it's not being forced. There's no such difference with doseq which is strict (i.e. not lazy) and will walk its argument sequences for you under any circumstances. Also note that doseq returns nil as its value; it's only good for side-effects. Finally I've skipped the rest from your code; you might have meant (rest foundApps) (unless it's just a typo).
Also note that (apply println foundApps) will print all the foundApps on one line, whereas (dorun (map println foundApps)) will print each member of foundApps on its own line.

A little explanation might help. In general you use apply to splat a sequence of elements into a set of arguments to a function. So applying a function to some arguments just means passing them in as arguments to the function, in a single function call.
The map function will do what you want, create a new seq by plugging each element of the input into a function and then storing the output. It does it lazily though, so the values will only be computed when you actually iterate over the list. To force this you can use the (doall my-seq) function, but most of the time you won't need to do that.
If you need to perform an operation immediately because it has side effects, like printing or saving to a database or something, then you typically use doseq.
So to append "foo" to all of your apps (assuming they are strings):
(map (fn [app] (str app "foo")) found-apps)
or using the shorhand for an anonymous function:
(map #(str % "foo") found-apps)
Doing the same but printing immediately can be done with either of these:
(doall (map #(println %) found-apps))
(doseq [app found-apps] (println app))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Clojure: Why does this give a StackOverflowError? - clojure

(reduce concat (repeat 10000 [])) I understand that flatten is probably a better way to do this but I am still curious as to why this causes an error.

concat is lazy, so all the calls to concat are saved up until the results are used. doall forces lazy sequences and can prevent this error: user> (reduce concat (repeat 10000 [])) StackOverflowError clojure.lang.RT.seq (RT.java:484) user> (reduce (comp doall concat) (repeat 10000 [])) ()

Related

Clojure - operate on each item in list of lists

Lazy concatenation of sequence in Clojure

Better way of creating a flat list out of numbers and vectors

Is it possible to do destructured head/tail separation of lazy sequences in clojure?

Clojure apply vs map

Categories

Resources