What is the difference between butlast and drop-last in Clojure ?
Is it only the laziness ? Should I should prefer one over the other ?
also, if you need to realize the whole collection, butlast is dramatically faster, which is logical if you look at their source:
(def
butlast (fn ^:static butlast [s]
(loop [ret [] s s]
(if (next s)
(recur (conj ret (first s)) (next s))
(seq ret)))))
(defn drop-last
([s] (drop-last 1 s))
([n s] (map (fn [x _] x) s (drop n s))))
so drop-last uses map, while butlast uses simple iteration with recur. Here is a little example:
user> (time (let [_ (butlast (range 10000000))]))
"Elapsed time: 2052.853726 msecs"
nil
user> (time (let [_ (doall (drop-last (range 10000000)))]))
"Elapsed time: 14072.259077 msecs"
nil
so i wouldn't blindly prefer one over another. I use drop-last only when i really need laziness, otherwise butlast.
Yes, laziness as well as the fact that drop-last can take also take n, indicating how many elements to drop from the end lazily.
There's a discussion here where someone is making the case that butlast is more readable and maybe a familiar idiom for Lisp programmers, but I generally just opt to use drop-last.
Related
How would I create a transducer from the following ordinary code, where combo is the alias for clojure.math.combinatorics:
(defn row->evenly-divided [xs]
(->> (combo/combinations (sort-by - xs) 2)
(some (fn [[big small]]
(assert (>= big small))
(let [res (/ big small)]
(when (int? res)
res))))))
As noted in a comment transducers are only applicable for processing each item. With this is mind I've made the code a little more transducer friendly by shifting the sorting so that it is now being done for each item. I don't think there's anything that can be done about the combinations part however!
(defn row->evenly-divided [xs]
(->> (combo/combinations xs 2)
(some (fn [xy]
(let [res (apply / (sort-by - xy))]
(when (int? res)
res))))))
This is the same function but with an introduced transducer:
(def x-row->evenly-divided (comp
(map (partial sort-by -))
(map (partial apply /))
(filter int?)))
(defn row->evenly-divided-2 [xs]
(->> (combo/combinations xs 2)
(sequence x-row->evenly-divided)
first))
I recently discovered the Specter library that provides data-structure navigation and transformation functions and is written in Clojure.
Implementing some of its API as a learning exercise seemed like a good idea. Specter implements an API taking a function and a nested structure as arguments and returns a vector of elements from the nested structure that satisfies the function like below:
(select (walker number?) [1 :a {:b 2}]) => [1 2]
Below is my attempt at implementing a function with similar API:
(defn select-walker [afn ds]
(vec (if (and (coll? ds) (not-empty ds))
(concat (select-walker afn (first ds))
(select-walker afn (rest ds)))
(if (afn ds) [ds]))))
(select-walker number? [1 :a {:b 2}]) => [1 2]
I have tried implementing select-walker by using list comprehension, looping, and using cons and conj. In all these cases
the return value was a nested list instead of a flat vector of elements.
Yet my implementation does not seem like idiomatic Clojure and has poor time and space complexity.
(time (dotimes [_ 1000] (select (walker number?) (range 100))))
"Elapsed time: 19.445396 msecs"
(time (dotimes [_ 1000] (select-walker number? (range 100))))
"Elapsed time: 237.000334 msecs"
Notice that my implementation is about 12 times slower than Specter's implementation.
I have three questions on the implemention of select-walker.
Is a tail-recursive implementaion of select-walker possible?
Can select-walker be written in more idiomatic Clojure?
Any hints to make select-walker execute faster?
there are at least two possibilities to make it tail recursive. First one is to process data in loop like this:
(defn select-walker-rec [afn ds]
(loop [res [] ds ds]
(cond (empty? ds) res
(coll? (first ds)) (recur res
(doall (concat (first ds)
(rest ds))))
(afn (first ds)) (recur (conj res (first ds)) (rest ds))
:else (recur res (rest ds)))))
in repl:
user> (select-walker-rec number? [1 :a {:b 2}])
[1 2]
user> user> (time (dotimes [_ 1000] (select-walker-rec number? (range 100))))
"Elapsed time: 19.428887 msecs"
(simple select-walker works about 200ms for me)
the second one (slower though, and more suitable for more difficult tasks) is to use zippers:
(require '[clojure.zip :as z])
(defn select-walker-z [afn ds]
(loop [res [] curr (z/zipper coll? seq nil ds)]
(cond (z/end? curr) res
(z/branch? curr) (recur res (z/next curr))
(afn (z/node curr)) (recur (conj res (z/node curr))
(z/next curr))
:else (recur res (z/next curr)))))
user> (time (dotimes [_ 1000] (select-walker-z number? (range 100))))
"Elapsed time: 219.015153 msecs"
this one is really slow, since zipper operates on more complex structures. It's great power brings unneeded overhead to this simple task.
the most idiomatic approach i guess, is to use tree-seq:
(defn select-walker-t [afn ds]
(filter #(and (not (coll? %)) (afn %))
(tree-seq coll? seq ds)))
user> (time (dotimes [_ 1000] (select-walker-t number? (range 100))))
"Elapsed time: 1.320209 msecs"
it is incredibly fast, as it produces a lazy sequence of results. In fact you should realize its data for the fair test:
user> (time (dotimes [_ 1000] (doall (select-walker-t number? (range 100)))))
"Elapsed time: 53.641014 msecs"
one more thing to notice about this variant, is that it's not tail recursive, so it would fail in case of really deeply nested structures (maybe i'm mistaken, but i guess it's about couple of thousands levels of nesting), still it's suitable for the most cases.
I have a number of (unevaluated) expressions held in a vector; [ expr1 expr2 expr3 ... ]
What I wish to do is hand each expression to a separate thread and wait until one returns a value. At that point I'm not interested in the results from the other threads and would like to cancel them to save CPU resource.
( I realise that this could cause non-determinism in that different runs of the program might cause different expressions to be evaluated first. I have this in hand. )
Is there a standard / idiomatic way of achieving the above?
Here's my take on it.
Basically you have to resolve a global promise inside each of your futures, then return a vector containing future list and the resolved value and then cancel all the futures in the list:
(defn run-and-cancel [& expr]
(let [p (promise)
run-futures (fn [& expr] [(doall (map #(future (deliver p (eval %1))) expr)) #p])
[fs res] (apply run-futures expr)]
(map future-cancel fs)
res))
It's not reached an official release yet, but core.async looks like it might be an interesting way of solving your problem - and other asynchronous problems, very neatly.
The leiningen incantation for core.async is (currently) as follows:
[org.clojure/core.async "0.1.0-SNAPSHOT"]
And here's some code to make a function that will take a number of time-consuming functions, and block until one of them returns.
(require '[clojure.core.async :refer [>!! chan alts!! thread]]))
(defn return-first [& ops]
(let [v (map vector ops (repeatedly chan))]
(doseq [[op c] v]
(thread (>!! c (op))))
(let [[value channel] (alts!! (map second v))]
value)))
;; Make sure the function returns what we expect with a simple Thread/sleep
(assert (= (return-first (fn [] (Thread/sleep 3000) 3000)
(fn [] (Thread/sleep 2000) 2000)
(fn [] (Thread/sleep 5000) 5000))
2000))
In the sample above:
chan creates an asynchronous channel
>!! puts a value onto a channel
thread executes the body in another thread
alts!! takes a vector of channels, and returns when a value appears on any of them
There's way more to it than this, and I'm still getting my head round it, but there's a walkthrough here: https://github.com/clojure/core.async/blob/master/examples/walkthrough.clj
And David Nolen's blog has some great, if mind-boggling, posts on it (http://swannodette.github.io/)
Edit
Just seen that MichaĆ Marczyk has answered a very similar question, but better, here, and it allows you to cancel/short-circuit.
with Clojure threading long running processes and comparing their returns
What you want is Java's CompletionService. I don't know of any wrapper around this in clojure, but it wouldn't be hard to do with interop. The example below is loosely based around the example on the JavaDoc page for the ExecutorCompletionService.
(defn f [col]
(let [cs (ExecutorCompletionService. (Executors/newCachedThreadPool))
futures (map #(.submit cs %) col)
result (.get (.take cs))]
(map #(.cancel % true) futures)
result))
You could use future-call to get a list of all futures, storing them in an Atom. then, compose each running future with a "shoot the other ones in the head" function so the first one will terminate all the remaining ones. Here is an example:
(defn first-out [& fns]
(let [fs (atom [])
terminate (fn [] (println "cancling..") (doall (map future-cancel #fs)))]
(reset! fs (doall (map (fn [x] (future-call #((x) (terminate)))) fns)))))
(defn wait-for [n s]
(fn [] (print "start...") (flush) (Thread/sleep n) (print s) (flush)))
(first-out (wait-for 1000 "long") (wait-for 500 "short"))
Edit
Just noticed that the previous code does not return the first results, so it is mainly useful for side-effects. here is another version that returns the first result using a promise:
(defn first-out [& fns]
(let [fs (atom [])
ret (promise)
terminate (fn [x] (println "cancling.." )
(doall (map future-cancel #fs))
(deliver ret x))]
(reset! fs (doall (map (fn [x] (future-call #(terminate (x)))) fns)))
#ret))
(defn wait-for [n s]
"this time, return the value"
(fn [] (print "start...") (flush) (Thread/sleep n) (print s) (flush) s))
(first-out (wait-for 1000 "long") (wait-for 500 "short"))
While I don't know if there is an idiomatic way to achieve your goal but Clojure Future looks like a good fit.
Takes a body of expressions and yields a future object that will
invoke the body in another thread, and will cache the result and
return it on all subsequent calls to deref/#. If the computation has
not yet finished, calls to deref/# will block, unless the variant of
deref with timeout is used.
I'm working through a book on clojure and ran into a stumbling block with "->>". The author provides an example of a comp that converts camelCased keywords into a clojure map with a more idiomatic camel-cased approach. Here's the code using comp:
(require '[clojure.string :as str])
(def camel->keyword (comp keyword
str/join
(partial interpose \-)
(partial map str/lower-case)
#(str/split % #"(?<=[a-z])(?=[A-Z])")))
This makes a lot of sense, but I don't really like using partial all over the place to handle a variable number of arguments. Instead, an alternative is provided here:
(defn camel->keyword
[s]
(->> (str/split s #"(?<=[a-z])(?=[A-Z])")
(map str/lower-case)
(interpose \-)
str/join
keyword))
This syntax is much more readable, and mimics the way I would think about solving a problem (front to back, instead of back to front). Extending the comp to complete the aforementioned goal...
(def camel-pairs->map (comp (partial apply hash-map)
(partial map-indexed (fn [i x]
(if (odd? i)
x
(camel->keyword x))))))
What would be the equivalent using ->>? I'm not exactly sure how to thread map-indexed (or any iterative function) using ->>. This is wrong:
(defn camel-pairs->map
[s]
(->> (map-indexed (fn [i x]
(if (odd? i)
x
(camel-keyword x)))
(apply hash-map)))
Three problems: missing a parenthesis, missing the > in the name of camel->keyword, and not "seeding" your ->> macro with the initial expression s.
(defn camel-pairs->map [s]
(->> s
(map-indexed
(fn [i x]
(if (odd? i)
x
(camel->keyword x))))
(apply hash-map)))
Is this really more clear than say?
(defn camel-pairs->map [s]
(into {}
(for [[k v] (partition 2 s)]
[(camel->keyword k) v])))
I had an idea for a higher-order function today that I'm not sure how to write. I have several sparse, lazy infinite sequences, and I want to create an abstraction that lets me check to see if a given number is in any of these lazy sequences. To improve performance, I wanted to push the values of the sparse sequence into a hashmap (or set), dynamically increasing the number of values in the hashmap whenever it is necessary. Automatic memoization is not the answer here due to sparsity of the lazy seqs.
Probably code is easiest to understand, so here's what I have so far. How do I change the following code so that the predicate uses a closed-over hashmap, but if needed increases the size of the hashmap and redefines itself to use the new hashmap?
(defn make-lazy-predicate [coll]
"Returns a predicate that returns true or false if a number is in
coll. Coll must be an ordered, increasing lazy seq of numbers."
(let [in-lazy-list? (fn [n coll top cache]
(if (> top n)
(not (nil? (cache n)))
(recur n (next coll) (first coll)
(conj cache (first coll)))]
(fn [n] (in-lazy-list? n coll (first coll) (sorted-set)))))
(def my-lazy-list (iterate #(+ % 100) 1))
(let [in-my-list? (make-lazy-predicate my-lazy-list)]
(doall (filter in-my-list? (range 10000))))
How do I solve this problem without reverting to an imperative style?
This is a thread-safe variant of Adam's solution.
(defn make-lazy-predicate
[coll]
(let [state (atom {:mem #{} :unknown coll})
update-state (fn [{:keys [mem unknown] :as state} item]
(let [[just-checked remainder]
(split-with #(<= % item) unknown)]
(if (seq just-checked)
(-> state
(assoc :mem (apply conj mem just-checked))
(assoc :unknown remainder))
state)))]
(fn [item]
(get-in (if (< item (first (:unknown #state)))
#state
(swap! state update-state item))
[:mem item]))))
One could also consider using refs, but than your predicate search might get rolled back by an enclosing transaction. This might or might not be what you want.
This function is based on the idea how the core memoize function works. Only numbers already consumed from the lazy list are cached in a set. It uses the built-in take-while instead of doing the search manually.
(defn make-lazy-predicate [coll]
(let [mem (atom #{})
unknown (atom coll)]
(fn [item]
(if (< item (first #unknown))
(#mem item)
(let [just-checked (take-while #(>= item %) #unknown)]
(swap! mem #(apply conj % just-checked))
(swap! unknown #(drop (count just-checked) %))
(= item (last just-checked)))))))