Rust vs. Clojure speed comparasion, any improvement for the Clojure code?

Rust vs. Clojure speed comparasion, any improvement for the Clojure code? - clojure

I translated a piece of Rust code example to Clojure.
Rust (imperative and functional):
Note: Both imperative and functional code here are together for clarity. In the test, I run them separately.
// The `AdditiveIterator` trait adds the `sum` method to iterators
use std::iter::AdditiveIterator;
use std::iter;
fn main() {
println!("Find the sum of all the squared odd numbers under 1000");
let upper = 1000u;
// Imperative approach
// Declare accumulator variable
let mut acc = 0;
// Iterate: 0, 1, 2, ... to infinity
for n in iter::count(0u, 1) {
// Square the number
let n_squared = n * n;
if n_squared >= upper {
// Break loop if exceeded the upper limit
break;
} else if is_odd(n_squared) {
// Accumulate value, if it's odd
acc += n_squared;
}
}
println!("imperative style: {}", acc);
// Functional approach
let sum_of_squared_odd_numbers =
// All natural numbers
iter::count(0u, 1).
// Squared
map(|n| n * n).
// Below upper limit
take_while(|&n| n < upper).
// That are odd
filter(|n| is_odd(*n)).
// Sum them
sum();
println!("functional style: {}", sum_of_squared_odd_numbers);
}
fn is_odd(n: uint) -> bool {
n % 2 == 1
}
Rust (imperative) time:
~/projects/rust_proj $> time ./hof_imperative
Find the sum of all the squared odd numbers under 1000
imperative style: 5456
real 0m0.006s
user 0m0.001s
sys 0m0.004s
~/projects/rust_proj $> time ./hof_imperative
Find the sum of all the squared odd numbers under 1000
imperative style: 5456
real 0m0.004s
user 0m0.000s
sys 0m0.004s
~/projects/rust_proj $> time ./hof_imperative
Find the sum of all the squared odd numbers under 1000
imperative style: 5456
real 0m0.005s
user 0m0.004s
sys 0m0.001s
Rust (Functional) time:
~/projects/rust_proj $> time ./hof
Find the sum of all the squared odd numbers under 1000
functional style: 5456
real 0m0.007s
user 0m0.001s
sys 0m0.004s
~/projects/rust_proj $> time ./hof
Find the sum of all the squared odd numbers under 1000
functional style: 5456
real 0m0.007s
user 0m0.007s
sys 0m0.000s
~/projects/rust_proj $> time ./hof
Find the sum of all the squared odd numbers under 1000
functional style: 5456
real 0m0.007s
user 0m0.004s
sys 0m0.003s
Clojure:
(defn sum-square-less-1000 []
"Find the sum of all the squared odd numbers under 1000
"
(->> (iterate inc 0)
(map (fn [n] (* n n)))
(take-while (partial > 1000))
(filter odd?)
(reduce +)))
Clojure time:
user> (time (sum-square-less-1000))
"Elapsed time: 0.443562 msecs"
5456
user> (time (sum-square-less-1000))
"Elapsed time: 0.201981 msecs"
5456
user> (time (sum-square-less-1000))
"Elapsed time: 0.4752 msecs"
5456
Question:
What's the difference of (reduce +) and (apply +) in Clojure?
Is this Clojure code the idiomatic way?
Can I draw conclusion that Speed: Clojure > Rust imperative > Rust functional ? Clojure really surprised me here for performance.

If you look at the source of +, you will see that (reduce +) and (apply +) are identical for higher argument counts. (apply +) is optimized for the 1 or 2 argument versions though.
(range) is going to be much faster than (iterate inc 0) for most cases.
partial is slower than a simple anonymous function, and should be reserved for cases where you don't know how many more args will be supplied.
Showing the results of benchmarking with criterium, we can see that applying those changes give a 36% drop in execution time:
user> (crit/bench (->> (iterate inc 0)
(map (fn [n] (* n n)))
(take-while (partial > 1000))
(filter odd?)
(reduce +)))
WARNING: Final GC required 2.679748643529675 % of runtime
Evaluation count : 3522840 in 60 samples of 58714 calls.
Execution time mean : 16.954649 µs
Execution time std-deviation : 140.180401 ns
Execution time lower quantile : 16.720122 µs ( 2.5%)
Execution time upper quantile : 17.261693 µs (97.5%)
Overhead used : 2.208566 ns
Found 2 outliers in 60 samples (3.3333 %)
low-severe 2 (3.3333 %)
Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil
user> (crit/bench (->> (range)
(map (fn [n] (* n n)))
(take-while #(> 1000 %))
(filter odd?)
(reduce +)))
Evaluation count : 5521440 in 60 samples of 92024 calls.
Execution time mean : 10.993332 µs
Execution time std-deviation : 118.100723 ns
Execution time lower quantile : 10.855536 µs ( 2.5%)
Execution time upper quantile : 11.238964 µs (97.5%)
Overhead used : 2.208566 ns
Found 2 outliers in 60 samples (3.3333 %)
low-severe 1 (1.6667 %)
low-mild 1 (1.6667 %)
Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil

The Clojure code looks idiomatic in my opinion but it's doing a lot of unnecessary work. Here is an alternative way.
(reduce #(+ %1 (* %2 %2)) 0 (range 1 32 2))
user=> (time (reduce #(+ %1 (* %2 %2)) 0 (range 1 32 2)))
"Elapsed time: 0.180778 msecs"
5456
user=> (time (reduce #(+ %1 (* %2 %2)) 0 (range 1 32 2)))
"Elapsed time: 0.255972 msecs"
5456
user=> (time (reduce #(+ %1 (* %2 %2)) 0 (range 1 32 2)))
"Elapsed time: 0.346192 msecs"
5456
user=> (time (reduce #(+ %1 (* %2 %2)) 0 (range 1 32 2)))
"Elapsed time: 0.162615 msecs"
5456
user=> (time (reduce #(+ %1 (* %2 %2)) 0 (range 1 32 2)))
"Elapsed time: 0.257901 msecs"
5456
user=> (time (reduce #(+ %1 (* %2 %2)) 0 (range 1 32 2)))
"Elapsed time: 0.175507 msecs"
5456
You can't really conclude that one is faster than the other based on this test though. Benchmarking is a tricky game. You need to test your programs in production-like environments with heavy inputs to get any meaningful results.
What's the difference of (reduce +) and (apply +) in Clojure?
apply is a higher order function with variable arity. Its first argument is a function of variable arity, takes a bunch of intervening args, and then the last arg must be a list of args. It works by first consing the intervening args to the list of args, then passes the args to the function.
Example:
(apply + 0 1 2 3 '(4 5 6 7))
=> (apply + '(0 1 2 3 4 5 6 7))
=> (+ 0 1 2 3 4 5 6 7)
=> result
As for reduce, well I think the docs say it clearly
user=> (doc reduce)
-------------------------
clojure.core/reduce
([f coll] [f val coll])
f should be a function of 2 arguments. If val is not supplied,
returns the result of applying f to the first 2 items in coll, then
applying f to that result and the 3rd item, etc. If coll contains no
items, f must accept no arguments as well, and reduce returns the
result of calling f with no arguments. If coll has only 1 item, it
is returned and f is not called. If val is supplied, returns the
result of applying f to val and the first item in coll, then
applying f to that result and the 2nd item, etc. If coll contains no
items, returns val and f is not called.
nil
There are situations were you could use either apply f coll or reduce f coll, but I normally use apply when f has variable arity, and reduce when f is a 2-ary function.

Related

Why reduce performs much sub-optimal than recur in this case?

One of intuitive ways to calculate π in polynomial sum looks like below,
π = ( 1/1 - 1/3 + 1/5 - 1/7 + 1/9 ... ) × 4
The following function ρ or ρ' denotes the polynomial sum, and the consumed time τ to calculate the π is measured respectively,
(defn term [k]
(let [x (/ 1. (inc (* 2 k)))]
(if (even? k)
x
(- x))))
(defn ρ [n]
(reduce
(fn [res k] (+ res (term k)))
0
(lazy-seq (range 0 n))))
(defn ρ' [n]
(loop [res 0 k 0]
(if (< k n)
(recur (+ res (term k)) (inc k))
res)))
(defn calc [P]
(let [start (System/nanoTime)
π (* (P 1000000) 4)
end (System/nanoTime)
τ (- end start)]
(printf "π=%.16f τ=%d\n" π τ)))
(calc ρ)
(calc ρ')
The result tells that ρ is about half more time spent than ρ', hence the underlying reduce performs much sub-optimal than recur in this case, but why?

Rewriting your code and using a more accurate timer shows there is no significant difference. This is to be expected since both loop/recur and reduce are very basic forms and we would expect them to both be fairly optimized.
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require
[criterium.core :as crit] ))
(def result (atom nil))
(defn term [k]
(let [x (/ 1. (inc (* 2 k)))]
(if (even? k)
x
(- x))))
(defn ρ [n]
(reduce
(fn [res k] (+ res (term k)))
0
(range 0 n)) )
(defn ρ' [n]
(loop [res 0 k 0]
(if (< k n)
(recur (+ res (term k)) (inc k))
res)) )
(defn calc [calc-fn N]
(let [pi (* (calc-fn N) 4)]
(reset! result pi)
pi))
We measure the execution time for both algorithms using Criterium:
(defn timings
[power]
(let [N (Math/pow 10 power)]
(newline)
(println :-----------------------------------------------------------------------------)
(spyx N)
(newline)
(crit/quick-bench (calc ρ N))
(println :rho #result)
(newline)
(crit/quick-bench (calc ρ' N))
(println :rho-prime N #result)
(newline)))
and we try it for 10^2, 10^4, and 10^6 values of N:
(dotest
(timings 2)
(timings 4)
(timings 6))
with results for 10^2:
-------------------------------
Clojure 1.10.1 Java 14
-------------------------------
Testing tst.demo.core
:-----------------------------------------------------------------------------
N => 100.0
Evaluation count : 135648 in 6 samples of 22608 calls.
Execution time mean : 4.877255 µs
Execution time std-deviation : 647.723342 ns
Execution time lower quantile : 4.438762 µs ( 2.5%)
Execution time upper quantile : 5.962740 µs (97.5%)
Overhead used : 2.165947 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 31.6928 % Variance is moderately inflated by outliers
:rho 3.1315929035585537
Evaluation count : 148434 in 6 samples of 24739 calls.
Execution time mean : 4.070798 µs
Execution time std-deviation : 68.430348 ns
Execution time lower quantile : 4.009978 µs ( 2.5%)
Execution time upper quantile : 4.170038 µs (97.5%)
Overhead used : 2.165947 ns
:rho-prime 100.0 3.1315929035585537
with results for 10^4:
:-----------------------------------------------------------------------------
N => 10000.0
Evaluation count : 1248 in 6 samples of 208 calls.
Execution time mean : 519.096208 µs
Execution time std-deviation : 143.478354 µs
Execution time lower quantile : 454.389510 µs ( 2.5%)
Execution time upper quantile : 767.610509 µs (97.5%)
Overhead used : 2.165947 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 65.1517 % Variance is severely inflated by outliers
:rho 3.1414926535900345
Evaluation count : 1392 in 6 samples of 232 calls.
Execution time mean : 431.020370 µs
Execution time std-deviation : 14.853924 µs
Execution time lower quantile : 420.838884 µs ( 2.5%)
Execution time upper quantile : 455.282989 µs (97.5%)
Overhead used : 2.165947 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
:rho-prime 10000.0 3.1414926535900345
with results for 10^6:
:-----------------------------------------------------------------------------
N => 1000000.0
Evaluation count : 18 in 6 samples of 3 calls.
Execution time mean : 46.080480 ms
Execution time std-deviation : 1.039714 ms
Execution time lower quantile : 45.132049 ms ( 2.5%)
Execution time upper quantile : 47.430310 ms (97.5%)
Overhead used : 2.165947 ns
:rho 3.1415916535897743
Evaluation count : 18 in 6 samples of 3 calls.
Execution time mean : 52.527777 ms
Execution time std-deviation : 17.483930 ms
Execution time lower quantile : 41.789520 ms ( 2.5%)
Execution time upper quantile : 82.539445 ms (97.5%)
Overhead used : 2.165947 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 81.7010 % Variance is severely inflated by outliers
:rho-prime 1000000.0 3.1415916535897743
Note that the times for rho and rho-prime flip-flop for the 10^4 and 10^6 cases. In any case, don't believe or worry much about timings that vary by less than 2x.
Update
I deleted the lazy-seq in the original code since clojure.core/range is already lazy. Also, I've never seen lazy-seq used without a cons and a recursive call to the generating function.
Re clojure.core/range, we have the docs:
range
Returns a lazy seq of nums from start (inclusive) to end (exclusive),
by step, where start defaults to 0, step to 1, and end to infinity.
When step is equal to 0, returns an infinite sequence of start. When
start is equal to end, returns empty list.
In the source code, it calls out into the Java impl of clojure.core:
([start end]
(if (and (instance? Long start) (instance? Long end))
(clojure.lang.LongRange/create start end)
(clojure.lang.Range/create start end)))
& the Java code indicates it is chunked:
public class Range extends ASeq implements IChunkedSeq, IReduce {
private static final int CHUNK_SIZE = 32;
<snip>

Additionally to other answers.
Performance can be significantly increased in you eliminate math boxing (original versions were both about 25ms). And variant with loop/recur is 2× faster.
(set! *unchecked-math* :warn-on-boxed)
(defn term ^double [^long k]
(let [x (/ 1. (inc (* 2 k)))]
(if (even? k)
x
(- x))))
(defn ρ [n]
(reduce
(fn [^double res ^long k] (+ res (term k)))
0
(range 0 n)))
(defn ρ' [^long n]
(loop [res (double 0) k 0]
(if (< k n)
(recur (+ res (term k)) (inc k))
res)))
(criterium.core/quick-bench
(ρ 1000000))
Evaluation count : 42 in 6 samples of 7 calls.
Execution time mean : 15,639294 ms
Execution time std-deviation : 371,972168 µs
Execution time lower quantile : 15,327698 ms ( 2,5%)
Execution time upper quantile : 16,227505 ms (97,5%)
Overhead used : 1,855553 ns
Found 1 outliers in 6 samples (16,6667 %)
low-severe 1 (16,6667 %)
Variance from outliers : 13,8889 % Variance is moderately inflated by outliers
=> nil
(criterium.core/quick-bench
(ρ' 1000000))
Evaluation count : 72 in 6 samples of 12 calls.
Execution time mean : 8,570961 ms
Execution time std-deviation : 302,554974 µs
Execution time lower quantile : 8,285648 ms ( 2,5%)
Execution time upper quantile : 8,919635 ms (97,5%)
Overhead used : 1,855553 ns
=> nil

Below is the improved version to be more representative. Seemingly, the performance varies from case to case but not by that much.
(defn term [k]
(let [x (/ 1. (inc (* 2 k)))]
(if (even? k)
x
(- x))))
(defn ρ1 [n]
(loop [res 0 k 0]
(if (< k n)
(recur (+ res (term k)) (inc k))
res)))
(defn ρ2 [n]
(reduce
(fn [res k] (+ res (term k)))
0
(range 0 n)))
(defn ρ3 [n]
(reduce + 0 (map term (range 0 n))))
(defn ρ4 [n]
(transduce (map term) + 0 (range 0 n)))
(defn calc [ρname ρ n]
(let [start (System/nanoTime)
π (* (ρ n) 4)
end (System/nanoTime)
τ (- end start)]
(printf "ρ=%8s n=%10d π=%.16f τ=%10d\n" ρname n π τ)))
(def args
{:N (map #(long (Math/pow 10 %)) [4 6])
:T 10
:P {:recur ρ1 :reduce ρ2 :mreduce ρ3 :xreduce ρ4}})
(doseq [n (:N args)]
(dotimes [_ (:T args)]
(println "---")
(doseq [kv (:P args)] (calc (key kv) (val kv) n))))

Optimize tail-recursion in Clojure: exponential moving average

I'm new to Clojure and trying to implement an exponential moving average function using tail recursion. After battling a little with stack overflows using lazy-seq and concat, I got to the following implementation which works, but is very slow:
(defn ema3 [c a]
(loop [ct (rest c) res [(first c)]]
(if (= (count ct) 0)
res
(recur
(rest ct)
(into;NOT LAZY-SEQ OR CONCAT
res
[(+ (* a (first ct)) (* (- 1 a) (last res)))]
)
)
)
)
)
For a 10,000 item collection, Clojure will take around 1300ms, whereas a Python Pandas call such as
s.ewm(alpha=0.3, adjust=True).mean()
will only take 700 us. How can I reduce that performance gap? Thank you,

Personally I would do this lazily with reductions. It's simpler to do than using loop/recur or building up a result vector by hand with reduce, and it also means you can consume the result as it is built up, rather than needing to wait for the last element to be finished before you can look at the first one.
If you care most about throughput then I suppose Taylor Wood's reduce is the best approach, but the lazy solution is only very slightly slower and is much more flexible.
(defn ema3-reductions [c a]
(let [a' (- 1 a)]
(reductions
(fn [ave x]
(+ (* a x)
(* (- 1 a') ave)))
(first c)
(rest c))))
user> (quick-bench (dorun (ema3-reductions (range 10000) 0.3)))
Evaluation count : 288 in 6 samples of 48 calls.
Execution time mean : 2.336732 ms
Execution time std-deviation : 282.205842 µs
Execution time lower quantile : 2.125654 ms ( 2.5%)
Execution time upper quantile : 2.686204 ms (97.5%)
Overhead used : 8.637601 ns
nil
user> (quick-bench (dorun (ema3-reduce (range 10000) 0.3)))
Evaluation count : 270 in 6 samples of 45 calls.
Execution time mean : 2.357937 ms
Execution time std-deviation : 26.934956 µs
Execution time lower quantile : 2.311448 ms ( 2.5%)
Execution time upper quantile : 2.381077 ms (97.5%)
Overhead used : 8.637601 ns
nil
Honestly in that benchmark you can't even tell the lazy version is slower than the vector version. I think my version is still slower, but it's a vanishingly trivial difference.
You can also speed things up if you tell Clojure to expect doubles, so it doesn't have to keep double-checking the types of a, c, and so on.
(defn ema3-reductions-prim [c ^double a]
(let [a' (- 1.0 a)]
(reductions (fn [ave x]
(+ (* a (double x))
(* a' (double ave))))
(first c)
(rest c))))
user> (quick-bench (dorun (ema3-reductions-prim (range 10000) 0.3)))
Evaluation count : 432 in 6 samples of 72 calls.
Execution time mean : 1.720125 ms
Execution time std-deviation : 385.880730 µs
Execution time lower quantile : 1.354539 ms ( 2.5%)
Execution time upper quantile : 2.141612 ms (97.5%)
Overhead used : 8.637601 ns
nil
Another 25% speedup, not too bad. I expect you could squeeze out a bit more by using primitives in either a reduce solution or with loop/recur if you were really desperate. It would be especially helpful in a loop because you wouldn't have to keep boxing and unboxing the intermediate results between double and Double.

If res is a vector (which it is in your example) then using peek instead of last yields much better performance:
(defn ema3 [c a]
(loop [ct (rest c) res [(first c)]]
(if (= (count ct) 0)
res
(recur
(rest ct)
(into
res
[(+ (* a (first ct)) (* (- 1 a) (peek res)))])))))
Your example on my computer:
(time (ema3 (range 10000) 0.3))
"Elapsed time: 990.417668 msecs"
Using peek:
(time (ema3 (range 10000) 0.3))
"Elapsed time: 9.736761 msecs"
Here's a version using reduce that's even faster on my computer:
(defn ema3 [c a]
(reduce (fn [res ct]
(conj
res
(+ (* a ct)
(* (- 1 a) (peek res)))))
[(first c)]
(rest c)))
;; "Elapsed time: 0.98824 msecs"
Take these timings with a grain of salt. Use something like criterium for more thorough benchmarking. You might be able to squeeze out more gains using mutability/transients.

Memoize a Clojure function that takes a lazy sequence as input

How can I have memoize work when the argument to a memoised function is a sequence
(defn foo
([x] (println "Hello First") (reduce + x))
([x y] (println "Hello Second") (reduce + (range x y))))
(def baz (memoize foo))
Passing one arg:
1)
(time (baz (range 1 1000000))) ;=> Hello First "Elapsed time: 14.870628 msecs"
2)
(time (baz (range 1 1000000))) ;=> "Elapsed time: 65.386561 msecs"
Passing 2 args:
1)
(time (baz 1 1000000)) ;=> Hello Second "Elapsed time: 18.619768 msecs"
2)
(time (baz 1 1000000)) ;=> "Elapsed time: 0.069684 msecs"
The second run of the function when passed 2 arguments seems to be what I expect.
However using a vector appears to work...
(time (baz [1 2 3 5 3 5 7 4 6 7 4 45 6 7])) ;=> Hello First "Elapsed time: 0.294963 msecs"
(time (baz [1 2 3 5 3 5 7 4 6 7 4 45 6 7])) ;=> "Elapsed time: 0.068229 msecs"

memoize does work with sequences, you just need to compare apples to apples. memoize looks up the parameter in the hash map of previously used ones, and as a result you end up comparing the sequences. Comparing long sequences is what takes a long time, whether they are vectors or not:
user> (def x (vec (range 1000000)))
;; => #'user/x
user> (def y (vec (range 1000000)))
;; => #'user/y
user> (time (= x y))
"Elapsed time: 64.351274 msecs"
;; => true
user> (time (baz x))
"Elapsed time: 67.42694 msecs"
;; => 499999500000
user> (time (baz x))
"Elapsed time: 73.231174 msecs"
;; => 499999500000
When you use very short input sequences, the timing is dominated by the reduce inside your function. But with very long ones most of the time you see is actually the comparison time inside memoize.
So technically memoize works, in the same way for all sequences. But working "technically" doesn't imply "being useful." As you have discovered yourself, it is useless (actually maybe even harmful) for input with expensive comparison semantics. Your second signature solves this problem.

PI Approximation: Why is my declarative version slower?

I'm approximating PI using the series:
The function for the series then looks like this:
(defn- pi-series [k]
(/ (if (even? (inc k)) 1 -1)
(dec (* 2 k))))
And then my series generator looks like *:
(defn pi [n]
(* 4
(loop [k 1
acc 0]
(if (= k (inc n))
acc
(recur (inc k)
(+ acc (double (pi-series k))))))))
Running pi with the value 999,999 produces the following:
(time (pi 999999))
;;=> "Elapsed time: 497.686 msecs"
;;=> 3.1415936535907734
That looks great, but I realize pi could be written more declarative. Here's what I ended up with:
(defn pi-fn [n]
(* 4 (reduce +
(map #(double (pi-series %))
(range 1 (inc n))))))
Which resulted in the following:
(time (pi-fn 999999))
;;=> "Elapsed time: 4431.626 msecs"
;;=> 3.1415936535907734
NOTE: The declarative version took around 4-seconds longer. Why?
Why is the declarative version so much slower? How can I update the declarative version to make it as fast as the imperative version?
I'm casting the result of pi-series to a double, because using clojure's ratio types performed a lot slower.

By the way, you can express an alternating finite sum as a difference of two sums, eliminating the need to adjust each term for sign individually. For example,
(defn alt-sum [f n]
(- (apply + (map f (range 1 (inc n) 2)))
(apply + (map f (range 2 (inc n) 2)))))
(time (* 4 (alt-sum #(/ 1.0 (dec (+ % %))) 999999)))
; "Elapsed time: 195.244047 msecs"
;= 3.141593653590707
On my laptop pi runs at 2500 msec. However, pi and pi-fn (either version) run at approx. the same rate (10x slower than alt-sum). More often than not, pi-fn is faster than pi. Are you sure you didn't accidentally insert an extra 9 before the second timing? Contra Juan, I do not think you're iterating over the sequence more than once, since the terms are generated lazily.
scratch.core> (time (pi 999999))
"Elapsed time: 2682.86669 msecs"
3.1415936535907734
scratch.core> (time (pi-fn 999999))
"Elapsed time: 2082.071798 msecs"
3.1415936535907734
scratch.core> (time (pi-fn-juan 999999))
"Elapsed time: 1934.976217 msecs"
3.1415936535907734
scratch.core> (time (* 4 (alt-sum #(/ 1.0 (dec (+ % %))) 999999)))
"Elapsed time: 199.998438 msecs"
3.141593653590707

Get all moving k-sized partitions of a string

I'm trying to get all "moving" partitions sized k of a string. Basically, I want to move a window of sized k along the string and get that k-word.
Here's an example,
k: 3
Input: ABDEFGH
Output: ABD, EFG, BDE, FGH, DEF
My idea was to walk along the input, drop a head and partition and then drop a head again from the previously (now headless) sequence, but I'm not sure exactly how to do this...Also, maybe there's a better way of doing this? Below is the idea I had in mind.
(#(partition k input) (collection of s where head was consecutively dropped))

Strings in Clojure can be treated as seqs of characters, so you can partition them directly. To get a sequence of overlapping partitions, use the version that accepts a size and a step:
user> (partition 3 1 "abcdef")
((\a \b \c) (\b \c \d) (\c \d \e) (\d \e \f))
To put a character sequence back into a string, just apply str to it:
user> (apply str '(\a \b \c))
"abc"
To put it all together:
user> (map (partial apply str) (partition 3 1 "abcdef"))
("abc" "bcd" "cde" "def")

Here are implementations of partition and partition-all for strings, returning a lazy-seq of strings, doing the splitting using subs. If you need high performance doing string transformations, these will be significantly faster (by average 8 times as fast, see criterium benchmarks below) than creating char-seqs.
(defn partition-string
"Like partition, but splits string using subs and returns a
lazy-seq of strings."
([n s]
(partition-string n n s))
([n p s]
(lazy-seq
(if-not (< (count s) n)
(cons
(subs s 0 n)
(->> (subs s p)
(partition-string n p)))))))
(defn partition-string-all
"Like partition-all, but splits string using subs and returns a
lazy-seq of strings."
([n s]
(partition-string-all n n s))
([n p s]
(let [less (if (< (count s) n)
(count s))]
(lazy-seq
(cons
(subs s 0 (or less n))
(if-not less
(->> (subs s p)
(partition-string-all n p))))))))
;; Alex answer:
;; (let [test-str "abcdefghijklmnopqrstuwxyz"]
;; (criterium.core/bench
;; (doall
;; (map (partial apply str) (partition 3 1 test-str)))))
;; WARNING: Final GC required 1.010207840526515 % of runtime
;; Evaluation count : 773220 in 60 samples of 12887 calls.
;; Execution time mean : 79.900801 µs
;; Execution time std-deviation : 2.008823 µs
;; Execution time lower quantile : 77.725304 µs ( 2.5%)
;; Execution time upper quantile : 83.888349 µs (97.5%)
;; Overhead used : 17.786101 ns
;; Found 3 outliers in 60 samples (5.0000 %)
;; low-severe 3 (5.0000 %)
;; Variance from outliers : 12.5585 % Variance is moderately inflated by outliers
;; KobbyPemson answer:
;; (let [test-str "abcdefghijklmnopqrstuwxyz"]
;; (criterium.core/bench
;; (doall
;; (moving-partition test-str 3))))
;; WARNING: Final GC required 1.674347646128195 % of runtime
;; Evaluation count : 386820 in 60 samples of 6447 calls.
;; Execution time mean : 161.928479 µs
;; Execution time std-deviation : 8.362590 µs
;; Execution time lower quantile : 154.707888 µs ( 2.5%)
;; Execution time upper quantile : 184.095816 µs (97.5%)
;; Overhead used : 17.786101 ns
;; Found 3 outliers in 60 samples (5.0000 %)
;; low-severe 2 (3.3333 %)
;; low-mild 1 (1.6667 %)
;; Variance from outliers : 36.8985 % Variance is moderately inflated by outliers
;; This answer
;; (let [test-str "abcdefghijklmnopqrstuwxyz"]
;; (criterium.core/bench
;; (doall
;; (partition-string 3 1 test-str))))
;; WARNING: Final GC required 1.317098148979236 % of runtime
;; Evaluation count : 5706000 in 60 samples of 95100 calls.
;; Execution time mean : 10.480174 µs
;; Execution time std-deviation : 240.957206 ns
;; Execution time lower quantile : 10.234580 µs ( 2.5%)
;; Execution time upper quantile : 11.075740 µs (97.5%)
;; Overhead used : 17.786101 ns
;; Found 3 outliers in 60 samples (5.0000 %)
;; low-severe 3 (5.0000 %)
;; Variance from outliers : 10.9961 % Variance is moderately inflated by outliers

(defn moving-partition
[input k]
(map #(.substring input % (+ k %))
(range (- (count input) (dec k)))))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Rust vs. Clojure speed comparasion, any improvement for the Clojure code? - clojure

Related

Why reduce performs much sub-optimal than recur in this case?

Optimize tail-recursion in Clojure: exponential moving average

Memoize a Clojure function that takes a lazy sequence as input

PI Approximation: Why is my declarative version slower?

Get all moving k-sized partitions of a string

Categories

Resources