I am new to Clojure programming, and would like to know what is the idiomatic way to do the following thing:
I would like to sum a collection of numbers nums, which may contains a large number of numbers, let's assume there are only positive numbers.
I don't care the exact sum if the sum is very large. For example, if the sum of the numbers is larger than 9999, I would simply return 10000 without summing the remaining numbers at all.
If I implement it with some OO language such as Java, I may do it like below:
private int sum(int[] nums) {
int sum = 0;
for(int n : nums) {
if(sum > 9999) {
sum = 10000;
break;
} else {
sum += n;
}
}
return sum;
}
A naive implementation in Clojure may look like:
(let [sum (reduce + nums)]
(if (> sum 9999) 10000 sum))
However, this seems to waste some CPU resource to sum the entire collection of numbers, which is not desired. I am looking for something like take-while function but for reduce, but cannot find it. Is there something like:
(reduce-while pred f val coll)
Or is there any other Clojure idiomatic way to solve this problem? I think the solution can be applied to a set of problems requiring similar logic.
Any comment is appreciated. Thanks.
If you're using Clojure 1.5.x then you may take advantage of new reduced function:
(reduce #(if (> %1 9999) (reduced 10000) (+ %1 %2)) nums)
One of the lesser known Clojure functions seems to be reductions. It will give you all the intermediate results of your computation:
(reductions + (range 4)) ;; => (0 1 3 6)
(reduce + (range 4)) ;; => 6
The last element of reductions' result seq will be the reduced value. There are multiple ways to enforce your predicate, e.g. using some:
(let [sums (reductions + nums)]
(if (some #(> % 9999) sums)
10000
(last sums)))
The reduce/reduced version given by #leonid-beschastny is probably faster (no lazy sequence overhead, reducers, ...) but this one will work in earlier Clojure versions, too.
Related
Clojure provides means for lazy evaluation of values in (infinite) sequences. With this, values will only be computed when they get actually consumed.
An example of an infinite sequence of one repeated element:
(take 3 (repeat "Hello StackOverflow"))
//=> ("Hello StackOverflow" "Hello StackOverflow" "Hello StackOverflow")
Using take helps to only consume as many elements from the sequence as we want. Without it, an OutOfMemoryError would kill the process quickly.
Another example of an infinite sequence is the following:
(take 5 (iterate inc 1))
//(1 2 3 4 5)
Or a more advanced sequence providing the factorial function:
((defn factorial [n]
(apply * (take n (iterate inc 1)))) 5)
Does Kotlin provide similar sequences? How do they look like?
I answered the question myself in order to document the knowledge here. This is fine according to Can I answer my own question?
In Kotlin, we can also make use of lazy evaluation using Sequences, too. In order to create a sequence, we may use generateSequence (with or without providing a seed.
fun <T : Any> generateSequence(
seed: T?,
nextFunction: (T) -> T?
): Sequence<T> (source)
Returns a sequence defined by the starting value seed and the function nextFunction, which is invoked to calculate the next value based on the previous one on each iteration.
The following will show some examples comparing Clojure with Kotlin sequences.
1. A simple take from an infinite sequence of one static value
Clojure
(take 3 (repeat "Hello StackOverflow"))
Kotlin
generateSequence { "Hello StackOverflow" }.take(3).toList()
These are pretty similar. In Clojure we can use repeat and in Kotlin it's simply generateSequence with a static value that will be yielded for ever. In both cases, take is being used in order to define the number of elements we want to compute.
Note: In Kotlin, we transform the resulting sequence into a list with toList()
2. A simple take from an infinite sequence of an dynamic value
Clojure
(take 5 (iterate inc 1))
Kotlin
generateSequence(1) { it.inc() }.take(5).toList()
This example is a bit different because the sequences yield the increment of the previous value infinitely. The Kotlin generateSequence can be invoked with a seed (here: 1) and a nextFunction (incrementing the previous value).
3. A cyclic repetition of values from a list
Clojure
(take 5 (drop 2 (cycle [:first :second :third ])))
// (:third :first :second :third :first)
Kotlin
listOf("first", "second", "third").let { elements ->
generateSequence(0) {
(it + 1) % elements.size
}.map(elements::get)
}.drop(2).take(5).toList()
In this example, we repeat the values of a list cyclically, drop the first two elements and then take 5. It happens to be quite verbose in Kotlin because repeating elements from the list isn't straightforward. In order to fix it, a simple extension function makes the relevant code more readable:
fun <T> List<T>.cyclicSequence() = generateSequence(0) {
(it + 1) % this.size
}.map(::get)
listOf("first", "second", "third").cyclicSequence().drop(2).take(5).toList()
4. Factorial
Last but not least, let's see how the factorial problem can be solved with a Kotlin sequence. First, let's review the Clojure version:
Clojure
(defn factorial [n]
(apply * (take n (iterate inc 1))))
We take n values from a sequence that yields an incrementing number starting with 1 and accumulate them with the help of apply.
Kotlin
fun factorial(n: Int) = generateSequence(1) { it.inc() }.take(n).fold(1) { v1, v2 ->
v1 * v2
}
Kotlin offers fold which let's us accumulate the values easily.
I thought I understood recur, but the following usage doesn't make sense:
(fn gcd [a b]
(if (= b 0)
a
(recur b (rem a b))))
The function retrieves the greatest common divisor for two numbers. For 4 and 2, the function would give 2.
I know that recur can be bound to functions, but I would think that 'b' is just cycled through the recur without any change. You generally need to put in something like a (inc b) to allow the value in the loop to change.
What am I missing?
The gcd function here uses the Euclidean algorithm to find the Greatest Common Divisor of two numbers.
The function works and does terminate because the argument list contains [a b] but the recur is called for b, (rem a b). Note that the place of b is changed here (from seond place to first place).
The value of a is changed because the value of b is assigned to it. Also, the value of b is changed because (rem a b) is assigned to it (thus decreasing). Therefore both values decrease when the calls are repeated and eventually one of them reaches 0 (that stops the recursion).
(fn gcd [a b]
(if (= b 0) a
(recur b (rem a b))))
For example I call this function with argument a = 24, b = 16.
This function is called recursively as long as b isn't zero.
(gcd 24 16)
=> (gcd 16 8)) #_"because b=24 doesn't equal to zero and 8 is the reminder of 24/16"
=> (gcd 8 0) #_"0 is the reminder of 16/8"
=> 8
This calculation stops because b reaches zero.
Yes you want to change the values in a recursive call so eventually your test will succeed, and you'll break out of the recursion. This algorithm does exactly that by sending the new first-parameter with the value of the old second-parameter, and the new second-parameter is recomputed based upon the old first and old second-parameters.
Try adding something like (println "a:" a "b:" b) before the if statement in the function, and you'll see the values cycling through as it seeks the answer.
I would like to loop through a vector starting from the nth element not 0;
How it looks like in Java:
for(int i = firstIndex; i <= lastIndex; i++) {
newText += contents[i] + " ";
}
If you are always dealing with a vector, then subvec is a good choice. For example:
(subvec contents firstIndex)
If you want to be compatible with sequences in general, you'll want to use drop. drop is O(n) w.r.t. the number of elements dropped, which subvec is always O(1). If you're only ever dropping a few elements, the difference is negligible. But for dropping a large number of elements (i.e., large firstIndex), subvec will be a clear winner. But subvec is only available on vectors.
You can use the drop function to skip first n elements and then loop over the result. For example, if you want to skip two first elements:
user=> (drop 2 [1 2 3 4])
(3 4)
The following can possibly do the same that the Java form you provided:
(require '[clojure.string :as str])
(str/join " " (drop first-index contents))
As we know, (map f [a b c]) is equivalent to [(f a) (f b) (f c)].
My question is: The evaluation result of (map #(- (int %) (int \0)) "1234") is (1 2 3 4), why does it return the results of applying #(- (int %) (int \0)) to every digits of "1234", rather than the string "1234" as a whole? How should I understand this code example?
map calls seq on all arguments after the first. seq turns a string into a sequence of characters.
Clojure can treat a string as a sequence - of characters. This is useful because you can:
map things over the string
partition the string
get locations by index
do everything else sequences do.
It's perhaps a bit annoying having to remember to put the resulting sequence back into a string by wrapping the sequence manipulating expression in a call to str.
I may be misinterpreting the exact phrasing on the question about which I've been thinking, but I'm curious as to how one would reduce an equation of multiple variables.
I'm assuming factoring plays a major role, however the only way I can think of doing it is to break the equation into a tree of operations and search the entire tree for duplicate nodes. I'm assuming that there's a better way, since many web applications do this quite quickly.
Any better way of doing this?
I would assume that the kind of reductions that they are looking for is that something like (2 + 3) * x should become (* 5 x) rather than (* (+ 2 3) x). In which case you can just recognize that subtrees are constant, and calculate them.
You can also use the associative and commutative laws to try to move things around first to assist in the process. So that 2 + x + 3 would become (+ 5 x) rather than (+ (+ 2 x) 3).
Take this idea as far as you want. It has been deliberately given in an open-ended fashion. I'm sure that they would be happy to see you automatically recognize that x * x + 2 * x + 1 is (* (+ 1 x) (+ 1 x)) instead of (+ (+ (* x x) (* 2 x)) 1) but you can do a lot of good reductions without going there.
The general solution is to write flex\bison translator and to reduce parsed expressions. When you have created a translation flow you can add rules like expr*expr + 2*expr + 1 -> (*expr expr) as simple as I write it here.
You can do it using a stack. It is much simpler that way. A solution is posted here
http://bluefintuna.wordpress.com/2008/07/15/infix-prefix/
One Problem is here what one considers as reduced: E.g. (I write it for better readeability in infix - but prefix it is similar) x * x + x + 2 + 2 * x, one obvious reduction would be x * x + 3 * x + 2, another one would be (x + 1) * (x + 2). As this are quite untrivial problems and from the wording of the assignment, I would assume you bring your equation in some canonical form (e.g polynomial, sorted with highest power to lowest (when you have more than one variable take to the sum of the powers) and reduce there the coefficients (calculate them when constant)).
Be careful that some optimizations may seem valid, but are not in general. E.g. dont reduce (/ (* x x) x) to x, as there the solution 0 is suddenly valid, which it was not before.