I often have to run my data through a function if the data fulfill certain criteria. Typically, both the function f and the criteria checker pred are parameterized to the data. For this reason, I find myself wishing for a higher-order if-then-else which knows neither f nor pred.
For example, assume I want to add 10 to all even integers in (range 5). Instead of
(map #(if (even? %) (+ % 10) %) (range 5))
I would prefer to have a helper –let's call it fork– and do this:
(map (fork even? #(+ % 10)) (range 5))
I could go ahead and implement fork as function. It would look like this:
(defn fork
([pred thenf elsef]
#(if (pred %) (thenf %) (elsef %)))
([pred thenf]
(fork pred thenf identity)))
Can this be done by elegantly combining core functions? Some nice chain of juxt / apply / some maybe?
Alternatively, do you know any Clojure library which implements the above (or similar)?
As Alan Thompson mentions, cond-> is a fairly standard way of implicitly getting the "else" part to be "return the value unchanged" these days. It doesn't really address your hope of being higher-order, though. I have another reason to dislike cond->: I think (and argued when cond-> was being invented) that it's a mistake for it to thread through each matching test, instead of just the first. It makes it impossible to use cond-> as an analogue to cond.
If you agree with me, you might try flatland.useful.fn/fix, or one of the other tools in that family, which we wrote years before cond->1.
to-fix is exactly your fork, except that it can handle multiple clauses and accepts constants as well as functions (for example, maybe you want to add 10 to other even numbers but replace 0 with 20):
(map (to-fix zero? 20, even? #(+ % 10)) xs)
It's easy to replicate the behavior of cond-> using fix, but not the other way around, which is why I argue that fix was the better design choice.
1 Apparently we're just a couple weeks away from the 10-year anniversary of the final version of fix. How time flies.
I agree that it could be very useful to have some kind of higher-order functional construct for this but I am not aware of any such construct. It is true that you could implement a higher order fork function, but its usefulness would be quite limited and can easily be achieved using if or the cond-> macro, as suggested in the other answers.
What comes to mind, however, are transducers. You could fairly easily implement a forking transducer that can be composed with other transducers to build powerful and concise sequence processing algorithms.
The implementation could look like this:
(defn forking [pred true-transducer false-transducer]
(fn [step]
(let [true-step (true-transducer step)
false-step (false-transducer step)]
(fn
([] (step))
([dst x] ((if (pred x) true-step false-step) dst x))
([dst] dst))))) ;; flushing not performed.
And this is how you would use it in your example:
(eduction (forking even?
(map #(+ 10 %))
identity)
(range 20))
;; => (10 1 12 3 14 5 16 7 18 9 20 11 22 13 24 15 26 17 28 19)
But it can also be composed with other transducers to build more complex sequence processing algorithms:
(into []
(comp (forking even?
(comp (drop 4)
(map #(+ 10 %)))
(comp (filter #(< 10 %))
(map #(vector % % %))
cat))
(partition-all 3))
(range 20))
;; => [[18 20 11] [11 11 22] [13 13 13] [24 15 15] [15 26 17] [17 17 28] [19 19 19]]
Another way to define fork (with three inputs) could be:
(defn fork [pred then else]
(comp
(partial apply apply)
(juxt (comp {true then, false else} pred) list)))
Notice that in this version the inputs and output can receive zero or more arguments. But let's take a more structured approach, defining some other useful combinators. Let's start by defining pick which corresponds to the categorical coproduct (sum) of morphisms:
(defn pick [actions]
(fn [[tag val]]
((actions tag) val)))
;alternatively
(defn pick [actions]
(comp
(partial apply apply)
(juxt (comp actions first) rest)))
E.g. (mapv (pick [inc dec]) [[0 1] [1 1]]) gives [2 0]. Using pick we can define switch which works like case:
(defn switch [test actions]
(comp
(pick actions)
(juxt test identity)))
E.g. (mapv (switch #(mod % 3) [inc dec -]) [3 4 5]) gives [4 3 -5]. Using switch we can easily define fork:
(defn fork [pred then else]
(switch pred {true then, false else}))
E.g. (mapv (fork even? inc dec) [0 1]) gives [1 0]. Finally, using fork let's also define fork* which receives zero or more predicate and action pairs and works like cond:
(defn fork* [& args]
(->> args
(partition 2)
reverse
(reduce
(fn [else [pred then]]
(fork pred then else))
identity)))
;equivalently
(defn fork* [& args]
(->> args
(partition 2)
(map (partial apply (partial partial fork)))
(apply comp)
(#(% identity))))
E.g. (mapv (fork* neg? -, even? inc) [-1 0 1]) gives [1 1 1].
Depending on the details, it is often easiest to accomplish this goal using the cond-> macro and friends:
(let [myfn (fn [val]
(cond-> val
(even? val) (+ val 10))) ]
with result
(mapv myfn (range 5)) => [10 1 14 3 18]
There is a variant in the Tupelo library that is sometimes helpful:
(mapv #(cond-it-> %
(even? it) (+ it 10))
(range 5))
that allows you to use the special symbol it as you thread the value through multiple stages.
As the examples show, you have the option to define and name the transformer function (my favorite), or use the function literal syntax #(...)
Related
I want to do the following in Clojure as idiomatically as possible:
transduce a collection
associate each element of the input collection with the corresponding element in the output collection
return the result in a hashmap
Is there a succinct way to do this using core library functions?
If not, what improvements can you suggest to the following implementation?
(defn to-hash [coll xform]
(reduce
merge
(map
#(apply hash-map %)
(mapcat hash-map coll (into [] xform coll)))))
something like this should do the trick without intermediate collections:
(defn process [data xform]
(zipmap data (eduction xform data)))
user> (process [1 2 3] (comp (map inc) (map #(* % %))))
;;=> {1 4, 2 9, 3 16}
the docs on eduction say the following:
Returns a reducible/iterable application of the transducers
to the items in coll. Transducers are applied in order as if
combined with comp. Note that these applications will be
performed every time reduce/iterator is called.
so no additional collection is created.
This is any good, of course, as long as there is one-to-one relationship between input and output elements. What is desired output for (process [1 -2 3] (filter pos?)) or (process [1 1 1 2 2 2] (dedupe)) ?
(by the way, your to-hash implementation has the same flaw)
A transducer is a function that takes a reducing function and returns a new reducing function. To make it work with transducers where there is not a one-to-one mapping from elements in the input collection to the output, you will have to use your transducer to create a new reducing function (step2 in the code below) that will associate elements into your hash map. Something like this.
(def ^:dynamic assoc-k nil)
(defn assoc-step [dst x]
(assoc dst assoc-k x))
(defn to-hash [coll xform]
(let [step (xform (completing assoc-step))
step2 (fn [dst x] (binding [assoc-k x] (step dst x)))]
(reduce step2 {} coll)))
This implementation is quite basic and I am not sure to which extent it will work with stateful transducers. But it will work with the stateless ones, such as map and filter.
And we can test it with a transducer that keeps odd elements in the input collection and squares them:
(defn square [x] (* x x))
(to-hash (range 10) (comp (filter odd?) (map square)))
;; => {1 1, 3 9, 5 25, 7 49, 9 81}
(defn DoubleFrequency []
(def s (slurp "Example.txt"))
(def m (reduce #(assoc %1 %2 (inc (%1 %2 0)))
{}
(re-seq #".." s)))
(def c (count m))
(doseq [[k x] m]
(println k ":" (/ x c))))
I'm trying to apply concurrency to my program, and I want to use pmap, but I'm not sure how to work it into my current code here. The functionality is correct for single core, but Ideally I want to replace reduce with pmap in some way and achieve the same results.
first of all, the function you're trying to make up, is called frequencies:
user> (frequencies [1 2 1 3 1 4 4])
;;=> {1 3, 2 1, 3 1, 4 2}
it is, indeed, single threaded. So let's try to make it parallel.
the initial approach with reduce is the right direction, though it's not parallel either, it could be employed to make the parallel one with clojure's standard library concurrency facilities, namely reducers.
first of all, let's rewrite your reducer function a bit, to do the same thing, but in a more idiomatic way (it is optional, but good for readability):
#(assoc %1 %2 (inc (%1 %2 0))) => #(update %1 %2 (fnil inc 0))
then we can approach to the parallel reduce with fold:
(require '[clojure.core.reducers :as r])
(defn pfreq [data]
(r/fold
(partial merge-with +)
(fn [acc k] (update acc k (fnil inc 0)))
data))
the idea is that it splits your collection by chunks (if it is long enough), and then combines chunks' results with merge-with:
user> (pfreq [1 2 1 3 1 4 1 5 2])
;;=> {1 4, 2 2, 3 1, 4 1, 5 1}
notice also, that the collection should be 'foldable'. By default, persistent vectors and maps are foldable, re-seq result is not, so you should first convert it into vector: (vec (re-seq #"..x" s)), otherwise you won't get any parallelization, falling back to plain reduce.
You can obviously approach to this one with pmap, with the same strategy: split -> map -> combine:
(defn pfreq2 [chunk-size data]
(->> data
(partition-all chunk-size)
(pmap frequencies)
(apply merge-with +)))
but this is not as flexible and powerful, as the reducers pipelines.
I'm currently learning Clojure, and I'm trying to learn how to do things the best way. Today I'm looking at the basic concept of doing things on a sequence, I know the basics of map, filter and reduce. Now I want to try to do a thing to pairs of elements in a sequence, and I found two ways of doing it. The function I apply is println. The output is simply 12 34 56 7
(def xs [1 2 3 4 5 6 7])
(defn work_on_pairs [xs]
(loop [data xs]
(if (empty? data)
data
(do
(println (str (first data) (second data)))
(recur (drop 2 data))))))
(work_on_pairs xs)
I mean, I could do like this
(map println (zipmap (take-nth 2 xs) (take-nth 2 (drop 1 xs))))
;; prints [1 2] [3 4] [5 6], and we loose the last element because zip.
But it is not really nice.. My background is in Python, where I could just say zip(xs[::2], xs[1::2]) But I guess this is not the Clojure way to do it.
So I'm looking for suggestions on how to do this same thing, in the best Clojure way.
I realize I'm so new to Clojure I don't even know what this kind of operation is called.
Thanks for any input
This can be done with partition-all:
(def xs [1 2 3 4 5 6 7])
(->> xs
(partition-all 2) ; Gives ((1 2) (3 4) (5 6) (7))
(map (partial apply str)) ; or use (map #(apply str %))
(apply println))
12 34 56 7
The map line is just to join the pairs so the "()" don't end up in the output.
If you want each pair printed on its own line, change (apply println) to (run! println). Your expected output seems to disagree with your code, so that's unclear.
If you want to dip into transducers, you can do something similar to the threading (->>) form of the accepted answer, but in a single pass over the data.
Assuming
(def xs [1 2 3 4 5 6 7])
has been evaluated already,
(transduce
(comp
(partition-all 2)
(map #(apply str %)))
conj
[]
xs)
should give you the same output if you wrap it in
(apply println ...)
We supply conj (reducing fn) and [] (initial data structure) to specify how the reduce process inside transduce should build up the result.
I wouldn't use a transducer for a list that small, or a process that simple, but it's good to know what's possible!
I have a collection (a Java List) of tens of thousands of elements and I'm writing a Clojure function that needs to split this list into several parts based on predicates. In the end I have several Clojure collections with only elements matching the predicate associated with the collection.
The following code solves my problem but iterates over the input list 3 times. Is there a better way to do this?
(defn divide-into-groups [col]
(let [one (filter #(< % 3) col)
two (filter #(and (>= % 3) (< % 6)) col)
three (filter #(>= % 6) col)]
[one two three]))
(divide-into-groups (shuffle (range 10)))
;[(2 0 1) (4 3 5) (6 8 7 9)]
I'm really looking for a functional Clojure solution. I already know I could create three collections as vars and mutate them inside the divide-into-groups function and maybe that is the Clojure way. If so, then please say so.
(NOTE: the predicates I use above are not the ones in my production code. The data I'm working with is also not numbers. This is just a SSCCE. The answer to this question must be applicable to the general problem with arbitrary data in the collection and arbitrary predicates. And of course, performant. To be clear, the lazy lists returned by filter will all be completely iterated over and used to generate some output. So I cannot rely on lazy solutions ;-)
This is what group-by is for. The only thing you need other than your predicates is to give each of your predicate groups a "name" to dictate what group it will be in:
(defn divide-into-groups [xs]
(let [group (fn [x] (cond (>= x 6) :large
(>= 6 x 3) :medium
:else :small))]
(group-by group xs)))
user> (divide-into-groups (shuffle (range 10)))
{:small [1 2 0], :large [6 9 8 7], :medium [3 4 5]}
You could use partition-by[1].
(partition-by (fn [x] (cond (< x 3) :coll-1
(and (>= x 3) (< x 6)) :coll-2
(>= x 6) :coll-3))
(range 10))
The required function can be constructed programmatically from the sequence of predicate functions. The unique value, ie :coll-1, :coll-2 etc can be anything, even the index of the predicate in the sequence.
EDIT:
;; updated to use map-indexed and some-fn as suggested by #Andre
(defn partitions
[preds coll]
(let [party-fn (apply some-fn
(map-indexed (fn [idx pred]
#(when (pred %1) idx))
preds))]
(partition-by party-fn coll)))
;; output
(partitions [ #(< %1 3) #(<= 3 %1 5) #(>= %1 6)] (range 10))
((0 1 2) (3 4 5) (6 7 8 9))
[1] - https://clojuredocs.org/clojure.core/partition-by
I am learning Clojure and trying to solve Project's Euler (http://projecteuler.net/) problems using this language.
Second problem asks to find the sum of the even-valued terms in Fibonacci sequence whose values do not exceed four million.
I've tried several approaches and would find next one most accurate if I could find where it's broken. Now it returns 0. I am pretty sure there is a problem with take-while condition but can't figure it out.
(reduce +
(take-while (and even? (partial < 4000000))
(map first (iterate (fn [[a b]] [b (+ a b)]) [0 1]))))
To compose multiple predicates in this way, you can use every-pred:
(every-pred even? (partial > 4000000))
The return value of this expression is a function that takes an argument and returns true if it is both even and greater than 4000000, false otherwise.
user> ((partial < 4000000) 1)
false
Partial puts the static arguments first and the free ones at the end, so it's building the opposite of what you want. It is essentially producing #(< 4000000 %) instead of #(< % 4000000) as you intended, So just change the > to <:
user> (reduce +
(take-while (and even? (partial > 4000000))
(map first (iterate (fn [[a b]] [b (+ a b)]) [0 1]))))
9227464
or perhaps it would be more clear to use the anonymous function form directly:
user> (reduce +
(take-while (and even? #(< % 4000000))
(map first (iterate (fn [[a b]] [b (+ a b)]) [0 1]))))
9227464
Now that we have covered a bit about partial, let's break down a working solution. I'll use the thread-last macro ->> to show each step separately.
user> (->> (iterate (fn [[a b]] [b (+ a b)]) [0 1]) ;; start with the fibs
(map first) ;; keep only the answer
(take-while #(< % 4000000)) ;; stop when they get too big
(filter even?) ;; take only the even? ones
(reduce +)) ;; sum it all together.
4613732
From this we can see that we don't actually want to compose the predicates evan? and less-than-4000000 on a take-while because this would stop as soon as either condition was true leaving only the number zero. Rather we want to use one of the predicates as a limit and the other as a filter.