Is there a reducing function in Clojure that performs the equivalent of `first`? - clojure

I'm often writing code of the form
(->> init
(map ...)
(filter ...)
(first))
When converting this into code that uses transducers I'll end up with something like
(transduce (comp (map ...) (filter ...)) (completing #(reduced %2)) nil init)
Writing (completing #(reduced %2)) instead of first doesn't sit well with me at all. It needlessly obscures a very straightforward task. Is there a more idiomatic way of performing this task?

I'd personally use your approach with a custom reducing function but here are some alternatives:
(let [[x] (into [] (comp (map inc) (filter even?) (take 1)) [0 1])]
x)
Using destructing :/
Or:
(first (eduction (map inc) (filter even?) [0 1])
Here you save on calling comp which is done for you. Though it's not super lazy. It'll realize up to 32 elements so it's potentially wasteful.
Fixed with a (take 1):
(first (eduction (map inc) (filter even?) (take 1) [0 1]))
Overall a bit shorter and not too unclear compared to:
(transduce (comp (map inc) (filter even?) (take 1)) (completing #(reduced %2)) nil [0 1])
If you need this a bunch, then I'd probably NOT create a custom reducer function but instead a function similar to transduce that takes xform, coll as the argument and returns the first value. It's clearer what it does and you can give it a nice docstring. If you want to save on calling comp you can also make it similar to eduction:
(defn single-xf
"Returns the first item of transducing the xforms over collection"
{:arglists '([xform* coll])}
[& xforms]
(transduce (apply comp (butlast xforms)) (completing #(reduced %2)) nil (last xforms)))
Example:
(single-xf (map inc) (filter even?) [0 1])

medley has find-first with a transducer arity and xforms has a reducing function called last. I think that the combination of the two is what you're after.
(ns foo.bar
(:require
[medley.core :as medley]
[net.cgrand.xforms.rfs :as rfs]))
(transduce (comp (map ,,,) (medley/find-first ,,,)) rfs/last init)

Related

Dispatching function calls on different formats of maps

I'm writing an agar.io clone. I've lately seen a lot of suggestions to limit use of records (like here), so I'm trying to do the whole project only using basic maps.*
I ended up creating constructors for different "types" of bacteria like
(defn new-bacterium [starting-position]
{:mass 0,
:position starting-position})
(defn new-directed-bacterium [starting-position starting-directions]
(-> (new-bacterium starting-position)
(assoc :direction starting-directions)))
The "directed bacterium" has a new entry added to it. The :direction entry will be used to remember what direction it was heading in.
Here's the problem: I want to have one function take-turn that accepts the bacterium and the current state of the world, and returns a vector of [x, y] indicating the offset from the current position to move the bacterium to. I want to have a single function that's called because I can think right now of at least three kinds of bacteria that I'll want to have, and would like to have the ability to add new types later that each define their own take-turn.
A Can-Take-Turn protocol is out the window since I'm just using plain maps.
A take-turn multimethod seemed like it would work at first, but then I realized that I'd have no dispatch values to use in my current setup that would be extensible. I could have :direction be the dispatch function, and then dispatch on nil to use the "directed bacterium"'s take-turn, or default to get the base aimless behavior, but that doesn't give me a way of even having a third "player bacterium" type.
The only solution I can think of it to require that all bacterium have a :type field, and to dispatch on it, like:
(defn new-bacterium [starting-position]
{:type :aimless
:mass 0,
:position starting-position})
(defn new-directed-bacterium [starting-position starting-directions]
(-> (new-bacterium starting-position)
(assoc :type :directed,
:direction starting-directions)))
(defmulti take-turn (fn [b _] (:type b)))
(defmethod take-turn :aimless [this world]
(println "Aimless turn!"))
(defmethod take-turn :directed [this world]
(println "Directed turn!"))
(take-turn (new-bacterium [0 0]) nil)
Aimless turn!
=> nil
(take-turn (new-directed-bacterium [0 0] nil) nil)
Directed turn!
=> nil
But now I'm back to basically dispatching on type, using a slower method than protocols. Is this a legitimate case to use records and protocols, or is there something about mutlimethods that I'm missing? I don't have a lot of practice with them.
* I also decided to try this because I was in the situation where I had a Bacterium record and wanted to create a new "directed" version of the record that had a single field direction added to it (inheritance basically). The original record implemented protocols though, and I didn't want to have to do something like nesting the original record in the new one, and routing all behavior to the nested instance. Every time I created a new type or changed a protocol, I would have to change all the routing, which was a lot of work.
You can use example-based multiple dispatch for this, as explained in this blog post. It is certainly not the most performant way to solve this problem, but arguably more flexible than multi-methods as it does not require you to declare a dispatch-method upfront. So it is open for extension to any data representation, even other things than maps. If you need performance, then multi-methods or protocols as you suggest, is probably the way to go.
First, you need to add a dependency on [bluebell/utils "1.5.0"] and require [bluebell.utils.ebmd :as ebmd]. Then you declare constructors for your data structures (copied from your question) and functions to test those data strucutres:
(defn new-bacterium [starting-position]
{:mass 0
:position starting-position})
(defn new-directed-bacterium [starting-position starting-directions]
(-> (new-bacterium starting-position)
(assoc :direction starting-directions)))
(defn bacterium? [x]
(and (map? x)
(contains? x :position)))
(defn directed-bacterium? [x]
(and (bacterium? x)
(contains? x :direction)))
Now we are going to register those datastructures as so called arg-specs so that we can use them for dispatch:
(ebmd/def-arg-spec ::bacterium {:pred bacterium?
:pos [(new-bacterium [9 8])]
:neg [3 4]})
(ebmd/def-arg-spec ::directed-bacterium {:pred directed-bacterium?
:pos [(new-directed-bacterium [9 8] [3 4])]
:neg [(new-bacterium [3 4])]})
For each arg-spec, we need to declare a few example values under the :pos key, and a few non-examples under the :neg key. Those values are used to resolve the fact that a directed-bacterium is more specific than just a bacterium in order for the dispatch to work properly.
Finally, we are going to define a polymorphic take-turn function. We first declare it, using declare-poly:
(ebmd/declare-poly take-turn)
And then, we can provide different implementations for specific arguments:
(ebmd/def-poly take-turn [::bacterium x
::ebmd/any-arg world]
:aimless)
(ebmd/def-poly take-turn [::directed-bacterium x
::ebmd/any-arg world]
:directed)
Here, the ::ebmd/any-arg is an arg-spec that matches any argument. The above approach is open to extension just like multi-methods, but does not require you to declare a :type field upfront and is thus more flexible. But, as I said, it is also going to be slower than both multimethods and protocols, so ultimately this is a trade-off.
Here is the full solution: https://github.com/jonasseglare/bluebell-utils/blob/archive/2018-11-16-002/test/bluebell/utils/ebmd/bacteria_test.clj
Dispatching a multimethod by a :type field is indeed polymorphic dispatch that could be done with a protocol, but using multimethods allows you to dispatch on different fields. You can add a second multimethod that dispatches on something other than :type, which might be tricky to accomplish with a protocol (or even multiple protocols).
Since a multimethod can dispatch on anything, you could use a set as the dispatch value. Here's an alternative approach. It's not fully extensible, since the keys to select are determined within the dispatch function, but it might give you an idea for a better solution:
(defmulti take-turn (fn [b _] (clojure.set/intersection #{:direction} (set (keys b)))))
(defmethod take-turn #{} [this world]
(println "Aimless turn!"))
(defmethod take-turn #{:direction} [this world]
(println "Directed turn!"))
Fast paths exist for a reason, but Clojure doesn't stop you from doing anything you want to do, per say, including ad hoc predicate dispatch. The world is definitely your oyster. Observe this super quick and dirty example below.
First, we'll start off with an atom to store all of our polymorphic functions:
(def polies (atom {}))
In usage, the internal structure of the polies would look something like this:
{foo ; <- function name
{:dispatch [[pred0 fn0 1 ()] ; <- if (pred0 args) do (fn0 args)
[pred1 fn1 1 ()]
[pred2 fn2 2 '&]]
:prefer {:this-pred #{:that-pred :other-pred}}}
bar
{:dispatch [[pred0 fn0 1 ()]
[pred1 fn1 3 ()]]
:prefer {:some-pred #{:any-pred}}}}
Now, let's make it so that we can prefer predicates (like prefer-method):
(defn- get-parent [pfn x] (->> (parents x) (filter pfn) first))
(defn- in-this-or-parent-prefs? [poly v1 v2 f1 f2]
(if-let [p (-> #polies (get-in [poly :prefer v1]))]
(or (contains? p v2) (get-parent f1 v2) (get-parent f2 v1))))
(defn- default-sort [v1 v2]
(if (= v1 :poly/default)
1
(if (= v2 :poly/default)
-1
0)))
(defn- pref [poly v1 v2]
(if (-> poly (in-this-or-parent-prefs? v1 v2 #(pref poly v1 %) #(pref poly % v2)))
-1
(default-sort v1 v2)))
(defn- sort-disp [poly]
(swap! polies update-in [poly :dispatch] #(->> % (sort-by first (partial pref poly)) vec)))
(defn prefer [poly v1 v2]
(swap! polies update-in [poly :prefer v1] #(-> % (or #{}) (conj v2)))
(sort-disp poly)
nil)
Now, let's create our dispatch lookup system:
(defn- get-disp [poly filter-fn]
(-> #polies (get-in [poly :dispatch]) (->> (filter filter-fn)) first))
(defn- pred->disp [poly pred]
(get-disp poly #(-> % first (= pred))))
(defn- pred->poly-fn [poly pred]
(-> poly (pred->disp pred) second))
(defn- check-args-length [disp args]
((if (= '& (-> disp (nth 3) first)) >= =) (count args) (nth disp 2)))
(defn- args-are? [disp args]
(or (isa? (vec args) (first disp)) (isa? (mapv class args) (first disp))))
(defn- check-dispatch-on-args [disp args]
(if (-> disp first vector?)
(-> disp (args-are? args))
(-> disp first (apply args))))
(defn- disp*args? [disp args]
(and (check-args-length disp args)
(check-dispatch-on-args disp args)))
(defn- args->poly-fn [poly args]
(-> poly (get-disp #(disp*args? % args)) second))
Next, let's prepare our define macro with some initialization and setup functions:
(defn- poly-impl [poly args]
(if-let [poly-fn (-> poly (args->poly-fn args))]
(-> poly-fn (apply args))
(if-let [default-poly-fn (-> poly (pred->poly-fn :poly/default))]
(-> default-poly-fn (apply args))
(throw (ex-info (str "No poly for " poly " with " args) {})))))
(defn- remove-disp [poly pred]
(when-let [disp (pred->disp poly pred)]
(swap! polies update-in [poly :dispatch] #(->> % (remove #{disp}) vec))))
(defn- til& [args]
(count (take-while (partial not= '&) args)))
(defn- add-disp [poly poly-fn pred params]
(swap! polies update-in [poly :dispatch]
#(-> % (or []) (conj [pred poly-fn (til& params) (filter #{'&} params)]))))
(defn- setup-poly [poly poly-fn pred params]
(remove-disp poly pred)
(add-disp poly poly-fn pred params)
(sort-disp poly))
With that, we can finally build our polies by rubbing some macro juice on there:
(defmacro defpoly [poly-name pred params body]
`(do (when-not (-> ~poly-name quote resolve bound?)
(defn ~poly-name [& args#] (poly-impl ~poly-name args#)))
(let [poly-fn# (fn ~(symbol (str poly-name "-poly")) ~params ~body)]
(setup-poly ~poly-name poly-fn# ~pred (quote ~params)))
~poly-name))
Now you can build arbitrary predicate dispatch:
;; use defpoly like defmethod, but without a defmulti declaration
;; unlike defmethods, all params are passed to defpoly's predicate function
(defpoly myinc number? [x] (inc x))
(myinc 1)
;#_=> 2
(myinc "1")
;#_=> Execution error (ExceptionInfo) at user$poly_impl/invokeStatic (REPL:6).
;No poly for user$eval187$myinc__188#5c8eee0f with ("1")
(defpoly myinc :poly/default [x] (inc x))
(myinc "1")
;#_=> Execution error (ClassCastException) at user$eval245$fn__246/invoke (REPL:1).
;java.lang.String cannot be cast to java.lang.Number
(defpoly myinc string? [x] (inc (read-string x)))
(myinc "1")
;#_=> 2
(defpoly myinc
#(and (number? %1) (number? %2) (->> %& (filter (complement number?)) empty?))
[x y & z]
(inc (apply + x y z)))
(myinc 1 2 3)
;#_=> 7
(myinc 1 2 3 "4")
;#_=> Execution error (ArityException) at user$poly_impl/invokeStatic (REPL:5).
;Wrong number of args (4) passed to: user/eval523/fn--524
; ^ took the :poly/default path
And when using your example, we can see:
(defn new-bacterium [starting-position]
{:mass 0,
:position starting-position})
(defn new-directed-bacterium [starting-position starting-directions]
(-> (new-bacterium starting-position)
(assoc :direction starting-directions)))
(defpoly take-turn (fn [b _] (-> b keys set (contains? :direction)))
[this world]
(println "Directed turn!"))
;; or, if you'd rather use spec
(defpoly take-turn (fn [b _] (->> b (s/valid? (s/keys :req-un [::direction])))
[this world]
(println "Directed turn!"))
(take-turn (new-directed-bacterium [0 0] nil) nil)
;#_=> Directed turn!
;nil
(defpoly take-turn :poly/default [this world]
(println "Aimless turn!"))
(take-turn (new-bacterium [0 0]) nil)
;#_=> Aimless turn!
;nil
(defpoly take-turn #(-> %& first :show) [this world]
(println :this this :world world))
(take-turn (assoc (new-bacterium [0 0]) :show true) nil)
;#_=> :this {:mass 0, :position [0 0], :show true} :world nil
;nil
Now, let's try using isa? relationships, a la defmulti:
(derive java.util.Map ::collection)
(derive java.util.Collection ::collection)
;; always wrap classes in a vector to dispatch off of isa? relationships
(defpoly foo [::collection] [c] :a-collection)
(defpoly foo [String] [s] :a-string)
(foo [])
;#_=> :a-collection
(foo "bob")
;#_=> :a-string
And of course we can use prefer to disambiguate relationships:
(derive ::rect ::shape)
(defpoly bar [::rect ::shape] [x y] :rect-shape)
(defpoly bar [::shape ::rect] [x y] :shape-rect)
(bar ::rect ::rect)
;#_=> :rect-shape
(prefer bar [::shape ::rect] [::rect ::shape])
(bar ::rect ::rect)
;#_=> :shape-rect
Again, the world's your oyster! There's nothing stopping you from extending the language in any direction you want.

Idiom for transducing over an atom?

What is the idiomatic way to apply transducers to an atom's value?
This seems to do the job, but I'm unsure of correctness (and style ^^).
(let [xf1 (map inc)
xf2 (map #(+ % 2))
xf #(vec (eduction (comp xf2 xf1) %))
a (atom [1 2 3])]
(swap! a xf))
;=> [4 5 6]
(let [xf1 (map inc)
xf2 (map #(* % 2))
foo #(into [] (comp xf2 xf1) %)
a (atom [1 2 3])]
(swap! a foo))
;; => [3 5 7]
There are two things you need to take note.
comp in transducers works the opposite order as normal applications. That is, xf2 is applied prior to xf1. For each element, it is doubled then incremented.
eduction returns a sequence, so it's not the same type as your original value in the atom.

seq to vec conversion - Key must be integer

I want to get the indices of nil elements in a vector eg.
[1 nil 3 nil nil 4 3 nil] => [1 3 4 7]
(defn nil-indices [vec]
(vec (remove nil? (map
#(if (= (second %) nil) (first %))
(partition-all 2 (interleave (range (count vec)) vec)))))
)
Running this code results in
java.lang.IllegalArgumentException: Key must be integer
(NO_SOURCE_FILE:0)
If I leave out the (vec) call surrounding everything, it seems to work, but returns a sequence instead of a vector.
Thank you!
Try this instead:
(defn nil-indices [v]
(vec (remove nil? (map
#(if (= (second %) nil) (first %))
(partition-all 2 (interleave (range (count v)) v))))))
Clojure is a LISP-1: It has a single namespace for both functions and data, so when you called (vec ...), you were trying to pass your result sequence to your data as a parameter, not to the standard-library vec function.
See other answer for your problem (you are shadowing vec), but consider using a simpler approach.
map can take multiple arguments, in which case they are passed as additional arguments to the map function, e.g. (map f c1 c2 ...) calls (f (first c1) (first c2) ...) etc, until one of the sequence arguments is exhausted.
This means your (partition-all 2 (interleave ...)) is a very verbose way of saying (map list (range) v). There is also a function map-indexed which does the same thing. However, it only takes one sequence argument, so (map-indexed f c1 c2) is not legal.
Here is your function rewritten for clarity using map-indexed, threading, and nil?:
(defn nil-indices [v]
; Note: map fn called like (f range-item v-item)
; Not like (f (range-item v-item)) as in your code.
(->> (map-indexed #(when (nil? %2) %1) v) ;; like (map #(when ...) (range) v)
(remove nil?)
vec))
However, you can do this instead with reduction and the reduce-kv function. This function is like reduce, except the reduction function receives three arguments instead of two: the accumulator, the key of the item in the collection (index for vectors, key for maps), and the item itself. Using reduce-kv you can rewrite this function even more clearly (and it will probably run faster, especially with transients):
(defn nil-indices [v]
(reduce-kv #(if (nil? %3) (conj %1 %2) %1) [] v))

Clojure multimethod dispatching on functions and values

I have a function that returns the indexes in seq s at which value v exists:
(defn indexes-of [v s]
(map first (filter #(= v (last %)) (zipmap (range) s))))
What I'd like to do is extend this to apply any arbitrary function for the existence test. My idea is to use a multimethod, but I'm not sure exactly how to detect a function. I want to do this:
(defmulti indexes-of ???)
(defmethod indexes-of ??? [v s] ;; v is a function
(map first (filter v (zipmap (range) s))))
(defmethod indexes-of ??? [v s] ;; v is not a function
(indexes-of #(= v %) s))
Is a multimethod the way to go here? If so, how can I accomplish what I'm trying to do?
If you want to use a multimethod it should be on the filter function, which is the one changing according to the existence test type.
So
(defmulti filter-test (fn [value element]
(cond
(fn? value) :function
:else :value)))
(defmethod filter-test :function
[value element]
(apply value [element]))
(defmethod filter-test :value
[value element]
(= value element))
(defn indexes-of [v s]
(map first (filter #(filter-test v (last %)) (zipmap (range) s))))
Consider the JVM doesn't support first-class functions, or lambdas, out of the box, so there's no "function" data type to dispatch on, that's the reason the fn? test.
None the less the predicate solution proposed by noisesmith is the proper way to go in this situation IMO.
(defmulti indexes-of (fn [v _]
(if (fn? v)
:function
:value)))
(defmethod indexes-of :function
[f coll]
(keep-indexed (fn [i v] (when (f v) i)) coll))
(defmethod indexes-of :value
[v coll]
(indexes-of (partial = v) coll))
How about something simpler and more general:
(defn index-matches [predicate s]
(map first (filter (comp predicate second) (map vector (range) s))))
user> (index-matches even? (reverse (range 10)))
(1 3 5 7 9)
user> (index-matches #{3} [0 1 2 3 1 3 44 3 1 3])
(3 5 7 9)
thanks to a suggestion from lgrapenthin, this function is also now effective for lazy input:
user> (take 1 (index-matches #{300000} (range)))
(300000)

In Clojure, how to group elements?

In clojure, I want to aggregate this data:
(def data [[:morning :pear][:morning :mango][:evening :mango][:evening :pear]])
(group-by first data)
;{:morning [[:morning :pear][:morning :mango]],:evening [[:evening :mango][:evening :pear]]}
My problem is that :evening and :morning are redundant.
Instead, I would like to create the following collection:
([:morning (:pear :mango)] [:evening (:mango :pear)])
I came up with:
(for [[moment moment-fruit-vec] (group-by first data)] [moment (map second moment-fruit-vec)])
Is there a more idiomatic solution?
I've come across similar grouping problems. Usually I end up plugging merge-with or update-in into some seq processing step:
(apply merge-with list (map (partial apply hash-map) data))
You get a map, but this is just a seq of key-value pairs:
user> (apply merge-with list (map (partial apply hash-map) data))
{:morning (:pear :mango), :evening (:mango :pear)}
user> (seq *1)
([:morning (:pear :mango)] [:evening (:mango :pear)])
This solution only gets what you want if each key appears twice, however. This might be better:
(reduce (fn [map [x y]] (update-in map [x] #(cons y %))) {} data)
Both of these feel "more functional" but also feel a little convoluted. Don't be too quick to dismiss your solution, it's easy-to-understand and functional enough.
Don't be too quick to dismiss group-by, it has aggregated your data by the desired key and it hasn't changed the data. Any other function expecting a sequence of moment-fruit pairs will accept any value looked up in the map returned by group-by.
In terms of computing the summary my inclination was to reach for merge-with but for that I had to transform the input data into a sequence of maps and construct a "base-map" with the required keys and empty-vectors as values.
(let [i-maps (for [[moment fruit] data] {moment fruit})
base-map (into {}
(for [key (into #{} (map first data))]
[key []]))]
(apply merge-with conj base-map i-maps))
{:morning [:pear :mango], :evening [:mango :pear]}
Meditating on #mike t's answer, I've come up with:
(defn agg[x y] (if (coll? x) (cons y x) (list y x)))
(apply merge-with agg (map (partial apply hash-map) data))
This solution works also when the keys appear more than twice on data:
(apply merge-with agg (map (partial apply hash-map)
[[:morning :pear][:morning :mango][:evening :mango] [:evening :pear] [:evening :kiwi]]))
;{:morning (:mango :pear), :evening (:kiwi :pear :mango)}
maybe just modify the standard group-by a little bit:
(defn my-group-by
[fk fv coll]
(persistent!
(reduce
(fn [ret x]
(let [k (fk x)]
(assoc! ret k (conj (get ret k []) (fv x)))))
(transient {}) coll)))
then use it as:
(my-group-by first second data)