Clojure: construct function based on variables dynamcally? - clojure

For the following data:
(def occurrence-data '(["John" "Artesyn" 1 31.0] ["Mike" "FlexPower" 2 31.0] ["John" "Eaton" 1 31.0]))
I would like to have a function:
(defn visit-numbers
"Produce a map from coordinates to number of customer visits from occurrence records."
[coordinates occurrences]
(let [selector ??? ; a function that would be equivalent to (juxt #(nth % c1) #(nth % c2) ..), where c1, c2, ... are elements of coordinates
]
(group-by selector occurrences)
)
For example, for coordinates = [1 3]
It should be
(group-by (juxt #(nth % 1) #(nth % 3)) occurrence-data)
I guess that it should be possible? I tried to use some list expression but has not figured out yet.
My experiment of following:
(def selector (list 'juxt '#(nth % 1) '#(nth % 3)))
(group-by selector occurrence-data)
Got error:
java.lang.ClassCastException: clojure.lang.PersistentList cannot be cast to clojure.lang.IFn
core.clj:6600 clojure.core/group-by[fn]
protocols.clj:143 clojure.core.protocols/fn
protocols.clj:19 clojure.core.protocols/fn[fn]
protocols.clj:31 clojure.core.protocols/seq-reduce
protocols.clj:48 clojure.core.protocols/fn
protocols.clj:13 clojure.core.protocols/fn[fn]
core.clj:6289 clojure.core/reduce
core.clj:6602 clojure.core/group-by
I have two problems to solve:
How to make selector a function?
How to dynamic construct such function based coordinates?
Thanks for your pointers, and help!
I also guess that using macro might also be possible to do it?
Or am I using too complicated method to achieve my goal?

Simply call juxt directly to create your function, and define selector to hold that function:
(def selector (juxt #(nth % 1) #(nth % 3)))
To make it dynamically, create a function-creating function:
(defn make-selector [& indexes] (apply juxt (map (fn[i] #(nth % i)) indexes)))
REPL example:
core> (def occurrence-data '(["John" "Artesyn" 1 31.0] ["Mike" "FlexPower" 2 31.0] ["John" "Eaton" 1 31.0]))
#'core/occurrence-data
core> (def selector (juxt #(nth % 1) #(nth % 3)))
#'core/selector
core> (group-by selector occurrence-data)
{["Artesyn" 31.0] [["John" "Artesyn" 1 31.0]], ["FlexPower" 31.0] [["Mike" "FlexPower" 2 31.0]], ["Eaton" 31.0] [["John" "Eaton" 1 31.0]]}
core> (group-by (make-selector 0 1 2) occurrence-data)
{["John" "Artesyn" 1] [["John" "Artesyn" 1 31.0]], ["Mike" "FlexPower" 2] [["Mike" "FlexPower" 2 31.0]], ["John" "Eaton" 1] [["John" "Eaton" 1 31.0]]}

This is almost index
(clojure.set/index occurrence-data [2 3])
;=>
; {{3 31.0, 2 2} #{["Mike" "FlexPower" 2 31.0]},
; {3 31.0, 2 1} #{["John" "Eaton" 1 31.0] ["John" "Artesyn" 1 31.0]}}
Where you can see, for example, that there are two records that share the same values at coordinates 2 and 3, those values being 1 and 31.0.
If you wanted to strip back out the indices and map to a count, then
(reduce-kv
(fn [a k v] (conj a {(vals k) (count v)}))
{}
(clojure.set/index occurrence-data [2 3]))
;=> {(31.0 1) 2, (31.0 2) 1}

Define
(defn group-by-indices [ns coll]
(group-by #(mapv % ns) coll))
then, for example,
(group-by-indices [1] occurrence-data)
;{["Artesyn"] [["John" "Artesyn" 1 31.0]],
; ["FlexPower"] [["Mike" "FlexPower" 2 31.0]],
; ["Eaton"] [["John" "Eaton" 1 31.0]]}
and
(group-by-indices [2 3] occurrence-data)
;{[1 31.0] [["John" "Artesyn" 1 31.0] ["John" "Eaton" 1 31.0]],
; [2 31.0] [["Mike" "FlexPower" 2 31.0]]}
If you want to keep the selection map, use select-keys instead of mapv. Then we're getting close to A.Webb's use of clojure.set/index, which is, other things being equal, the method of choice.

Related

Cummulative addition on a vector of maps in clojure

I have a data set that looks like this
[{1 "a"} {2 "b"} {3 "c"}]
I want to transform it into a cummulative map that looks like
{1 "a" 3 "b" 6 "c"}
I think my current approach is long winded. So far I have come up with
(reduce
(fn [sum item]
(assoc sum (+ (reduce + (keys sum))
(key (first item)))
(val (first item))))
split-map)
but the addition on the keys is incorrect. Does anyone know how I can improve on this?
and one more fun version:
(->> data
(reductions (fn [[sum] m] (update (first m) 0 + sum)) [0])
rest
(into {}))
;;=> {1 "a", 3 "b", 6 "c"}
the trick is that reduction function operates on previous and current key-value pairs updating the current pair's key:
(reductions (fn [[sum] m] (update (first m) 0 + sum)) [0] data)
;;=> ([0] [1 "a"] [3 "b"] [6 "c"])
If you fancy transducers:
(require '[net.cgrand.xforms :as xf])
(let [data [{1 "a"} {2 "b"} {3 "c"}]]
(into {} (comp
(map first)
(xf/multiplex [(map last)
(comp (map first) (xf/reductions +) (drop 1))])
(partition-all 2)) data))
=> {1 "a", 3 "b", 6 "c"}
Here is a possible implementation of the computation and it makes extensive use of Clojure sequence functions:
(defn cumul-pairs [data]
(zipmap (rest (reductions ((map (comp key first)) +) 0 data))
(map (comp val first) data)))
(cumul-pairs [{1 "a"} {2 "b"} {3 "c"}])
;; => {1 "a", 3 "b", 6 "c"}
In this code, the expression (rest (reductions ((map (comp first keys)) +) 0 data)) computes the keys of the resulting map and the expression (map (comp first vals) data) computes the values. Then we combine them with zipmap. The function reductions works just like reduce but returns a sequence of all intermediate results instead of just the last one. The curious looking subexpression ((map (comp first keys)) +) is the reducing function, where we use a mapping transducer to construct a reducing function from the + reducing function that will map the input value before adding it.
This is a bit of an awkward problem to solve succintly. Here is one way:
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(dotest
(let-spy
[x1 [{1 "a"} {2 "b"} {3 "c"}]
nums (mapv #(first (first %)) x1)
chars (mapv #(second (first %)) x1)
nums-cum (reductions + nums)
pairs (mapv vector nums-cum chars) ; these 2 lines are
result (into {} pairs)] ; like `zipmap`
(is= result {1 "a", 3 "b", 6 "c"})))
By using my favorite template project
we are able to use let-spy from the Tupelo library
and see the results printed at each step:
-----------------------------------
Clojure 1.10.3 Java 15.0.2
-----------------------------------
Testing tst.demo.core
x1 => [{1 "a"} {2 "b"} {3 "c"}]
nums => [1 2 3]
chars => ["a" "b" "c"]
nums-cum => (1 3 6)
pairs => [[1 "a"] [3 "b"] [6 "c"]]
result => {1 "a", 3 "b", 6 "c"}
Ran 2 tests containing 1 assertions.
0 failures, 0 errors.
When it is working with all unit tests, just trim off the -spy part to leave a normal (let ...)form.
Be sure to see this list of documentation sources, especially the Clojure CheatSheet.
Probably the easiest (most readable) version:
(def ml [{1 "a"} {2 "b"} {3 "c"}])
(defn cumsum [l] (reductions + l))
(let [m (into (sorted-map) ml)]
(zipmap (cumsum (keys m)) (vals m)))
;; => {1 "a", 3 "b", 6 "c"}
How about this?
(defn f [v]
(zipmap (reductions + (mapcat keys v)) (mapcat vals v)))
which works with the original vector of maps:
(f [{1 "a"} {2 "b"} {3 "c"}])
;; => {1 "a", 3 "b", 6 "c"}
.. and also with maps of varying length:
(f [{1 "a"} {2 "b"} {3 "c" 4 "d"} {5 "e" 6 "f" 7 "g"}])
;; => {1 "a", 3 "b", 6 "c", 10 "d", 15 "e", 21 "f", 28 "g"}

Clojure nested for loop with index

I've been trying to idiomatically loop through a nested vector like below:
[[:a 1 :b 1 :c 1] [:a 1 :b 1 :c 3] [:a 1 :b 1 :c 1]]
I also need to return the coordinates once I've found a value.
eg The call (find-key-value 3) should return [1 2]
This is what I have so far but its not giving me the output that I need it would return ([] [] [] [] [] [1 2] [] [] []) where as i only need [1 2]
(defn find-key-value
[array value]
(for [x (range 0 (count array))]
(loop [y 0
ret []]
(cond
(= y (count (nth array x))) [x y]
:else (if (= value (get-in array [x y]))
(recur (+ 1 y) (conj ret [x y]))
(recur (+ 1 y) ret))))))
Anyone have any ideas on how I can fix my code to get to my desired solution or have a better approach in mind!
A list comprehension can be used to find coordinates of all values satisfying a predicate:
(defn find-locs [pred coll]
(for [[i vals] (map-indexed vector coll)
[j val] (map-indexed vector vals)
:when (pred val)]
[i j]))
(find-locs #(= 3 %) [[:a 1 :b 1 :c 1] [:a 1 :b 1 :c 3] [:a 1 :b 1 :c 1]])
=> ([1 5])
(find-locs zero? [[0 1 1] [1 1 1] [1 0 1]])
=> ([0 0] [2 1])
The posed question seems to imply that the keywords in the inputs should be ignored, in which case the answer becomes:
(defn find-locs-ignore-keyword [pred coll]
(for [[i vals] (map-indexed vector coll)
[j val] (map-indexed vector (remove keyword? vals))
:when (pred val)]
[i j]))
(find-locs-ignore-keyword #(= 3 %) [[:a 1 :b 1 :c 1] [:a 1 :b 1 :c 3] [:a 1 :b 1 :c 1]])
=> ([1 2])
there is a function in clojure core, which exactly suites the task: keep-indexed. Which is exactly indexed map + filter:
(defn find-val-idx [v data]
(ffirst (keep-indexed
(fn [i row]
(seq (keep-indexed
(fn [j [_ x]] (when (= v x) [i j]))
(partition 2 row))))
data)))
user> (find-val-idx 3 [[:a 1 :b 1 :c 1] [:a 1 :b 1 :c 3] [:a 1 :b 1 :c 1]])
;;=> [1 2]
user> (find-val-idx 10 [[:a 1 :b 1 :c 1] [:a 1 :b 1 :c 3] [:a 1 :b 1 :c 1]])
;;=> nil
user> (find-val-idx 1 [[:a 1 :b 1 :c 1] [:a 1 :b 1 :c 3] [:a 1 :b 1 :c 1]])
;;=> [0 0]
There is a map-indexed that is sometimes helpful. See the Clojure Cheatsheet and other docs listed here.
==> Could you please edit the question to clarify the search conditions?
Here is an outline of what you could do to search for the desired answer:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn coords
[data pred]
(let [result (atom [])]
(doseq [row (range (count data))
col (range (count (first data)))]
(let [elem (get-in data [row col])
keeper? (pred elem)]
(when keeper?
(swap! result conj [row col]))))
(deref result)))
(dotest
(let [data [[11 12 13]
[21 22 23]
[31 32 33]]
ends-in-2? (fn [x] (zero? (mod x 2)))]
(is= (coords data ends-in-2?)
[[0 1]
[1 1]
[2 1]])))
It is based on the same template project as the docs. There are many variations (for example, you could use reduce instead of an atom).
Please review the docs listed above.
(defn vec-to-map [v] (into {} (into [] (map vec (partition 2 v)))))
(defn vec-vals [v] (vals (vec-to-map v)))
(defn map-vec-index [v el] (.indexOf (vec-vals v) el))
(defn find-val-coord
([arr val] (find-val-coord arr val 0))
([arr val counter]
(let [row (first arr)
idx (map-vec-index row val)]
(cond (< 0 idx) [counter idx]
:else (recur (rest arr) val (inc counter))))))
(find-val-coord arr 3) ;; => [1 2]
We can also write functions to pick value or corresponding key
from array when coordinate is given:
(defn vec-keys [v] (keys (vec-to-map v)))
(defn get-val-coord [arr coord]
(nth (vec-vals (nth arr (first coord))) (second coord)))
(defn get-key-coord [arr coord]
(nth (vec-keys (nth arr (first coord))) (second coord)))
(get-val-coord arr [1 2]) ;; => 3
(get-key-coord arr [1 2]) ;; => :c
I might be over-engineering this answer slightly, but here is a non-recursive and non-lazy approach based on a single loop that will work for arbitrary and mixed levels of nesting and won't suffer from stack overflow due to recursion:
(defn find-key-value [array value]
(loop [remain [[[] array]]]
(if (empty? remain)
nil
(let [[[path x] & remain] remain]
(cond (= x value) path
(sequential? x)
(recur (into remain
(comp (remove keyword?)
(map-indexed (fn [i x] [(conj path i) x])))
x))
:default (recur remain))))))
(find-key-value [[:a 1 :b 1 :c 1] [:a 1 :b 1 :c 3] [:a 1 :b 1 :c 1]] 3)
;; => [1 2]
(find-key-value [[:a 1 [[[[[:c]]]] [[[9 [[[3]] :k]] 119]]]] [:a [[[1]]] :b 1]] 3)
;; => [0 1 1 0 0 1 0 0 0]
(find-key-value (last (take 20000 (iterate vector 3))) 3)
;; => [0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
A simpler solution, assuming 2D array where the inner vectors are
key value vectors, uses flattening of the 2D array and .indexOf.
(defn find-coord [arr val]
(let [m (count (first arr))
idx (.indexOf (flatten arr) val)]
[(quot idx m) (quot (dec (mod idx m)) 2)]))
(find-coord arr 3) ;;=> [1 2]

Partitioning across partitions in Clojure?

Here are some values. Each is a sequence of ascending (or otherwise grouped) values.
(def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]]
[[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]]
[[1 :l] [3 :m]]])
I can partition them each by value.
=> (map (partial partition-by first) input-vals)
((([1 :a] [1 :b]) ([2 :c]) ([3 :d] [3 :e])) (([1 :f]) ([2 :g] [2 :h] [2 :i]) ([3 :j] [3 :k])) (([1 :l]) ([3 :m])))
But that gets me 3 sequences of partitions. I want one single sequence of partitioned groups.
What I want to do is return a single lazy sequence of (potentially) lazy sequences that are the respective partitions joined. e.g. I want to produce this:
((([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m])))
Note that not all values appear in all sequences (there is no 2 in the third vector).
This is of course a simplification of my problem. The real data is a set of lazy streams coming from very large files, so nothing can be realised. But I think the solution for the above question is the solution for my problem.
Feel free to edit the title, I wasn't quite sure how to express it.
Try this horror:
(defn partition-many-by [f comp-f s]
(let [sorted-s (sort-by first comp-f s)
first-list (first (drop-while (complement seq) sorted-s))
match-val (f (first first-list))
remains (filter #(not (empty? %))
(map #(drop-while (fn [ss] (= match-val (f ss))) %)
sorted-s))]
(when match-val
(cons
(apply concat
(map #(take-while (fn [ss] (= match-val (f ss))) %)
sorted-s))
(lazy-seq (partition-many-by f comp-f remains))))))
It could possibly be improved to remove the double value check (take-while and drop-while).
Example usage:
(partition-many-by identity [[1 1 1 1 2 2 3 3 3 3] [1 1 2 2 2 2 3] [3]])
=> ((1 1 1 1 1 1) (2 2 2 2 2 2) (3 3 3 3 3 3))
Let's make this interesting and use sequences of infinite length for our input
(def twos (iterate #(+ 2 %) 0))
(def threes (iterate #(+ 3 %) 0))
(def fives (iterate #(+ 5 %) 0))
We'll need to lazily merge them. Let's ask for a comparator so we can apply to other data types as well.
(defn lazy-merge-by
([compfn xs ys]
(lazy-seq
(cond
(empty? xs) ys
(empty? ys) xs
:else (if (compfn (first xs) (first ys))
(cons (first xs) (lazy-merge-by compfn (rest xs) ys))
(cons (first ys) (lazy-merge-by compfn xs (rest ys)))))))
([compfn xs ys & more]
(apply lazy-merge-by compfn (lazy-merge-by compfn xs ys) more)))
Test
(take 15 (lazy-merge-by < twos threes fives))
;=> (0 0 0 2 3 4 5 6 6 8 9 10 10 12 12)
We can (lazily) partition by value if desired
(take 10 (partition-by identity (lazy-merge-by < twos threes fives)))
;=> ((0 0 0) (2) (3) (4) (5) (6 6) (8) (9) (10 10) (12 12))
Now, back to the sample input
(partition-by first (apply lazy-merge-by #(<= (first %) (first %2)) input-vals))
;=> (([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m]))
as desired less one extraneous set of outer parentheses.
I'm not sure whether I'm following but you can faltten the result sequence, something like:
(flatten (partition-by identity (first input-vals)))
clojure.core/flatten
([x])
Takes any nested combination of sequential things (lists, vectors,
etc.) and returns their contents as a single, flat sequence.
(flatten nil) returns an empty sequence.
You can use realized? function to test whether a sequence is lazy or not.
user> (def desired-result '((([1 :a] [1 :b] [1 :f] [1 :l])
([2 :c] [2 :g] [2 :h] [2 :i])
([3 :d] [3 :e] [3 :j] [3 :k] [3 :m]))))
#'user/desired-result
user> (def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]]
[[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]]
[[1 :l] [3 :m]]])
#'user/input-vals
user> (= desired-result (vector (vals (group-by first (apply concat input-vals)))))
true
I changed the input-vals slightly to correct for what I assume was a typographical error, if it was not an error I can update my code to accommodate the less regular structure.
Using the ->> (thread last) macro, we can have the equivalent code in a more readable form:
user> (= desired-result
(->> input-vals
(apply concat)
(group-by first)
vals
vector))
true
(partition-by first (sort-by first (mapcat identity input-vals)))

Apply a list of functions to a corresponding list of data in Clojure

So I have a list of functions and a list of data:
[fn1 fn2 fn3] [item1 item2 item3]
What can I do to apply each function to its corresponding data item:
[(fn1 item1) (fn2 item2) (fn3 item3)]
Example:
[str #(* 2 %) (partial inc)] [3 5 8]
=> ["3" 10 9]
You can use map
(map #(%1 %2) [str #(* 2 %) (partial inc)] [3 5 8])
("3" 10 9)
If you need a vector back, you can (apply vector ...)
(apply vector (map #(%1 %2) [str #(* 2 %) (partial inc)] [3 5 8]))
["3" 10 9]
Disclaimer: I don't know much Clojure, so there would probably be better ways to do this.
An alternative, not necessarily better:
user=> (for [[f x] (map vector [neg? pos? number?] [1 2 "foo"])]
#_=> (f x))
(false true false)
To make the map version suitable for varargs:
user=> (map (fn [f & args] (apply f args)) [+ - *] [1 2 3] [4 5 6] [7 8 9])
(12 -11 162)

How to remove multiple items from a list?

I have a list [2 3 5] which I want to use to remove items from another list like [1 2 3 4 5], so that I get [1 4].
thanks
Try this:
(let [a [1 2 3 4 5]
b [2 3 5]]
(remove (set b) a))
which returns (1 4).
The remove function, by the way, takes a predicate and a collection, and returns a sequence of the elements that don't satisfy the predicate (a set, in this example).
user=> (use 'clojure.set)
nil
user=> (difference (set [1 2 3 4 5]) (set [2 3 5]))
#{1 4}
Reference:
http://clojure.org/data_structures#toc22
http://clojure.org/api#difference
You can do this yourself with something like:
(def a [2 3 5])
(def b [1 2 3 4 5])
(defn seq-contains?
[coll target] (some #(= target %) coll))
(filter #(not (seq-contains? a %)) b)
; (3 4 5)
A version based on the reducers library could be:
(require '[clojure.core.reducers :as r])
(defn seq-contains?
[coll target]
(some #(= target %) coll))
(defn my-remove
"remove values from seq b that are present in seq a"
[a b]
(into [] (r/filter #(not (seq-contains? b %)) a)))
(my-remove [1 2 3 4 5] [2 3 5] )
; [1 4]
EDIT Added seq-contains? code
Here is my take without using sets;
(defn my-diff-func [X Y]
(reduce #(remove (fn [x] (= x %2)) %1) X Y ))