Partitioning across partitions in Clojure? - clojure

Here are some values. Each is a sequence of ascending (or otherwise grouped) values.
(def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]]
[[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]]
[[1 :l] [3 :m]]])
I can partition them each by value.
=> (map (partial partition-by first) input-vals)
((([1 :a] [1 :b]) ([2 :c]) ([3 :d] [3 :e])) (([1 :f]) ([2 :g] [2 :h] [2 :i]) ([3 :j] [3 :k])) (([1 :l]) ([3 :m])))
But that gets me 3 sequences of partitions. I want one single sequence of partitioned groups.
What I want to do is return a single lazy sequence of (potentially) lazy sequences that are the respective partitions joined. e.g. I want to produce this:
((([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m])))
Note that not all values appear in all sequences (there is no 2 in the third vector).
This is of course a simplification of my problem. The real data is a set of lazy streams coming from very large files, so nothing can be realised. But I think the solution for the above question is the solution for my problem.
Feel free to edit the title, I wasn't quite sure how to express it.

Try this horror:
(defn partition-many-by [f comp-f s]
(let [sorted-s (sort-by first comp-f s)
first-list (first (drop-while (complement seq) sorted-s))
match-val (f (first first-list))
remains (filter #(not (empty? %))
(map #(drop-while (fn [ss] (= match-val (f ss))) %)
sorted-s))]
(when match-val
(cons
(apply concat
(map #(take-while (fn [ss] (= match-val (f ss))) %)
sorted-s))
(lazy-seq (partition-many-by f comp-f remains))))))
It could possibly be improved to remove the double value check (take-while and drop-while).
Example usage:
(partition-many-by identity [[1 1 1 1 2 2 3 3 3 3] [1 1 2 2 2 2 3] [3]])
=> ((1 1 1 1 1 1) (2 2 2 2 2 2) (3 3 3 3 3 3))

Let's make this interesting and use sequences of infinite length for our input
(def twos (iterate #(+ 2 %) 0))
(def threes (iterate #(+ 3 %) 0))
(def fives (iterate #(+ 5 %) 0))
We'll need to lazily merge them. Let's ask for a comparator so we can apply to other data types as well.
(defn lazy-merge-by
([compfn xs ys]
(lazy-seq
(cond
(empty? xs) ys
(empty? ys) xs
:else (if (compfn (first xs) (first ys))
(cons (first xs) (lazy-merge-by compfn (rest xs) ys))
(cons (first ys) (lazy-merge-by compfn xs (rest ys)))))))
([compfn xs ys & more]
(apply lazy-merge-by compfn (lazy-merge-by compfn xs ys) more)))
Test
(take 15 (lazy-merge-by < twos threes fives))
;=> (0 0 0 2 3 4 5 6 6 8 9 10 10 12 12)
We can (lazily) partition by value if desired
(take 10 (partition-by identity (lazy-merge-by < twos threes fives)))
;=> ((0 0 0) (2) (3) (4) (5) (6 6) (8) (9) (10 10) (12 12))
Now, back to the sample input
(partition-by first (apply lazy-merge-by #(<= (first %) (first %2)) input-vals))
;=> (([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m]))
as desired less one extraneous set of outer parentheses.

I'm not sure whether I'm following but you can faltten the result sequence, something like:
(flatten (partition-by identity (first input-vals)))
clojure.core/flatten
([x])
Takes any nested combination of sequential things (lists, vectors,
etc.) and returns their contents as a single, flat sequence.
(flatten nil) returns an empty sequence.
You can use realized? function to test whether a sequence is lazy or not.

user> (def desired-result '((([1 :a] [1 :b] [1 :f] [1 :l])
([2 :c] [2 :g] [2 :h] [2 :i])
([3 :d] [3 :e] [3 :j] [3 :k] [3 :m]))))
#'user/desired-result
user> (def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]]
[[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]]
[[1 :l] [3 :m]]])
#'user/input-vals
user> (= desired-result (vector (vals (group-by first (apply concat input-vals)))))
true
I changed the input-vals slightly to correct for what I assume was a typographical error, if it was not an error I can update my code to accommodate the less regular structure.
Using the ->> (thread last) macro, we can have the equivalent code in a more readable form:
user> (= desired-result
(->> input-vals
(apply concat)
(group-by first)
vals
vector))
true

(partition-by first (sort-by first (mapcat identity input-vals)))

Related

Clojure - Function that returns all the indices of a vector of vectors

If I have a vector [[[1 2 3] [4 5 6] [7 8 9]] [[10 11] [12 13]] [[14] [15]]]
How can I return the positions of each element in the vector?
For example 1 has index [0 0 0], 2 has index [0 0 1], etc
I want something like
(some-fn [[[1 2 3] [4 5 6] [7 8 9]] [[10 11] [12 13]] [[14] [15]]] 1)
=> [0 0 0]
I know that if I have a vector [1 2 3 4], I can do (.indexOf [1 2 3 4] 1) => 0 but how can I extend this to vectors within vectors.
Thanks
and one more solution with zippers:
(require '[clojure.zip :as z])
(defn find-in-vec [x data]
(loop [curr (z/vector-zip data)]
(cond (z/end? curr) nil
(= x (z/node curr)) (let [path (rseq (conj (z/path curr) x))]
(reverse (map #(.indexOf %2 %1) path (rest path))))
:else (recur (z/next curr)))))
user> (find-in-vec 11 data)
(1 0 1)
user> (find-in-vec 12 data)
(1 1 0)
user> (find-in-vec 18 data)
nil
user> (find-in-vec 8 data)
(0 2 1)
the idea is to make a depth-first search for an item, and then reconstruct a path to it, indexing it.
Maybe something like this.
Unlike Asthor's answer it works for any nesting depth (until it runs out of stack). Their answer will give the indices of all items that match, while mine will return the first one. Which one you want depends on the specific use-case.
(defn indexed [coll]
(map-indexed vector coll))
(defn nested-index-of [coll target]
(letfn [(step [indices coll]
(reduce (fn [_ [i x]]
(if (sequential? x)
(when-let [result (step (conj indices i) x)]
(reduced result))
(when (= x target)
(reduced (conj indices i)))))
nil, (indexed coll)))]
(step [] coll)))
(def x [[[1 2 3] [4 5 6] [7 8 9]] [[10 11] [12 13]] [[14] [15]]])
(nested-index-of x 2) ;=> [0 0 1]
(nested-index-of x 15) ;=> [2 1 0]
Edit: Target never changes, so the inner step fn doesn't need it as an argument.
Edit 2: Cause I'm procrastinating here, and recursion is a nice puzzle, maybe you wanted the indices of all matches.
You can tweak my first function slightly to carry around an accumulator.
(defn nested-indices-of [coll target]
(letfn [(step [indices acc coll]
(reduce (fn [acc [i x]]
(if (sequential? x)
(step (conj indices i) acc x)
(if (= x target)
(conj acc (conj indices i))
acc)))
acc, (indexed coll)))]
(step [] [] coll)))
(def y [[[1 2 3] [4 5 6] [7 8 9]] [[10 11] [12 13]] [[14] [15 [16 17 4]]]])
(nested-indices-of y 4) ;=> [[0 1 0] [2 1 1 2]]
Vectors within vectors are no different to ints within vectors:
(.indexOf [[[1 2 3] [4 5 6] [7 8 9]] [[10 11] [12 13]] [[14] [15]]] [[14] [15]])
;;=> 2
The above might be a bit difficult to read, but [[14] [15]] is the third element.
Something like
(defn indexer [vec number]
(for [[x set1] (map-indexed vector vec)
[y set2] (map-indexed vector set1)
[z val] (map-indexed vector set2)
:when (= number val)]
[x y z]))
Written directly into here so not tested. Giving more context on what this would be used for might make it easier to give a good answer as this feels like something you shouldn't end up doing in Clojure.
You can also try and flatten the vectors in some way
An other solution to find the path of every occurrences of a given number.
Usually with functional programming you can go for broader, general, elegant, bite size solution. You will always be able to optimize using language constructs or techniques as you need (tail recursion, use of accumulator, use of lazy-seq, etc)
(defn indexes-of-value [v coll]
(into []
(comp (map-indexed #(if (== v %2) %1))
(remove nil?))
coll))
(defn coord' [v path coll]
(cond
;; node is a leaf: empty or coll of numbers
(or (empty? coll)
(number? (first coll)))
(when-let [indexes (seq (indexes-of-value v coll))]
(map #(conj path %) indexes))
;; node is branch: a coll of colls
(coll? (first coll))
(seq (sequence (comp (map-indexed vector)
(mapcat #(coord' v (conj path (first %)) (second %))))
coll))))
(defn coords [v coll] (coord' v [] coll))
Execution examples:
(def coll [[2 1] [] [7 8 9] [[] [1 2 2 3 2]]])
(coords 2 coll)
=> ([0 0] [3 1 1] [3 1 2] [3 1 4])
As a bonus you can write a function to test if paths are all valid:
(defn valid-coords? [v coll coords]
(->> coords
(map #(get-in coll %))
(remove #(== v %))
empty?))
and try the solution with input generated with clojure.spec:
(s/def ::leaf-vec (s/coll-of nat-int? :kind vector?))
(s/def ::branch-vec (s/or :branch (s/coll-of ::branch-vec :kind vector?
:min-count 1)
:leaf ::leaf-vec))
(let [v 1
coll (first (gen/sample (s/gen ::branch-vec) 1))
res (coords v coll)]
(println "generated coll: " coll)
(if-not (valid-coords? v coll res)
(println "Error:" res)
:ok))
Here is a function that can recursively search for a target value, keeping track of the indexes as it goes:
(ns tst.clj.core
(:use clj.core tupelo.test)
(:require [tupelo.core :as t] ))
(t/refer-tupelo)
(defn index-impl
[idxs data tgt]
(apply glue
(for [[idx val] (zip (range (count data)) data)]
(let [idxs-curr (append idxs idx)]
(if (sequential? val)
(index-impl idxs-curr val tgt)
(if (= val tgt)
[{:idxs idxs-curr :val val}]
[nil]))))))
(defn index [data tgt]
(keep-if not-nil? (index-impl [] data tgt)))
(dotest
(let [data-1 [1 2 3]
data-2 [[1 2 3]
[10 11]
[]]
data-3 [[[1 2 3]
[4 5 6]
[7 8 9]]
[[10 11]
[12 13]]
[[20]
[21]]
[[30]]
[[]]]
]
(spyx (index data-1 2))
(spyx (index data-2 10))
(spyx (index data-3 13))
(spyx (index data-3 21))
(spyx (index data-3 99))
))
with results:
(index data-1 2) => [{:idxs [1], :val 2}]
(index data-2 10) => [{:idxs [1 0], :val 10}]
(index data-3 13) => [{:idxs [1 1 1], :val 13}]
(index data-3 21) => [{:idxs [2 1 0], :val 21}]
(index data-3 99) => []
If we add repeated values we get the following:
data-4 [[[1 2 3]
[4 5 6]
[7 8 9]]
[[10 11]
[12 2]]
[[20]
[21]]
[[30]]
[[2]]]
(index data-4 2) => [{:idxs [0 0 1], :val 2}
{:idxs [1 1 1], :val 2}
{:idxs [4 0 0], :val 2}]

Clojure; select all nth element from list of lists with unequal size, for n = 1, 2,

I'd like to have a function, such that,
(f '([1 4 7] [2 5 9] [3 6]))
would give
([1 2 3] [4 5 6] [7 9])
I tried
(apply map vector '([1 4 7] [2 5 9] [3 6]))
would only produce:
([1 2 3] [4 5 6])
I find it hard to describe my requirements that it's difficult for me to search for a ready solution.
Please help me either to improve my description, or pointer to a solution.
Thanks in advance!
I'd solve a more general problem which means you might reuse that function in the future. I'd change map so that it keeps going past the smallest map.
(defn map-all
"Like map but if given multiple collections will call the function f
with as many arguments as there are elements still left."
([f] (map f))
([f coll] (map f coll))
([f c1 & colls]
(let [step (fn step [cs]
(lazy-seq
(let [ss (keep seq cs)]
(when (seq ss)
(cons (map first ss)
(step (map rest ss)))))))]
(map #(apply f %) (step (conj colls c1))))))
(apply map-all vector '([1 4 7] [2 5 9] [3 6]))
(apply map-all vector '([1 false 7] [nil 5 9] [3 6] [8]))
Note, that as opposed to many other solutions, this one works fine even if any of the sequences contain nil or false.
or this way with loop/recur:
user> (defn transpose-all-2 [colls]
(loop [colls colls res []]
(if-let [colls (seq (filter seq colls))]
(recur (doall (map next colls))
(conj res (mapv first colls)))
res)))
#'user/transpose-all-2
user> (transpose-all-2 x)
[[1 2 3] [4 5 6] [7 9]]
user> (transpose-all-2 '((0 1 2 3) (4 5 6 7) (8 9)))
[[0 4 8] [1 5 9] [2 6] [3 7]]
If you know the maximum length of the vectors ahead of time, you could define
(defn tx [colls]
(lazy-seq
(cons (filterv identity (map first colls))
(tx (map rest colls)))))
then
(take 3 (tx '([1 4 7] [2 5 9] [3 6])))
A simple solution is
(defn transpose-all
[colls]
(lazy-seq
(let [ss (keep seq colls)]
(when (seq ss)
(cons (map first ss) (transpose-all (map rest ss)))))))
For example,
(transpose-all '([1 4 7] [2 5 9] [3 6] [11 12 13 14]))
;((1 2 3 11) (4 5 6 12) (7 9 13) (14))
Here is my own attempt:
(defn f [l]
(let [max-count (apply max (map count l))
l-patched (map (fn [e] (if (< (count e) max-count)
(concat e (take (- max-count (count e)) (repeat nil)))
e)) l)]
(map (fn [x] (filter identity x)) (apply map vector l-patched))
))
Another simple solution:
(->> jagged-list
(map #(concat % (repeat nil)))
(apply map vector)
(take-while (partial some identity)))
A jagged-list like this
'([1 4 7 ]
[2 5 9 ]
[3 6 ]
[11 12 13 14])
will produce:
'([1 2 3 11]
[4 5 6 12]
[7 9 nil 13]
[nil nil nil 14])
Here is another go that doesn't require you to know the vector length in advance:
(defn padzip [& [colls]]
(loop [acc [] colls colls]
(if (every? empty? colls) acc
(recur (conj acc (filterv some?
(map first colls))) (map rest colls)))))

Apply a list of functions to a corresponding list of data in Clojure

So I have a list of functions and a list of data:
[fn1 fn2 fn3] [item1 item2 item3]
What can I do to apply each function to its corresponding data item:
[(fn1 item1) (fn2 item2) (fn3 item3)]
Example:
[str #(* 2 %) (partial inc)] [3 5 8]
=> ["3" 10 9]
You can use map
(map #(%1 %2) [str #(* 2 %) (partial inc)] [3 5 8])
("3" 10 9)
If you need a vector back, you can (apply vector ...)
(apply vector (map #(%1 %2) [str #(* 2 %) (partial inc)] [3 5 8]))
["3" 10 9]
Disclaimer: I don't know much Clojure, so there would probably be better ways to do this.
An alternative, not necessarily better:
user=> (for [[f x] (map vector [neg? pos? number?] [1 2 "foo"])]
#_=> (f x))
(false true false)
To make the map version suitable for varargs:
user=> (map (fn [f & args] (apply f args)) [+ - *] [1 2 3] [4 5 6] [7 8 9])
(12 -11 162)

How to write a Clojure function that returns a list of adjacent pairs?

I'm trying to write a function adjacents that returns a vector of a sequence's adjacent pairs. So (adjacents [1 2 3]) would return [[1 2] [2 3]].
(defn adjacents [s]
(loop [[a b :as remaining] s
acc []]
(if (empty? b)
acc
(recur (rest remaining) (conj acc (vector a b))))))
My current implementation works for sequences of strings but with integers or characters the REPL outputs this error:
IllegalArgumentException Don't know how to create ISeq from: java.lang.Long clojure.lang.RT.seqFrom (RT.java:494)
The problem here is in the first evaluation loop of (adjacents [1 2 3]), a is bound to 1 and b to 2. Then you ask if b is empty?. But empty? works on sequences and b is not a sequence, it is a Long, namely 2. The predicate you could use for this case here is nil?:
user=> (defn adjacents [s]
#_=> (loop [[a b :as remaining] s acc []]
#_=> (if (nil? b)
#_=> acc
#_=> (recur (rest remaining) (conj acc (vector a b))))))
#'user/adjacents
user=> (adjacents [1 2 3 4 5])
[[1 2] [2 3] [3 4] [4 5]]
But, as #amalloy points out, this may fail to give the desired result if you have legitimate nils in your data:
user=> (adjacents [1 2 nil 4 5])
[[1 2]]
See his comment for suggested implementation using lists.
Note that Clojure's partition can be used to do this work without the perils of defining your own:
user=> (partition 2 1 [1 2 3 4 5])
((1 2) (2 3) (3 4) (4 5))
user=> (partition 2 1 [1 2 nil 4 5])
((1 2) (2 nil) (nil 4) (4 5))
Here is my short answer. Everything becomes a vector, but it works for all sequences.
(defn adjacent-pairs [s]
{:pre [(sequential? s)]}
(map vector (butlast s) (rest s)))
Testing:
user=> (defn adjacent-pairs [s] (map vector (butlast s) (rest s)))
#'user/adjacent-pairs
user=> (adjacent-pairs '(1 2 3 4 5 6))
([1 2] [2 3] [3 4] [4 5] [5 6])
user=> (adjacent-pairs [1 2 3 4 5 6])
([1 2] [2 3] [3 4] [4 5] [5 6])
user=>
This answer is probably less efficient than the one using partition above, however.

How to remove multiple items from a list?

I have a list [2 3 5] which I want to use to remove items from another list like [1 2 3 4 5], so that I get [1 4].
thanks
Try this:
(let [a [1 2 3 4 5]
b [2 3 5]]
(remove (set b) a))
which returns (1 4).
The remove function, by the way, takes a predicate and a collection, and returns a sequence of the elements that don't satisfy the predicate (a set, in this example).
user=> (use 'clojure.set)
nil
user=> (difference (set [1 2 3 4 5]) (set [2 3 5]))
#{1 4}
Reference:
http://clojure.org/data_structures#toc22
http://clojure.org/api#difference
You can do this yourself with something like:
(def a [2 3 5])
(def b [1 2 3 4 5])
(defn seq-contains?
[coll target] (some #(= target %) coll))
(filter #(not (seq-contains? a %)) b)
; (3 4 5)
A version based on the reducers library could be:
(require '[clojure.core.reducers :as r])
(defn seq-contains?
[coll target]
(some #(= target %) coll))
(defn my-remove
"remove values from seq b that are present in seq a"
[a b]
(into [] (r/filter #(not (seq-contains? b %)) a)))
(my-remove [1 2 3 4 5] [2 3 5] )
; [1 4]
EDIT Added seq-contains? code
Here is my take without using sets;
(defn my-diff-func [X Y]
(reduce #(remove (fn [x] (= x %2)) %1) X Y ))