Clojure: Semi-Flattening a nested Sequence - clojure

I have a list with embedded lists of vectors, which looks like:
(([1 2]) ([3 4] [5 6]) ([7 8]))
Which I know is not ideal to work with. I'd like to flatten this to ([1 2] [3 4] [5 6] [7 8]).
flatten doesn't work: it gives me (1 2 3 4 5 6 7 8).
How do I do this? I figure I need to create a new list based on the contents of each list item, not the items, and it's this part I can't find out how to do from the docs.

If you only want to flatten it one level you can use concat
(apply concat '(([1 2]) ([3 4] [5 6]) ([7 8])))
=> ([1 2] [3 4] [5 6] [7 8])

To turn a list-of-lists into a single list containing the elements of every sub-list, you want apply concat as nickik suggests.
However, there's usually a better solution: don't produce the list-of-lists to begin with! For example, let's imagine you have a function called get-names-for which takes a symbol and returns a list of all the cool things you could call that symbol:
(get-names-for '+) => (plus add cross junction)
If you want to get all the names for some list of symbols, you might try
(map get-names-for '[+ /])
=> ((plus add cross junction) (slash divide stroke))
But this leads to the problem you were having. You could glue them together with an apply concat, but better would be to use mapcat instead of map to begin with:
(mapcat get-names-for '[+ /])
=> (plus add cross junction slash divide stroke)

The code for flatten is fairly short:
(defn flatten
[x]
(filter (complement sequential?)
(rest (tree-seq sequential? seq x))))
It uses tree-seq to walk through the data structure and return a sequence of the atoms. Since we want all the bottom-level sequences, we could modify it like this:
(defn almost-flatten
[x]
(filter #(and (sequential? %) (not-any? sequential? %))
(rest (tree-seq #(and (sequential? %) (some sequential? %)) seq x))))
so we return all the sequences that don't contain sequences.

Also you may found useful this general 1 level flatten function I found on clojuremvc:
(defn flatten-1
"Flattens only the first level of a given sequence, e.g. [[1 2][3]] becomes
[1 2 3], but [[1 [2]] [3]] becomes [1 [2] 3]."
[seq]
(if (or (not (seqable? seq)) (nil? seq))
seq ; if seq is nil or not a sequence, don't do anything
(loop [acc [] [elt & others] seq]
(if (nil? elt) acc
(recur
(if (seqable? elt)
(apply conj acc elt) ; if elt is a sequence, add each element of elt
(conj acc elt)) ; if elt is not a sequence, add elt itself
others)))))
Example:
(flatten-1 (([1 2]) ([3 4] [5 6]) ([7 8])))
=>[[1 2] [3 4] [5 6] [7 8]]
concat exampe surely do job for you, but this flatten-1 is also allowing non seq elements inside a collection:
(flatten-1 '(1 2 ([3 4] [5 6]) ([7 8])))
=>[1 2 [3 4] [5 6] [7 8]]
;whereas
(apply concat '(1 2 ([3 4] [5 6]) ([7 8])))
=> java.lang.IllegalArgumentException:
Don't know how to create ISeq from: java.lang.Integer

Here's a function that will flatten down to the sequence level, regardless of uneven nesting:
(fn flt [s] (mapcat #(if (every? coll? %) (flt %) (list %)) s))
So if your original sequence was:
'(([1 2]) (([3 4]) ((([5 6])))) ([7 8]))
You'd still get the same result:
([1 2] [3 4] [5 6] [7 8])

Related

Clojure; select all nth element from list of lists with unequal size, for n = 1, 2,

I'd like to have a function, such that,
(f '([1 4 7] [2 5 9] [3 6]))
would give
([1 2 3] [4 5 6] [7 9])
I tried
(apply map vector '([1 4 7] [2 5 9] [3 6]))
would only produce:
([1 2 3] [4 5 6])
I find it hard to describe my requirements that it's difficult for me to search for a ready solution.
Please help me either to improve my description, or pointer to a solution.
Thanks in advance!
I'd solve a more general problem which means you might reuse that function in the future. I'd change map so that it keeps going past the smallest map.
(defn map-all
"Like map but if given multiple collections will call the function f
with as many arguments as there are elements still left."
([f] (map f))
([f coll] (map f coll))
([f c1 & colls]
(let [step (fn step [cs]
(lazy-seq
(let [ss (keep seq cs)]
(when (seq ss)
(cons (map first ss)
(step (map rest ss)))))))]
(map #(apply f %) (step (conj colls c1))))))
(apply map-all vector '([1 4 7] [2 5 9] [3 6]))
(apply map-all vector '([1 false 7] [nil 5 9] [3 6] [8]))
Note, that as opposed to many other solutions, this one works fine even if any of the sequences contain nil or false.
or this way with loop/recur:
user> (defn transpose-all-2 [colls]
(loop [colls colls res []]
(if-let [colls (seq (filter seq colls))]
(recur (doall (map next colls))
(conj res (mapv first colls)))
res)))
#'user/transpose-all-2
user> (transpose-all-2 x)
[[1 2 3] [4 5 6] [7 9]]
user> (transpose-all-2 '((0 1 2 3) (4 5 6 7) (8 9)))
[[0 4 8] [1 5 9] [2 6] [3 7]]
If you know the maximum length of the vectors ahead of time, you could define
(defn tx [colls]
(lazy-seq
(cons (filterv identity (map first colls))
(tx (map rest colls)))))
then
(take 3 (tx '([1 4 7] [2 5 9] [3 6])))
A simple solution is
(defn transpose-all
[colls]
(lazy-seq
(let [ss (keep seq colls)]
(when (seq ss)
(cons (map first ss) (transpose-all (map rest ss)))))))
For example,
(transpose-all '([1 4 7] [2 5 9] [3 6] [11 12 13 14]))
;((1 2 3 11) (4 5 6 12) (7 9 13) (14))
Here is my own attempt:
(defn f [l]
(let [max-count (apply max (map count l))
l-patched (map (fn [e] (if (< (count e) max-count)
(concat e (take (- max-count (count e)) (repeat nil)))
e)) l)]
(map (fn [x] (filter identity x)) (apply map vector l-patched))
))
Another simple solution:
(->> jagged-list
(map #(concat % (repeat nil)))
(apply map vector)
(take-while (partial some identity)))
A jagged-list like this
'([1 4 7 ]
[2 5 9 ]
[3 6 ]
[11 12 13 14])
will produce:
'([1 2 3 11]
[4 5 6 12]
[7 9 nil 13]
[nil nil nil 14])
Here is another go that doesn't require you to know the vector length in advance:
(defn padzip [& [colls]]
(loop [acc [] colls colls]
(if (every? empty? colls) acc
(recur (conj acc (filterv some?
(map first colls))) (map rest colls)))))

Tree traversal in Clojure

Consider a tree, which is defined by the following recursive definition:
a tree should be a vector of two elements: The first one is a number, the second one is either a list of trees or nil.
The following clojure data structure would be an example:
(def tree '[9 ([4 nil]
[6 nil]
[2 ([55 nil]
[22 nil]
[3 ([5 nil])])]
[67 ([44 nil])])])
This structure should be transformed into a list of all possible downwards connections, that can be made from any node to its childs. The connections should be represented as vectors containing the value of the parent node followed by the value of the child node. The order is not important:
(def result '([9 4]
[9 6]
[9 2]
[2 55]
[2 22]
[2 3]
[3 5]
[9 67]
[67 44])
I came up with this solution:
(defn get-connections [[x xs]]
(concat (map #(vector x (first %)) xs)
(mapcat get-connections xs)))
And, indeed:
(= (sort result)
(sort (get-connections tree)))
;; true
However, are there better ways to do so with just plain clojure? In the approach, I'm traversing each node's childs twice, this should be avoided. In this particular case, tail-recursion is not necessary, so a simple recursive version would be ok.
Moreover, I'd be interested which higher level abstractions could be used for solving this. What about Zippers or Clojure/Walk? And finally: Which of those techniques would be available in ClojureScript as well?
you could try combination of list comprehension + tree-seq:
user> (for [[value children] (tree-seq coll? second tree)
[child-value] children]
[value child-value])
;;=> ([9 4] [9 6] [9 2] [9 67] [2 55] [2 22] [2 3] [3 5] [67 44])
this should be available in cljs.
as far as i know, both zippers and clojure.walk are available in clojurescript, but in fact you don't need them for this trivial task. I guess tree-seq is rather idiomatic.
What about the double traversal, you could easily rearrange it into a single one like this:
(defn get-connections [[x xs]]
(mapcat #(cons [x (first %)] (get-connections %)) xs))
user> (get-connections tree)
;;=> ([9 4] [9 6] [9 2] [2 55] [2 22] [2 3] [3 5] [9 67] [67 44])
then you could add laziness, to make this solution truly idiomatic:
(defn get-connections [[x xs]]
(mapcat #(lazy-seq (cons [x (first %)] (get-connections %))) xs))

Clojure: how to merge several sorted vectors of vectors into common structure?

Need your help. Stuck on an intuitively simple task.
I have a few vectors of vectors. The first element of each of the sub-vectors is a numeric key. All parent vectors are sorted by these keys. For example:
[[1 a b] [3 c d] [4 f d] .... ]
[[1 aa bb] [2 cc dd] [3 ww qq] [5 f]... ]
[[3 ccc ddd] [4 fff ddd] ...]
Need to clarify that some key values in nested vectors may be missing, but sorting order guaranteed.
I need to merge all of these vectors into some unified structure by numeric keys. I also need to now, that a key was missed in original vector or vectors.
Like this:
[ [[1 a b][1 aa bb][]] [[][2 cc dd]] [[3 c d][3 ww qq][3 ccc ddd]] [[4 f d][][4 fff dd]]...]
You can break the problem down into two parts:
1) get the unique keys in sorted order
2) for each unique key, iterate through the list of vectors, and output either the entry for the key, or an empty list if missing
To get the unique keys, just pull out all the keys into lists, concat them into one big list, and then put them into a sorted-set:
(into
(sorted-set)
(apply concat
(for [vector vectors]
(map first vector))))
if we start with a list of vectors of:
(def vectors
[[[1 'a 'b] [3 'c 'd] [4 'f 'd]]
[[1 'aa 'bb] [2 'cc 'dd] [3 'ww 'qq] [5 'f]]
[[3 'ccc 'ddd] [4 'fff 'ddd]]])
then we get a sorted set of:
=> #{1 2 3 4 5}
so far so good. now for each key in that sorted set we need to iterate through the vectors, and get the entry with that key, or an empty list if it's missing. You can do that using two 'for' forms and then 'some' to find the entry (if present)
(for [k #{1 2 3 4 5}]
(for [vector vectors]
(or (some #(when (= (first %) k) %) vector )
[])))
this yields:
=> (([1 a b] [1 aa bb] []) ([] [2 cc dd] []) ([3 c d] [3 ww qq] [3 ccc ddd]) ([4 f d] [] [4 fff ddd]) ([] [5 f] []))
which I think is what you want. (if you need vectors and not lists, just use "(into [] ...)" in the appropriate places.)
Putting it all together, we get:
(defn unify-vectors [vectors]
(for [k (into (sorted-set)
(apply concat
(for [vector vectors]
(map first vector))))]
(for [vector vectors]
(or (some #(when (= (first %) k) %) vector)
[]))))
(unify-vectors
[[[1 'a 'b] [3 'c 'd] [4 'f 'd]]
[[1 'aa 'bb] [2 'cc 'dd] [3 'ww 'qq] [5 'f]]
[[3 'ccc 'ddd] [4 'fff 'ddd]]])
=> (([1 a b] [1 aa bb] []) ([] [2 cc dd] []) ([3 c d] [3 ww qq] [3 ccc ddd]) ([4 f d] [] [4 fff ddd]) ([] [5 f] []))
I do not have a complete solution for you, but as a hint: use group-by to sort your vectors for the first argument.
This will be more idiomatic and maybe just a few lines when it is ready.
So you could write something like
(group-by first [[1 :a :b] [3 :c :d] [4 :f :d]])
and do this for all vectors. Then you can sort / merge them with the keys provided by group-by.
This is a simple workaround, but doesn't meet the best practices of Clojure Programming. Just to give a simple idea here.
(def vectors
[
[[1 'a 'b] [3 'c 'd] [4 'f 'd]]
[[1 'aa 'bb] [2 'cc 'dd] [3 'ww 'qq] [5 'f]]
[[3 'ccc 'ddd] [4 'fff 'ddd]]]
)
(loop [i 1
result []]
(def sub-result [])
(doseq [v vectors]
(doseq [sub-v v]
(if
(= i (first sub-v))
(def sub-result (into sub-result [sub-v]))))
(if-not
(some #{i}
(map first v))
(def sub-result (into sub-result [[]]))
))
(if (< i 6)
(recur (inc i) (into result [sub-result]))
(print result)))

Partitioning across partitions in Clojure?

Here are some values. Each is a sequence of ascending (or otherwise grouped) values.
(def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]]
[[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]]
[[1 :l] [3 :m]]])
I can partition them each by value.
=> (map (partial partition-by first) input-vals)
((([1 :a] [1 :b]) ([2 :c]) ([3 :d] [3 :e])) (([1 :f]) ([2 :g] [2 :h] [2 :i]) ([3 :j] [3 :k])) (([1 :l]) ([3 :m])))
But that gets me 3 sequences of partitions. I want one single sequence of partitioned groups.
What I want to do is return a single lazy sequence of (potentially) lazy sequences that are the respective partitions joined. e.g. I want to produce this:
((([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m])))
Note that not all values appear in all sequences (there is no 2 in the third vector).
This is of course a simplification of my problem. The real data is a set of lazy streams coming from very large files, so nothing can be realised. But I think the solution for the above question is the solution for my problem.
Feel free to edit the title, I wasn't quite sure how to express it.
Try this horror:
(defn partition-many-by [f comp-f s]
(let [sorted-s (sort-by first comp-f s)
first-list (first (drop-while (complement seq) sorted-s))
match-val (f (first first-list))
remains (filter #(not (empty? %))
(map #(drop-while (fn [ss] (= match-val (f ss))) %)
sorted-s))]
(when match-val
(cons
(apply concat
(map #(take-while (fn [ss] (= match-val (f ss))) %)
sorted-s))
(lazy-seq (partition-many-by f comp-f remains))))))
It could possibly be improved to remove the double value check (take-while and drop-while).
Example usage:
(partition-many-by identity [[1 1 1 1 2 2 3 3 3 3] [1 1 2 2 2 2 3] [3]])
=> ((1 1 1 1 1 1) (2 2 2 2 2 2) (3 3 3 3 3 3))
Let's make this interesting and use sequences of infinite length for our input
(def twos (iterate #(+ 2 %) 0))
(def threes (iterate #(+ 3 %) 0))
(def fives (iterate #(+ 5 %) 0))
We'll need to lazily merge them. Let's ask for a comparator so we can apply to other data types as well.
(defn lazy-merge-by
([compfn xs ys]
(lazy-seq
(cond
(empty? xs) ys
(empty? ys) xs
:else (if (compfn (first xs) (first ys))
(cons (first xs) (lazy-merge-by compfn (rest xs) ys))
(cons (first ys) (lazy-merge-by compfn xs (rest ys)))))))
([compfn xs ys & more]
(apply lazy-merge-by compfn (lazy-merge-by compfn xs ys) more)))
Test
(take 15 (lazy-merge-by < twos threes fives))
;=> (0 0 0 2 3 4 5 6 6 8 9 10 10 12 12)
We can (lazily) partition by value if desired
(take 10 (partition-by identity (lazy-merge-by < twos threes fives)))
;=> ((0 0 0) (2) (3) (4) (5) (6 6) (8) (9) (10 10) (12 12))
Now, back to the sample input
(partition-by first (apply lazy-merge-by #(<= (first %) (first %2)) input-vals))
;=> (([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m]))
as desired less one extraneous set of outer parentheses.
I'm not sure whether I'm following but you can faltten the result sequence, something like:
(flatten (partition-by identity (first input-vals)))
clojure.core/flatten
([x])
Takes any nested combination of sequential things (lists, vectors,
etc.) and returns their contents as a single, flat sequence.
(flatten nil) returns an empty sequence.
You can use realized? function to test whether a sequence is lazy or not.
user> (def desired-result '((([1 :a] [1 :b] [1 :f] [1 :l])
([2 :c] [2 :g] [2 :h] [2 :i])
([3 :d] [3 :e] [3 :j] [3 :k] [3 :m]))))
#'user/desired-result
user> (def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]]
[[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]]
[[1 :l] [3 :m]]])
#'user/input-vals
user> (= desired-result (vector (vals (group-by first (apply concat input-vals)))))
true
I changed the input-vals slightly to correct for what I assume was a typographical error, if it was not an error I can update my code to accommodate the less regular structure.
Using the ->> (thread last) macro, we can have the equivalent code in a more readable form:
user> (= desired-result
(->> input-vals
(apply concat)
(group-by first)
vals
vector))
true
(partition-by first (sort-by first (mapcat identity input-vals)))

How to write a Clojure function that returns a list of adjacent pairs?

I'm trying to write a function adjacents that returns a vector of a sequence's adjacent pairs. So (adjacents [1 2 3]) would return [[1 2] [2 3]].
(defn adjacents [s]
(loop [[a b :as remaining] s
acc []]
(if (empty? b)
acc
(recur (rest remaining) (conj acc (vector a b))))))
My current implementation works for sequences of strings but with integers or characters the REPL outputs this error:
IllegalArgumentException Don't know how to create ISeq from: java.lang.Long clojure.lang.RT.seqFrom (RT.java:494)
The problem here is in the first evaluation loop of (adjacents [1 2 3]), a is bound to 1 and b to 2. Then you ask if b is empty?. But empty? works on sequences and b is not a sequence, it is a Long, namely 2. The predicate you could use for this case here is nil?:
user=> (defn adjacents [s]
#_=> (loop [[a b :as remaining] s acc []]
#_=> (if (nil? b)
#_=> acc
#_=> (recur (rest remaining) (conj acc (vector a b))))))
#'user/adjacents
user=> (adjacents [1 2 3 4 5])
[[1 2] [2 3] [3 4] [4 5]]
But, as #amalloy points out, this may fail to give the desired result if you have legitimate nils in your data:
user=> (adjacents [1 2 nil 4 5])
[[1 2]]
See his comment for suggested implementation using lists.
Note that Clojure's partition can be used to do this work without the perils of defining your own:
user=> (partition 2 1 [1 2 3 4 5])
((1 2) (2 3) (3 4) (4 5))
user=> (partition 2 1 [1 2 nil 4 5])
((1 2) (2 nil) (nil 4) (4 5))
Here is my short answer. Everything becomes a vector, but it works for all sequences.
(defn adjacent-pairs [s]
{:pre [(sequential? s)]}
(map vector (butlast s) (rest s)))
Testing:
user=> (defn adjacent-pairs [s] (map vector (butlast s) (rest s)))
#'user/adjacent-pairs
user=> (adjacent-pairs '(1 2 3 4 5 6))
([1 2] [2 3] [3 4] [4 5] [5 6])
user=> (adjacent-pairs [1 2 3 4 5 6])
([1 2] [2 3] [3 4] [4 5] [5 6])
user=>
This answer is probably less efficient than the one using partition above, however.