Tree traversal in Clojure - clojure

Consider a tree, which is defined by the following recursive definition:
a tree should be a vector of two elements: The first one is a number, the second one is either a list of trees or nil.
The following clojure data structure would be an example:
(def tree '[9 ([4 nil]
[6 nil]
[2 ([55 nil]
[22 nil]
[3 ([5 nil])])]
[67 ([44 nil])])])
This structure should be transformed into a list of all possible downwards connections, that can be made from any node to its childs. The connections should be represented as vectors containing the value of the parent node followed by the value of the child node. The order is not important:
(def result '([9 4]
[9 6]
[9 2]
[2 55]
[2 22]
[2 3]
[3 5]
[9 67]
[67 44])
I came up with this solution:
(defn get-connections [[x xs]]
(concat (map #(vector x (first %)) xs)
(mapcat get-connections xs)))
And, indeed:
(= (sort result)
(sort (get-connections tree)))
;; true
However, are there better ways to do so with just plain clojure? In the approach, I'm traversing each node's childs twice, this should be avoided. In this particular case, tail-recursion is not necessary, so a simple recursive version would be ok.
Moreover, I'd be interested which higher level abstractions could be used for solving this. What about Zippers or Clojure/Walk? And finally: Which of those techniques would be available in ClojureScript as well?

you could try combination of list comprehension + tree-seq:
user> (for [[value children] (tree-seq coll? second tree)
[child-value] children]
[value child-value])
;;=> ([9 4] [9 6] [9 2] [9 67] [2 55] [2 22] [2 3] [3 5] [67 44])
this should be available in cljs.
as far as i know, both zippers and clojure.walk are available in clojurescript, but in fact you don't need them for this trivial task. I guess tree-seq is rather idiomatic.
What about the double traversal, you could easily rearrange it into a single one like this:
(defn get-connections [[x xs]]
(mapcat #(cons [x (first %)] (get-connections %)) xs))
user> (get-connections tree)
;;=> ([9 4] [9 6] [9 2] [2 55] [2 22] [2 3] [3 5] [9 67] [67 44])
then you could add laziness, to make this solution truly idiomatic:
(defn get-connections [[x xs]]
(mapcat #(lazy-seq (cons [x (first %)] (get-connections %))) xs))

Related

Map all elements, preserving vector structures

For example, how could I best achieve this transformation:
[[[1 2]] [3 4] [[5] 6]] -> [[[2 3]] [4 5] [[6] 7]]
Is there an idiomatic way of doing this, with any number of levels?
You could use clojure.walk to increment numbers in arbitrarily nested structures:
(def data [[[1 2]] [3 4] [[5] 6]])
(clojure.walk/postwalk
#(if (number? %) (inc %) %)
data)
=> [[[2 3]] [4 5] [[6] 7]]

how to implement walk/postwalk traversal using clojure.zip

I can't figure out how to implement the clojure.walk/postwalk function using clojure.zip:
(clojure.walk/postwalk
#(do (println %)
%)
[1
[2 [3 4 5]]
[6 [7 8]]])
outputs:
1
2
3
4
5
[3 4 5]
[2 [3 4 5]]
6
7
8
[7 8]
[6 [7 8]]
[1 [2 [3 4 5]] [6 [7 8]]]
(defn postwalk [f loc]
(let [loc (if-some [loc (z/down loc)]
(loop [loc loc]
(let [loc (postwalk f loc)]
(if-some [loc (z/right loc)]
(recur loc)
(z/up loc))))
loc)]
(z/replace loc (f (z/node loc)))))
=> (postwalk #(doto % prn) (z/vector-zip [1 [2 [3 4 5]] [6 [7 8]]]))
1
2
3
4
5
[3 4 5]
[2 [3 4 5]]
6
7
8
[7 8]
[6 [7 8]]
[1 [2 [3 4 5]] [6 [7 8]]]
Edit: for prewalk, just perform the z/replace before going down.
(defn prewalk [f loc]
(let [loc (z/replace loc (f (z/node loc)))]
(if-some [loc (z/down loc)]
(loop [loc loc]
(let [loc (prewalk f loc)]
(if-some [loc (z/right loc)]
(recur loc)
(z/up loc))))
loc)))
I believe there is a more idiomatic way to implement post order traversal that preserves zipper style navigation:
(defn post-zip
[loc]
;; start from the deepest left child
(loop [loc loc]
(if-let [l (and (z/branch? loc) (z/down loc))]
(recur l)
loc)))
(defn post-next
[loc]
(if-let [sib (z/right loc)]
;; if we have a right sibling, move to it's deepest left child
(post-zip sib)
;; If there is no right sibling move up if possible
(if-let [parent (z/up loc)]
parent
;; otherwise we are done
(with-meta [(zip/node loc) :end] (meta loc)))))
This allows you to make conditional modifications in the same way you would with a normal zipper (i.e. you don't always have to z/replace, sometimes you can just call post-next with no change to the current node).
The implementation of postwalk given these helper functions becomes quite trivial:
(defn postwalk [f zipper]
(loop [loc (post-zip zipper)]
(if (z/end? loc)
loc
(recur (post-next (z/replace loc (f (z/node loc))))))))
This solution also has the advantage of not introducing recursion that could overflow the stack for large trees.

Clojure: how to merge several sorted vectors of vectors into common structure?

Need your help. Stuck on an intuitively simple task.
I have a few vectors of vectors. The first element of each of the sub-vectors is a numeric key. All parent vectors are sorted by these keys. For example:
[[1 a b] [3 c d] [4 f d] .... ]
[[1 aa bb] [2 cc dd] [3 ww qq] [5 f]... ]
[[3 ccc ddd] [4 fff ddd] ...]
Need to clarify that some key values in nested vectors may be missing, but sorting order guaranteed.
I need to merge all of these vectors into some unified structure by numeric keys. I also need to now, that a key was missed in original vector or vectors.
Like this:
[ [[1 a b][1 aa bb][]] [[][2 cc dd]] [[3 c d][3 ww qq][3 ccc ddd]] [[4 f d][][4 fff dd]]...]
You can break the problem down into two parts:
1) get the unique keys in sorted order
2) for each unique key, iterate through the list of vectors, and output either the entry for the key, or an empty list if missing
To get the unique keys, just pull out all the keys into lists, concat them into one big list, and then put them into a sorted-set:
(into
(sorted-set)
(apply concat
(for [vector vectors]
(map first vector))))
if we start with a list of vectors of:
(def vectors
[[[1 'a 'b] [3 'c 'd] [4 'f 'd]]
[[1 'aa 'bb] [2 'cc 'dd] [3 'ww 'qq] [5 'f]]
[[3 'ccc 'ddd] [4 'fff 'ddd]]])
then we get a sorted set of:
=> #{1 2 3 4 5}
so far so good. now for each key in that sorted set we need to iterate through the vectors, and get the entry with that key, or an empty list if it's missing. You can do that using two 'for' forms and then 'some' to find the entry (if present)
(for [k #{1 2 3 4 5}]
(for [vector vectors]
(or (some #(when (= (first %) k) %) vector )
[])))
this yields:
=> (([1 a b] [1 aa bb] []) ([] [2 cc dd] []) ([3 c d] [3 ww qq] [3 ccc ddd]) ([4 f d] [] [4 fff ddd]) ([] [5 f] []))
which I think is what you want. (if you need vectors and not lists, just use "(into [] ...)" in the appropriate places.)
Putting it all together, we get:
(defn unify-vectors [vectors]
(for [k (into (sorted-set)
(apply concat
(for [vector vectors]
(map first vector))))]
(for [vector vectors]
(or (some #(when (= (first %) k) %) vector)
[]))))
(unify-vectors
[[[1 'a 'b] [3 'c 'd] [4 'f 'd]]
[[1 'aa 'bb] [2 'cc 'dd] [3 'ww 'qq] [5 'f]]
[[3 'ccc 'ddd] [4 'fff 'ddd]]])
=> (([1 a b] [1 aa bb] []) ([] [2 cc dd] []) ([3 c d] [3 ww qq] [3 ccc ddd]) ([4 f d] [] [4 fff ddd]) ([] [5 f] []))
I do not have a complete solution for you, but as a hint: use group-by to sort your vectors for the first argument.
This will be more idiomatic and maybe just a few lines when it is ready.
So you could write something like
(group-by first [[1 :a :b] [3 :c :d] [4 :f :d]])
and do this for all vectors. Then you can sort / merge them with the keys provided by group-by.
This is a simple workaround, but doesn't meet the best practices of Clojure Programming. Just to give a simple idea here.
(def vectors
[
[[1 'a 'b] [3 'c 'd] [4 'f 'd]]
[[1 'aa 'bb] [2 'cc 'dd] [3 'ww 'qq] [5 'f]]
[[3 'ccc 'ddd] [4 'fff 'ddd]]]
)
(loop [i 1
result []]
(def sub-result [])
(doseq [v vectors]
(doseq [sub-v v]
(if
(= i (first sub-v))
(def sub-result (into sub-result [sub-v]))))
(if-not
(some #{i}
(map first v))
(def sub-result (into sub-result [[]]))
))
(if (< i 6)
(recur (inc i) (into result [sub-result]))
(print result)))

Partitioning across partitions in Clojure?

Here are some values. Each is a sequence of ascending (or otherwise grouped) values.
(def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]]
[[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]]
[[1 :l] [3 :m]]])
I can partition them each by value.
=> (map (partial partition-by first) input-vals)
((([1 :a] [1 :b]) ([2 :c]) ([3 :d] [3 :e])) (([1 :f]) ([2 :g] [2 :h] [2 :i]) ([3 :j] [3 :k])) (([1 :l]) ([3 :m])))
But that gets me 3 sequences of partitions. I want one single sequence of partitioned groups.
What I want to do is return a single lazy sequence of (potentially) lazy sequences that are the respective partitions joined. e.g. I want to produce this:
((([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m])))
Note that not all values appear in all sequences (there is no 2 in the third vector).
This is of course a simplification of my problem. The real data is a set of lazy streams coming from very large files, so nothing can be realised. But I think the solution for the above question is the solution for my problem.
Feel free to edit the title, I wasn't quite sure how to express it.
Try this horror:
(defn partition-many-by [f comp-f s]
(let [sorted-s (sort-by first comp-f s)
first-list (first (drop-while (complement seq) sorted-s))
match-val (f (first first-list))
remains (filter #(not (empty? %))
(map #(drop-while (fn [ss] (= match-val (f ss))) %)
sorted-s))]
(when match-val
(cons
(apply concat
(map #(take-while (fn [ss] (= match-val (f ss))) %)
sorted-s))
(lazy-seq (partition-many-by f comp-f remains))))))
It could possibly be improved to remove the double value check (take-while and drop-while).
Example usage:
(partition-many-by identity [[1 1 1 1 2 2 3 3 3 3] [1 1 2 2 2 2 3] [3]])
=> ((1 1 1 1 1 1) (2 2 2 2 2 2) (3 3 3 3 3 3))
Let's make this interesting and use sequences of infinite length for our input
(def twos (iterate #(+ 2 %) 0))
(def threes (iterate #(+ 3 %) 0))
(def fives (iterate #(+ 5 %) 0))
We'll need to lazily merge them. Let's ask for a comparator so we can apply to other data types as well.
(defn lazy-merge-by
([compfn xs ys]
(lazy-seq
(cond
(empty? xs) ys
(empty? ys) xs
:else (if (compfn (first xs) (first ys))
(cons (first xs) (lazy-merge-by compfn (rest xs) ys))
(cons (first ys) (lazy-merge-by compfn xs (rest ys)))))))
([compfn xs ys & more]
(apply lazy-merge-by compfn (lazy-merge-by compfn xs ys) more)))
Test
(take 15 (lazy-merge-by < twos threes fives))
;=> (0 0 0 2 3 4 5 6 6 8 9 10 10 12 12)
We can (lazily) partition by value if desired
(take 10 (partition-by identity (lazy-merge-by < twos threes fives)))
;=> ((0 0 0) (2) (3) (4) (5) (6 6) (8) (9) (10 10) (12 12))
Now, back to the sample input
(partition-by first (apply lazy-merge-by #(<= (first %) (first %2)) input-vals))
;=> (([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m]))
as desired less one extraneous set of outer parentheses.
I'm not sure whether I'm following but you can faltten the result sequence, something like:
(flatten (partition-by identity (first input-vals)))
clojure.core/flatten
([x])
Takes any nested combination of sequential things (lists, vectors,
etc.) and returns their contents as a single, flat sequence.
(flatten nil) returns an empty sequence.
You can use realized? function to test whether a sequence is lazy or not.
user> (def desired-result '((([1 :a] [1 :b] [1 :f] [1 :l])
([2 :c] [2 :g] [2 :h] [2 :i])
([3 :d] [3 :e] [3 :j] [3 :k] [3 :m]))))
#'user/desired-result
user> (def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]]
[[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]]
[[1 :l] [3 :m]]])
#'user/input-vals
user> (= desired-result (vector (vals (group-by first (apply concat input-vals)))))
true
I changed the input-vals slightly to correct for what I assume was a typographical error, if it was not an error I can update my code to accommodate the less regular structure.
Using the ->> (thread last) macro, we can have the equivalent code in a more readable form:
user> (= desired-result
(->> input-vals
(apply concat)
(group-by first)
vals
vector))
true
(partition-by first (sort-by first (mapcat identity input-vals)))

Clojure: Semi-Flattening a nested Sequence

I have a list with embedded lists of vectors, which looks like:
(([1 2]) ([3 4] [5 6]) ([7 8]))
Which I know is not ideal to work with. I'd like to flatten this to ([1 2] [3 4] [5 6] [7 8]).
flatten doesn't work: it gives me (1 2 3 4 5 6 7 8).
How do I do this? I figure I need to create a new list based on the contents of each list item, not the items, and it's this part I can't find out how to do from the docs.
If you only want to flatten it one level you can use concat
(apply concat '(([1 2]) ([3 4] [5 6]) ([7 8])))
=> ([1 2] [3 4] [5 6] [7 8])
To turn a list-of-lists into a single list containing the elements of every sub-list, you want apply concat as nickik suggests.
However, there's usually a better solution: don't produce the list-of-lists to begin with! For example, let's imagine you have a function called get-names-for which takes a symbol and returns a list of all the cool things you could call that symbol:
(get-names-for '+) => (plus add cross junction)
If you want to get all the names for some list of symbols, you might try
(map get-names-for '[+ /])
=> ((plus add cross junction) (slash divide stroke))
But this leads to the problem you were having. You could glue them together with an apply concat, but better would be to use mapcat instead of map to begin with:
(mapcat get-names-for '[+ /])
=> (plus add cross junction slash divide stroke)
The code for flatten is fairly short:
(defn flatten
[x]
(filter (complement sequential?)
(rest (tree-seq sequential? seq x))))
It uses tree-seq to walk through the data structure and return a sequence of the atoms. Since we want all the bottom-level sequences, we could modify it like this:
(defn almost-flatten
[x]
(filter #(and (sequential? %) (not-any? sequential? %))
(rest (tree-seq #(and (sequential? %) (some sequential? %)) seq x))))
so we return all the sequences that don't contain sequences.
Also you may found useful this general 1 level flatten function I found on clojuremvc:
(defn flatten-1
"Flattens only the first level of a given sequence, e.g. [[1 2][3]] becomes
[1 2 3], but [[1 [2]] [3]] becomes [1 [2] 3]."
[seq]
(if (or (not (seqable? seq)) (nil? seq))
seq ; if seq is nil or not a sequence, don't do anything
(loop [acc [] [elt & others] seq]
(if (nil? elt) acc
(recur
(if (seqable? elt)
(apply conj acc elt) ; if elt is a sequence, add each element of elt
(conj acc elt)) ; if elt is not a sequence, add elt itself
others)))))
Example:
(flatten-1 (([1 2]) ([3 4] [5 6]) ([7 8])))
=>[[1 2] [3 4] [5 6] [7 8]]
concat exampe surely do job for you, but this flatten-1 is also allowing non seq elements inside a collection:
(flatten-1 '(1 2 ([3 4] [5 6]) ([7 8])))
=>[1 2 [3 4] [5 6] [7 8]]
;whereas
(apply concat '(1 2 ([3 4] [5 6]) ([7 8])))
=> java.lang.IllegalArgumentException:
Don't know how to create ISeq from: java.lang.Integer
Here's a function that will flatten down to the sequence level, regardless of uneven nesting:
(fn flt [s] (mapcat #(if (every? coll? %) (flt %) (list %)) s))
So if your original sequence was:
'(([1 2]) (([3 4]) ((([5 6])))) ([7 8]))
You'd still get the same result:
([1 2] [3 4] [5 6] [7 8])