I want to use clojure.zip to walk through this tree and print the node and its parent. I am having trouble getting the parent. For example the parent of :e is :b.
;;
;; :a
;; / \
;; :b :c
;; /\ \
;; :d :e :f
;;
(def example [:a [:b [:d] [:e]] [:c [:f]]])
(def z (zip/vector-zip example))
(def locs (take-while (complement zip/end?) (iterate zip/next z)))
(defn parent-of [loc]
(when-let [parent-loc (-> loc zip/up zip/left)]
(zip/node parent-loc)))
(defn visit-all []
(doseq [loc locs]
(let [node (zip/node loc)]
(when (keyword? node)
(println node "has parent" (parent-of loc))))))
This is the result:
:a has parent nil
:b has parent :a
:d has parent :b
:e has parent [:d]
:c has parent [:b [:d] [:e]]
:f has parent :c
I could keep improving the parent-of function - my next thought would be to go to the left-most node. There will be an algorithm that will return the correct answer from all locations - however going about it this way seems like quite a lot of work for a common requirement.
Is there a better approach I should be taking?
Edit
This question is not about clojure.walk or Spector. I am looking for an answer that uses clojure.zip and gives the parent as I defined it in the question, which is simply the keyword above the loc. So if the loc given to parent-of is :f I would expect it to return :c.
If someone can tell me, as a comment, that zippers are not really used anymore, and say clojure.walk or Spector is the current best practice way to go for navigating trees, then that would help.
Why are you using zip/left? The parent of the node is exactly the node above it, not a node above it and to the left for some reason. Removing that is all you need to do.
This is my own answer:
(defn parent-of [loc]
(when-let [parent-loc (-> loc zip/up zip/up first)]
(zip/node parent-loc)))
It gives the correct output in all cases:
:a has parent nil
:b has parent :a
:d has parent :b
:e has parent :b
:c has parent :a
:f has parent :c
Related
I want to select paths of a deeply nested map to keep.
For example:
{:a 1
:b {:c [{:d 1 :e 1}
{:d 2 :e 2}]
:f 1}
:g {:h {:i 4 :j [1 2 3]}}}
I want to select by paths, like so:
(select-paths m [[:a]
[:b :c :e]
[:b :f]
[:g :h :i]])
This would return
{:a 1
:b {:c [{:e 1}
{:e 2}]
:f 1}
:g {:h {:i 4}}}
Essentially the same as Elasticsearch's fields parameter. The format of the paths argument can be something else, this is just the first idea.
I tried two different solutions
Go through the entire map and checking if the full path of the current element is in the given paths. I can't figure out how to handle lists of maps so that they are kept as lists of maps.
Creating select-keys statements from the given paths but again I run into problems with lists of maps - and especially trying to resolve paths of varying depths that have some common depth.
I looked at spectre but I didn't see anything that would do this. Any map or postwalk based solution I come up with turns into something incredibly convoluted at some point. I must be thinking about this the wrong way.
If there's a way to do this with raw json, that would be fine as well. Or even a Java solution.
There is no simple way to accomplish your goal. The automatic processing implied for the sequence under [:b :c] is also problematic.
You can get partway there using the Tupelo Forest library. See the Lightning Talk video from Clojure/Conj 2017.
I did some additional work in data destructuring that you may find useful building the tupelo.core/destruct macro (see examples here). You could follow a similar outline to build a recursive solution to your specific problem.
A related project is Meander. I have worked on my own version which is like a generalized version of tupelo.core/destruct. Given data like this
(def skynet-widgets [{:basic-info {:producer-code "Cyberdyne"}
:widgets [{:widget-code "Model-101"
:widget-type-code "t800"}
{:widget-code "Model-102"
:widget-type-code "t800"}
{:widget-code "Model-201"
:widget-type-code "t1000"}]
:widget-types [{:widget-type-code "t800"
:description "Resistance Infiltrator"}
{:widget-type-code "t1000"
:description "Mimetic polyalloy"}]}
{:basic-info {:producer-code "ACME"}
:widgets [{:widget-code "Dynamite"
:widget-type-code "c40"}]
:widget-types [{:widget-type-code "c40"
:description "Boom!"}]}])
You can search and extract data using a template like this:
(let [root-eid (td/add-entity-edn skynet-widgets)
results (td/match
[{:basic-info {:producer-code ?}
:widgets [{:widget-code ?
:widget-type-code wtc}]
:widget-types [{:widget-type-code wtc
:description ?}]}])]
(is= results
[{:description "Resistance Infiltrator" :widget-code "Model-101" :producer-code "Cyberdyne" :wtc "t800"}
{:description "Resistance Infiltrator" :widget-code "Model-102" :producer-code "Cyberdyne" :wtc "t800"}
{:description "Mimetic polyalloy" :widget-code "Model-201" :producer-code "Cyberdyne" :wtc "t1000"}
{:description "Boom!" :widget-code "Dynamite" :producer-code "ACME" :wtc "c40"}])))
This code is working (see here) but it needs more polish. You could use it as a guide to building a generalized select-paths function.
Can you add any details on how this problem arose or the specific context? That may point to ideas for an alternate solution.
One way of solving this problem would be to generate a set of all subpaths that you accept and then write a recursive function that traverses the data structure and keeps track of the path to the current node. The code that accomplishes that does not need to be very long:
(defn select-paths-from-set [current-path path-set data]
(cond
(map? data) (into {}
(remove nil?)
(for [[k v] data]
(let [p (conj current-path k)]
(if (contains? path-set p)
[k (select-paths-from-set p path-set v)]))))
(sequential? data) (mapv (partial select-paths-from-set current-path path-set) data)
:default data))
(defn select-paths [data paths]
(select-paths-from-set []
(into #{}
(mapcat #(take-while seq (iterate butlast %)))
paths)
data))
(select-paths {:a 1
:b {:c [{:d 1 :e 1}
{:d 2 :e 2}]
:f 1}
:g {:h {:i 4 :j [1 2 3]}}}
[[:a]
[:b :c :e]
[:b :f]
[:g :h :i]])
;; => {:a 1, :b {:c [{:e 1} {:e 2}], :f 1}, :g {:h {:i 4}}}
I am using clojure.walk/postwalk to compare a predicate to every map in a nested collection and want to exit with true on the first true. How would I do that? I am ok with it walking the whole data structure and then returning true if there is a true match.
As a corollary question, I guess the same question could apply to when one performs a map as opposed to a postwalk.
UPDATE: this was truly a tired/lazy question; I should have provided a code example. That said, I'm leaving it up in case anyone is currently formulating an answer to my half-baked question. The only thing that is worse than asking one is taking it down after someone has been kind enough to start helping. I will be quite content if no one answers, if they request a better question, or if they just give me suggestions of what to research.
a bit different way to do it, also employing tree-seq:
(defn find-deep [pred data not-found]
(->> data
(tree-seq coll? seq)
(some #(when (pred %) [%]))
((fnil first [not-found]))))
user> (find-deep #(= (:c %) 30) [{:a 10 :b [{:c 20 :d {:c 30}}]}] ::none)
;;=> {:c 30}
user> (find-deep #(= (:c %) 40) [{:a 10 :b [{:c 20 :d {:c 30}}]}] ::none)
;;=> :user/none
You may be interested in this function I call walk-seq. It returns a lazy depth-first sequence over a data structure which you can then seek against to find the first match. I find it to be preferable here because it doesn't require callbacks and exceptions to exit early like clojure.walk/postwalk would.
(defn walk-seq
"Returns a lazy depth-first sequence of all forms within a data structure."
[form]
(tree-seq coll? seq form))
(defn seek
"Find the first element in the collection that matches pred,
else returns not-found. Note that using seek can lead to
poor performance and you should always use indexed data
structures instead of multiple seeks over the same data."
([pred coll]
(seek pred coll nil))
([pred coll not-found]
(reduce (fn [nf x] (if (pred x) (reduced x) nf)) not-found coll)))
Usage of walk-seq:
(walk-seq {:a [{:b -1} {:b 1}] :b 2})
=>
({:a [{:b -1} {:b 1}], :b 2}
[:a [{:b -1} {:b 1}]]
:a
[{:b -1} {:b 1}]
{:b -1}
[:b -1]
:b
-1
{:b 1}
[:b 1]
:b
1
[:b 2]
:b
2)
Combining the two:
(seek (every-pred number? pos?) (walk-seq {:a [{:b -1} {:b 1}] :b 2}))
=>
1
It can be done using postwalk by throwing an exception once the predicate is true as I suggested in the comment. This approach is unconventional but concise and lets us reuse the logic of postwalk for walking the datastructure:
(defn walk-some [pred data]
(try
(clojure.walk/postwalk
#(if (pred %)
(throw (ex-info "Found" {:data %}))
%)
data)
false
(catch clojure.lang.ExceptionInfo e
true)))
(walk-some #(and (number? %) (odd? %)) {:a [[9] 3]})
;; => true
(walk-some #(and (number? %) (even? %)) {:a [[9] 3]})
;; => false
Using exceptions for control flow is rarely needed but occasionally it useful to deviate a bit from convention. You may want to define a custom exception type for improved robustness in case your predicate can throw objects of type ExceptionInfo.
I need to translate an array map that has this structure:
{A [(A B) (A C)], C [(C D)], B [(B nil)], D [(D E) (D F)]}
Into this equivalent list:
'(A (B (nil)) (C (D (E) (F))))
I have this function that works just fine for not that deep structures:
(def to-tree (memoize (fn [start nodes]
(list* start
(if-let [connections (seq (nodes start))]
(map #(to-tree (second %) nodes) connections))))))
However, as the n of nested elements grows, it gives off stack overflow error. How can I optimize this function, or rather, is there a way of doing this using walk or any other functional approach?
The input data that you provide looks a lot like an adjacency list. One approach you could take would be to convert your data into a graph and then create trees from it.
Here is a solution using loom to work with graphs. This example only uses one function from loom (loom.graph/digraph), so you could probably build something similar if adding a dependency is not an option for you.
Let's start by creating a directed graph from your data structure.
(defn adj-list
"Converts the data structure into an adjacency list."
[ds]
(into {} (map
;; convert [:a [[:a :b] [:a :c]]] => [:a [:b :c]]
(fn [[k vs]] [k (map second vs)])
ds)))
(defn ds->digraph
"Creates a directed graph that mirrors the data structure."
[ds]
(loom.graph/digraph (adj-list ds)))
Once we have the graph built, we want to generate the trees from the root nodes of the graph. In your example, there is only one root node (A), but there is really nothing limiting it to just one.
Loom stores a list of all nodes in the graph as well as a set of all nodes with incoming edges to a given node in the graph. We can use these to find the root nodes.
(defn roots
"Finds the set of nodes that are root nodes in the graph.
Root nodes are those with no incoming edges."
[g]
(clojure.set/difference (:nodeset g)
(set (keys (:in g)))))
Given the root nodes, we now just need to create a tree for each. We can query the graph for the nodes adjacent to a given node, and then create trees for those recursively.
(defn to-tree [g n]
"Given a node in a graph, create a tree (lazily).
Assumes that n is a node in g."
(if-let [succ (get-in g [:adj n])]
(cons n (lazy-seq (map #(to-tree g %) succ)))
(list n)))
(defn to-trees
"Convert a graph into a collection of trees, one for each root node."
[g]
(map #(to-tree g %) (roots g)))
...and that's it! Taking your input, we can generate the desired output:
(def input {:a [[:a :b] [:a :c]] :c [[:c :d]] :b [[:b nil]] :d [[:d :e] [:d :f]]})
(first (to-trees (ds->digraph input))) ; => (:a (:c (:d (:e) (:f))) (:b (nil)))
Here are a couple of inputs for generating structures that are deep or have multiple root nodes.
(def input-deep (into {} (map (fn [[x y z]] [x [[x y] [x z]]]) (partition 3 2 (range 1000)))))
(def input-2-roots {:a [[:a :b] [:a :c]] :b [[:b nil]] :c [[:c :d]] :e [[:e :b] [:e :d]]})
(to-trees (ds->digraph input-2-roots)) ; => ((:e (:b (nil)) (:d)) (:a (:c (:d)) (:b (nil))))
One of the cool things about this approach is that it can work with infinitely nested data structures since generating the tree is lazy. You will get a StackOverflowException if you try to render the tree (because it also infinitely nested), but actually generating it is no problem.
The easiest way to play with this is to create a structure with a cycle, such as in the following example. (Note that the :c node is necessary. If only :a and :b are in the graph, there are no root nodes!)
(def input-cycle {:a [[:a :b]] :b [[:b :a]] :c [[:c :a]]})
(def ts (to-trees (ds->digraph input-cycle)))
(-> ts first second first) ;; :a
(-> ts first second second first) ;; :b
You can test for this condition using loom.alg/dag?.
user=> (into {} '((:a :b) (:c :d)))
Throws: ClassCastException clojure.lang.Keyword cannot be cast to java.util.Map$Entry clojure.lang.ATransientMap.conj (ATransientMap.java:44).
Whereas:
user=> (into {} (list [:a :b] [:c :d]))
Is fine. It's a strange difference, since many times other functions return lists when the thing they had to begin with was a vector:
user=> (into {} (partition 2 (interleave [:a :b] [:c :d])))
Will throw, because it partition 2 ...) results in ((:a :c) (:b :d)). So it's pretty annoying. You basically have to memorize both the return types of methods and the specific behaviors of functions like into, or you have to just let stuff blow up and fix it as you find it with stuff like (into {} (map vec (partition 2 (interleave [:a :b] [:c :d])))).
Is there a specific reason why into doesn't like the pairs as lists?
The reason is as you state, only a vector pairs can be used to build maps. I don't know of a practical reason why this limitation exists. But there are also several other methods for constructing hash-maps. If you find yourself using partition, perhaps the answer is to use an alternate construction method.
If you have parallel sequences of keys and values:
(zipmap [:a :c] [:b :d])
If you have all the items in a flat sequence:
(apply hash-map [:a :b :c :d])
Building a map from a sequence:
(into {} (for [[k v] xs]
[k (transform v)]))
I never realized this wouldn't work! Don't forget:
(apply hash-map (interleave [:a :b] [:c :d]))
;=> {:b :d, :a :c}
since hash-map implicitly creates pairs from the scalar args:
(hash-map :a :c :b :d)
;=> {:b :d, :a :c}
you don't really need the (partition 2...) which is the source of the problem.
Context
As an exercise for myself (I'm learning clojure). I wanted to implement the Depth-first search algorithm.
How I did it
Using recursion
(def graph
{:s {:a 3 :d 4}
:a {:s 3 :d 5 :b 4}
:b {:a 4 :e 5 :c 4}
:c {:b 4}
:d {:s 4 :a 5 :e 2}
:e {:d 2 :b 5 :f 4}
:f {:e 4 :g 1}})
(def stack [[:s]])
(def goal :g)
(defn cost [Graph start goal]
(goal (start Graph)))
(defn hasloop? [path]
(not (= (count path) (count (set path)))))
(defn atgoal? [path]
(= goal (last path)))
(defn solved? [stack]
(some true? (map atgoal? stack)))
(defn addtopath [path node]
(conj path node))
(defn pop* [stack]
(last stack))
(defn findpath [stack]
(if (not (solved? stack))
(let [first* (pop* stack) l (last first*) ]
(findpath (drop-last
(remove hasloop? (lazy-cat
(map #(addtopath first* %)
(keys (l graph))) stack)))))
[(first stack)]))
How to use
(findpath stack)
Question
I'm really really interested in how this code can be improved. Both in readability, efficiency and performance.
Do not use lazy-cat, your seq is realized if you do drop-last on it.
Recursion in Clojure should be done using loop/recur to avoid stack overflows.
Do not put several lets on a single line:
(let [first* (pop* stack)
l (last first*)]
Use (if-not instead of (if (not. Same for (not=
Use lower-case var names (graph, not Graph). Keep capitalization to classes, records and protocols.