How do you make a binary search tree in Clojure? - clojure

In Scheme, I can use define-struct to make a binary search tree, but how do you do it in Clojure?

You can use structmaps. To define one:
(defstruct bintree :left :right :key)
To make an instance:
(struct-map bintree :left nil :right nil :key 0)
You can then access the values in the struct like this:
(:left tree)
etc.
Or you can create new accessor functions:
(def left-branch (accessor bintree :left))
and use it:
(left-branch tree)

I don't know Clojure, but I bet it's the same way you do it in Scheme without define-struct ... just cons together the left and right branches. To find something, recurse until you hit an atom.
Seriously, though, structmaps sound like what you want. I found this page. Look for structmaps about half way down.

The simplest way would be to use the tree that is already defined in language (every sorted-map is a tree really, if you just need different function to compare keys, use sorted-map-by).
;;define function for comparing keys
(defn compare-key-fn [key1 key2] (< key1 key2) )
;;define tree and add elements
(def my-tree
(-> ;;syntax sugar
(sorted-map-by compare-key-fn) ;;this returns empty tree with given function to compare keys
(assoc 100 "data for key = 100") ;;below we add elements to tree
(assoc 2 "data for key = 2")
(assoc 10 "data for key = 10")
(assoc -2 "data for key = -1")))
;;accesing elements by key
(prn "element for key 100 =" (my-tree 100))
;;"erasing" elements from tree - in reality, what we are really doing, is returning a new tree that contains all elements of the old one, except the element we've just erased.
(def my-new-tree
(dissoc my-tree 2))
(prn my-new-tree) ;; to verify, that element 2 is "erased"

Related

Is manipulating a vector of nested maps possible using zippers?

I need to turn the following input into output by applying the following two rules:
remove all vectors that have "nope" as last item
remove each map that does not have at least one vector with "ds1" as last item
(def input
[{:simple1 [:from [:simple1 'ds1]]}
{:simple2 [:from-any [[:simple2 'nope] [:simple2 'ds1]]]}
{:walk1 [:from [:sub1 :sub2 'ds1]]}
{:unaffected [:from [:unaffected 'nope]]}
{:replaced-with-nil [:from [:the-original 'ds1]]}
{:concat1 [:concat [[:simple1 'ds1] [:simple2 'ds1]]]}
{:lookup-word [:lookup [:word 'word :word 'ds1]]}])
(def output
[{:simple1 [:from [:simple1 'ds1]]}
{:simple2 [:from-any [[:simple2 'ds1]]]}
{:walk1 [:from [:sub1 :sub2 'ds1]]}
{:replaced-with-nil [:from [:the-original 'ds1]]}
{:concat1 [:concat [[:simple1 'ds1] [:simple2 'ds1]]]}
{:lookup-word [:lookup [:word 'word :word 'ds1]]}])
I was wondering if performing this transformation is possible with zippers?
I'd recommend clojure.walk instead for this kind of general tree transformation. It can take a bit of fiddling to get the replacement functions right but it works nicely with any nesting of Clojure data structures, which AFAIK can be a bit more challenging in a zipper based approach.
We're looking to shrink our tree, so postwalk is my go-to here. It takes a function f and a tree root and goes through the tree, replacing each leaf value with (f leaf), then their parents and their parents etc. until finally replacing the root. (prewalk is similar but proceeds from root and down to leaves, so it's usually more natural when you're growing the tree by splitting branches.)
The strategy here is to somehow construct a function that prunes any branch which meets our removal criteria, but returns any other value unchanged.
(ns shrink-tree
(:require [clojure.walk :refer [postwalk]]))
(letfn[(rule-1 [node]
(and (vector? node)
(= 'nope (last node))))
(rule-2 [node]
(and
(map? node)
(not-any? #(and (vector? %) (= 'ds1 (last %)))
(tree-seq vector? seq (-> node vals first)))))
(remove-marked [node]
(if (coll? node)
(into (empty node) (remove (some-fn rule-1 rule-2) node))
node))]
(= output (postwalk remove-marked input)))
;; => true
Here the fns rule-1 and rule-2 try to turn your rules into predicates and remove-marked:
If a node is a collection, returns the same collection, less any members for which rule1 or rule2 return truthy when called with that member. To check for either one at the same time we combine the predicates with some-fn.
Otherwise returns the same node. This is how we keep values like 'ds1 or :from-any around.
You might also want to consider looking at specter. It supports these sorts of transformations by allowing you to select and transform arbitrarily complex structures.

"nth not supported" on PersistentHashSet when destructuring Set in Loop header

Clojure noob here.
I want to pull the front and rest out of a Set. Doing (front #{1}) and (rest #{1}) produce 1 and () respectively, which is mostly what I'd expect.
However in the code below, I use the destructuring [current-node & open-nodes] #{start} in my loop to pull something out of the set (at this point I don't really care about if it was the first or last item. I just want this form working) and it breaks.
Here's my function, half-implementing a grid search:
(defn navigate-to [grid start dest]
"provides route from start to dest, not including start"
(loop [[current-node & open-nodes] #{start} ;; << throws exception
closed-nodes #{}]
(if (= dest current-node)
[] ;; todo: return route
(let [all-current-neighbours (neighbours-of grid current-node) ;; << returns a set
open-neighbours (set/difference all-current-neighbours closed-nodes)]
(recur (set/union open-nodes open-neighbours)
(conj closed-nodes current-node))))))
When stepping through (with Cider), on the start of the first loop, it throws this exception:
UnsupportedOperationException nth not supported on this type: PersistentHashSet clojure.lang.RT.nthFrom (RT.java:933)
I could use a nested let form that does first/rest manually, but that seems wasteful. Is there a way to get destructured Sets working like this in the loop form? Is it just not supported on Sets?
Sets are unordered, so positional destructuring doesn't make much sense.
According to the documentation for Special Forms, which treats destructuring as well, sequential (vector) binding is specified to use nth and nthnext to look up the elements to bind.
Vector binding-exprs allow you to bind names to parts of sequential things (not just vectors), like vectors, lists, seqs, strings, arrays, and anything that supports nth.
Clojure hash sets (being instances of java.util.Set) do not support lookup by index.
I don't know the context of your example code, but in any case pouring the set contents into an ordered collection, for example (vec #{start}), would make the destructuring work.
As mentioned by others you cannot bind a set to a vector literal, because a set is not sequential. So even this simple let fails with nth not supported:
(let [[x] #{1}])
You could work around this by "destructuring" the set with the use of first and disj:
(loop [remaining-nodes #{start}
closed-nodes #{}]
(let [current-node (first remaining-nodes)
open-nodes (disj remaining-nodes current-node)]
;; rest of your code ...
))
Using (rest remaining-nodes) instead of (disj remaining-nodes current-node) could be possible, but as sets are unordered, rest is in theory not obliged to take out the same element as was extracted with first. Anyway disj will do the job.
NB: be sure to detect remaining-nodes being nil, which could lead to an endless loop.
Algorithm for returning the route
For implementing the missing part in the algorithm (returning the route) you could maintain
a map of paths. It would have one path for each visited node: a vector with the nodes leading from the start node to that node, keyed by that node.
You could use reduce to maintain that map of paths as you visit new nodes. With a new function used together with that reduce and an added nil test, the program could look like this:
(defn add-path [[path paths] node]
"adds a node to a given path, which is added to a map of paths, keyed by that node"
[path (assoc paths node (conj path node))])
(defn navigate-to [grid start dest]
"provides route from start to dest, including both"
(loop [remaining-nodes #{start}
closed-nodes #{}
paths (hash-map start [start])]
(let [current-node (first remaining-nodes)
current-path (get paths current-node)
all-current-neighbours (neighbours-of grid current-node)
open-neighbours (set/difference all-current-neighbours closed-nodes)]
(if (contains? #{dest nil} current-node)
current-path ;; search complete
(recur (set/union (disj remaining-nodes current-node) open-neighbours)
(conj closed-nodes current-node)
(second (reduce add-path [current-path paths] open-neighbours)))))))
The essence of the algorithm is still the same, although I merged the original let with the one needed for destructuring the nodes. This is not absolutely needed, but it probably makes the code more readable.
Test
I tested this with a poor-mans definition of grid and neighbours-of, based on this graph (digits are nodes, bars indicate linked nodes:
0--1 2
| | |
3--4--5
|
6--7--8
This graph seems a good candidate for a test as it has a loop, a dead end, and is connected.
The graph is encoded with grid being a vector, where each element represents a node. An element's index in that vector is the node's identifier. The content of each element is a set of neighbours, making the neighbours-of function a trivial thing (your implementation will be different):
(def grid [#{1 3} #{0 4} #{5}
#{0 4 6} #{1 3 5} #{2 4}
#{3 7} #{6 8} #{7} ])
(defn neighbours-of [grid node]
(get grid node))
Then the test is to find the route from node 0 to node 8:
(println (navigate-to grid 0 8))
Output is:
[0 1 4 3 6 7 8]
This outcome demonstrates that the algoritm does not guarantee a shortest route, only that a route will be found if it exists. I suppose the outcome could be different on different engines, depending on how the Conjure internals decide which element to take from a set with first.
After removing one of the necessary node links, like the one between node 7 and 8, the output is nil.
NB: I found this an interesting question, and probably went a bit too far in my answer.

How do I flatten a sequence of sequences of maps into a sequence of vectors?

I'm trying to build a POS tagger in Clojure. I need to iterate over a file and build out feature vectors. The input is (text pos chunk) triples from a file like the following:
input from the file:
I PP B-NP
am VBP B-VB
groot NN B-NP
I've written functions to input the file, transform each line into a map, and then slide over a variable amount of the data.
(defn lazy-file-lines
"open a file and make it a lazy sequence."
[filename]
(letfn [(helper [rdr]
(lazy-seq
(if-let [line (.readLine rdr)]
(cons line (helper rdr))
(do (.close rdr) nil))))]
(helper (clojure.java.io/reader filename))))
(defn to-map
"take a a line from a file and make it a map."
[lines]
(map
#(zipmap [:text :pos :chunk] (clojure.string/split (apply str %) #" "))lines)
)
(defn window
"create windows around the target word."
[size filelines]
(partition size 1 [] filelines))
I plan to use the above functions in the following way:
(take 2 (window 3(to-map(lazy-file-lines "/path/to/train.txt"))))
which gives the following output for the first two entries in the sequence:
(({:chunk B-NP, :pos NN, :text Confidence} {:chunk B-PP, :pos IN, :text in} {:chunk B-NP, :pos DT, :text the}) ({:chunk B-PP, :pos IN, :text in} {:chunk B-NP, :pos DT, :text the} {:chunk I-NP, :pos NN, :text pound}))
Given each sequence of maps within the sequence, I want to extract :pos and :text for each map and put them in one vector. Like so:
[Confidence in the NN IN DT]
[in the pound IN DT NN]
I've not been able to conceptualize how to handle this in clojure. My partial attempted solution is below:
(defn create-features
"creates the features and tags from the datafile."
[filename windowsize & features]
(map #(apply select-keys % [:text :pos])
(->>
(lazy-file-lines filename)
(window windowsize))))
I think one of the issues is that apply is referencing a sequence itself, so select-keys isn't operating on a map. I'm not sure how to nest another apply function into this, though.
Any thoughts on this code would be great. Thanks.
I'm not entirely sure what you want as input and output, and to be honest, I don't want to work through all of the code that you've provided to figure that out, since I don't think that all of the code is essential to the question. Someone else may give you an answer that's narrowly tailored to your code, but I think the real question is more general.
I'm guessing that the general idea of what you want to implement is that:
Given a sequence of sequence of maps, select those map entries that have particular keys, and then return a sequence of vectors representing map entries. If that's not what you want, I think that the following will probably give you an idea about how to proceed.
This method is not the most efficient or concise, but it breaks the problem down into a series of steps that are easy to understand:
(defn selkeys-or-not
"Like select-keys, but returns nil rather than {} if no keys match."
[keys map]
(not-empty (select-keys map keys)))
(defn seq-seqs-maps-to-seq-vecs
"Given a sequence of keys, and a sequence of sequences of maps,
returns a sequence of vectors, where each vector contains key-val
pairs from the maps for matching keys."
[keys seq-seqs-maps]
(let [maps (flatten seq-seqs-maps)]
(map vec
(apply concat
(filter identity
(map (partial selkeys-or-not keys) maps))))))
What's happening in the second function:
First, we flatten the outer sequence, since fact that the maps are within inner sequences is irrelevant to our goals. This gives us a single sequence of maps.
Then we map a helper function selkeys-or-not over the sequence of maps, passing our keys to the helper function. select-keys returns {} when it finds nothing, but {} is truthy, and we want a falsey value in this case for the next step. selkeys-or-not returns a falsey value (nil) instead of {}.
Now we can filter out the nils using filter identity--filter returns a sequence containing all values such that its first argument returns a truthy value.
At this point we have a sequence of maps, but we want a sequence of vectors instead. applying concat turns the sequence of maps into a sequence of map entries, and mapping vec over them turns the map entries into vectors.
(defn extract-line-seq
[ls]
(concat (map :text ls)
(map :pos ls)))
(extract-line-seq '({:chunk B-NP, :pos NN, :text Confidence} {:chunk B-PP, :pos IN, :text in} {:chunk B-NP, :pos DT, :text the}))
;-> (Confidence in the NN IN DT)
You can put it into a vector if you want outside of the function. This way laziness is an option to the caller.

Traversing a tree in Pre-order

I am new to clojure as well as to Functional Programming. I am trying to traverse a tree in pre-order using:
(def preordercoll [])
(deftrace preorder [mytree]
(if-not (empty? mytree)
(do (println "position"(value mytree))
(cons (value mytree) (preorder (left-child mytree)))
(cons (value mytree) (preorder (right-child mytree))))
)preordercoll) )
(preorder [45[65 [90 nil nil] [81 nil nil]] [72[82 nil nil][96 nil nil]]])
I am unable to append values of node in some list, like I tried using 'conj' operation on global variable preordercoll, but yes it doesn't work like object oriented , so I tried using cons, but only few values are returned, that too in improper order. Can anyone guide me what mistake am making?
I also thought of using Partial function but could not find how to supply value of node in recursive manner. Am not asking for code but please draw me in right direction to get collection of values in pre-order.
you're on the right track, just have the nesting of the cons calls a little off. First a note on evaluation of forms in clojure. One of the key ideas is that every form evaluates to something* which is why there is no "return" statement in the language, because you could say everything would be a return statement so what's the point in having it. In the case of a do expression the return value of the expression is the last statement so:
(do 1 2 3)
returns (evaluates to) 3. In the do expression in your code it returns the result of the second cons, and the first cons has no effect.
(do (println "position"(value mytree))
(cons (value mytree) (preorder (left-child mytree))) ;; <-- this does nothing
(cons (value mytree) (preorder (right-child mytree))))
instead it sounds like you would like an expression that starts with the result of calling preorder on the left tree, then concatinates the result of calling preorder on the right tree, then attaches the current node's value to the front of that.
(let [left-side (preorder (left-child mytree))
right-side (preorder (left-child mytree))
this-value (value mytree)]
(do (println "position" this-value)
(cons this-value (concat right-side left-side))
*(for the pedants) "except the ignore reader maco #_"

Convert map keys and values to string array

How do I convert a clojure map into string, almost key value pair, as shown below:
Clojure data:
(def data { :starks "Winter is coming" :Lannisters "Hear me roar" })
I want to convert the above to
"starks" "winter is coming" "Lannisters" "hear me roar"
I don't want any identifiers / delimiters between but obviously "starks" should always be followed by "winter is coming"
I tried this:
(str (keys data) (vals data))
Which outputs this:
"(:starks :Lannisters)(\"Winter is coming\" \"Hear me roar\")"
Which is not what I want at all...
The map data keys and values are not always the same so it needs to be generic
there will always be just one level, as in, the value will not contain a nested map etc..
Edit
What I'm actually trying to do:
I am trying to index a few thousand Neo4j nodes with clojure. To help me with this task, I am using Neocons Clojure neo4j library.
According to the documentation, the add-to-index accepts properties and values like so:
(nn/add-to-index (:id node) (:name idx) "username" "joe")))
which is, in my above case, going to look like
(nn/add-to-index (:id node) (:name idx) "starks" "winter is coming" "Lannisters" "Hear me roar")))
now, I have my Node, I can access the node properties with (:data node) and that gives me a clojure map.
The property differs pretty much from node to node so I'm trying to figure out how to pass that map to the library in the way that it understands..
Marius Danila's answer got me almost there.
Doing
(map name (apply concat data))
still complains of the third parameter, as it has the braces around the result.
So, how can I achieve this?
Do I just have to write a whole lot of if-not blocks to construct the properties myself?
Thanks
This should do the trick:
(map name (apply concat data))
A map can be viewed as a sequence of key-value pairs, which in turn represented as arrays of 2 elements. We're concatenating the pairs and then we extract the name from the keywords and strings (for string this does nothing, for keywords it returns the bit without ':').
From the code you've posted, I'm guessing you would use this like so:
(apply nn/add-to-index (list* (:id node) (:name idx) (map name (apply concat data))))
The (nn/add-to-index ...) function simply accepts only four arguments. The node, index and one key/value pair. So you have too loop through your data like.
(doseq [[k v] data]
(nn/add-to-index (:id node) (:name idx) (name k) (clojure.string/lower-case v))))
Unlike the the str function in Clojure the add-to-index function is more limited and simply does not accept variable parameter lists.
You can use vector to have array like random access:
=> (def v (vec (map name (apply concat data))))
=> (v 0)
;"starks"
=> (v 1)
;"Winter is coming"
You could try the following:
=> (interleave (map name (keys data)) (vals data))
;; which returns ("starks" "Winter is coming" "Lannisters" "Hear me roar")